From ned at nedbatchelder.com  Sun May  1 00:49:11 2011
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Sat, 30 Apr 2011 18:49:11 -0400
Subject: [Python-Dev] sys.settrace: behavior doesn't match docs
Message-ID: <4DBC91E7.9060402@nedbatchelder.com>

This week I learned something new about trace functions (how to write a 
C trace function that survives a sys.settrace(sys.gettrace()) 
round-trip), and while writing up what I learned, I was surprised to 
discover that trace functions don't behave the way I thought, or the way 
the docs say they behave.

The docs say:

    The trace function is invoked (with /event/ set to 'call') whenever
    a new local scope is entered; it should return a reference to a
    local trace function to be used that scope, or None if the scope
    shouldn't be traced.

    The local trace function should return a reference to itself (or to
    another function for further tracing in that scope), or None to turn
    off tracing in that scope.

It's that last part that's wrong: returning None from the trace function 
only has an effect on the first call in a new frame.  Once the trace 
function returns a function for a frame, returning None from subsequent 
calls is ignored.  A "local trace function" can't turn off tracing in 
its scope.

To demonstrate:

    import sys

    UPTO_LINE = 1

    def t(frame, event, arg):
         num = frame.f_lineno
         print("line %d" % num)
         if num < UPTO_LINE:
             return t

    def try_it():
         print("twelve")
         print("thirteen")
         print("fourteen")
         print("fifteen")

    UPTO_LINE = 1
    sys.settrace(t)
    try_it()

    UPTO_LINE = 13
    sys.settrace(t)
    try_it()

Produces:

    line 11
    twelve
    thirteen
    fourteen
    fifteen
    line 11
    line 12
    twelve
    line 13
    thirteen
    line 14
    fourteen
    line 15
    fifteen
    line 15

The first call to try_it() returns None immediately, preventing tracing 
for the rest of the function.  The second call returns None at line 13, 
but the rest of the function is traced anyway.  This behavior is the 
same in all versions from 2.3 to 3.2, in fact, the 100 lines of code in 
sysmodule.c responsible for Python tracing functions are completely 
unchanged through those versions.  (A deeper mystery that I haven't 
looked into yet is why Python 3.x intersperses all of these lines with 
"line 18" interjections.)

I'm writing this email because I'm not sure whether this is a behavior 
bug or a doc bug.  One of them is wrong, since they disagree.  The 
documented behavior makes sense, and is what people have all along 
thought the trace function did.  The actual behavior is a bit more 
complicated to explain, but is what people have actually been 
experiencing.  FWIW, PyPy implements the documented behavior.

Should we fix the code or the docs?  I'd be glad to supply a patch for 
either.

--Ned.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110430/33f7f201/attachment.html>

From guido at python.org  Sun May  1 02:43:27 2011
From: guido at python.org (Guido van Rossum)
Date: Sat, 30 Apr 2011 17:43:27 -0700
Subject: [Python-Dev] sys.settrace: behavior doesn't match docs
In-Reply-To: <4DBC91E7.9060402@nedbatchelder.com>
References: <4DBC91E7.9060402@nedbatchelder.com>
Message-ID: <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com>

I think you need to go back farther in time. :-) In Python 2.0 the
call_trace function in ceval.c has a completely different signature
(but the docs are the same). I haven't checked all history but
somewhere between 2.0 and 2.3, SET_LINENO-less tracing was added, and
that's where the implementation must have gone wrong. So I think we
should fix the code.

--Guido

On Sat, Apr 30, 2011 at 3:49 PM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> This week I learned something new about trace functions (how to write a C
> trace function that survives a sys.settrace(sys.gettrace()) round-trip), and
> while writing up what I learned, I was surprised to discover that trace
> functions don't behave the way I thought, or the way the docs say they
> behave.
>
> The docs say:
>
> The trace function is invoked (with event set to 'call') whenever a new
> local scope is entered; it should return a reference to a local trace
> function to be used that scope, or None if the scope shouldn?t be traced.
>
> The local trace function should return a reference to itself (or to another
> function for further tracing in that scope), or None to turn off tracing in
> that scope.
>
> It's that last part that's wrong: returning None from the trace function
> only has an effect on the first call in a new frame.? Once the trace
> function returns a function for a frame, returning None from subsequent
> calls is ignored.? A "local trace function" can't turn off tracing in its
> scope.
>
> To demonstrate:
>
> import sys
>
> UPTO_LINE = 1
>
> def t(frame, event, arg):
> ??? num = frame.f_lineno
> ??? print("line %d" % num)
> ??? if num < UPTO_LINE:
> ??????? return t
>
> def try_it():
> ??? print("twelve")
> ??? print("thirteen")
> ??? print("fourteen")
> ??? print("fifteen")
>
> UPTO_LINE = 1
> sys.settrace(t)
> try_it()
>
> UPTO_LINE = 13
> sys.settrace(t)
> try_it()
>
> Produces:
>
> line 11
> twelve
> thirteen
> fourteen
> fifteen
> line 11
> line 12
> twelve
> line 13
> thirteen
> line 14
> fourteen
> line 15
> fifteen
> line 15
>
> The first call to try_it() returns None immediately, preventing tracing for
> the rest of the function.? The second call returns None at line 13, but the
> rest of the function is traced anyway.? This behavior is the same in all
> versions from 2.3 to 3.2, in fact, the 100 lines of code in sysmodule.c
> responsible for Python tracing functions are completely unchanged through
> those versions.? (A deeper mystery that I haven't looked into yet is why
> Python 3.x intersperses all of these lines with "line 18" interjections.)
>
> I'm writing this email because I'm not sure whether this is a behavior bug
> or a doc bug.? One of them is wrong, since they disagree.? The documented
> behavior makes sense, and is what people have all along thought the trace
> function did.? The actual behavior is a bit more complicated to explain, but
> is what people have actually been experiencing.? FWIW, PyPy implements the
> documented behavior.
>
> Should we fix the code or the docs?? I'd be glad to supply a patch for
> either.
>
> --Ned.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>



-- 
--Guido van Rossum (python.org/~guido)

From techtonik at gmail.com  Sun May  1 12:40:43 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Sun, 1 May 2011 13:40:43 +0300
Subject: [Python-Dev] 2to3 status, repositories and HACKING guide
In-Reply-To: <AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com>
References: <AANLkTi=Uh7f56LdZ_kgvwQVF7t_FYNgqpfH34DdgppNG@mail.gmail.com>
	<AANLkTingP4ZwRv=ta6RC_C59nbS6gN-hkzVm15Xdd-2Y@mail.gmail.com>
	<AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com>
Message-ID: <BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com>

Is there any high-level overview of 2to3 tool that people can use as a
quick start for writing their own fixers?

Source doesn't explain much (to me at least), and some kind of "learn
by example" would really help a lot. In particular, I find the syntax of
tree matchers the most unclear part.
--
anatoly t.


On Fri, Mar 25, 2011 at 9:12 PM, Benjamin Peterson <benjamin at python.org> wrote:
> The main cpython repo.
>
> 2011/3/25 anatoly techtonik <techtonik at gmail.com>:
>> Hi, Benjamin,
>>
>> Is your repository for 2to3 is still actual?
>> http://svn.python.org/view/sandbox/trunk/2to3/
>>
>> Which should I use to start hacking on 2to3?
>>
>> --
>> anatoly t.
>>
>>
>>
>> On Wed, Mar 23, 2011 at 9:01 AM, anatoly techtonik <techtonik at gmail.com> wrote:
>>> Hi,
>>>
>>> Currently 2to3 page at http://wiki.python.org/moin/2to3 lists
>>> http://svn.python.org/view/sandbox/trunk/2to3 as a repository for 2to3
>>> tool. There is also an outdated repository at http://hg.python.org/
>>> and the page says that the code is finally integrated into CPython 2.6
>>> - you can see it at
>>> http://hg.python.org/cpython/file/default/Lib/lib2to3. So, what
>>> version is more up-to-date?
>>>
>>> In svn repository there is a HACKING guide advising to use
>>> find_pattern.py script for writing new fixer. However, there is no
>>> find_pattern.py in CPython repository, no HACKING guide, no any
>>> documentation about how to write fixers or description of PATTERN
>>> format. Did I miss something?
>>> --
>>> anatoly t.
>>>
>>
>
>
>
> --
> Regards,
> Benjamin
>

From ncoghlan at gmail.com  Sun May  1 13:27:44 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 1 May 2011 21:27:44 +1000
Subject: [Python-Dev] Not-a-Number (was PyObject_RichCompareBool
	identity shortcut)
In-Reply-To: <BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com>
References: <4DB7E3EA.3030208@avl.com>
	<BANLkTik6Fr0e=5PLNTu4x=CT+v12tt3Tsg@mail.gmail.com>
	<87d3k79jvt.fsf@uwakimon.sk.tsukuba.ac.jp>
	<BANLkTi=AusPRDsf2zKDGteZ5dGxs0EEuXw@mail.gmail.com>
	<4DB90748.4030501@g.nevcal.com>
	<BANLkTi=eAug-2n+MsQvSpaet5PM4NQDHSg@mail.gmail.com>
	<4DB916DE.1050302@g.nevcal.com>
	<BANLkTikGVfox3dXkO7B5f5iQbX5L8ypNgw@mail.gmail.com>
	<4DB927F4.3040206@dcs.gla.ac.uk> <ipchp6$1ba$1@dough.gmane.org>
	<871v0la5yg.fsf@uwakimon.sk.tsukuba.ac.jp>
	<BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com>
Message-ID: <BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com>

On Sat, Apr 30, 2011 at 3:11 AM, Guido van Rossum <guido at python.org> wrote:
> Decimal, for that reason, has a context that lets one specify
> different behaviors when a NaN is produced. Would it make sense to add
> a float context that also lets one specify what should happen? That
> could include returning Inf for 1.0/0.0 (for experts), or raising
> exceptions when NaNs are produced (for the numerically naive like
> myself).
>
> I could see a downside too, e.g. the correctness of code that
> passingly uses floats might be affected by the context settings.
> There's also the question of whether the float context should affect
> int operations; floats vs. ints is another can of worms since (in
> Python 3) we attempt to tie them together through 1/2 == 0.5, but ints
> have a much larger range than floats.

Given that we delegate most float() behaviour to the underlying CPU
and C libraries (and then the math module tries to cope with any
cross-platform discrepancies), introducing context handling isn't
easy, and would likely harm the current speed advantage that floats
hold over the decimal module.

We decided that losing the speed advantage of native integers was
worthwhile in order to better unify the semantics of int and long for
Py3k, but both the speed differential and the semantic gap between
float() and decimal.Decimal() are significantly larger.

However, I did find Terry's suggestion of using the warnings module to
report some of the floating point corner cases that currently silently
produce unexpected results to be an interesting one. If those
operations issued a FloatWarning, then users could either silence them
or turn them into errors as desired.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From benjamin at python.org  Sun May  1 17:44:10 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 1 May 2011 10:44:10 -0500
Subject: [Python-Dev] 2to3 status, repositories and HACKING guide
In-Reply-To: <BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com>
References: <AANLkTi=Uh7f56LdZ_kgvwQVF7t_FYNgqpfH34DdgppNG@mail.gmail.com>
	<AANLkTingP4ZwRv=ta6RC_C59nbS6gN-hkzVm15Xdd-2Y@mail.gmail.com>
	<AANLkTi=frkAceLLEtBYWDsQCKSOsVrQPv1jJ=G3h_jT1@mail.gmail.com>
	<BANLkTi=87D_89D67swsy4gLO6-U2MGZPSg@mail.gmail.com>
Message-ID: <BANLkTi=p3epLD_F6gPDVdiL3ihGTnwm7JA@mail.gmail.com>

2011/5/1 anatoly techtonik <techtonik at gmail.com>:
> Is there any high-level overview of 2to3 tool that people can use as a
> quick start for writing their own fixers?

No.

>
> Source doesn't explain much (to me at least), and some kind of "learn
> by example" would really help a lot. In particular, I find the syntax of
> tree matchers the most unclear part.

I think you can learn a lot by reading through the current fixers in
lib2to3/fixers/.


-- 
Regards,
Benjamin

From g.brandl at gmx.net  Sun May  1 18:31:20 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 01 May 2011 18:31:20 +0200
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
Message-ID: <ipk1st$nfm$1@dough.gmane.org>

On 30.04.2011 16:53, anatoly techtonik wrote:
> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>
>> The hardest part is debugging the TAL when you make a mistake, but
>> even that isn't a whole lot worse than any other templating language.
> 
> How much in % is it worse than Django templating language?

I'm just guessing here, but I'd say 47.256 %.

Georg


From g.brandl at gmx.net  Sun May  1 19:57:51 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 01 May 2011 19:57:51 +0200
Subject: [Python-Dev] Python 3.2.1
Message-ID: <ipk6v4$h54$1@dough.gmane.org>

Hi,

I'd like to release Python 3.2.1 on May 21, with a release candidate
on May 14.  Please bring any issues you think need to be fixed in it
to my attention by assigning "release blocker" status in the tracker.

Georg



From raymond.hettinger at gmail.com  Sun May  1 20:22:02 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sun, 1 May 2011 11:22:02 -0700
Subject: [Python-Dev] Python 3.2.1
In-Reply-To: <ipk6v4$h54$1@dough.gmane.org>
References: <ipk6v4$h54$1@dough.gmane.org>
Message-ID: <5D8F6095-D052-47F6-A65B-D578A4460F20@gmail.com>


On May 1, 2011, at 10:57 AM, Georg Brandl wrote:

> I'd like to release Python 3.2.1 on May 21, with a release candidate
> on May 14.  Please bring any issues you think need to be fixed in it
> to my attention by assigning "release blocker" status in the tracker.


Thanks to http://www.python.org/dev/daily-dmg/ , I've been able
to work off of the head every day.  Python 3.2.1 is in pretty good shape :-)


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110501/f34277de/attachment.html>

From tjreedy at udel.edu  Sun May  1 20:45:06 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 01 May 2011 14:45:06 -0400
Subject: [Python-Dev] Not-a-Number (was PyObject_RichCompareBool
	identity shortcut)
In-Reply-To: <BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com>
References: <4DB7E3EA.3030208@avl.com>	<BANLkTik6Fr0e=5PLNTu4x=CT+v12tt3Tsg@mail.gmail.com>	<87d3k79jvt.fsf@uwakimon.sk.tsukuba.ac.jp>	<BANLkTi=AusPRDsf2zKDGteZ5dGxs0EEuXw@mail.gmail.com>	<4DB90748.4030501@g.nevcal.com>	<BANLkTi=eAug-2n+MsQvSpaet5PM4NQDHSg@mail.gmail.com>	<4DB916DE.1050302@g.nevcal.com>	<BANLkTikGVfox3dXkO7B5f5iQbX5L8ypNgw@mail.gmail.com>	<4DB927F4.3040206@dcs.gla.ac.uk>
	<ipchp6$1ba$1@dough.gmane.org>	<871v0la5yg.fsf@uwakimon.sk.tsukuba.ac.jp>	<BANLkTi=v98ZLbqTGSBED-MdE4V4X6JoTdg@mail.gmail.com>
	<BANLkTi=dcK5fewo0bdoDgySBhOqYcxz=uQ@mail.gmail.com>
Message-ID: <ipk9nh$9l$1@dough.gmane.org>

On 5/1/2011 7:27 AM, Nick Coghlan wrote:

> However, I did find Terry's suggestion of using the warnings module to
> report some of the floating point corner cases that currently silently
> produce unexpected results to be an interesting one. If those
> operations issued a FloatWarning, then users could either silence them
> or turn them into errors as desired.

I would like to take credit for that, but I was actually seconding 
Alexander's insight and idea. I may have added the specific name after 
looking at the currently list and seeing UnicodeWarning and 
BytesWarning, so why not a FloatWarning. I did read the warnings doc 
more carefully to verify that it would really put the user in control, 
which was apparently the intent of the committee.

I am not sure whether FloatWarnings should ignored or printed by 
default. Ignored would, I guess, match current behavior, unless 
something else is changed as part of a more extensive overhaul. -f and 
-ff are available to turn ignored FloatWarning into print or raise 
exception, as with BytesWarning. I suspect that these would get at lease 
as much usage as -b and -bb.

So I see 4 questions:
1. Add FloatWarning?
2. If yes, default disposition?
3. Add command line options?
4. Use the addition of FloatWarning as an opportunity to change other 
defaults, given that user will have more options?

-- 
Terry Jan Reedy


From brian.curtin at gmail.com  Sun May  1 22:51:55 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Sun, 1 May 2011 15:51:55 -0500
Subject: [Python-Dev] Windows 2000 Support
Message-ID: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com>

I'm currently writing a post about the process of removing OS/2 and VMS
support and thought about a discussion of Windows 2000 some time back.
http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes a
proposal for beginning to walk away from 2000, but doesn't appear to come to
any conclusion.

Was anything decided off the list? I don't see anything in PEP-11 and don't
see any changes in the installer made around Windows 2000.

If nothing was decided, should anything be done for 3.3?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110501/6f48ef53/attachment.html>

From victor.stinner at haypocalc.com  Mon May  2 12:06:47 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 2 May 2011 12:06:47 +0200
Subject: [Python-Dev] Raise OSError or RuntimeError in the OS module?
Message-ID: <201105021206.47384.victor.stinner@haypocalc.com>

Hi,

I introduced recently the signal.pthread_sigmask() function (issue #8407). 
pthread_sigmask() (the C function) returns an error code using errno codes. I 
choosed to raise a RuntimeError using this error code, but I am not sure that 
RuntimeError is the best choice. It is more an OS error than a runtime error: 
should signal.pthread_sigmask() raise an OSError instead?

signal.signal() raises a RuntimeError if setting the signal handler failed. 
signal.siginterrupt() raises also a RuntimeError on error.

signal.setitimer() and signal.getitimer() have their own exception class: 
signal.ItimerError, raised on setimer() and getitimer() error.

Victor

From ned at nedbatchelder.com  Mon May  2 13:27:40 2011
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 02 May 2011 07:27:40 -0400
Subject: [Python-Dev] sys.settrace: behavior doesn't match docs
In-Reply-To: <BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com>
References: <4DBC91E7.9060402@nedbatchelder.com>
	<BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com>
Message-ID: <4DBE952C.2070005@nedbatchelder.com>

Indeed, the 2.0 code is very different, and got this case right.

I'm a little surprised no one is arguing that changing this code now 
could break some applications.  Maybe the fact no one noticed the docs 
were wrong proves that no one ever tried returning None from a local 
trace function.

--Ned.

On 4/30/2011 8:43 PM, Guido van Rossum wrote:
> I think you need to go back farther in time. :-) In Python 2.0 the
> call_trace function in ceval.c has a completely different signature
> (but the docs are the same). I haven't checked all history but
> somewhere between 2.0 and 2.3, SET_LINENO-less tracing was added, and
> that's where the implementation must have gone wrong. So I think we
> should fix the code.
>
> --Guido
>
> On Sat, Apr 30, 2011 at 3:49 PM, Ned Batchelder<ned at nedbatchelder.com>  wrote:
>> This week I learned something new about trace functions (how to write a C
>> trace function that survives a sys.settrace(sys.gettrace()) round-trip), and
>> while writing up what I learned, I was surprised to discover that trace
>> functions don't behave the way I thought, or the way the docs say they
>> behave.
>>
>> The docs say:
>>
>> The trace function is invoked (with event set to 'call') whenever a new
>> local scope is entered; it should return a reference to a local trace
>> function to be used that scope, or None if the scope shouldn?t be traced.
>>
>> The local trace function should return a reference to itself (or to another
>> function for further tracing in that scope), or None to turn off tracing in
>> that scope.
>>
>> It's that last part that's wrong: returning None from the trace function
>> only has an effect on the first call in a new frame.  Once the trace
>> function returns a function for a frame, returning None from subsequent
>> calls is ignored.  A "local trace function" can't turn off tracing in its
>> scope.
>>
>> To demonstrate:
>>
>> import sys
>>
>> UPTO_LINE = 1
>>
>> def t(frame, event, arg):
>>      num = frame.f_lineno
>>      print("line %d" % num)
>>      if num<  UPTO_LINE:
>>          return t
>>
>> def try_it():
>>      print("twelve")
>>      print("thirteen")
>>      print("fourteen")
>>      print("fifteen")
>>
>> UPTO_LINE = 1
>> sys.settrace(t)
>> try_it()
>>
>> UPTO_LINE = 13
>> sys.settrace(t)
>> try_it()
>>
>> Produces:
>>
>> line 11
>> twelve
>> thirteen
>> fourteen
>> fifteen
>> line 11
>> line 12
>> twelve
>> line 13
>> thirteen
>> line 14
>> fourteen
>> line 15
>> fifteen
>> line 15
>>
>> The first call to try_it() returns None immediately, preventing tracing for
>> the rest of the function.  The second call returns None at line 13, but the
>> rest of the function is traced anyway.  This behavior is the same in all
>> versions from 2.3 to 3.2, in fact, the 100 lines of code in sysmodule.c
>> responsible for Python tracing functions are completely unchanged through
>> those versions.  (A deeper mystery that I haven't looked into yet is why
>> Python 3.x intersperses all of these lines with "line 18" interjections.)
>>
>> I'm writing this email because I'm not sure whether this is a behavior bug
>> or a doc bug.  One of them is wrong, since they disagree.  The documented
>> behavior makes sense, and is what people have all along thought the trace
>> function did.  The actual behavior is a bit more complicated to explain, but
>> is what people have actually been experiencing.  FWIW, PyPy implements the
>> documented behavior.
>>
>> Should we fix the code or the docs?  I'd be glad to supply a patch for
>> either.
>>
>> --Ned.
>>
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
>
>

From mhammond at skippinet.com.au  Mon May  2 14:47:11 2011
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Mon, 02 May 2011 22:47:11 +1000
Subject: [Python-Dev] sys.settrace: behavior doesn't match docs
In-Reply-To: <4DBE952C.2070005@nedbatchelder.com>
References: <4DBC91E7.9060402@nedbatchelder.com>
	<BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com>
	<4DBE952C.2070005@nedbatchelder.com>
Message-ID: <4DBEA7CF.4030307@skippinet.com.au>

On 2/05/2011 9:27 PM, Ned Batchelder wrote:
...
> Maybe the fact no one noticed the docs
> were wrong proves that no one ever tried returning None from a local
> trace function.

Or if they did, they should have complained by now.  IMO, if the 
behaviour regresses from how it is documented and how it previously 
worked and no reports of the regression exist, we should just fix it 
without regard to people relying on the "new" functionality...

Mark

From ncoghlan at gmail.com  Mon May  2 15:12:32 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 2 May 2011 23:12:32 +1000
Subject: [Python-Dev] sys.settrace: behavior doesn't match docs
In-Reply-To: <4DBEA7CF.4030307@skippinet.com.au>
References: <4DBC91E7.9060402@nedbatchelder.com>
	<BANLkTikvTVP_-V7g86BgU9oPX3dUGE3eSw@mail.gmail.com>
	<4DBE952C.2070005@nedbatchelder.com>
	<4DBEA7CF.4030307@skippinet.com.au>
Message-ID: <BANLkTi=_wg7KqgQXBFAOz3YoHpYvHyE-UA@mail.gmail.com>

On Mon, May 2, 2011 at 10:47 PM, Mark Hammond <mhammond at skippinet.com.au> wrote:
> On 2/05/2011 9:27 PM, Ned Batchelder wrote:
> ...
>>
>> Maybe the fact no one noticed the docs
>> were wrong proves that no one ever tried returning None from a local
>> trace function.
>
> Or if they did, they should have complained by now. ?IMO, if the behaviour
> regresses from how it is documented and how it previously worked and no
> reports of the regression exist, we should just fix it without regard to
> people relying on the "new" functionality...

+1

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From vinay_sajip at yahoo.co.uk  Mon May  2 16:26:56 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Mon, 2 May 2011 14:26:56 +0000 (UTC)
Subject: [Python-Dev] Socket servers in the test suite
References: <loom.20110427T230704-75@post.gmane.org>
	<BANLkTimqCY02e+iy-OcV4nzZa1BTiC_sOQ@mail.gmail.com>
Message-ID: <loom.20110502T155417-507@post.gmane.org>

Nick Coghlan <ncoghlan <at> gmail.com> writes:

> sure the urllib tests already fire up a local server). Starting down
> the path of standardisation of that test functionality would be good.

I've made a start with test_logging.py by implementing some potential server
classes for use in tests: in the latest test_logging.py, the servers are between
comments containing the text "server_helper".

The basic approach for implementing socket servers is traditionally to use a
request handler class which implements the custom logic, but for some testing
applications this is overkill - you just want to be able to pass a handling
callable which is, say, a test case method. So the signatures of the servers are
all like this:

__init__(self, listen_addr, handler, poll_interval ...)

Initialise using the specified listen address and handler callable. Internally,
a RequestHandler subclass will be used whose handle() delegates to the handler
callable passed in.
A zero port number can be passed in, and a port attribute will (after binding)
have the actual port number used, so that clients can connect on that port.

start()

Start the server on a separate thread, using the poll_interval specified in the
underlying poll()/select() call. Before this is called, the request handler
class could be replaced with a subclass if need be.

stop(timeout=None)

Ask the server to stop and wait for the server thread to terminate.

The server also has a ready attribute which is a threading.Event, set just when
the server is entering its service loop. Typical mode of use would be:

class ClientTestCase(unittest.TestCase):
    def setUp(self):
        self.server = TheAppropriateServerClass(('localhost', 0),
self.handle_request, 0.01, ...)
        self.server.start()
        self.server.ready.wait()
        self.handled = threading.Event()

    def tearDown(self):
        self.server.stop(1.0) # wait up to 1 sec for thread to stop

    def handle_request(self, request):
        # Handle the request, e.g. by setting some attributes based on what
        # was received at the server
        # Set the flag to say we finished handling
        self.handled.set()

    def test_xxx(self):
        # set up client and send stuff to server
        # Wait for server to finish doing stuff
        self.handled.wait()
        # make assertions based on the attributes
        # set during request handling

The server classes provided are TestSMTPServer, TestTCPServer, TestUDPServer and
TestHTTPServer. There are examples of actual usage in test_logging.py:
SMTPHandlerTest, SocketHandlerTest, DatagramHandlerTest, SysLogHandlerTest,
HTTPHandlerTest.

I'd like some comments on this suggested API. I have not yet looked at how to
adapt other stdlib code than test_logging to use these classes, but the above
usage mode seems convenient and sufficient for testing applications. No doubt
people will be able to suggest problems with/improvements to the approach
outlined above.

Regards,

Vinay Sajip


From techtonik at gmail.com  Mon May  2 18:06:58 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 2 May 2011 19:06:58 +0300
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <ipk1st$nfm$1@dough.gmane.org>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
	<ipk1st$nfm$1@dough.gmane.org>
Message-ID: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>

On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 30.04.2011 16:53, anatoly techtonik wrote:
>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>>
>>> The hardest part is debugging the TAL when you make a mistake, but
>>> even that isn't a whole lot worse than any other templating language.
>>
>> How much in % is it worse than Django templating language?
>
> I'm just guessing here, but I'd say 47.256 %.

That means switching to Django templates will make Roundup design
plumbing work 47.256% more attractive for potential contributors.
--
anatoly t.

From benjamin at python.org  Mon May  2 18:17:59 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 2 May 2011 11:17:59 -0500
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
	<ipk1st$nfm$1@dough.gmane.org>
	<BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
Message-ID: <BANLkTinnksUMptvmWatwDpSDS08HwJrOYw@mail.gmail.com>

2011/5/2 anatoly techtonik <techtonik at gmail.com>:
> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>> On 30.04.2011 16:53, anatoly techtonik wrote:
>>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>>>
>>>> The hardest part is debugging the TAL when you make a mistake, but
>>>> even that isn't a whole lot worse than any other templating language.
>>>
>>> How much in % is it worse than Django templating language?
>>
>> I'm just guessing here, but I'd say 47.256 %.
>
> That means switching to Django templates will make Roundup design
> plumbing work 47.256% more attractive for potential contributors.

Perhaps some of those eager contributors would like to volunteer for the task.



-- 
Regards,
Benjamin

From brian.curtin at gmail.com  Mon May  2 18:19:28 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 2 May 2011 11:19:28 -0500
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
	<ipk1st$nfm$1@dough.gmane.org>
	<BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
Message-ID: <BANLkTimFPoo2BDZYyAJV8m0q41oFYzbJ6A@mail.gmail.com>

On Mon, May 2, 2011 at 11:06, anatoly techtonik <techtonik at gmail.com> wrote:

> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> > On 30.04.2011 16:53, anatoly techtonik wrote:
> >> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com>
> wrote:
> >>>
> >>> The hardest part is debugging the TAL when you make a mistake, but
> >>> even that isn't a whole lot worse than any other templating language.
> >>
> >> How much in % is it worse than Django templating language?
> >
> > I'm just guessing here, but I'd say 47.256 %.
>
> That means switching to Django templates will make Roundup design
> plumbing work 47.256% more attractive for potential contributors.


What if these "potential contributors" never surface? Then we've made
a 47.256% change in attractiveness, which is a 1423.843% waste of time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/dcc35554/attachment.html>

From techtonik at gmail.com  Mon May  2 19:14:50 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 2 May 2011 20:14:50 +0300
Subject: [Python-Dev] PEP 386 and dev repository versions workflow
Message-ID: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com>

http://guide.python-distribute.org/quickstart.html proposes suffixing
version of a module in repository with 'dev' in a way that after
release of '1.0' version, the repository version is changed to
'2.0dev'. This makes sense, but it is not compatible with PEP 386,
which suggests using 2.0.devN, where N is a repository revision
number. I'd expand PEP 386 to include 2.0dev use case.

--
anatoly t.

From ziade.tarek at gmail.com  Mon May  2 19:19:28 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 2 May 2011 19:19:28 +0200
Subject: [Python-Dev] PEP 386 and dev repository versions workflow
In-Reply-To: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com>
References: <BANLkTi=0NAuSRM=MELYAiko45BnA=u-HLw@mail.gmail.com>
Message-ID: <BANLkTindgivLDc=-yPOf75v1iVuKUnjGCw@mail.gmail.com>

On Mon, May 2, 2011 at 7:14 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> http://guide.python-distribute.org/quickstart.html proposes suffixing
> version of a module in repository with 'dev' in a way that after
> release of '1.0' version, the repository version is changed to
> '2.0dev'. This makes sense, but it is not compatible with PEP 386,
> which suggests using 2.0.devN, where N is a repository revision
> number. I'd expand PEP 386 to include 2.0dev use case.

This is a typo I'll fix, thanks for noticing


> --
> anatoly t.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
>



-- 
Tarek Ziad? | http://ziade.org

From g.rodola at gmail.com  Mon May  2 20:27:57 2011
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Mon, 2 May 2011 20:27:57 +0200
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
Message-ID: <BANLkTimW43h6PUFm8U4bzC+mO1Dsrzzm9Q@mail.gmail.com>

2011/4/30 anatoly techtonik <techtonik at gmail.com>:
> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>
>> The hardest part is debugging the TAL when you make a mistake, but
>> even that isn't a whole lot worse than any other templating language.
>
> How much in % is it worse than Django templating language?
> --
> anatoly t.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com
>

Knowing both of them I can say ZPT is one of the few things I like
about Zope and I find it a lot more powerful than Django templating
system.
Other than that, I don't see how changing the templating language can
make any difference.
If one does not contribute something because of the language used in
templates... well, I think it wouldn't have been a particular good
contribution anyway. =)


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/

From g.brandl at gmx.net  Mon May  2 20:41:12 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 02 May 2011 20:41:12 +0200
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
	<ipk1st$nfm$1@dough.gmane.org>
	<BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
Message-ID: <ipmtsc$ao9$1@dough.gmane.org>

On 02.05.2011 18:06, anatoly techtonik wrote:
> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>> On 30.04.2011 16:53, anatoly techtonik wrote:
>>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>>>
>>>> The hardest part is debugging the TAL when you make a mistake, but
>>>> even that isn't a whole lot worse than any other templating language.
>>>
>>> How much in % is it worse than Django templating language?
>>
>> I'm just guessing here, but I'd say 47.256 %.
> 
> That means switching to Django templates will make Roundup design
> plumbing work 47.256% more attractive for potential contributors.

That's not true actually.

It'll be 89.595 % more attractive.

Georg



From sijinjoseph at gmail.com  Mon May  2 17:27:49 2011
From: sijinjoseph at gmail.com (Sijin Joseph)
Date: Mon, 2 May 2011 11:27:49 -0400
Subject: [Python-Dev] Convert Py_Buffer to Py_UNICODE
Message-ID: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com>

Hi - I am working on a patch where I have an argument that can either be a
unicode string or binary data, I parse the argument using the
PyArg_ParseTuple method using the s* format specification and get a
Py_Buffer.

I now need to convert this Py_Buffer object to a Py_Unicode and pass it into
a function. What is the best way to do this? If I determine that the passed
argument was binary using another flag parameter then I am passing
Py_Buffer->buf as a pointer to the start of the data.

This is in winsound module, here's the relevant code snippet

sound_playsound(PyObject *s, PyObject *args)
{
    Py_buffer *buffer;
    int flags;
    int ok;
    LPCWSTR pszSound;

    if (PyArg_ParseTuple(args, "s*i:PlaySound", &buffer, &flags)) {
        if (flags & SND_ASYNC && flags & SND_MEMORY) {
            /* Sidestep reference counting headache; unfortunately this also
               prevent SND_LOOP from memory. */
            PyBuffer_Release(buffer);
            PyErr_SetString(PyExc_RuntimeError, "Cannot play asynchronously
from memory");
            return NULL;
        }

        if(flags & SND_MEMORY) {
            pszSound = buffer->buf;
        }
        else {
            /* pszSound = ????; */
        }

-- Sijin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/8a0f0250/attachment.html>

From mal at egenix.com  Mon May  2 21:12:27 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 02 May 2011 21:12:27 +0200
Subject: [Python-Dev] Convert Py_Buffer to Py_UNICODE
In-Reply-To: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com>
References: <BANLkTi==+x8SChR6y=wEjBQxhsjXEciWeQ@mail.gmail.com>
Message-ID: <4DBF021B.90602@egenix.com>

Sijin Joseph wrote:
> Hi - I am working on a patch where I have an argument that can either be a
> unicode string or binary data, I parse the argument using the
> PyArg_ParseTuple method using the s* format specification and get a
> Py_Buffer.
> 
> I now need to convert this Py_Buffer object to a Py_Unicode and pass it into
> a function. What is the best way to do this? If I determine that the passed
> argument was binary using another flag parameter then I am passing
> Py_Buffer->buf as a pointer to the start of the data.

I don't understand why you'd want to convert PyUnicode to PyBytes
(encoded as UTF-8), only to decode it again afterwards in order
to pass it to some other PyUnicode API.

It'd be more efficient to use the "O" parser marker and then
use PyObject_GetBuffer() to convert non-PyUnicode objects to
a Py_buffer.

> This is in winsound module, here's the relevant code snippet
> 
> sound_playsound(PyObject *s, PyObject *args)
> {
>     Py_buffer *buffer;
>     int flags;
>     int ok;
>     LPCWSTR pszSound;
> 
>     if (PyArg_ParseTuple(args, "s*i:PlaySound", &buffer, &flags)) {
>         if (flags & SND_ASYNC && flags & SND_MEMORY) {
>             /* Sidestep reference counting headache; unfortunately this also
>                prevent SND_LOOP from memory. */
>             PyBuffer_Release(buffer);
>             PyErr_SetString(PyExc_RuntimeError, "Cannot play asynchronously
> from memory");
>             return NULL;
>         }
> 
>         if(flags & SND_MEMORY) {
>             pszSound = buffer->buf;
>         }
>         else {
>             /* pszSound = ????; */
>         }
> 
> -- Sijin
> 
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 02 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-06-20: EuroPython 2011, Florence, Italy               49 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From benjamin at python.org  Mon May  2 21:25:44 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 2 May 2011 14:25:44 -0500
Subject: [Python-Dev] Issue Tracker
In-Reply-To: <ipmtsc$ao9$1@dough.gmane.org>
References: <4D90EA06.3030003@stoneleaf.us>
	<AANLkTikK=4Js-4Z2NRgmkhhkfKX_CufXTi3E0A2MhTPe@mail.gmail.com>
	<20110328223112.76482a9d@pitrou.net>
	<20110329013756.99EB8D64A7@kimball.webabinitio.net>
	<BANLkTi=ppYhHd4hAHMGeByTN1aUcBF2WNg@mail.gmail.com>
	<ipk1st$nfm$1@dough.gmane.org>
	<BANLkTikX_vTjy09x35mWBDb2P_aqRFsMQg@mail.gmail.com>
	<ipmtsc$ao9$1@dough.gmane.org>
Message-ID: <BANLkTimwmUJvwwzMz=C4jdiGOPD3_ABrQw@mail.gmail.com>

2011/5/2 Georg Brandl <g.brandl at gmx.net>:
> On 02.05.2011 18:06, anatoly techtonik wrote:
>> On Sun, May 1, 2011 at 7:31 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>>> On 30.04.2011 16:53, anatoly techtonik wrote:
>>>> On Tue, Mar 29, 2011 at 4:37 AM, R. David Murray <rdmurray at bitdance.com> wrote:
>>>>>
>>>>> The hardest part is debugging the TAL when you make a mistake, but
>>>>> even that isn't a whole lot worse than any other templating language.
>>>>
>>>> How much in % is it worse than Django templating language?
>>>
>>> I'm just guessing here, but I'd say 47.256 %.
>>
>> That means switching to Django templates will make Roundup design
>> plumbing work 47.256% more attractive for potential contributors.
>
> That's not true actually.
>
> It'll be 89.595 % more attractive.

I don't understand why you're truncating to 3 digits. Let's be honest
in that it will be sqrt(2)^(13e/2) % more attractive.


-- 
Regards,
Benjamin

From tjreedy at udel.edu  Mon May  2 22:49:54 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 02 May 2011 16:49:54 -0400
Subject: [Python-Dev] running/stepping python backwards
In-Reply-To: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com>
References: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com>
Message-ID: <4DBF18F2.9040202@udel.edu>

On 4/29/2011 10:13 PM, Adrian Johnston wrote:
> This may seem like an odd question, but I?m intrigued by the idea of
> using Python as a data definition language with ?undo? support.
>
> If I were to try and instrument the Python interpreter to be able to
> step backwards, would that be an unduly difficult or inefficient thing
> to do?

The pydev list is for development of the next version of Python. Please 
direct your question to a more appropriate forum such as python-list.

 > (Please reply to me directly.)

I did this time, but you should not expect that when posting to a public 
list.

-- 
Terry Jan Reedy



From martin at v.loewis.de  Mon May  2 23:14:06 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 02 May 2011 23:14:06 +0200
Subject: [Python-Dev] Windows 2000 Support
In-Reply-To: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com>
References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com>
Message-ID: <4DBF1E9E.5000006@v.loewis.de>

Am 01.05.2011 22:51, schrieb Brian Curtin:
> I'm currently writing a post about the process of removing OS/2 and VMS
> support and thought about a discussion of Windows 2000 some time
> back. http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes
> a proposal for beginning to walk away from 2000, but doesn't appear to
> come to any conclusion.
> 
> Was anything decided off the list? I don't see anything in PEP-11 and
> don't see any changes in the installer made around Windows 2000.

That's what you get for not following your own processes. It seems the
discussion just stopped, with no action. I vaguely recall having made
changes to the installer to produce a warning, but apparently never
got to commit these changes.

> If nothing was decided, should anything be done for 3.3?

Most certainly. It seems we missed the chance of dropping support for
W2k, so we still can't actively remove any code. However, I'd

a) add it to PEP 11, and
b) add a warning to the installer

I stand by

http://mail.python.org/pipermail/python-dev/2010-March/098101.html

i.e. if there are patches that happen not to work on W2k, I'd accept
them anyway - anybody interested in W2k would then have to provide
fixes before 3.3rc1.

So please go ahead and change PEP 11. While you are at it, also threaten
to remove support for systems where the COMSPEC points to command.com
(#2405).

Regards,
Martin

From drsalists at gmail.com  Mon May  2 23:19:38 2011
From: drsalists at gmail.com (Dan Stromberg)
Date: Mon, 2 May 2011 14:19:38 -0700
Subject: [Python-Dev] running/stepping python backwards
In-Reply-To: <4DBF18F2.9040202@udel.edu>
References: <BANLkTinSQtdpOVKn0GhH4=cP6NnhGgOD0A@mail.gmail.com>
	<4DBF18F2.9040202@udel.edu>
Message-ID: <BANLkTimSg33jOErPr_+9D7=0z47Vw_2KYw@mail.gmail.com>

On Mon, May 2, 2011 at 1:49 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> > (Please reply to me directly.)
>
> I did this time, but you should not expect that when posting to a public
> list.


Actually, this is not only appropriate on some lists, on some lists one is
actually strongly discouraged from doing anything else.

EG: sun-managers, where replies are expected to be private, and the
originator of the thread is expected to collect all (private) replies and
summarize them, to keep the list traffic low and the S/N ratio high.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/2225c141/attachment.html>

From barry at python.org  Tue May  3 00:35:20 2011
From: barry at python.org (Barry Warsaw)
Date: Mon, 2 May 2011 18:35:20 -0400
Subject: [Python-Dev] Python 2.6.7 schedule
Message-ID: <20110502183520.1c9efdc0@neurotica.wooz.org>

I'd like to make a Python 2.6.7 release candidate this Friday, May 6, with a
final release scheduled for May 20.  I've put these dates on the Python
Release Schedule calendar.

This will be a source-only security release.  I see no release blockers for
Python 2.6, so if you know of anything that must go into 2.6.7, please be sure
there is a tracker issue for it, that 2.6 is marked as being affected, and
with a release blocker priority.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/c5d1d695/attachment.pgp>

From martin at v.loewis.de  Tue May  3 01:09:42 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 03 May 2011 01:09:42 +0200
Subject: [Python-Dev] Fwd: viewVC shows traceback on non utf-8 module
 markup
In-Reply-To: <4DBB19A5.4010409@voidspace.org.uk>
References: <4DBB19A5.4010409@voidspace.org.uk>
Message-ID: <4DBF39B6.3050100@v.loewis.de>

Am 29.04.2011 22:03, schrieb Michael Foord:
> I know that the svn repo is now for legacy purposes only, but I doubt it
> is intended that the online source browser should raise exceptions.

It's certainly not. However, I don't plan to do anything about it,
either (nor would I know that anybody else would). To view the source
code of the file, use

http://svn.python.org/view/python/trunk/Lib/heapq.py?view=co&content-type=text/plain

Regards,
Martin

From brian.curtin at gmail.com  Tue May  3 02:39:33 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Mon, 2 May 2011 19:39:33 -0500
Subject: [Python-Dev] Windows 2000 Support
In-Reply-To: <4DBF1E9E.5000006@v.loewis.de>
References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com>
	<4DBF1E9E.5000006@v.loewis.de>
Message-ID: <BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com>

On Mon, May 2, 2011 at 16:14, "Martin v. L?wis" <martin at v.loewis.de> wrote:

> Am 01.05.2011 22:51, schrieb Brian Curtin:
> > I'm currently writing a post about the process of removing OS/2 and VMS
> > support and thought about a discussion of Windows 2000 some time
> > back. http://mail.python.org/pipermail/python-dev/2010-March/098074.htmlmakes
> > a proposal for beginning to walk away from 2000, but doesn't appear to
> > come to any conclusion.
> >
> > Was anything decided off the list? I don't see anything in PEP-11 and
> > don't see any changes in the installer made around Windows 2000.
>
> That's what you get for not following your own processes. It seems the
> discussion just stopped, with no action. I vaguely recall having made
> changes to the installer to produce a warning, but apparently never
> got to commit these changes.
>
> > If nothing was decided, should anything be done for 3.3?
>
> Most certainly. It seems we missed the chance of dropping support for
> W2k, so we still can't actively remove any code. However, I'd
>
> a) add it to PEP 11, and
> b) add a warning to the installer
>
> I stand by
>
> http://mail.python.org/pipermail/python-dev/2010-March/098101.html
>
> i.e. if there are patches that happen not to work on W2k, I'd accept
> them anyway - anybody interested in W2k would then have to provide
> fixes before 3.3rc1.
>
> So please go ahead and change PEP 11. While you are at it, also threaten
> to remove support for systems where the COMSPEC points to command.com
> (#2405).
>

Done and done - http://hg.python.org/peps/rev/b9390aa12855
I'll have a look at the installer and add some type of message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110502/1fa43b39/attachment.html>

From nadeem.vawda at gmail.com  Tue May  3 16:22:27 2011
From: nadeem.vawda at gmail.com (Nadeem Vawda)
Date: Tue, 3 May 2011 16:22:27 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <E1QHFVm-0002kp-TV@dinsdale.python.org>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
Message-ID: <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>

On Tue, May 3, 2011 at 3:19 PM, victor.stinner
<python-checkins at python.org> wrote:
> +# Issue #10276 - check that inputs of 2 GB are handled correctly.
> +# Be aware of issues #1202, #8650, #8651 and #10276
> +class ChecksumBigBufferTestCase(unittest.TestCase):
> + ? ?int_max = 0x7FFFFFFF
> +
> + ? ?@unittest.skipUnless(mmap, "mmap() is not available.")
> + ? ?def test_big_buffer(self):
> + ? ? ? ?if sys.platform[:3] == 'win' or sys.platform == 'darwin':
> + ? ? ? ? ? ?requires('largefile',
> + ? ? ? ? ? ? ? ? ? ? 'test requires %s bytes and a long time to run' %
> + ? ? ? ? ? ? ? ? ? ? str(self.int_max))
> + ? ? ? ?try:
> + ? ? ? ? ? ?with open(TESTFN, "wb+") as f:
> + ? ? ? ? ? ? ? ?f.seek(self.int_max-4)
> + ? ? ? ? ? ? ? ?f.write("asdf")
> + ? ? ? ? ? ? ? ?f.flush()
> + ? ? ? ? ? ? ? ?try:
> + ? ? ? ? ? ? ? ? ? ?m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
> + ? ? ? ? ? ? ? ? ? ?self.assertEqual(zlib.crc32(m), 0x709418e7)
> + ? ? ? ? ? ? ? ? ? ?self.assertEqual(zlib.adler32(m), -2072837729)
> + ? ? ? ? ? ? ? ?finally:
> + ? ? ? ? ? ? ? ? ? ?m.close()
> + ? ? ? ?except (IOError, OverflowError):
> + ? ? ? ? ? ?raise unittest.SkipTest("filesystem doesn't have largefile support")
> + ? ? ? ?finally:
> + ? ? ? ? ? ?unlink(TESTFN)
> +
> +

0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be
0x80000000. However, if you make this change, crc32() and adler32()
raise OverflowErrors (see changeset a0681e7a6ded). This makes the test
to erroneously report that the filesystem doesn't support large files.
The assertEqual() tests should probably be changed to
assertRaises(..., OverflowError).

Also, the assignment to m needs to be moved outside of the inner
try...finally block. If mmap() fails, the call to m.close() raises a
new exception because m has not yet been bound. This seems to be
causing failures on some of the 32-bit buildbots.

As an aside, in this sort of situation is it better to just go and
commit a fix myself, or is raising it on the mailing list first the
right way to do things?

Cheers,
Nadeem

From g.brandl at gmx.net  Tue May  3 20:30:22 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 03 May 2011 20:30:22 +0200
Subject: [Python-Dev] Raise OSError or RuntimeError in the OS module?
In-Reply-To: <201105021206.47384.victor.stinner@haypocalc.com>
References: <201105021206.47384.victor.stinner@haypocalc.com>
Message-ID: <ipphk3$iui$2@dough.gmane.org>

On 02.05.2011 12:06, Victor Stinner wrote:
> Hi,
> 
> I introduced recently the signal.pthread_sigmask() function (issue #8407). 
> pthread_sigmask() (the C function) returns an error code using errno codes. I 
> choosed to raise a RuntimeError using this error code, but I am not sure that 
> RuntimeError is the best choice. It is more an OS error than a runtime error: 
> should signal.pthread_sigmask() raise an OSError instead?
> 
> signal.signal() raises a RuntimeError if setting the signal handler failed. 
> signal.siginterrupt() raises also a RuntimeError on error.
> 
> signal.setitimer() and signal.getitimer() have their own exception class: 
> signal.ItimerError, raised on setimer() and getitimer() error.

If it has an errno, it should be a subclass of EnvironmentError.

Georg



From brian.curtin at gmail.com  Tue May  3 20:39:40 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Tue, 3 May 2011 13:39:40 -0500
Subject: [Python-Dev] Windows 2000 Support
In-Reply-To: <BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com>
References: <BANLkTik3w5jD+dC1tx2zTjOxSXbcmkfGPw@mail.gmail.com>
	<4DBF1E9E.5000006@v.loewis.de>
	<BANLkTi=OFubGKXv0nXaM+aNGxda2_vwtPg@mail.gmail.com>
Message-ID: <BANLkTi=9tQiFZbGFSBsXro=bXaggxSiR9g@mail.gmail.com>

On Mon, May 2, 2011 at 19:39, Brian Curtin <brian.curtin at gmail.com> wrote:

> On Mon, May 2, 2011 at 16:14, "Martin v. L?wis" <martin at v.loewis.de>wrote:
>
>> Am 01.05.2011 22:51, schrieb Brian Curtin:
>> > I'm currently writing a post about the process of removing OS/2 and VMS
>> > support and thought about a discussion of Windows 2000 some time
>> > back.
>> http://mail.python.org/pipermail/python-dev/2010-March/098074.html makes
>> > a proposal for beginning to walk away from 2000, but doesn't appear to
>> > come to any conclusion.
>> >
>> > Was anything decided off the list? I don't see anything in PEP-11 and
>> > don't see any changes in the installer made around Windows 2000.
>>
>> That's what you get for not following your own processes. It seems the
>> discussion just stopped, with no action. I vaguely recall having made
>> changes to the installer to produce a warning, but apparently never
>> got to commit these changes.
>>
>> > If nothing was decided, should anything be done for 3.3?
>>
>> Most certainly. It seems we missed the chance of dropping support for
>> W2k, so we still can't actively remove any code. However, I'd
>>
>> a) add it to PEP 11, and
>> b) add a warning to the installer
>>
>> I stand by
>>
>> http://mail.python.org/pipermail/python-dev/2010-March/098101.html
>>
>> i.e. if there are patches that happen not to work on W2k, I'd accept
>> them anyway - anybody interested in W2k would then have to provide
>> fixes before 3.3rc1.
>>
>> So please go ahead and change PEP 11. While you are at it, also threaten
>> to remove support for systems where the COMSPEC points to command.com
>> (#2405).
>>
>
> Done and done - http://hg.python.org/peps/rev/b9390aa12855
> I'll have a look at the installer and add some type of message.
>

It turns out that you did make the change at some point for 2.7 being the
last, but there was no corresponding 3.x version chosen.
http://hg.python.org/cpython/rev/de53c52fbcbf changed the installer to list
3.3.0 as the last Windows 2000 release on the default branch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110503/f655c8fb/attachment.html>

From solipsis at pitrou.net  Tue May  3 20:57:47 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 3 May 2011 20:57:47 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
Message-ID: <20110503205747.65a76522@pitrou.net>


Hello,

On Tue, 3 May 2011 16:22:27 +0200
Nadeem Vawda <nadeem.vawda at gmail.com> wrote:
> 
> As an aside, in this sort of situation is it better to just go and
> commit a fix myself, or is raising it on the mailing list first the
> right way to do things?

Raising it on the mailing-list makes it serve as a kind of post-commit
review. Also, it ensures that the committer of the original patch
understands the issues with it.

cheers

Antoine.



From victor.stinner at haypocalc.com  Tue May  3 22:38:43 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 03 May 2011 22:38:43 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
Message-ID: <1304455123.1971.5.camel@marge>

Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit :
> On Tue, May 3, 2011 at 3:19 PM, victor.stinner
> <python-checkins at python.org> wrote:
> > +# Issue #10276 - check that inputs of 2 GB are handled correctly.
> > +# Be aware of issues #1202, #8650, #8651 and #10276
> > +class ChecksumBigBufferTestCase(unittest.TestCase):
> > +    int_max = 0x7FFFFFFF
> > +
> > +    @unittest.skipUnless(mmap, "mmap() is not available.")
> > +    def test_big_buffer(self):
> > +        if sys.platform[:3] == 'win' or sys.platform == 'darwin':
> > +            requires('largefile',
> > +                     'test requires %s bytes and a long time to run' %
> > +                     str(self.int_max))
> > +        try:
> > +            with open(TESTFN, "wb+") as f:
> > +                f.seek(self.int_max-4)
> > +                f.write("asdf")
> > +                f.flush()
> > +                try:
> > +                    m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
> > +                    self.assertEqual(zlib.crc32(m), 0x709418e7)
> > +                    self.assertEqual(zlib.adler32(m), -2072837729)
> > +                finally:
> > +                    m.close()
> > +        except (IOError, OverflowError):
> > +            raise unittest.SkipTest("filesystem doesn't have largefile support")
> > +        finally:
> > +            unlink(TESTFN)
> > +
> > +
> 
> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be
> 0x80000000. However, if you make this change, crc32() and adler32()
> raise OverflowErrors (see changeset a0681e7a6ded).

I don't want to check OverflowError: the test is supposed to compute the
checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and
adler32(). 0x7FFFFFFF is the biggest size supported by these functions
(zlib doesn't use Py_ssize_t in Python 2.7).

If you use a buffer of 0x80000000 bytes, you test PyArg_Parse*()
functions, which have already a dedicated test (in test_xml_etree_c,
it's not the best file to store such test...).

> Also, the assignment to m needs to be moved outside of the inner
> try...finally block.

Yeah, I noticed this with buildbots: already fixed by dd58f8072216.

> As an aside, in this sort of situation is it better to just go and
> commit a fix myself, or is raising it on the mailing list first the
> right way to do things?

I'm not sure that you understood the test, so I think that it's better
to ask first on IRC and/or the mailing list.

Victor


From nadeem.vawda at gmail.com  Tue May  3 23:11:48 2011
From: nadeem.vawda at gmail.com (Nadeem Vawda)
Date: Tue, 3 May 2011 23:11:48 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304455123.1971.5.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge>
Message-ID: <BANLkTimcU=-826rXGaPqM8LD0fvuWW-GPw@mail.gmail.com>

On Tue, May 3, 2011 at 10:38 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> I don't want to check OverflowError: the test is supposed to compute the
> checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and
> adler32(). 0x7FFFFFFF is the biggest size supported by these functions
> (zlib doesn't use Py_ssize_t in Python 2.7).

I see. Since you mentioned issue 10276 in the commit message, I
assumed you were testing
for the underlying C functions truncating their arguments. It seems
that I was mistaken.
Sorry for the confusion.

Cheers,
Nadeem

From victor.stinner at haypocalc.com  Wed May  4 10:58:42 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 04 May 2011 10:58:42 +0200
Subject: [Python-Dev] The zombi thread of the Tcl library
Message-ID: <1304499523.15694.11.camel@marge>

Hi,

I have a question: would it be possible to mask all signals in the Tcl
thread? To understand the question, let's see the context...

I'm working on signals, especially on pthread_sigmask(), and I'm trying
to understand test_signal failures.

test_signal fails if the _tkinter module is loaded, because _tkinter
loads the Tcl library which create a thread waiting events in select().
For example, "python -m test test_pydoc test_signal" fails, because
test_pydoc loads ALL Python modules. I opened an issue for test_pydoc:
http://bugs.python.org/issue11995

_tkinter.c contains the following code:
#if 0
    /* This was not a good idea; through <Destroy> bindings,
       Tcl_Finalize() may invoke Python code but at that point the
       interpreter and thread state have already been destroyed! */
    Py_AtExit(Tcl_Finalize);
#endif

Tcl_Finalize() exits the thread, but this function is never called in
Python. Anyway, it is not possible to unload a module implemented in C.

I would like to know if it would be possible to mask all signals in the
Tcl thread, or if Tcl supports/uses signals.

It is possible to mask all signals in the Tcl thread using:
----------
allsignals = range(1, signal.NSIG)
oldmask = signal.pthread_sigmask(signal.SIG_BLOCK, allsignals)
import _tkinter
signal.pthread_sigmask(signal.SIG_SETMASK, oldmask)
----------

I'm not asking the question for test_signal: I have a patch fixing
test_signal, even if the Tcl zombi thread is present (use pthread_kill()
to send the signal directly to the main thread).

(I wrote "zombi" thread because I was not aware that Tcl uses a thread,
nor that test_pydoc loads all modules. The thread is valid, alive, and
it's just a joke. The threads is more hidden than zombi.)

Victor


From marks at dcs.gla.ac.uk  Wed May  4 11:08:33 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Wed, 04 May 2011 10:08:33 +0100
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <1304499523.15694.11.camel@marge>
References: <1304499523.15694.11.camel@marge>
Message-ID: <4DC11791.2000109@dcs.gla.ac.uk>

Hi,

The online documentation specifies which API function borrow and/or 
steal references (as opposed to the default behaviour).
Yet, I cannot find this information anywhere in the source.

Any clues as to where I should look?

Cheers,
Mark

From amauryfa at gmail.com  Wed May  4 11:35:19 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 4 May 2011 11:35:19 +0200
Subject: [Python-Dev]  Borrowed and Stolen References in API
In-Reply-To: <4DC11791.2000109@dcs.gla.ac.uk>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
Message-ID: <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>

Hi,

Le mercredi 4 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit?:
> The online documentation specifies which API function borrow and/or steal references (as opposed to the default behaviour).
> Yet, I cannot find this information anywhere in the source.
>
> Any clues as to where I should look?

It's in the file Doc/data/refcounts.dat
in some custom format.

-- 
Amaury

-- 
Amaury Forgeot d'Arc

From solipsis at pitrou.net  Wed May  4 12:05:19 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 4 May 2011 12:05:19 +0200
Subject: [Python-Dev] The zombi thread of the Tcl library
References: <1304499523.15694.11.camel@marge>
Message-ID: <20110504120519.7a1bc105@pitrou.net>

On Wed, 04 May 2011 10:58:42 +0200
Victor Stinner <victor.stinner at haypocalc.com> wrote:
> 
> Tcl_Finalize() exits the thread, but this function is never called in
> Python. Anyway, it is not possible to unload a module implemented in C.

You could expose Tcl_Finalize() for debug purposes and call it in
test_signal.

Regards

Antoine.



From victor.stinner at haypocalc.com  Wed May  4 13:54:20 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 04 May 2011 13:54:20 +0200
Subject: [Python-Dev] The zombi thread of the Tcl library
In-Reply-To: <20110504120519.7a1bc105@pitrou.net>
References: <1304499523.15694.11.camel@marge>
	<20110504120519.7a1bc105@pitrou.net>
Message-ID: <1304510060.15694.13.camel@marge>

Le mercredi 04 mai 2011 ? 12:05 +0200, Antoine Pitrou a ?crit :
> On Wed, 04 May 2011 10:58:42 +0200
> Victor Stinner <victor.stinner at haypocalc.com> wrote:
> > 
> > Tcl_Finalize() exits the thread, but this function is never called in
> > Python. Anyway, it is not possible to unload a module implemented in C.
> 
> You could expose Tcl_Finalize() for debug purposes and call it in
> test_signal.

Good idea. I opened an issue with a patch implementing Tcl_Finalize():
http://bugs.python.org/issue11998

I also added a workaround _tkinter border effect in test_signal.
Buildbots look to be happy.

Victor


From ncoghlan at gmail.com  Wed May  4 19:01:58 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 5 May 2011 03:01:58 +1000
Subject: [Python-Dev] New interest areas in Experts Index
Message-ID: <BANLkTiku43f7MaTnv=4N+NBa6sBa+5cJrg@mail.gmail.com>

I just added two new interest areas in the Expert's Index [1]

context managers: for any issues relating to proposals to add context
management capabilities to objects in the stdlib, triagers should feel
free to add me to the nosy list
test coverage: this is specifically for anyone willing to help review
and commit test coverage improvement patches (rather than the more
general "testing" interest area that was already present)

Cheers,
Nick.

[1] http://docs.python.org/devguide/experts

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From solipsis at pitrou.net  Wed May  4 21:35:11 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 4 May 2011 21:35:11 +0200
Subject: [Python-Dev] cpython (2.7): Issue #11277: test_zlib tests a
 buffer of 1 GB on 32	bits
References: <E1QHhji-0003D9-5t@dinsdale.python.org>
Message-ID: <20110504213511.07e9f2bf@pitrou.net>

On Wed, 04 May 2011 21:27:50 +0200
victor.stinner <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/7f3cab59ef3e
> changeset:   69834:7f3cab59ef3e
> branch:      2.7
> parent:      69827:affec521b330
> user:        Victor Stinner <victor.stinner at haypocalc.com>
> date:        Wed May 04 21:27:39 2011 +0200
> summary:
>   Issue #11277: test_zlib tests a buffer of 1 GB on 32 bits

What's the point? The issue with 2GB or 4GB buffers is that they cross
the potential limit of a machine type (a signed or unsigned integer).
I don't see any benefit in testing a 1GB buffer; the test could
probably be removed instead.

Regards

Antoine.



From greg.ewing at canterbury.ac.nz  Thu May  5 00:04:51 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 May 2011 10:04:51 +1200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <4DC11791.2000109@dcs.gla.ac.uk>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
Message-ID: <4DC1CD83.3000603@canterbury.ac.nz>

Mark Shannon wrote:

> The online documentation specifies which API function borrow and/or 
> steal references (as opposed to the default behaviour).
> Yet, I cannot find this information anywhere in the source.

There are comments in some places, e.g. in listobject.h:

   *** WARNING *** PyList_SetItem does not increment the new item's reference
   count, but does decrement the reference count of the item it replaces,
   if not nil.  It does *decrement* the reference count if it is *not*
   inserted in the list.  Similarly, PyList_GetItem does not increment the
   returned item's reference count.

If you're looking for evidence in the actual code, there's
nothing particular to look for -- it's implicit in the
way the function works overall.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Thu May  5 00:23:01 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 May 2011 10:23:01 +1200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
Message-ID: <4DC1D1C5.9010507@canterbury.ac.nz>

Amaury Forgeot d'Arc wrote:

> It's in the file Doc/data/refcounts.dat
> in some custom format.

However, it doesn't seem to quite convey the same information.
It lists the "refcount effect" on each parameter, but translating
that into the notion of borrowed or stolen references seems
to require knowledge of what the function does.

For example, PyDict_SetItem has:

PyDict_SetItem:PyObject*:p:0:
PyDict_SetItem:PyObject*:key:+1:
PyDict_SetItem:PyObject*:val:+1:

All of these parameters take borrowed references, but the
key and val get incremented because they're being stored
in the dict.

So this file appears to be of limited usefulness.

-- 
Greg


From ethan at stoneleaf.us  Thu May  5 00:40:42 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 04 May 2011 15:40:42 -0700
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304455123.1971.5.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge>
Message-ID: <4DC1D5EA.7060608@stoneleaf.us>

Victor Stinner wrote:
> Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit :
>> On Tue, May 3, 2011 at 3:19 PM, victor.stinner
>> <python-checkins at python.org> wrote:
>>> +# Issue #10276 - check that inputs of 2 GB are handled correctly.
>>> +# Be aware of issues #1202, #8650, #8651 and #10276
>>> +class ChecksumBigBufferTestCase(unittest.TestCase):
>>> +    int_max = 0x7FFFFFFF
>>> +
>>> +    @unittest.skipUnless(mmap, "mmap() is not available.")
>>> +    def test_big_buffer(self):
>>> +        if sys.platform[:3] == 'win' or sys.platform == 'darwin':
>>> +            requires('largefile',
>>> +                     'test requires %s bytes and a long time to run' %
>>> +                     str(self.int_max))
>>> +        try:
>>> +            with open(TESTFN, "wb+") as f:
>>> +                f.seek(self.int_max-4)
>>> +                f.write("asdf")
>>> +                f.flush()
>>> +                try:
>>> +                    m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
>>> +                    self.assertEqual(zlib.crc32(m), 0x709418e7)
>>> +                    self.assertEqual(zlib.adler32(m), -2072837729)
>>> +                finally:
>>> +                    m.close()
>>> +        except (IOError, OverflowError):
>>> +            raise unittest.SkipTest("filesystem doesn't have largefile support")
>>> +        finally:
>>> +            unlink(TESTFN)
>>> +
>>> +
>> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be
>> 0x80000000. However, if you make this change, crc32() and adler32()
>> raise OverflowErrors (see changeset a0681e7a6ded).
> 
> I don't want to check OverflowError: the test is supposed to compute the
> checksum of a buffer of 0x7FFFFFFF bytes

The comment says 'check that inputs of 2 GB are handled correctly' but 
the file created is 1 byte short of 2Gb.  Is the test wrong, or just 
wrongly commented?  Or am I not understanding?

~Ethan~

From victor.stinner at haypocalc.com  Thu May  5 11:33:27 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 05 May 2011 11:33:27 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <4DC1D5EA.7060608@stoneleaf.us>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge>  <4DC1D5EA.7060608@stoneleaf.us>
Message-ID: <1304588007.22418.7.camel@marge>

Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit :
> Victor Stinner wrote:
> > Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit :
> >> On Tue, May 3, 2011 at 3:19 PM, victor.stinner
> >> <python-checkins at python.org> wrote:
> >>> +# Issue #10276 - check that inputs of 2 GB are handled correctly.
> >>> +# Be aware of issues #1202, #8650, #8651 and #10276
> >>> +class ChecksumBigBufferTestCase(unittest.TestCase):
> >>> +    int_max = 0x7FFFFFFF
> >>> +
> >>> +    @unittest.skipUnless(mmap, "mmap() is not available.")
> >>> +    def test_big_buffer(self):
> >>> +        if sys.platform[:3] == 'win' or sys.platform == 'darwin':
> >>> +            requires('largefile',
> >>> +                     'test requires %s bytes and a long time to run' %
> >>> +                     str(self.int_max))
> >>> +        try:
> >>> +            with open(TESTFN, "wb+") as f:
> >>> +                f.seek(self.int_max-4)
> >>> +                f.write("asdf")
> >>> +                f.flush()
> >>> +                try:
> >>> +                    m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
> >>> +                    self.assertEqual(zlib.crc32(m), 0x709418e7)
> >>> +                    self.assertEqual(zlib.adler32(m), -2072837729)
> >>> +                finally:
> >>> +                    m.close()
> >>> +        except (IOError, OverflowError):
> >>> +            raise unittest.SkipTest("filesystem doesn't have largefile support")
> >>> +        finally:
> >>> +            unlink(TESTFN)
> >>> +
> >>> +
> >> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be
> >> 0x80000000. However, if you make this change, crc32() and adler32()
> >> raise OverflowErrors (see changeset a0681e7a6ded).
> > 
> > I don't want to check OverflowError: the test is supposed to compute the
> > checksum of a buffer of 0x7FFFFFFF bytes
> 
> The comment says 'check that inputs of 2 GB are handled correctly' but 
> the file created is 1 byte short of 2Gb.  Is the test wrong, or just 
> wrongly commented?  Or am I not understanding?

If you write a byte after 2 GB of zeros, the file size is 2 GB+the few
bytes. This trick is to create quickly a large file: some OSes support
sparse files, zeros are not written on disk. But on Mac OS X and
Windows, you really write 2 GB+some bytes.

Victor


From nadeem.vawda at gmail.com  Thu May  5 11:43:19 2011
From: nadeem.vawda at gmail.com (Nadeem Vawda)
Date: Thu, 5 May 2011 11:43:19 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304588007.22418.7.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us>
	<1304588007.22418.7.camel@marge>
Message-ID: <BANLkTinPa4aq7Q35JKipLhCPppxxrANBww@mail.gmail.com>

On Thu, May 5, 2011 at 11:33 AM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit :
>> The comment says 'check that inputs of 2 GB are handled correctly' but
>> the file created is 1 byte short of 2Gb. ?Is the test wrong, or just
>> wrongly commented? ?Or am I not understanding?
>
> If you write a byte after 2 GB of zeros, the file size is 2 GB+the few
> bytes. This trick is to create quickly a large file: some OSes support
> sparse files, zeros are not written on disk. But on Mac OS X and
> Windows, you really write 2 GB+some bytes.

Ethan's point is that 0x7FFFFFFF is not 2GB - it is (2G-1) bytes. So the
test and the preceding comment are inconsistent.

From p.f.moore at gmail.com  Thu May  5 11:53:59 2011
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu, 5 May 2011 10:53:59 +0100
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304588007.22418.7.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us>
	<1304588007.22418.7.camel@marge>
Message-ID: <BANLkTikNL=6ry4CPPdH5ZVejNNF10eoiEg@mail.gmail.com>

On 5 May 2011 10:33, Victor Stinner <victor.stinner at haypocalc.com> wrote:
> If you write a byte after 2 GB of zeros, the file size is 2 GB+the few
> bytes. This trick is to create quickly a large file: some OSes support
> sparse files, zeros are not written on disk. But on Mac OS X and
> Windows, you really write 2 GB+some bytes.

FWIW, on Windows you can create sparse files, using
DeviceIoControl(FILE_SET_SPARSE). It's probably too messy to be worth
it for this case, though...

Paul

From giuott at gmail.com  Thu May  5 12:14:34 2011
From: giuott at gmail.com (Giuseppe Ottaviano)
Date: Thu, 5 May 2011 11:14:34 +0100
Subject: [Python-Dev] What if replacing items in a dictionary returns
 the new dictionary?
In-Reply-To: <BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com>
References: <BANLkTin8sB+85CicRtqkbrgtN7--Ujh3jQ@mail.gmail.com>
	<20110429143406.GA441@iskra.aviel.ru>
	<BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com>
Message-ID: <BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com>

On Fri, Apr 29, 2011 at 4:05 PM, Roy Hyunjin Han
<starsareblueandfaraway at gmail.com> wrote:
>> ? You can implement this in your own subclass of dict, no?
>
> Yes, I just thought it would be convenient to have in the language
> itself, but the responses to my post seem to indicate that [not
> returning the updated object] is an intended language feature for
> mutable types like dict or list.

In general nothing stops you to use a proxy object that returns itself
after each method call, something like


class using(object):
    def __init__(self, obj):
        self._wrappee = obj

    def unwrap(self):
        return self._wrappee

    def __getattr__(self, attr):
        def wrapper(*args, **kwargs):
            getattr(self._wrappee, attr)(*args, **kwargs)
            return self
        return wrapper


d = dict()
print using(d).update(dict(a=1)).update(dict(b=2)).unwrap()
# prints {'a': 1, 'b': 2}
l = list()
print using(l).append(1).append(2).unwrap()
# prints [1, 2]

From amauryfa at gmail.com  Thu May  5 12:38:32 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 5 May 2011 12:38:32 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <4DC1D1C5.9010507@canterbury.ac.nz>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
Message-ID: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>

Hi,

Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit?:
> Amaury Forgeot d'Arc wrote:
>
>
> It's in the file Doc/data/refcounts.dat
> in some custom format.
>
>
> However, it doesn't seem to quite convey the same information.
> It lists the "refcount effect" on each parameter, but translating
> that into the notion of borrowed or stolen references seems
> to require knowledge of what the function does.
>
> For example, PyDict_SetItem has:
>
> PyDict_SetItem:PyObject*:p:0:
> PyDict_SetItem:PyObject*:key:+1:
> PyDict_SetItem:PyObject*:val:+1:
>
> All of these parameters take borrowed references, but the
> key and val get incremented because they're being stored
> in the dict.

This is not always true, for example when the item is already present
in the dict.
It's not important to know what the function does to the object,
Only the action on the reference is relevant.

>
> So this file appears to be of limited usefulness.

-- 
Amaury

-- 
Amaury Forgeot d'Arc

From ethan at stoneleaf.us  Thu May  5 14:07:04 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 05 May 2011 05:07:04 -0700
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304588007.22418.7.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>	
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>	
	<1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us>
	<1304588007.22418.7.camel@marge>
Message-ID: <4DC292E8.9010904@stoneleaf.us>

Victor Stinner wrote:
> Le mercredi 04 mai 2011 ? 15:40 -0700, Ethan Furman a ?crit :
>> Victor Stinner wrote:
>>> Le mardi 03 mai 2011 ? 16:22 +0200, Nadeem Vawda a ?crit :
>>>> On Tue, May 3, 2011 at 3:19 PM, victor.stinner
>>>> <python-checkins at python.org> wrote:
>>>>> 
>>>>> +    int_max = 0x7FFFFFFF
>>>>> 
>>>>> +            with open(TESTFN, "wb+") as f:
>>>>> +                f.seek(self.int_max-4)
>>>>> +                f.write("asdf")
>>>>> +                f.flush()
>>>> 
>>>> 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be
>>>> 0x80000000. However, if you make this change, crc32() and adler32()
>>>> raise OverflowErrors (see changeset a0681e7a6ded).
 >>>
>>> I don't want to check OverflowError: the test is supposed to compute the
>>> checksum of a buffer of 0x7FFFFFFF bytes
 >>
>> The comment says 'check that inputs of 2 GB are handled correctly' but 
>> the file created is 1 byte short of 2Gb.  Is the test wrong, or just 
>> wrongly commented?  Or am I not understanding?
> 
> If you write a byte after 2 GB of zeros, the file size is 2 GB+the few
> bytes. This trick is to create quickly a large file: some OSes support
> sparse files, zeros are not written on disk. But on Mac OS X and
> Windows, you really write 2 GB+some bytes.

True, but that's not what's happening -- four bytes are being written at
int_max - 4, and int_max is one less that 2GB; hence the resulting file 
is one less than 2GB.

~Ethan~

From victor.stinner at haypocalc.com  Thu May  5 14:27:43 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 05 May 2011 14:27:43 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <4DC292E8.9010904@stoneleaf.us>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>
	<1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us>
	<1304588007.22418.7.camel@marge>  <4DC292E8.9010904@stoneleaf.us>
Message-ID: <1304598463.27042.0.camel@marge>

Le jeudi 05 mai 2011 ? 05:07 -0700, Ethan Furman a ?crit :
> ... hence the resulting file is one less than 2GB.

Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in
an int. The zlib module stores the buffer size into an int in Python 2.7
(and Py_ssize_t in Python 3.3).

Victor


From ethan at stoneleaf.us  Thu May  5 17:17:27 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 05 May 2011 08:17:27 -0700
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276:
 test_zlib checks that inputs of 2 GB are handled correctly by
In-Reply-To: <1304598463.27042.0.camel@marge>
References: <E1QHFVm-0002kp-TV@dinsdale.python.org>	
	<BANLkTinsj8Y4SuQ0iSi3um-8TtCgJkGWPg@mail.gmail.com>	
	<1304455123.1971.5.camel@marge> <4DC1D5EA.7060608@stoneleaf.us>	
	<1304588007.22418.7.camel@marge> <4DC292E8.9010904@stoneleaf.us>
	<1304598463.27042.0.camel@marge>
Message-ID: <4DC2BF87.40100@stoneleaf.us>

Victor Stinner wrote:
> Le jeudi 05 mai 2011 ? 05:07 -0700, Ethan Furman a ?crit :
 >>
>> ... hence the resulting file is one less than 2GB.
> 
> Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in
> an int. The zlib module stores the buffer size into an int in Python 2.7
> (and Py_ssize_t in Python 3.3).

So we are agreed that the file is not, in fact, 2GB in size...

 > On Tue, May 3, 2011 at 3:19 PM, victor.stinner
 > <python-checkins at python.org> wrote:
 >> +# Issue #10276 - check that inputs of 2 GB are handled correctly.
 >> +# Be aware of issues #1202, #8650, #8651 and #10276

So why do the comments say we are testing a 2GB input?

~Ethan~

From starsareblueandfaraway at gmail.com  Thu May  5 16:37:04 2011
From: starsareblueandfaraway at gmail.com (Roy Hyunjin Han)
Date: Thu, 5 May 2011 10:37:04 -0400
Subject: [Python-Dev] What if replacing items in a dictionary returns
 the new dictionary?
In-Reply-To: <BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com>
References: <BANLkTin8sB+85CicRtqkbrgtN7--Ujh3jQ@mail.gmail.com>
	<20110429143406.GA441@iskra.aviel.ru>
	<BANLkTikt4ue3NYBzna3p=GbNr6J6zEtGDA@mail.gmail.com>
	<BANLkTiksDBMJVEzcr27=rwuMdX2=ph-qjA@mail.gmail.com>
Message-ID: <BANLkTim0EkdVum9vkgDYBhCEpekSVc4+Ow@mail.gmail.com>

>> 2011/4/29 Roy Hyunjin Han <starsareblueandfaraway at gmail.com>:
>> It would be convenient if replacing items in a dictionary returns the
>> new dictionary, in a manner analogous to str.replace().  What do you
>> think?
>>
>>    # Current behavior
>>    x = {'key1': 1}
>>    x.update(key1=3) == None
>>    x == {'key1': 3} # Original variable has changed
>>
>>    # Possible behavior
>>    x = {'key1': 1}
>>    x.replace(key1=3) == {'key1': 3}
>>    x == {'key1': 1} # Original variable is unchanged
>>
> 2011/5/5 Giuseppe Ottaviano <giuott at gmail.com>:
> In general nothing stops you to use a proxy object that returns itself
> after each method call, something like
>
> class using(object):
>    def __init__(self, obj):
>        self._wrappee = obj
>
>    def unwrap(self):
>        return self._wrappee
>
>    def __getattr__(self, attr):
>        def wrapper(*args, **kwargs):
>            getattr(self._wrappee, attr)(*args, **kwargs)
>            return self
>        return wrapper
>
>
> d = dict()
> print using(d).update(dict(a=1)).update(dict(b=2)).unwrap()
> # prints {'a': 1, 'b': 2}
> l = list()
> print using(l).append(1).append(2).unwrap()
> # prints [1, 2]

Cool!  I never thought of that.  That's a great snippet.

I'll forward this to the python-ideas list.  I don't think the
python-dev people want this discussion to continue on their mailing
list.

From guido at python.org  Thu May  5 19:00:54 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 5 May 2011 10:00:54 -0700
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
Message-ID: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>

On Thu, May 5, 2011 at 3:38 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
> Hi,
>
> Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit?:
>> Amaury Forgeot d'Arc wrote:
>>
>>
>> It's in the file Doc/data/refcounts.dat
>> in some custom format.
>>
>>
>> However, it doesn't seem to quite convey the same information.
>> It lists the "refcount effect" on each parameter, but translating
>> that into the notion of borrowed or stolen references seems
>> to require knowledge of what the function does.
>>
>> For example, PyDict_SetItem has:
>>
>> PyDict_SetItem:PyObject*:p:0:
>> PyDict_SetItem:PyObject*:key:+1:
>> PyDict_SetItem:PyObject*:val:+1:
>>
>> All of these parameters take borrowed references, but the
>> key and val get incremented because they're being stored
>> in the dict.
>
> This is not always true, for example when the item is already present
> in the dict.
> It's not important to know what the function does to the object,
> Only the action on the reference is relevant.
>
>>
>> So this file appears to be of limited usefulness.

Seems you're in agreement with this. IMO when references are borrowed
it is not very interesting. The interesting thing is when calling a
function *steals* a reference. The other important thing to know is
whether the caller ends up owning the return value (if it is an
object) or not. I *think* you can tell the latter from the +1 for the
return value; but the former (whether it steals a reference) is
unclear from the data given. There's even an XXX comment about this in
the file:

# XXX NOTE: the 0/+1/-1 refcount information for arguments is
# confusing!  Much more useful would be to indicate whether the
# function "steals" a reference to the argument or not.  Take for
# example PyList_SetItem(list, i, item).  This lists as a 0 change for
# both the list and the item arguments.  However, in fact it steals a
# reference to the item argument!

-- 
--Guido van Rossum (python.org/~guido)

From amauryfa at gmail.com  Thu May  5 19:17:30 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 5 May 2011 19:17:30 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
Message-ID: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>

2011/5/5 Guido van Rossum <guido at python.org>:
> Seems you're in agreement with this. IMO when references are borrowed
> it is not very interesting. The interesting thing is when calling a
> function *steals* a reference. The other important thing to know is
> whether the caller ends up owning the return value (if it is an
> object) or not. I *think* you can tell the latter from the +1 for the
> return value; but the former (whether it steals a reference) is
> unclear from the data given. There's even an XXX comment about this in
> the file:
>
> # XXX NOTE: the 0/+1/-1 refcount information for arguments is
> # confusing! ?Much more useful would be to indicate whether the
> # function "steals" a reference to the argument or not. ?Take for
> # example PyList_SetItem(list, i, item). ?This lists as a 0 change for
> # both the list and the item arguments. ?However, in fact it steals a
> # reference to the item argument!

Should we change this file then?
And only list functions that don't follow the usual conventions.

But I'm sure that there are external tools which already use refcounts.dat
in its present format.

-- 
Amaury Forgeot d'Arc

From guido at python.org  Thu May  5 19:18:54 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 5 May 2011 10:18:54 -0700
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
Message-ID: <BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com>

On Thu, May 5, 2011 at 10:17 AM, Amaury Forgeot d'Arc
<amauryfa at gmail.com> wrote:
> 2011/5/5 Guido van Rossum <guido at python.org>:
>> Seems you're in agreement with this. IMO when references are borrowed
>> it is not very interesting. The interesting thing is when calling a
>> function *steals* a reference. The other important thing to know is
>> whether the caller ends up owning the return value (if it is an
>> object) or not. I *think* you can tell the latter from the +1 for the
>> return value; but the former (whether it steals a reference) is
>> unclear from the data given. There's even an XXX comment about this in
>> the file:
>>
>> # XXX NOTE: the 0/+1/-1 refcount information for arguments is
>> # confusing! ?Much more useful would be to indicate whether the
>> # function "steals" a reference to the argument or not. ?Take for
>> # example PyList_SetItem(list, i, item). ?This lists as a 0 change for
>> # both the list and the item arguments. ?However, in fact it steals a
>> # reference to the item argument!
>
> Should we change this file then?
> And only list functions that don't follow the usual conventions.
>
> But I'm sure that there are external tools which already use refcounts.dat
> in its present format.

Maybe we can *add* a column with the desired information?

-- 
--Guido van Rossum (python.org/~guido)

From g.brandl at gmx.net  Thu May  5 20:08:51 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 05 May 2011 20:08:51 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
Message-ID: <ipup3i$c9v$1@dough.gmane.org>

On 05.05.2011 19:00, Guido van Rossum wrote:
> On Thu, May 5, 2011 at 3:38 AM, Amaury Forgeot d'Arc <amauryfa at gmail.com> wrote:
>> Hi,
>>
>> Le jeudi 5 mai 2011, Greg Ewing <greg.ewing at canterbury.ac.nz> a ?crit :
>>> Amaury Forgeot d'Arc wrote:
>>>
>>>
>>> It's in the file Doc/data/refcounts.dat
>>> in some custom format.
>>>
>>>
>>> However, it doesn't seem to quite convey the same information.
>>> It lists the "refcount effect" on each parameter, but translating
>>> that into the notion of borrowed or stolen references seems
>>> to require knowledge of what the function does.
>>>
>>> For example, PyDict_SetItem has:
>>>
>>> PyDict_SetItem:PyObject*:p:0:
>>> PyDict_SetItem:PyObject*:key:+1:
>>> PyDict_SetItem:PyObject*:val:+1:
>>>
>>> All of these parameters take borrowed references, but the
>>> key and val get incremented because they're being stored
>>> in the dict.
>>
>> This is not always true, for example when the item is already present
>> in the dict.
>> It's not important to know what the function does to the object,
>> Only the action on the reference is relevant.
>>
>>>
>>> So this file appears to be of limited usefulness.
> 
> Seems you're in agreement with this. IMO when references are borrowed
> it is not very interesting. The interesting thing is when calling a
> function *steals* a reference. The other important thing to know is
> whether the caller ends up owning the return value (if it is an
> object) or not. I *think* you can tell the latter from the +1 for the
> return value; but the former (whether it steals a reference) is
> unclear from the data given. There's even an XXX comment about this in
> the file:
> 
> # XXX NOTE: the 0/+1/-1 refcount information for arguments is
> # confusing!  Much more useful would be to indicate whether the
> # function "steals" a reference to the argument or not.  Take for
> # example PyList_SetItem(list, i, item).  This lists as a 0 change for
> # both the list and the item arguments.  However, in fact it steals a
> # reference to the item argument!

We're not using the information about arguments anyway in the doc build.
So we're free to change the file to list only return types, and parameters
in the event of stolen references.

Georg



From solipsis at pitrou.net  Thu May  5 20:09:30 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 5 May 2011 20:09:30 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
Message-ID: <20110505200930.0412d200@pitrou.net>

On Thu, 5 May 2011 19:17:30 +0200
"Amaury Forgeot d'Arc" <amauryfa at gmail.com> wrote:

> 2011/5/5 Guido van Rossum <guido at python.org>:
> > Seems you're in agreement with this. IMO when references are borrowed
> > it is not very interesting. The interesting thing is when calling a
> > function *steals* a reference. The other important thing to know is
> > whether the caller ends up owning the return value (if it is an
> > object) or not. I *think* you can tell the latter from the +1 for the
> > return value; but the former (whether it steals a reference) is
> > unclear from the data given. There's even an XXX comment about this in
> > the file:
> >
> > # XXX NOTE: the 0/+1/-1 refcount information for arguments is
> > # confusing! ?Much more useful would be to indicate whether the
> > # function "steals" a reference to the argument or not. ?Take for
> > # example PyList_SetItem(list, i, item). ?This lists as a 0 change for
> > # both the list and the item arguments. ?However, in fact it steals a
> > # reference to the item argument!
> 
> Should we change this file then?
> And only list functions that don't follow the usual conventions.

+1

Regards

Antoine.



From raymond.hettinger at gmail.com  Thu May  5 20:12:55 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 5 May 2011 11:12:55 -0700
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
	<BANLkTimvbe5MzYPL7ptNEs9kE8CYXfr6Lg@mail.gmail.com>
Message-ID: <D5F03F3E-C4A5-481F-BAFA-8C3C00E62665@gmail.com>


On May 5, 2011, at 10:18 AM, Guido van Rossum wrote:

> On Thu, May 5, 2011 at 10:17 AM, Amaury Forgeot d'Arc
> <amauryfa at gmail.com> wrote:
>> 2011/5/5 Guido van Rossum <guido at python.org>:
>>> Seems you're in agreement with this. IMO when references are borrowed
>>> it is not very interesting. The interesting thing is when calling a
>>> function *steals* a reference. The other important thing to know is
>>> whether the caller ends up owning the return value (if it is an
>>> object) or not. I *think* you can tell the latter from the +1 for the
>>> return value; but the former (whether it steals a reference) is
>>> unclear from the data given. There's even an XXX comment about this in
>>> the file:
>>> 
>>> # XXX NOTE: the 0/+1/-1 refcount information for arguments is
>>> # confusing!  Much more useful would be to indicate whether the
>>> # function "steals" a reference to the argument or not.  Take for
>>> # example PyList_SetItem(list, i, item).  This lists as a 0 change for
>>> # both the list and the item arguments.  However, in fact it steals a
>>> # reference to the item argument!
>> 
>> Should we change this file then?
>> And only list functions that don't follow the usual conventions.
>> 
>> But I'm sure that there are external tools which already use refcounts.dat
>> in its present format.
> 
> Maybe we can *add* a column with the desired information?

+1


Raymond


From benjamin at python.org  Thu May  5 20:41:50 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 5 May 2011 13:41:50 -0500
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <E1QI3RQ-00050Z-JW@dinsdale.python.org>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>
Message-ID: <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>

2011/5/5 raymond.hettinger <python-checkins at python.org>:
> http://hg.python.org/cpython/rev/1a56775c6e54
> changeset: ? 69857:1a56775c6e54
> branch: ? ? ?3.2
> parent: ? ? ?69855:97a4855202b8
> user: ? ? ? ?Raymond Hettinger <python at rcn.com>
> date: ? ? ? ?Thu May 05 11:35:50 2011 -0700
> summary:
> ?Avoid codec spelling issues by just using the utf-8 default.

Out of curiosity, what is the issue?
>
> files:
> ?Lib/random.py | ?2 +-
> ?1 files changed, 1 insertions(+), 1 deletions(-)
>
>
> diff --git a/Lib/random.py b/Lib/random.py
> --- a/Lib/random.py
> +++ b/Lib/random.py
> @@ -114,7 +114,7 @@
> ? ? ? ? if version == 2:
> ? ? ? ? ? ? if isinstance(a, (str, bytes, bytearray)):
> ? ? ? ? ? ? ? ? if isinstance(a, str):
> - ? ? ? ? ? ? ? ? ? ?a = a.encode("utf8")
> + ? ? ? ? ? ? ? ? ? ?a = a.encode()


-- 
Regards,
Benjamin

From solipsis at pitrou.net  Thu May  5 20:44:04 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 5 May 2011 20:44:04 +0200
Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec
 spelling issues by just	using the utf-8 default.
References: <E1QI3RT-00050j-Ml@dinsdale.python.org>
Message-ID: <20110505204404.5cfa02f2@pitrou.net>

On Thu, 05 May 2011 20:38:27 +0200
raymond.hettinger <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/2bc784057226
> changeset:   69858:2bc784057226
> parent:      69856:b06ad8458b32
> parent:      69857:1a56775c6e54
> user:        Raymond Hettinger <python at rcn.com>
> date:        Thu May 05 11:38:06 2011 -0700
> summary:
>   Avoid codec spelling issues by just using the utf-8 default.
> 
> files:
>   Lib/random.py |  2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/Lib/random.py b/Lib/random.py
> --- a/Lib/random.py
> +++ b/Lib/random.py
> @@ -114,7 +114,7 @@
>          if version == 2:
>              if isinstance(a, (str, bytes, bytearray)):
>                  if isinstance(a, str):
> -                    a = a.encode("utf-8")
> +                    a = a.encode()

Isn't explicit better than implicit? By reading the new code it is not
obvious that any thought was given to the choice of a codec, while
stating "utf-8" explicitly hints that a decision was made.

(also, I don't understand the spelling issue: "utf-8" just works)

Regards

Antoine.



From alexander.belopolsky at gmail.com  Thu May  5 21:01:29 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 5 May 2011 15:01:29 -0400
Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <20110505204404.5cfa02f2@pitrou.net>
References: <E1QI3RT-00050j-Ml@dinsdale.python.org>
	<20110505204404.5cfa02f2@pitrou.net>
Message-ID: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com>

On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
..
> (also, I don't understand the spelling issue: "utf-8" just works)

This is probably referring to the fact that while encode() accepts
many spelling variants, some are short-circuited in C code while
others require codec lookup implemented in python.

From solipsis at pitrou.net  Thu May  5 21:07:07 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 05 May 2011 21:07:07 +0200
Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com>
References: <E1QI3RT-00050j-Ml@dinsdale.python.org>
	<20110505204404.5cfa02f2@pitrou.net>
	<BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com>
Message-ID: <1304622427.3564.12.camel@localhost.localdomain>

Le jeudi 05 mai 2011 ? 15:01 -0400, Alexander Belopolsky a ?crit :
> On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> ..
> > (also, I don't understand the spelling issue: "utf-8" just works)
> 
> This is probably referring to the fact that while encode() accepts
> many spelling variants, some are short-circuited in C code while
> others require codec lookup implemented in python.

This sounds like a bug to fix (isn't it fixed it already, btw?) rather
than add hackish workarounds for in stdlib code.

Regards

Antoine.



From benjamin at python.org  Thu May  5 21:13:34 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 5 May 2011 14:13:34 -0500
Subject: [Python-Dev] cpython (merge 3.2 -> default): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com>
References: <E1QI3RT-00050j-Ml@dinsdale.python.org>
	<20110505204404.5cfa02f2@pitrou.net>
	<BANLkTikXHQGqS04Uwc=WTys-6L7Wrufdhw@mail.gmail.com>
Message-ID: <BANLkTinei4otVrp==7Q6CROodKs416z1Ng@mail.gmail.com>

2011/5/5 Alexander Belopolsky <alexander.belopolsky at gmail.com>:
> On Thu, May 5, 2011 at 2:44 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> ..
>> (also, I don't understand the spelling issue: "utf-8" just works)
>
> This is probably referring to the fact that while encode() accepts
> many spelling variants, some are short-circuited in C code while
> others require codec lookup implemented in python.

Isn't it cached after the first run? If this is the reasoning, I find
it hard to believe that seed() is a large bottleneck in random.


-- 
Regards,
Benjamin

From g.brandl at gmx.net  Thu May  5 22:45:13 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 05 May 2011 22:45:13 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
Message-ID: <ipv28o$6e9$1@dough.gmane.org>

On 05.05.2011 19:17, Amaury Forgeot d'Arc wrote:
> 2011/5/5 Guido van Rossum <guido at python.org>:
>> Seems you're in agreement with this. IMO when references are borrowed
>> it is not very interesting. The interesting thing is when calling a
>> function *steals* a reference. The other important thing to know is
>> whether the caller ends up owning the return value (if it is an
>> object) or not. I *think* you can tell the latter from the +1 for the
>> return value; but the former (whether it steals a reference) is
>> unclear from the data given. There's even an XXX comment about this in
>> the file:
>>
>> # XXX NOTE: the 0/+1/-1 refcount information for arguments is
>> # confusing!  Much more useful would be to indicate whether the
>> # function "steals" a reference to the argument or not.  Take for
>> # example PyList_SetItem(list, i, item).  This lists as a 0 change for
>> # both the list and the item arguments.  However, in fact it steals a
>> # reference to the item argument!
> 
> Should we change this file then?
> And only list functions that don't follow the usual conventions.
> 
> But I'm sure that there are external tools which already use refcounts.dat
> in its present format.

I doubt it.  And even if there are, the information in there is in parts
highly outdated (because the docs don't use parameter info), and large
numbers of functions are missing.

Let's remove the cruft, and only keep interesting info.  This will also make
the file much more manageable.

Georg



From raymond.hettinger at gmail.com  Thu May  5 22:55:07 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 5 May 2011 13:55:07 -0700
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec
	spelling issues by just using the utf-8 default.
In-Reply-To: <BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>
	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
Message-ID: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>


On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote:

> 2011/5/5 raymond.hettinger <python-checkins at python.org>:
>> http://hg.python.org/cpython/rev/1a56775c6e54
>> changeset:   69857:1a56775c6e54
>> branch:      3.2
>> parent:      69855:97a4855202b8
>> user:        Raymond Hettinger <python at rcn.com>
>> date:        Thu May 05 11:35:50 2011 -0700
>> summary:
>>  Avoid codec spelling issues by just using the utf-8 default.
> 
> Out of curiosity, what is the issue?

IIRC, the performance depended on how your spelled-it.
I believe that is why the spelling got changed in Py3.3.
Either way, the code is simpler by just using the default.


Raymond


From mal at egenix.com  Fri May  6 00:32:59 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 06 May 2011 00:32:59 +0200
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid
 codec	spelling issues by just using the utf-8 default.
In-Reply-To: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
	<926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
Message-ID: <4DC3259B.5020804@egenix.com>

Raymond Hettinger wrote:
> 
> On May 5, 2011, at 11:41 AM, Benjamin Peterson wrote:
> 
>> 2011/5/5 raymond.hettinger <python-checkins at python.org>:
>>> http://hg.python.org/cpython/rev/1a56775c6e54
>>> changeset:   69857:1a56775c6e54
>>> branch:      3.2
>>> parent:      69855:97a4855202b8
>>> user:        Raymond Hettinger <python at rcn.com>
>>> date:        Thu May 05 11:35:50 2011 -0700
>>> summary:
>>>  Avoid codec spelling issues by just using the utf-8 default.
>>
>> Out of curiosity, what is the issue?
> 
> IIRC, the performance depended on how your spelled-it.
> I believe that is why the spelling got changed in Py3.3.

Not really. It got changed because we have canonical names
for the codecs which the stdlib should use rather than
rely on aliases. Performance-wise it only makes a difference
if you use it in tight loops.

> Either way, the code is simpler by just using the default.

... as long as the casual reader knows what the default it :-)

I think it's better to make the choice explicit, if the code
relies on a particular non-ASCII encoding. If it doesn't,
than the default is fine.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 06 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-06-20: EuroPython 2011, Florence, Italy               45 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From tjreedy at udel.edu  Fri May  6 00:52:34 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 05 May 2011 18:52:34 -0400
Subject: [Python-Dev] cpython (3.2): Avoid codec spelling issues by just
 using the utf-8 default.
In-Reply-To: <926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
	<926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
Message-ID: <ipv9nh$f5b$1@dough.gmane.org>

On 5/5/2011 4:55 PM, Raymond Hettinger wrote:

> Either way, the code is simpler by just using the default.

I thought about this and decided that the purpose of having defaults is 
so one does not have to always spell it out. So use it. Readers can 
always look it up and learn.

-- 
Terry Jan Reedy


From alexander.belopolsky at gmail.com  Fri May  6 00:54:11 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 5 May 2011 18:54:11 -0400
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <4DC3259B.5020804@egenix.com>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>
	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
	<926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
	<4DC3259B.5020804@egenix.com>
Message-ID: <BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com>

On Thu, May 5, 2011 at 6:32 PM, M.-A. Lemburg <mal at egenix.com> wrote:
..
>> Either way, the code is simpler by just using the default.
>
> ... as long as the casual reader knows what the default it :-)
>

.. or cares.  I this particular case, it hardly matters how random
bits are encoded.

From victor.stinner at haypocalc.com  Fri May  6 01:14:14 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 06 May 2011 01:14:14 +0200
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Avoid codec
 spelling issues by just using the utf-8 default.
In-Reply-To: <BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com>
References: <E1QI3RQ-00050Z-JW@dinsdale.python.org>
	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>
	<926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
	<4DC3259B.5020804@egenix.com>
	<BANLkTim-QnEyJ23ZbMeTN7M2+AkDVf9JEQ@mail.gmail.com>
Message-ID: <1304637254.12569.4.camel@marge>

Le jeudi 05 mai 2011 ? 18:54 -0400, Alexander Belopolsky a ?crit :
> On Thu, May 5, 2011 at 6:32 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> ..
> >> Either way, the code is simpler by just using the default.
> >
> > ... as long as the casual reader knows what the default it :-)
> >
> 
> .. or cares.  I this particular case, it hardly matters how random
> bits are encoded.

You don't get the same random number sequence if you use a different
encoding.

>>> r=random.Random()
>>> r.seed('\xe9'.encode('iso-8859-1')); r.randint(0, 1000)
639
>>> r.seed('\xe9'.encode('utf-8')); r.randint(0, 1000)
992

So it is useful to know how the seed was computed. The real question is
which encoding gives the most random numbers? :-)

Victor


From greg.ewing at canterbury.ac.nz  Fri May  6 03:28:11 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 06 May 2011 13:28:11 +1200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
Message-ID: <4DC34EAB.9050001@canterbury.ac.nz>

Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:

> This is not always true, for example when the item is already present
> in the dict.
> It's not important to know what the function does to the object,
> Only the action on the reference is relevant.

Yes, that's the whole point. When using a functon,
what you need to know is whether it borrows or steals
a reference.

But this file *doesn't tell* you that -- rather it
assigns either 0 or +1 to a borrowed reference,
apparently based on some notion of what the function
"usually" does with that parameter.

There does not seem to be enough information in that
file to work out the borrowed/stolen statuses, which
makes it seem rather useless.

-- 
Greg

From skip at pobox.com  Fri May  6 03:52:08 2011
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 5 May 2011 20:52:08 -0500
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <ipv28o$6e9$1@dough.gmane.org>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
	<ipv28o$6e9$1@dough.gmane.org>
Message-ID: <19907.21576.751581.958722@montanaro.dyndns.org>


    Georg> Let's remove the cruft, and only keep interesting info.  This
    Georg> will also make the file much more manageable.

If I was to do this from scratch I'd think hard about annotating the source
code.  No matter how hard you try, if you keep this information separate
from the code and maintain it manually, it's going to get out-of-date.

Skip

From marks at dcs.gla.ac.uk  Fri May  6 09:44:11 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 08:44:11 +0100
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <19907.21576.751581.958722@montanaro.dyndns.org>
References: <1304499523.15694.11.camel@marge>
	<4DC11791.2000109@dcs.gla.ac.uk>	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>	<4DC1D1C5.9010507@canterbury.ac.nz>	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>	<ipv28o$6e9$1@dough.gmane.org>
	<19907.21576.751581.958722@montanaro.dyndns.org>
Message-ID: <4DC3A6CB.5020809@dcs.gla.ac.uk>

skip at pobox.com wrote:
>     Georg> Let's remove the cruft, and only keep interesting info.  This
>     Georg> will also make the file much more manageable.
> 
> If I was to do this from scratch I'd think hard about annotating the source
> code.  No matter how hard you try, if you keep this information separate
> from the code and maintain it manually, it's going to get out-of-date.
> 
What about #defining PY_STOLEN in some header?

Then any stolen parameter can be prefixed with PY_STOLEN in signature.

For return values, similarly #define PY_BORROWED.

Cheers,
Mark.

From amauryfa at gmail.com  Fri May  6 10:18:32 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Fri, 6 May 2011 10:18:32 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <4DC3A6CB.5020809@dcs.gla.ac.uk>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
	<ipv28o$6e9$1@dough.gmane.org>
	<19907.21576.751581.958722@montanaro.dyndns.org>
	<4DC3A6CB.5020809@dcs.gla.ac.uk>
Message-ID: <BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com>

Le vendredi 6 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit?:
> What about #defining PY_STOLEN in some header?
>
> Then any stolen parameter can be prefixed with PY_STOLEN in signature.
>
> For return values, similarly #define PY_BORROWED.

Header files are harder to parse, and I don't see how it would apply to macros.
What about additional tags in the .rst files?

-- 
Amaury

-- 
Amaury Forgeot d'Arc

From solipsis at pitrou.net  Fri May  6 12:27:03 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 6 May 2011 12:27:03 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
Message-ID: <20110506122703.17c4d889@pitrou.net>

On Fri, 06 May 2011 13:28:11 +1200
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:
> 
> > This is not always true, for example when the item is already present
> > in the dict.
> > It's not important to know what the function does to the object,
> > Only the action on the reference is relevant.
> 
> Yes, that's the whole point. When using a functon,
> what you need to know is whether it borrows or steals
> a reference.

Doesn't "borrow" mean the same as "steal" in that context?
If an API borrows a reference, I expect it to take it from me.

Regards

Antoine.



From marks at dcs.gla.ac.uk  Fri May  6 12:45:38 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 11:45:38 +0100
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <20110506122703.17c4d889@pitrou.net>
References: <1304499523.15694.11.camel@marge>
	<4DC11791.2000109@dcs.gla.ac.uk>	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>	<4DC1D1C5.9010507@canterbury.ac.nz>	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
Message-ID: <4DC3D152.601@dcs.gla.ac.uk>

Antoine Pitrou wrote:
> On Fri, 06 May 2011 13:28:11 +1200
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
>> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:
>>
>>> This is not always true, for example when the item is already present
>>> in the dict.
>>> It's not important to know what the function does to the object,
>>> Only the action on the reference is relevant.
>> Yes, that's the whole point. When using a functon,
>> what you need to know is whether it borrows or steals
>> a reference.
> 
> Doesn't "borrow" mean the same as "steal" in that context?
> If an API borrows a reference, I expect it to take it from me.

"Stealing" takes the ownership. Borrowing does not.

This explains it better:
http://docs.python.org/py3k/c-api/intro.html#reference-count-details

Cheers,
Mark.

From jimjjewett at gmail.com  Fri May  6 15:49:19 2011
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 6 May 2011 09:49:19 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Userlist.copy() wasn't
	returning a UserList.
In-Reply-To: <E1QI6C6-0003Xg-UH@dinsdale.python.org>
References: <E1QI6C6-0003Xg-UH@dinsdale.python.org>
Message-ID: <BANLkTikphTzgiyGu9wZBwSnOv6W2Mwt97A@mail.gmail.com>

Do you also want to assert that u is not v, or would that sort of
"copy" be acceptable by some subclasses?

On 5/5/11, raymond.hettinger <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/f20373fcdde5
> changeset:   69865:f20373fcdde5
> user:        Raymond Hettinger <python at rcn.com>
> date:        Thu May 05 14:34:35 2011 -0700
> summary:
>   Userlist.copy() wasn't returning a UserList.
>
> files:
>   Lib/collections/__init__.py |  2 +-
>   Lib/test/test_userlist.py   |  6 ++++++
>   2 files changed, 7 insertions(+), 1 deletions(-)
>
>
> diff --git a/Lib/collections/__init__.py b/Lib/collections/__init__.py
> --- a/Lib/collections/__init__.py
> +++ b/Lib/collections/__init__.py
> @@ -887,7 +887,7 @@
>      def pop(self, i=-1): return self.data.pop(i)
>      def remove(self, item): self.data.remove(item)
>      def clear(self): self.data.clear()
> -    def copy(self): return self.data.copy()
> +    def copy(self): return self.__class__(self)
>      def count(self, item): return self.data.count(item)
>      def index(self, item, *args): return self.data.index(item, *args)
>      def reverse(self): self.data.reverse()
> diff --git a/Lib/test/test_userlist.py b/Lib/test/test_userlist.py
> --- a/Lib/test/test_userlist.py
> +++ b/Lib/test/test_userlist.py
> @@ -52,6 +52,12 @@
>                  return str(key) + '!!!'
>          self.assertEqual(next(iter(T((1,2)))), "0!!!")
>
> +    def test_userlist_copy(self):
> +        u = self.type2test([6, 8, 1, 9, 1])
> +        v = u.copy()
> +        self.assertEqual(u, v)
> +        self.assertEqual(type(u), type(v))
> +
>  def test_main():
>      support.run_unittest(UserListTest)
>
>
> --
> Repository URL: http://hg.python.org/cpython
>

From ndbecker2 at gmail.com  Fri May  6 16:04:09 2011
From: ndbecker2 at gmail.com (Neal Becker)
Date: Fri, 06 May 2011 10:04:09 -0400
Subject: [Python-Dev] Linus on garbage collection
Message-ID: <iq0v4q$ubm$1@dough.gmane.org>

http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html


From solipsis at pitrou.net  Fri May  6 16:12:33 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 6 May 2011 16:12:33 +0200
Subject: [Python-Dev] Linus on garbage collection
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <20110506161233.1ed647ec@pitrou.net>

On Fri, 06 May 2011 10:04:09 -0400
Neal Becker <ndbecker2 at gmail.com> wrote:
> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html

Since we're sharing links, here's Matt Mackall's take:
http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html

cheers

Antoine.




From marks at dcs.gla.ac.uk  Fri May  6 16:46:08 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 15:46:08 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <4DC409B0.60909@dcs.gla.ac.uk>



Neal Becker wrote:
> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
> 
Being famous does not necessarily make you right.

OS kernels are pretty atypical software,
even if Linus is right about Linux, it doesn't apply to Python.

I have empirical evidence, not opinion, that PyPy and my own HotPy
are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark 
(which stresses the memory management subsystem).

(Note that gcbench does not introduce any cycles, so its being easy on 
CPython)

In fact, for gcbench CPython spends over twice as long in the 
cycle-collector as HotPy takes in total!
I don't have such detailed results for PyPy.

For other benchmarks, the HotPy GC times are often smaller than the 
inter-run variations in runtime, for example:

HotPy GC stats for pystones (on a slow machine with a small cache):

Total memory allocated: 20 Mbytes.
20 minor collections, 0 major collections
Max heap size 2.4 Mbytes.
Total time spent in GC: 3.5 milliseconds. ( <1% of execution time)

My GC is quick, but its not the fastest.

Evidence trumps opinion IMHO ;)

Cheers,
Mark.

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/marks%40dcs.gla.ac.uk


From solipsis at pitrou.net  Fri May  6 17:33:51 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 6 May 2011 17:33:51 +0200
Subject: [Python-Dev] Linus on garbage collection
References: <iq0v4q$ubm$1@dough.gmane.org>
	<4DC409B0.60909@dcs.gla.ac.uk>
Message-ID: <20110506173351.4aef8145@pitrou.net>

On Fri, 06 May 2011 15:46:08 +0100
Mark Shannon <marks at dcs.gla.ac.uk> wrote:
> 
> Neal Becker wrote:
> > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
> > 
> Being famous does not necessarily make you right.
> 
> OS kernels are pretty atypical software,
> even if Linus is right about Linux, it doesn't apply to Python.
> 
> I have empirical evidence, not opinion, that PyPy and my own HotPy
> are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark 
> (which stresses the memory management subsystem).
> 
> (Note that gcbench does not introduce any cycles, so its being easy on 
> CPython)
> 
> In fact, for gcbench CPython spends over twice as long in the 
> cycle-collector as HotPy takes in total!

The thing is, it would be easy to change our collection heuristics
so that the cycle collector gets called less often (actually, you can
already do so using gc.set_threshold, IIRC). Something which is much
more delicate for a "full" GC, where it would grow memory consumption a
lot.

Regards

Antoine.



From status at bugs.python.org  Fri May  6 18:07:23 2011
From: status at bugs.python.org (Python tracker)
Date: Fri,  6 May 2011 18:07:23 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110506160723.04A101CFD5@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-04-29 - 2011-05-06)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2783 (+23)
  closed 21017 (+41)
  total  23800 (+64)

Open issues with patches: 1201 


Issues opened (47)
==================

#11955: 3.3 : test_argparse.py fails 'make test'
http://bugs.python.org/issue11955  opened by Jason.Vas.Dias

#11956: 3.3 : test_import.py causes 'make test' to fail
http://bugs.python.org/issue11956  opened by Jason.Vas.Dias

#11957: re.sub confusion between count and flags args
http://bugs.python.org/issue11957  opened by mindauga

#11959: smtpd cannot be used without affecting global state
http://bugs.python.org/issue11959  opened by vinay.sajip

#11962: Buildbot reliability
http://bugs.python.org/issue11962  opened by skrah

#11963: Use real assert* for test_trigger_memory_error (test_parser)
http://bugs.python.org/issue11963  opened by eric.araujo

#11964: Undocumented change to indent param of json.dump in 3.2
http://bugs.python.org/issue11964  opened by eric.araujo

#11965: Simplify context manager in os.popen
http://bugs.python.org/issue11965  opened by eric.araujo

#11968: wsgiref's wsgi application sample code does not work
http://bugs.python.org/issue11968  opened by shimizukawa

#11969: Can't launch Process on built-in static method
http://bugs.python.org/issue11969  opened by cool-RR

#11972: input does not strip a trailing newline correctly on Windows
http://bugs.python.org/issue11972  opened by Michal.Molhanec

#11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags)
http://bugs.python.org/issue11973  opened by DragonSA

#11974: Class definition gotcha.. should this be documented somewhere?
http://bugs.python.org/issue11974  opened by sleepycal

#11975: Fix referencing of built-in types (list, int, ...)
http://bugs.python.org/issue11975  opened by jonash

#11978: Report correct coverage.py data for tests that invoke subproce
http://bugs.python.org/issue11978  opened by ncoghlan

#11979: Minor improvements to the Sockets readme: typos, wording and s
http://bugs.python.org/issue11979  opened by xmorel

#11980: zipfile.ZipFile.write should accept fp as argument
http://bugs.python.org/issue11980  opened by proppy

#11981: dupe self.fp.tell() in zipfile.ZipFile.writestr
http://bugs.python.org/issue11981  opened by proppy

#11983: Inconsistent hash and comparison for code objects
http://bugs.python.org/issue11983  opened by eltoder

#11984: Wrong "See also" in symbol and token module docs
http://bugs.python.org/issue11984  opened by davipo

#11989: deprecate shutil.copy2
http://bugs.python.org/issue11989  opened by datamuc

#11990: redirected output - stdout writes newline as \n in windows
http://bugs.python.org/issue11990  opened by Jimbofbx

#11992: sys.settrace doesn't disable tracing if a local trace function
http://bugs.python.org/issue11992  opened by nedbat

#11993: Use sub-second resolution to determine if a file is newer
http://bugs.python.org/issue11993  opened by jsjgruber

#11994: [2.7/gcc-4.4.3] Segfault under valgrind in string.split()
http://bugs.python.org/issue11994  opened by skrah

#11995: test_pydoc loads all Python modules
http://bugs.python.org/issue11995  opened by haypo

#11996: libpython.py: nicer py-bt output
http://bugs.python.org/issue11996  opened by haypo

#11998: test_signal cannot test blocked signals if _tkinter is loaded;
http://bugs.python.org/issue11998  opened by haypo

#11999: sporadic failure in test_mailbox on FreeBSD
http://bugs.python.org/issue11999  opened by haypo

#12001: Extend json.dumps to handle N-triples strings
http://bugs.python.org/issue12001  opened by Glenn.Ammons

#12002: ftplib.FTP.abort fails with TypeError on Python 3.x
http://bugs.python.org/issue12002  opened by nneonneo

#12003: documentation: alternate version of xrange seems to fail.
http://bugs.python.org/issue12003  opened by tenuki

#12004: PyZipFile.writepy gives internal error on syntax errors
http://bugs.python.org/issue12004  opened by Ben.Morgan

#12005: modulo result of Decimal differs from float/int
http://bugs.python.org/issue12005  opened by Kotan

#12006: strptime should implement %V or %u directive from libc
http://bugs.python.org/issue12006  opened by Erik.Cederstrand

#12007: Console commands won't work
http://bugs.python.org/issue12007  opened by jake_mcaga

#12008: HtmlParser non-strict goes wrong with unquoted attributes
http://bugs.python.org/issue12008  opened by svilend

#12009: netrc module crashes if netrc file has comment lines
http://bugs.python.org/issue12009  opened by rmstoi

#12010: Compile fails when sizeof(wchar_t) == 1
http://bugs.python.org/issue12010  opened by dcoles

#12011: The signal module should raise OSError for OS-related exceptio
http://bugs.python.org/issue12011  opened by pitrou

#12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method 
http://bugs.python.org/issue12012  opened by haypo

#12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i
http://bugs.python.org/issue12013  opened by alex_lai

#12014: str.format parses replacement field incorrectly
http://bugs.python.org/issue12014  opened by Ben.Wolfson

#12015: possible characters in temporary file name is too few
http://bugs.python.org/issue12015  opened by planet36

#12016: Wrong behavior for '\xff\n'.decode('gb2312', 'ignore')
http://bugs.python.org/issue12016  opened by cdqzzy

#12017: Decoding a highly-nested object with json (_speedups enabled) 
http://bugs.python.org/issue12017  opened by ivank

#12018: No tests for ntpath.samefile, ntpath.sameopenfile
http://bugs.python.org/issue12018  opened by ronaldoussoren



Most recent 15 issues with no replies (15)
==========================================

#12018: No tests for ntpath.samefile, ntpath.sameopenfile
http://bugs.python.org/issue12018

#12016: Wrong behavior for '\xff\n'.decode('gb2312', 'ignore')
http://bugs.python.org/issue12016

#12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i
http://bugs.python.org/issue12013

#12009: netrc module crashes if netrc file has comment lines
http://bugs.python.org/issue12009

#12003: documentation: alternate version of xrange seems to fail.
http://bugs.python.org/issue12003

#12002: ftplib.FTP.abort fails with TypeError on Python 3.x
http://bugs.python.org/issue12002

#12001: Extend json.dumps to handle N-triples strings
http://bugs.python.org/issue12001

#11992: sys.settrace doesn't disable tracing if a local trace function
http://bugs.python.org/issue11992

#11989: deprecate shutil.copy2
http://bugs.python.org/issue11989

#11984: Wrong "See also" in symbol and token module docs
http://bugs.python.org/issue11984

#11983: Inconsistent hash and comparison for code objects
http://bugs.python.org/issue11983

#11979: Minor improvements to the Sockets readme: typos, wording and s
http://bugs.python.org/issue11979

#11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags)
http://bugs.python.org/issue11973

#11969: Can't launch Process on built-in static method
http://bugs.python.org/issue11969

#11968: wsgiref's wsgi application sample code does not work
http://bugs.python.org/issue11968



Most recent 15 issues waiting for review (15)
=============================================

#12015: possible characters in temporary file name is too few
http://bugs.python.org/issue12015

#12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method 
http://bugs.python.org/issue12012

#12008: HtmlParser non-strict goes wrong with unquoted attributes
http://bugs.python.org/issue12008

#12004: PyZipFile.writepy gives internal error on syntax errors
http://bugs.python.org/issue12004

#11999: sporadic failure in test_mailbox on FreeBSD
http://bugs.python.org/issue11999

#11998: test_signal cannot test blocked signals if _tkinter is loaded;
http://bugs.python.org/issue11998

#11996: libpython.py: nicer py-bt output
http://bugs.python.org/issue11996

#11989: deprecate shutil.copy2
http://bugs.python.org/issue11989

#11981: dupe self.fp.tell() in zipfile.ZipFile.writestr
http://bugs.python.org/issue11981

#11980: zipfile.ZipFile.write should accept fp as argument
http://bugs.python.org/issue11980

#11973: kevent does not accept KQ_NOTE_EXIT (and other (f)flags)
http://bugs.python.org/issue11973

#11963: Use real assert* for test_trigger_memory_error (test_parser)
http://bugs.python.org/issue11963

#11956: 3.3 : test_import.py causes 'make test' to fail
http://bugs.python.org/issue11956

#11949: Make float('nan') unorderable
http://bugs.python.org/issue11949

#11948: Tutorial/Modules - small fix to better clarify the modules sea
http://bugs.python.org/issue11948



Top 10 most discussed issues (10)
=================================

#11277: Crash with mmap and sparse files on Mac OS X
http://bugs.python.org/issue11277  19 msgs

#8407: expose signalfd(2) and pthread_sigmask in the signal module
http://bugs.python.org/issue8407  18 msgs

#11935: MMDF/MBOX mailbox need utime
http://bugs.python.org/issue11935  17 msgs

#11999: sporadic failure in test_mailbox on FreeBSD
http://bugs.python.org/issue11999  11 msgs

#6721: Locks in python standard library should be sanitized on fork
http://bugs.python.org/issue6721  10 msgs

#9971: Optimize BufferedReader.readinto
http://bugs.python.org/issue9971   9 msgs

#3526: Customized malloc implementation on SunOS and AIX
http://bugs.python.org/issue3526   8 msgs

#11962: Buildbot reliability
http://bugs.python.org/issue11962   8 msgs

#11949: Make float('nan') unorderable
http://bugs.python.org/issue11949   7 msgs

#11954: 3.3 - 'make test' fails
http://bugs.python.org/issue11954   7 msgs



Issues closed (37)
==================

#1856: shutdown (exit) can hang or segfault with daemon threads runni
http://bugs.python.org/issue1856  closed by pitrou

#7517: freeze.py not ported to python3
http://bugs.python.org/issue7517  closed by eric.araujo

#8158: Docstring of optparse.OptionParser incomplete
http://bugs.python.org/issue8158  closed by r.david.murray

#9756: Crash with custom __getattribute__
http://bugs.python.org/issue9756  closed by haypo

#10684: Folders get deleted when trying to change case with shutil.mov
http://bugs.python.org/issue10684  closed by ronaldoussoren

#10775: assertRaises as a context manager should accept a 'msg' keywor
http://bugs.python.org/issue10775  closed by ezio.melotti

#10922: Unexpected exception when calling function_proxy.__class__.__c
http://bugs.python.org/issue10922  closed by haypo

#11034: Build problem on Windows with MSVC++ Express 2008
http://bugs.python.org/issue11034  closed by loewis

#11206: test_readline unconditionally calls clear_history()
http://bugs.python.org/issue11206  closed by ned.deily

#11247: Error sending packets to multicast IPV4 address
http://bugs.python.org/issue11247  closed by neologix

#11335: Memory leak after key function failure in sort
http://bugs.python.org/issue11335  closed by stutzbach

#11834: wrong module installation dir on Windows
http://bugs.python.org/issue11834  closed by brian.curtin

#11849: glibc allocator doesn't release all free()ed memory
http://bugs.python.org/issue11849  closed by pitrou

#11873: test_regexp() of test_compileall fails occassionally
http://bugs.python.org/issue11873  closed by r.david.murray

#11883: Call connect() before sending an email with smtplib
http://bugs.python.org/issue11883  closed by r.david.murray

#11887: unittest fails on comparing str with bytes if python has the -
http://bugs.python.org/issue11887  closed by michael.foord

#11898: Sending binary data with a POST request in httplib can cause U
http://bugs.python.org/issue11898  closed by orsenthil

#11912: PaX triggers a segfault in dlopen
http://bugs.python.org/issue11912  closed by neologix

#11930: Remove time.accept2dyear
http://bugs.python.org/issue11930  closed by belopolsky

#11950: logger use dict for loggers instead of WeakValueDictionary
http://bugs.python.org/issue11950  closed by vinay.sajip

#11958: test.test_ftplib.TestIPv6Environment failure
http://bugs.python.org/issue11958  closed by python-dev

#11960: Python crashes when running numpy test
http://bugs.python.org/issue11960  closed by amaury.forgeotdarc

#11961: Document STARTUPINFO and creationflags options for Windows
http://bugs.python.org/issue11961  closed by brian.curtin

#11966: Typo in PyModule_AddIntMacro's documentation
http://bugs.python.org/issue11966  closed by python-dev

#11967: Left shift and Right shift for floats
http://bugs.python.org/issue11967  closed by loewis

#11970: distutils command 'upload' crashes when --show-response is sel
http://bugs.python.org/issue11970  closed by offby1

#11971: Wrong parameter -O0 instead of -OO in manpage
http://bugs.python.org/issue11971  closed by r.david.murray

#11976: Provide proper documentation for list data type
http://bugs.python.org/issue11976  closed by georg.brandl

#11977: Document int.conjugate, .denominator, ...
http://bugs.python.org/issue11977  closed by python-dev

#11982: json.loads() returns str instead of unicode for empty strings
http://bugs.python.org/issue11982  closed by ezio.melotti

#11985: Document that platform.python_implementation supports PyPy
http://bugs.python.org/issue11985  closed by ezio.melotti

#11986: Min/max not symmetric in presence of NaN
http://bugs.python.org/issue11986  closed by rhettinger

#11987: queue.Queue.put should acquire mutex for unfinished_tasks
http://bugs.python.org/issue11987  closed by rhettinger

#11988: special method lookup docs don't address some important detail
http://bugs.python.org/issue11988  closed by r.david.murray

#11991: test_distutils fails because of bad filename match
http://bugs.python.org/issue11991  closed by eric.araujo

#11997: One typo in Doc/c-api/init.rst
http://bugs.python.org/issue11997  closed by ezio.melotti

#12000: SSL certificate verification failed if no dNSName entry in sub
http://bugs.python.org/issue12000  closed by pitrou

From skip at pobox.com  Fri May  6 18:18:51 2011
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 6 May 2011 11:18:51 -0500
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <20110506161233.1ed647ec@pitrou.net>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
Message-ID: <19908.8043.8921.50222@montanaro.dyndns.org>


    Antoine> Since we're sharing links, here's Matt Mackall's take:
    Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html

>From that note:

    1: You can't have meaningful destructors, because when destruction
    happens is undefined. And going-out-of-scope destructors are extremely
    useful. Python is already a rather broken in this regard, so feel free
    to ignore this point.

Given the presence of cyclic data I don't see how reference counting or
garbage collection win.  Ignoring the fact that in a pure reference counted
system you won't even consider cycles for reclmation, would both RC and GC
have to punt because they can't tell which object's destructor to call
first?

Skip

From fuzzyman at voidspace.org.uk  Fri May  6 18:31:44 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 06 May 2011 17:31:44 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
Message-ID: <4DC42270.1000301@voidspace.org.uk>

On 06/05/2011 17:18, skip at pobox.com wrote:
>      Antoine>  Since we're sharing links, here's Matt Mackall's take:
>      Antoine>  http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>
> > From that note:
>
>      1: You can't have meaningful destructors, because when destruction
>      happens is undefined. And going-out-of-scope destructors are extremely
>      useful. Python is already a rather broken in this regard, so feel free
>      to ignore this point.
>
> Given the presence of cyclic data I don't see how reference counting or
> garbage collection win.  Ignoring the fact that in a pure reference counted
> system you won't even consider cycles for reclmation, would both RC and GC
> have to punt because they can't tell which object's destructor to call
> first?

pypy and .NET choose to arbitrarily break cycles rather than leave 
objects unfinalised and memory unreclaimed. Not sure what Java does.

All the best,

Michael Foord

> Skip
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From greg at krypto.org  Fri May  6 18:32:51 2011
From: greg at krypto.org (Gregory P. Smith)
Date: Fri, 6 May 2011 09:32:51 -0700
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
Message-ID: <BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com>

On Fri, May 6, 2011 at 9:18 AM,  <skip at pobox.com> wrote:
>
> ? ?Antoine> Since we're sharing links, here's Matt Mackall's take:
> ? ?Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>
> >From that note:
>
> ? ?1: You can't have meaningful destructors, because when destruction
> ? ?happens is undefined. And going-out-of-scope destructors are extremely
> ? ?useful. Python is already a rather broken in this regard, so feel free
> ? ?to ignore this point.

Python being "broken" in this regard is pretty much exactly why
__enter__, __exit__ and with as context managers were added to the
language.

That gives the ability to have the equivalent of well defined nested
scopes that destroy something (exit) deterministically much as it is
easy to do in C++ with some {}s and a ~destructor().

It is not broken, just different.

-gps

From marks at dcs.gla.ac.uk  Fri May  6 18:33:03 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 17:33:03 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
Message-ID: <4DC422BF.4010006@dcs.gla.ac.uk>

skip at pobox.com wrote:
>     Antoine> Since we're sharing links, here's Matt Mackall's take:
>     Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
> 
>>From that note:
> 
>     1: You can't have meaningful destructors, because when destruction
>     happens is undefined. And going-out-of-scope destructors are extremely
>     useful. Python is already a rather broken in this regard, so feel free
>     to ignore this point.
> 
> Given the presence of cyclic data I don't see how reference counting or
> garbage collection win.  Ignoring the fact that in a pure reference counted
> system you won't even consider cycles for reclmation, would both RC and GC
> have to punt because they can't tell which object's destructor to call
> first?

It doesn't matter which is called first.
In fact, the VM could call all the destructors at the same time if the 
machine has enough cores and there's no GIL.

All objects are kept alive by the GC until after the destructors are 
called. Those that are still dead will have their memory reclaimed.

> 
> Skip
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/marks%40dcs.gla.ac.uk


From stefan_ml at behnel.de  Fri May  6 18:51:37 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 06 May 2011 18:51:37 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC422BF.4010006@dcs.gla.ac.uk>
References: <iq0v4q$ubm$1@dough.gmane.org>
	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC422BF.4010006@dcs.gla.ac.uk>
Message-ID: <iq18uq$s9p$1@dough.gmane.org>

Mark Shannon, 06.05.2011 18:33:
> skip at pobox.com wrote:
>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>> Antoine>
>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>
>>> From that note:
>>
>> 1: You can't have meaningful destructors, because when destruction
>> happens is undefined. And going-out-of-scope destructors are extremely
>> useful. Python is already a rather broken in this regard, so feel free
>> to ignore this point.
>>
>> Given the presence of cyclic data I don't see how reference counting or
>> garbage collection win. Ignoring the fact that in a pure reference counted
>> system you won't even consider cycles for reclmation, would both RC and GC
>> have to punt because they can't tell which object's destructor to call
>> first?
>
> It doesn't matter which is called first.

May I quote you on that one the next time my software crashes?

It may not make a difference for the runtime, but the difference for user 
software may be "dead" or "alive".

Stefan


From fuzzyman at voidspace.org.uk  Fri May  6 19:04:53 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 06 May 2011 18:04:53 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com>
References: <iq0v4q$ubm$1@dough.gmane.org>
	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>
	<BANLkTikUwyELPXSC6zSWnP8Xv8OqsBkkhQ@mail.gmail.com>
Message-ID: <4DC42A35.6060303@voidspace.org.uk>

On 06/05/2011 17:32, Gregory P. Smith wrote:
> On Fri, May 6, 2011 at 9:18 AM,<skip at pobox.com>  wrote:
>>     Antoine>  Since we're sharing links, here's Matt Mackall's take:
>>     Antoine>  http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>
>> > From that note:
>>
>>     1: You can't have meaningful destructors, because when destruction
>>     happens is undefined. And going-out-of-scope destructors are extremely
>>     useful. Python is already a rather broken in this regard, so feel free
>>     to ignore this point.
> Python being "broken" in this regard is pretty much exactly why
> __enter__, __exit__ and with as context managers were added to the
> language.
>

How does that help with cycles? Sure it makes cleaning up some resources 
easier, but not at all this case. Explicit destruction is of course 
always an alternative to the runtime doing it for you, but it doesn't 
help with (for example) reclaiming memory. For long running processes 
memory leaks due to unreclaimable cycles can be a problem with CPython.

> That gives the ability to have the equivalent of well defined nested
> scopes that destroy something (exit) deterministically much as it is
> easy to do in C++ with some {}s and a ~destructor().
>
> It is not broken, just different.

+1 QOTW ;-)

Michael
> -gps
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From fuzzyman at voidspace.org.uk  Fri May  6 19:06:35 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 06 May 2011 18:06:35 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq18uq$s9p$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>
	<iq18uq$s9p$1@dough.gmane.org>
Message-ID: <4DC42A9B.6020000@voidspace.org.uk>

On 06/05/2011 17:51, Stefan Behnel wrote:
> Mark Shannon, 06.05.2011 18:33:
>> skip at pobox.com wrote:
>>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>>> Antoine>
>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>>
>>>> From that note:
>>>
>>> 1: You can't have meaningful destructors, because when destruction
>>> happens is undefined. And going-out-of-scope destructors are extremely
>>> useful. Python is already a rather broken in this regard, so feel free
>>> to ignore this point.
>>>
>>> Given the presence of cyclic data I don't see how reference counting or
>>> garbage collection win. Ignoring the fact that in a pure reference 
>>> counted
>>> system you won't even consider cycles for reclmation, would both RC 
>>> and GC
>>> have to punt because they can't tell which object's destructor to call
>>> first?
>>
>> It doesn't matter which is called first.
>
> May I quote you on that one the next time my software crashes?
>

Arbitrarily breaking cycles *could* cause a problem if a destructor 
attempts to access an already collected object. Not breaking cycles 
*definitely* leaks memory and definitely doesn't call finalizers.

Michael

> It may not make a difference for the runtime, but the difference for 
> user software may be "dead" or "alive".
>
> Stefan
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From glyph at twistedmatrix.com  Fri May  6 19:07:44 2011
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Fri, 6 May 2011 13:07:44 -0400
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC42270.1000301@voidspace.org.uk>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC42270.1000301@voidspace.org.uk>
Message-ID: <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com>

On May 6, 2011, at 12:31 PM, Michael Foord wrote:

> pypy and .NET choose to arbitrarily break cycles rather than leave objects unfinalised and memory unreclaimed. Not sure what Java does.

I think that's a mischaracterization of their respective collectors; "arbitrarily break cycles" implies that user code would see broken or incomplete objects, at least during finalization, which I'm fairly sure is not true on either .NET or PyPy.

Java definitely has a collector that can handles cycles too.  (None of these are reference counting.)

-glyph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/9bf3df2c/attachment.html>

From stephen at xemacs.org  Fri May  6 19:15:33 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 07 May 2011 02:15:33 +0900
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC409B0.60909@dcs.gla.ac.uk>
References: <iq0v4q$ubm$1@dough.gmane.org>
	<4DC409B0.60909@dcs.gla.ac.uk>
Message-ID: <87y62jeone.fsf@uwakimon.sk.tsukuba.ac.jp>

Mark Shannon writes:
 > 
 > 
 > Neal Becker wrote:
 > > http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
 > > 
 > Being famous does not necessarily make you right.

No, but being a genius sure helps you beat the odds.

 > OS kernels are pretty atypical software,
 > even if Linus is right about Linux, it doesn't apply to Python.

Well, actually he was writing about GCC....

 > I have empirical evidence, not opinion, that PyPy and my own HotPy
 > are a *lot* faster (x5 or better) on Unladen Swallow's gcbench benchmark 
 > (which stresses the memory management subsystem).

You're missing Linus's point, I think.  Linus did *not* claim that
it's impossible to write a fast *GC*.  He claimed that it's hard to
write a fast *program* that uses GC for memory management.  A
benchmark that stresses *only* the memory management system is
unlikely to impress him.


From fuzzyman at voidspace.org.uk  Fri May  6 19:12:51 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 06 May 2011 18:12:51 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC42270.1000301@voidspace.org.uk>
	<8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com>
Message-ID: <4DC42C13.8070806@voidspace.org.uk>

On 06/05/2011 18:07, Glyph Lefkowitz wrote:
> On May 6, 2011, at 12:31 PM, Michael Foord wrote:
>
>> pypy and .NET choose to arbitrarily break cycles rather than leave 
>> objects unfinalised and memory unreclaimed. Not sure what Java does.
>
> I think that's a mischaracterization of their respective collectors; 
> "arbitrarily break cycles" implies that user code would see broken or 
> incomplete objects, at least during finalization, which I'm fairly 
> sure is not true on either .NET or PyPy.

http://morepypy.blogspot.com/2008/02/python-finalizers-semantics-part-1.html

"Therefore we decided to break such a cycle at an arbitrary place, which 
doesn't sound too insane."

All the best,

Michael Foord
>
> Java definitely has a collector that can handles cycles too.  (None of 
> these are reference counting.)
>
> -glyph


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/3afbfd6a/attachment.html>

From marks at dcs.gla.ac.uk  Fri May  6 19:46:37 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 18:46:37 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC4321F.3070206@voidspace.org.uk>
References: <iq0v4q$ubm$1@dough.gmane.org>	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>	<iq18uq$s9p$1@dough.gmane.org>
	<4DC42A9B.6020000@voidspace.org.uk>
	<4DC42F4B.1050509@dcs.gla.ac.uk>
	<4DC4321F.3070206@voidspace.org.uk>
Message-ID: <4DC433FD.6090803@dcs.gla.ac.uk>

Michael Foord wrote:
> On 06/05/2011 18:26, Mark Shannon wrote:
>> Michael Foord wrote:
>>> On 06/05/2011 17:51, Stefan Behnel wrote:
>>>> Mark Shannon, 06.05.2011 18:33:
>>>>> skip at pobox.com wrote:
>>>>>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>>>>>> Antoine>
>>>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>>>>>
>>>>>>> From that note:
>>>>>> 1: You can't have meaningful destructors, because when destruction
>>>>>> happens is undefined. And going-out-of-scope destructors are 
>>>>>> extremely
>>>>>> useful. Python is already a rather broken in this regard, so feel 
>>>>>> free
>>>>>> to ignore this point.
>>>>>>
>>>>>> Given the presence of cyclic data I don't see how reference 
>>>>>> counting or
>>>>>> garbage collection win. Ignoring the fact that in a pure reference 
>>>>>> counted
>>>>>> system you won't even consider cycles for reclmation, would both 
>>>>>> RC and GC
>>>>>> have to punt because they can't tell which object's destructor to 
>>>>>> call
>>>>>> first?
>>>>> It doesn't matter which is called first.
>>>> May I quote you on that one the next time my software crashes?
>>>>
>>> Arbitrarily breaking cycles *could* cause a problem if a destructor 
>>> attempts to access an already collected object. Not breaking cycles 
>>> *definitely* leaks memory and definitely doesn't call finalizers.
>> You don't need to break the cycles to call the finalizers. Just call 
>> them, then collect the whole cycle (assuming it is still unreachable).
>>
>> The GC will *never* reclaim a reachable object. Objects awaiting 
>> finalization are reachable, by definition.
>>
> Well it was sloppily worded, so replace it with:
> 
>      if a finalizer attempts to access an already finalized object.

A finalized object will still be a valid object.
Python code cannot make an object unsafe.
Obviously C code can make it unsafe, but that's true of C code anywhere.

For example, a file object will close itself during finalization,
but its still a valid object, just a closed file rather than an open one.
> 
> Michael
>>> Michael
>>>
>>>> It may not make a difference for the runtime, but the difference for 
>>>> user software may be "dead" or "alive".
>>>>
>>>> Stefan
>>>>
>>>> _______________________________________________
>>>> Python-Dev mailing list
>>>> Python-Dev at python.org
>>>> http://mail.python.org/mailman/listinfo/python-dev
>>>> Unsubscribe: 
>>>> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>>>
> 
> 


From merwok at netwok.org  Fri May  6 19:42:11 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Fri, 06 May 2011 19:42:11 +0200
Subject: [Python-Dev] cpython (3.2): Avoid codec spelling issues by just
 using the utf-8 default.
In-Reply-To: <ipv9nh$f5b$1@dough.gmane.org>
References: "\"<E1QI3RQ-00050Z-JW@dinsdale.python.org>"
	<BANLkTikamsX+wcTEKu1JtrpzgWzO7H_Huw@mail.gmail.com>"
	<926F0913-8142-430A-8400-6E6F0CD5B8F1@gmail.com>
	<ipv9nh$f5b$1@dough.gmane.org>
Message-ID: <da8189c6ef78d0e47bea356efadae97e@netwok.org>

 Le 06/05/2011 00:52, Terry Reedy a ?crit :
> On 5/5/2011 4:55 PM, Raymond Hettinger wrote:
>> Either way, the code is simpler by just using the default.
> I thought about this and decided that the purpose of having defaults 
> is
> so one does not have to always spell it out. So use it. Readers can
> always look it up and learn.

 Agreed.  I thought about something similar after Victor?s commit that 
 changed open(mode='rU') to use just 'r': Why not remove the mode 
 argument entirely when it is the default value?

 Regards

From merwok at netwok.org  Fri May  6 19:51:31 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Fri, 06 May 2011 19:51:31 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
Message-ID: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>

 Hi,

 Sorry for quick email-battery dying.

 regrtest helpfully reports when a test leaves the environment unclean 
 (sys.path, os.environ, logging._handlerList), but I think the 
 implementation is buggy: it compares object identity and then value.  
 Why is comparing identity useful?  I?d just use ==.  It makes writing 
 cleanup code easier (just use addCleanup(setattr, obj, 'attr', 
 copy(obj.attr))).

 Second: in packaging, we have two modules that create a logging 
 handler.  I?m not sure how if we should change the code or fix the tests 
 to restore the _handlerList, or how.

 Thanks for advice.

 Regards

From skip at pobox.com  Fri May  6 19:58:34 2011
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 6 May 2011 12:58:34 -0500
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC42C13.8070806@voidspace.org.uk>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC42270.1000301@voidspace.org.uk>
	<8F83194F-5A5C-496E-920A-A2488F9949E4@twistedmatrix.com>
	<4DC42C13.8070806@voidspace.org.uk>
Message-ID: <19908.14026.312182.540486@montanaro.dyndns.org>


    Michael> "Therefore we decided to break such a cycle at an arbitrary
    Michael> place, which doesn't sound too insane."

I trust "arbitrary" != "random"?

Skip

From stefan_ml at behnel.de  Fri May  6 20:06:12 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 06 May 2011 20:06:12 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC42A9B.6020000@voidspace.org.uk>
References: <iq0v4q$ubm$1@dough.gmane.org>	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>	<iq18uq$s9p$1@dough.gmane.org>
	<4DC42A9B.6020000@voidspace.org.uk>
Message-ID: <iq1dak$mr2$1@dough.gmane.org>

Michael Foord, 06.05.2011 19:06:
> On 06/05/2011 17:51, Stefan Behnel wrote:
>> Mark Shannon, 06.05.2011 18:33:
>>> skip at pobox.com wrote:
>>>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>>>> Antoine>
>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>>>
>>>>> From that note:
>>>>
>>>> 1: You can't have meaningful destructors, because when destruction
>>>> happens is undefined. And going-out-of-scope destructors are extremely
>>>> useful. Python is already a rather broken in this regard, so feel free
>>>> to ignore this point.
>>>>
>>>> Given the presence of cyclic data I don't see how reference counting or
>>>> garbage collection win. Ignoring the fact that in a pure reference counted
>>>> system you won't even consider cycles for reclmation, would both RC and GC
>>>> have to punt because they can't tell which object's destructor to call
>>>> first?
>>>
>>> It doesn't matter which is called first.
>>
>> May I quote you on that one the next time my software crashes?
>
> Arbitrarily breaking cycles *could* cause a problem if a destructor
> attempts to access an already collected object.

This is more real than the "could" suggests. Remember that CPython includes 
a lot of C code, and is commonly used to interface with C libraries. While 
you will simply get an exception when cycles are broken in Python code, 
cycles that involve C code can suffer quite badly from this problem.

There was a bug in the lxml.etree XML library a while ago that could let it 
crash hard when its Element objects participated in a reference cycle. It's 
based on libxml2, so there's an underlying C tree that potentially involves 
disconnected subtrees, and a Python space representation using Element 
proxies, with at least one Element for each disconnected subtree.

Basically, Elements reference their Document (not the other way round) even 
if they are disconnected from the main C document tree. The Document needs 
to do some final cleanup in the end, whereas the Elements require the 
Document to be alive to do their own subtree cleanup, if only to know what 
exactly to clean up, as the subtrees share some C state through the 
document. Now, if any of the Elements ends up in a reference cycle for some 
reason, the GC will throw its dices and may decide to call the Document 
destructor first. Then the Element destructors are bound to crash, trying 
to access dead memory of the Document.

This was easy to fix in CPython's refcounting environment. A double INCREF 
on the Document for each Element does the trick, as it effectively removes 
the Document from the collectable cycle and lets the Element destructors 
decide when to let the Document refcount go down to 0. A fix in a pure GC 
system is substantially harder to make efficient.

Stefan


From g.brandl at gmx.net  Fri May  6 20:14:28 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 06 May 2011 20:14:28 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<BANLkTikG95Qo+5LodJrjC=y3ANei=KkSXg@mail.gmail.com>
	<BANLkTimKxOWhRKUaHV-B1T=D7n39RQ8Lhg@mail.gmail.com>
	<ipv28o$6e9$1@dough.gmane.org>
	<19907.21576.751581.958722@montanaro.dyndns.org>
	<4DC3A6CB.5020809@dcs.gla.ac.uk>
	<BANLkTimh4a9BE+1HmuckMXf8yboxtc9m0w@mail.gmail.com>
Message-ID: <iq1dq3$p6i$1@dough.gmane.org>

On 06.05.2011 10:18, Amaury Forgeot d'Arc wrote:
> Le vendredi 6 mai 2011, Mark Shannon <marks at dcs.gla.ac.uk> a ?crit :
>> What about #defining PY_STOLEN in some header?
>>
>> Then any stolen parameter can be prefixed with PY_STOLEN in signature.
>>
>> For return values, similarly #define PY_BORROWED.
> 
> Header files are harder to parse, and I don't see how it would apply to macros.
> What about additional tags in the .rst files?

Possible, of course, and even easier to implement.

Georg


From g.brandl at gmx.net  Fri May  6 20:16:20 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 06 May 2011 20:16:20 +0200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <20110506122703.17c4d889@pitrou.net>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
Message-ID: <iq1dtj$p6i$2@dough.gmane.org>

On 06.05.2011 12:27, Antoine Pitrou wrote:
> On Fri, 06 May 2011 13:28:11 +1200
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> 
>> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:
>> 
>> > This is not always true, for example when the item is already present
>> > in the dict.
>> > It's not important to know what the function does to the object,
>> > Only the action on the reference is relevant.
>> 
>> Yes, that's the whole point. When using a functon,
>> what you need to know is whether it borrows or steals
>> a reference.
> 
> Doesn't "borrow" mean the same as "steal" in that context?
> If an API borrows a reference, I expect it to take it from me.

Basically, "borrow" is applied to return values (or, more generally,
"out" parameters), and means that *you* borrowed the reference.
"steal", OTOH, is applied to (and the exception for) "in" parameters.

Georg



From marks at dcs.gla.ac.uk  Fri May  6 20:45:41 2011
From: marks at dcs.gla.ac.uk (Mark Shannon)
Date: Fri, 06 May 2011 19:45:41 +0100
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq1dak$mr2$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>	<iq18uq$s9p$1@dough.gmane.org>	<4DC42A9B.6020000@voidspace.org.uk>
	<iq1dak$mr2$1@dough.gmane.org>
Message-ID: <4DC441D5.2070102@dcs.gla.ac.uk>

Stefan Behnel wrote:
> Michael Foord, 06.05.2011 19:06:
>> On 06/05/2011 17:51, Stefan Behnel wrote:
>>> Mark Shannon, 06.05.2011 18:33:
>>>> skip at pobox.com wrote:
>>>>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>>>>> Antoine>
>>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>>>>
>>>>>> From that note:
>>>>> 1: You can't have meaningful destructors, because when destruction
>>>>> happens is undefined. And going-out-of-scope destructors are extremely
>>>>> useful. Python is already a rather broken in this regard, so feel free
>>>>> to ignore this point.
>>>>>
>>>>> Given the presence of cyclic data I don't see how reference counting or
>>>>> garbage collection win. Ignoring the fact that in a pure reference counted
>>>>> system you won't even consider cycles for reclmation, would both RC and GC
>>>>> have to punt because they can't tell which object's destructor to call
>>>>> first?
>>>> It doesn't matter which is called first.
>>> May I quote you on that one the next time my software crashes?
>> Arbitrarily breaking cycles *could* cause a problem if a destructor
>> attempts to access an already collected object.
> 
> This is more real than the "could" suggests. Remember that CPython includes 
> a lot of C code, and is commonly used to interface with C libraries. While 
> you will simply get an exception when cycles are broken in Python code, 
> cycles that involve C code can suffer quite badly from this problem.
> 
> There was a bug in the lxml.etree XML library a while ago that could let it 
> crash hard when its Element objects participated in a reference cycle. It's 
> based on libxml2, so there's an underlying C tree that potentially involves 
> disconnected subtrees, and a Python space representation using Element 
> proxies, with at least one Element for each disconnected subtree.
> 
> Basically, Elements reference their Document (not the other way round) even 
> if they are disconnected from the main C document tree. The Document needs 
> to do some final cleanup in the end, whereas the Elements require the 
> Document to be alive to do their own subtree cleanup, if only to know what 
> exactly to clean up, as the subtrees share some C state through the 
> document. Now, if any of the Elements ends up in a reference cycle for some 
> reason, the GC will throw its dices and may decide to call the Document 
> destructor first. Then the Element destructors are bound to crash, trying 
> to access dead memory of the Document.

With a tracing collector it is *impossible* to access dead memory, ever.
If it can be reached the GC will *not* collect it.
This should be a fundamental invariant of *all* GCs.

If an object is finalizable or reachable from any finalizable objects
then it is reachable and its memory should not be reclaimed until it is 
truly unreachable.

Finalization and reclamation are separate phases.

> 
> This was easy to fix in CPython's refcounting environment. A double INCREF 
> on the Document for each Element does the trick, as it effectively removes 
> the Document from the collectable cycle and lets the Element destructors 
> decide when to let the Document refcount go down to 0. A fix in a pure GC 
> system is substantially harder to make efficient.

With a tracing GC:
While the Elements are finalized, the Document is still alive.
While the Document is finalized, the Elements are still alive.
Then, and only then, is the whole lot reclaimed.

Mark.

From vinay_sajip at yahoo.co.uk  Fri May  6 20:57:24 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Fri, 6 May 2011 18:57:24 +0000 (UTC)
Subject: [Python-Dev] Problems with regrtest and with logging
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
Message-ID: <loom.20110506T205048-495@post.gmane.org>

?ric Araujo <merwok <at> netwok.org> writes:


>  Second: in packaging, we have two modules that create a logging 
>  handler.  I?m not sure how if we should change the code or fix the tests 
>  to restore the _handlerList, or how.

If you are saying this happens in your unit tests for packaging, then you can
either restore the _handlerList using the approach in test_logging, or else you
can just close the handlers when you've done with them.

If you point me at the relevant code (is it on bitbucket or on hg.python.org?) I
can perhaps take a look and advise.

Regards,

Vinay Sajip


From stefan_ml at behnel.de  Fri May  6 21:10:30 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 06 May 2011 21:10:30 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC441D5.2070102@dcs.gla.ac.uk>
References: <iq0v4q$ubm$1@dough.gmane.org>	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>	<iq18uq$s9p$1@dough.gmane.org>	<4DC42A9B.6020000@voidspace.org.uk>	<iq1dak$mr2$1@dough.gmane.org>
	<4DC441D5.2070102@dcs.gla.ac.uk>
Message-ID: <iq1h37$dmi$1@dough.gmane.org>

Mark Shannon, 06.05.2011 20:45:
> Stefan Behnel wrote:
>> Michael Foord, 06.05.2011 19:06:
>>> On 06/05/2011 17:51, Stefan Behnel wrote:
>>>> Mark Shannon, 06.05.2011 18:33:
>>>>> skip at pobox.com wrote:
>>>>>> Antoine> Since we're sharing links, here's Matt Mackall's take:
>>>>>> Antoine>
>>>>>> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
>>>>>>
>>>>>>> From that note:
>>>>>> 1: You can't have meaningful destructors, because when destruction
>>>>>> happens is undefined. And going-out-of-scope destructors are extremely
>>>>>> useful. Python is already a rather broken in this regard, so feel free
>>>>>> to ignore this point.
>>>>>>
>>>>>> Given the presence of cyclic data I don't see how reference counting or
>>>>>> garbage collection win. Ignoring the fact that in a pure reference
>>>>>> counted
>>>>>> system you won't even consider cycles for reclmation, would both RC
>>>>>> and GC
>>>>>> have to punt because they can't tell which object's destructor to call
>>>>>> first?
>>>>> It doesn't matter which is called first.
>>>> May I quote you on that one the next time my software crashes?
>>> Arbitrarily breaking cycles *could* cause a problem if a destructor
>>> attempts to access an already collected object.
>>
>> This is more real than the "could" suggests. Remember that CPython
>> includes a lot of C code, and is commonly used to interface with C
>> libraries. While you will simply get an exception when cycles are broken
>> in Python code, cycles that involve C code can suffer quite badly from
>> this problem.
>>
>> There was a bug in the lxml.etree XML library a while ago that could let
>> it crash hard when its Element objects participated in a reference cycle.
>> It's based on libxml2, so there's an underlying C tree that potentially
>> involves disconnected subtrees, and a Python space representation using
>> Element proxies, with at least one Element for each disconnected subtree.
>>
>> Basically, Elements reference their Document (not the other way round)
>> even if they are disconnected from the main C document tree. The Document
>> needs to do some final cleanup in the end, whereas the Elements require
>> the Document to be alive to do their own subtree cleanup, if only to know
>> what exactly to clean up, as the subtrees share some C state through the
>> document. Now, if any of the Elements ends up in a reference cycle for
>> some reason, the GC will throw its dices and may decide to call the
>> Document destructor first. Then the Element destructors are bound to
>> crash, trying to access dead memory of the Document.
>
> With a tracing collector it is *impossible* to access dead memory, ever.
> If it can be reached the GC will *not* collect it.
> This should be a fundamental invariant of *all* GCs.
>
> If an object is finalizable or reachable from any finalizable objects
> then it is reachable and its memory should not be reclaimed until it is
> truly unreachable.
>
> Finalization and reclamation are separate phases.

Sure. However, I'm talking about Python types and C memory here. Even if 
the Python objects are still alive, they may already have freed the 
underlying C memory during their *finalisation*. When an Element goes out 
of scope, it must free its C subtree if it is disconnected, even if the 
Document stays alive. So that's what Elements do in their destructor, and 
they need the Document's C memory for that, which the Document frees during 
its own finalisation.

I do agree that CPython's destructor call algorithms could have been 
smarter in this case. After all, the described crash case indicates that 
the Document destructor was called before all of the Element destructors 
had been called, although all Elements reference their Document, but the 
Document does not refer to any of the Elements, so it's basically a dead 
end. That would have provided a detectable hint to call the Document 
destructor last, after the ones of all objects that reference it. 
Apparently, this hint did not lead to an appropriate action, possibly 
because it's an unimplemented special case and there are enough cases where 
multiple objects with destructors are actually part of the 'real' cycle.

Stefan


From drsalists at gmail.com  Fri May  6 21:59:30 2011
From: drsalists at gmail.com (Dan Stromberg)
Date: Fri, 6 May 2011 12:59:30 -0700
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <BANLkTimYpqiBtp_EFtmeQ3X9xCAnww+tcw@mail.gmail.com>

On Fri, May 6, 2011 at 7:04 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
>

Of course, a generational GC improves locality of reference.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/db1cb31f/attachment.html>

From rdmurray at bitdance.com  Fri May  6 22:07:30 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 06 May 2011 16:07:30 -0400
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
Message-ID: <20110506200734.049872500DF@webabinitio.net>

On Fri, 06 May 2011 19:51:31 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= <merwok at netwok.org> wrote:
> regrtest helpfully reports when a test leaves the environment unclean
> (sys.path, os.environ, logging._handlerList), but I think the
> implementation is buggy: it compares object identity and then value.
> Why is comparing identity useful?  I???d just use ==.  It makes writing
> cleanup code easier (just use addCleanup(setattr, obj, 'attr',
> copy(obj.attr))).

Well, the implementation is intentional.  Nick (I think) added the
identity check, and he had a reason at the time.  I don't remember what
it was, though.

--
R. David Murray           http://www.bitdance.com

From greg.ewing at canterbury.ac.nz  Sat May  7 01:25:09 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 11:25:09 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <4DC48355.2050509@canterbury.ac.nz>

Neal Becker wrote:
> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html

There, Linus says

> For example, if you have an _explicit_ refcounting system, then it is
> quite natural to have operations like ...
> 
> 		note_t *node = *np;
> 		if (node->count > 1)
> 			newnode = copy_alloc(node);

It's interesting to note that, even though you *can* get reference
count information in CPython, it's not all that useful for doing
things like that, because it's hard to be sure how many incidental
references have been created on the way to the code concerned.
So tricks like this at the Python level aren't really feasible in
any robust way.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Sat May  7 01:43:16 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 11:43:16 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <19908.8043.8921.50222@montanaro.dyndns.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
Message-ID: <4DC48794.5070808@canterbury.ac.nz>

Antoine> http://www.selenic.com/pipermail/mercurial-devel/2011-May/031055.html
> 
>>From that note:
> 
>     1: You can't have meaningful destructors, because when destruction
>     happens is undefined. And going-out-of-scope destructors are extremely
>     useful. Python is already a rather broken in this regard, so feel free
>     to ignore this point.

It's only broken if you regard RAII as the One True Way to
implement scoped resource management. Python has other approaches
to that, such as the with-statement.

Also, you *can* have destructors that work for objects in cycles,
as long as you don't insist on the destructor having access to
the object that's being destroyed. Weakref callbacks provide a
way of implementing this in CPython.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Sat May  7 01:53:39 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 11:53:39 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC433FD.6090803@dcs.gla.ac.uk>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC422BF.4010006@dcs.gla.ac.uk>
	<iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk>
	<4DC42F4B.1050509@dcs.gla.ac.uk> <4DC4321F.3070206@voidspace.org.uk>
	<4DC433FD.6090803@dcs.gla.ac.uk>
Message-ID: <4DC48A03.2020800@canterbury.ac.nz>

Mark Shannon wrote:

> For example, a file object will close itself during finalization,
> but its still a valid object, just a closed file rather than an open one.

It might be valid in the sense that you won't get a segfault.
But the point is that the destructors of some objects may be
relying on other objects still being in a certain state,
e.g. a file still being open.

One would have to adopt a highly defensive coding style in
destructors, verging on paranoia, to be sure that one's destructor
code was completely immune to this kind of problem.

All of this worry goes away if the destructor is not a method
of the object being destroyed, but something external that
runs *after* the object has disappeared.

-- 
Greg

From ncoghlan at gmail.com  Sat May  7 02:12:33 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 7 May 2011 10:12:33 +1000
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <6163FB60-2F6C-4143-8D9D-EE241DD09081@gmail.com>

Even if he's right (and he probably is) manual memory management is still a premature optimization for most applications. C and C++ data structures are a PITA because you have to be so careful to avoid leaks and double-frees, so people end up using dumb algorithms. Worrying about losing cycles waiting for main memory is stupid if your high level algorithm is O(N^2).

Cheers,
Nick.

--
Nick Coghlan, Brisbane, Australia

On 07/05/2011, at 12:04 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com

From greg.ewing at canterbury.ac.nz  Sat May  7 02:22:22 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 12:22:22 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC441D5.2070102@dcs.gla.ac.uk>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC422BF.4010006@dcs.gla.ac.uk>
	<iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk>
	<iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk>
Message-ID: <4DC490BE.4080002@canterbury.ac.nz>

Mark Shannon wrote:

> With a tracing GC:
> While the Elements are finalized, the Document is still alive.
> While the Document is finalized, the Elements are still alive.
> Then, and only then, is the whole lot reclaimed.

One problem is that, at the C level in CPython, you can't separate
finalisation and reclamation. When an object's refcount drops to
zero, its tp_dealloc method is called, which both finalises the object
and reclaims its memory.

Another problem is that just because an object's memory hasn't
been reclaimed yet doesn't mean it's safe to do anything with that
object. This is doubly true at the C level, where the consequences
can include segfaults.

Seems to me the basic issue here is that the C code wasn't designed
with tracing GC in mind. There is a reference cycle, but it is
assumed that the user is in manual control of deallocation and will
deallocate the Nodes before the Document.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Sat May  7 02:26:10 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 12:26:10 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq1h37$dmi$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC422BF.4010006@dcs.gla.ac.uk>
	<iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk>
	<iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk>
	<iq1h37$dmi$1@dough.gmane.org>
Message-ID: <4DC491A2.2060309@canterbury.ac.nz>

Stefan Behnel wrote:
> After all, the described crash case indicates that 
> the Document destructor was called before all of the Element destructors 
> had been called, although all Elements reference their Document, but the 
> Document does not refer to any of the Elements,

In that case, why was the GC system regarding this as a cycle
at all? There must be more going on.

-- 
Greg

From glyph at twistedmatrix.com  Sat May  7 03:39:10 2011
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Fri, 6 May 2011 21:39:10 -0400
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq0v4q$ubm$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org>
Message-ID: <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com>

Apologies in advance for contributing to an obviously and increasingly off-topic thread, but this kind of FUD about GC is a pet peeve of mine.

On May 6, 2011, at 10:04 AM, Neal Becker wrote:

> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html

Counterpoint: <http://lwn.net/Articles/268783/>.  Sorry Linus, sometimes correctness matters more than performance.

But, even the performance argument is kind of bogus.  See, for example, this paper on real-time garbage collection: <http://domino.research.ibm.com/comm/research_people.nsf/pages/dgrove.ecoop07.html>.  That's just one example of an easy-to-find solution to a problem that Linus holds up as unsolved or unsolvable.  There are solutions to pretty much all of the problems that Linus brings up.  One of these solutions is even famously implemented by CPython!  The CPython "string +=" idiom optimization fixes at least one case of the "you tend to always copy the node" antipattern Linus describes, and lots of languages (especially Scheme and derivatives, IIRC) have very nice optimizations around this area.  One could argue that any functional language without large pools of mutable state (i.e. Erlang) is a massive optimization for this case.

Another example: the "dirty cache" problem Linus talks about can be addressed by having a GC that cooperates with the VMM: <http://www.cs.umass.edu/~emery/pubs/f034-hertz.pdf>.

And the "re-using stuff as fast as possible" thing is exactly the kind of problem that generational GCs address.  When you run out of space in cache, you reap your first generation before you start copying stuff.  One of the key insights of generational GC is that you'll usually reclaim enough (in this case, cache-local) memory that you can keep going for a little while.  You don't have to read a super fancy modern paper on this, Wikipedia explains nicely: <http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Generational_GC_.28ephemeral_GC.29>.  Of course if you don't tune your GC at all for your machine-specific cache size, you won't see this performance benefit play out.

I don't know if there's a programming language and runtime with a real-time, VM-cooperating garbage collector that actually exists today which has all the bells and whistles required to implement an OS kernel, so I wouldn't give the Linux kernel folks too much of a hard time for still using C; but there's nothing wrong with the idea in the abstract.  The performance differences between automatic and manual GC are dubious at best, and with a really good GC and a language that supports it, GC tends to win big.  When it loses, it loses in ways which can be fixed in one area of the code (the GC) rather than millions of tiny fixes across your whole codebase, as is the case with strategies used by manual collection algorithms.

The assertion that "modern hardware" is not designed for big data-structure pointer-chasing is also a bit silly.  On the contrary, modern hardware has evolved staggeringly massive caches, specifically because large programs (whether they're GC'd or not) tend to do lots of this kind of thing, because there's a certain level of complexity beyond which one can no longer avoid it.  It's old hardware, with tiny caches (that were, by virtue of their tininess, closer to the main instruction-processing silicon), that was optimized for the "carefully stack-allocating everything in the world to conserve cache" approach.

You can see this pretty clearly by running your favorite Python benchmark of choice on machines which are similar except for cache size.  The newer machine, with the bigger cache, will run Python considerably faster, but doesn't help the average trivial C benchmark that much - or, for that matter, Linux benchmarks.

-glyph

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110506/3ad368a3/attachment.html>

From g.brandl at gmx.net  Sat May  7 08:55:57 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 07 May 2011 08:55:57 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC48355.2050509@canterbury.ac.nz>
References: <iq0v4q$ubm$1@dough.gmane.org> <4DC48355.2050509@canterbury.ac.nz>
Message-ID: <iq2qds$1um$1@dough.gmane.org>

On 07.05.2011 01:25, Greg Ewing wrote:
> Neal Becker wrote:
>> http://gcc.gnu.org/ml/gcc/2002-08/msg00552.html
> 
> There, Linus says
> 
>> For example, if you have an _explicit_ refcounting system, then it is
>> quite natural to have operations like ...
>> 
>> 		note_t *node = *np;
>> 		if (node->count > 1)
>> 			newnode = copy_alloc(node);
> 
> It's interesting to note that, even though you *can* get reference
> count information in CPython, it's not all that useful for doing
> things like that, because it's hard to be sure how many incidental
> references have been created on the way to the code concerned.
> So tricks like this at the Python level aren't really feasible in
> any robust way.

But they are at the C level, see for example the optimization for

  string += something

if "string"'s reference count is exactly one.

Georg


From stefan_ml at behnel.de  Sat May  7 09:20:38 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 07 May 2011 09:20:38 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <4DC491A2.2060309@canterbury.ac.nz>
References: <iq0v4q$ubm$1@dough.gmane.org>
	<20110506161233.1ed647ec@pitrou.net>	<19908.8043.8921.50222@montanaro.dyndns.org>	<4DC422BF.4010006@dcs.gla.ac.uk>	<iq18uq$s9p$1@dough.gmane.org>
	<4DC42A9B.6020000@voidspace.org.uk>	<iq1dak$mr2$1@dough.gmane.org>
	<4DC441D5.2070102@dcs.gla.ac.uk>	<iq1h37$dmi$1@dough.gmane.org>
	<4DC491A2.2060309@canterbury.ac.nz>
Message-ID: <iq2rs6$91r$1@dough.gmane.org>

Greg Ewing, 07.05.2011 02:26:
> Stefan Behnel wrote:
>> After all, the described crash case indicates that the Document
>> destructor was called before all of the Element destructors had been
>> called, although all Elements reference their Document, but the Document
>> does not refer to any of the Elements,
>
> In that case, why was the GC system regarding this as a cycle
> at all? There must be more going on.

It's a dead-end that is referenced by a cycle, that's all.

Stefan


From greg.ewing at canterbury.ac.nz  Sat May  7 10:20:23 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 07 May 2011 20:20:23 +1200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <iq2rs6$91r$1@dough.gmane.org>
References: <iq0v4q$ubm$1@dough.gmane.org> <20110506161233.1ed647ec@pitrou.net>
	<19908.8043.8921.50222@montanaro.dyndns.org>
	<4DC422BF.4010006@dcs.gla.ac.uk>
	<iq18uq$s9p$1@dough.gmane.org> <4DC42A9B.6020000@voidspace.org.uk>
	<iq1dak$mr2$1@dough.gmane.org> <4DC441D5.2070102@dcs.gla.ac.uk>
	<iq1h37$dmi$1@dough.gmane.org> <4DC491A2.2060309@canterbury.ac.nz>
	<iq2rs6$91r$1@dough.gmane.org>
Message-ID: <4DC500C7.1070608@canterbury.ac.nz>

Stefan Behnel wrote:

> It's a dead-end that is referenced by a cycle, that's all.

But shouldn't it be breaking the cycle by clearing one
of the objects that's actually part of the cycle, rather
than part of the dead-end?

I can't see how the Document could get picked for clearing
unless it was actually in the cycle. Either that or I'm
imagining the cyclic GC algorithm to be smarter than it
actually is.

-- 
Greg

From solipsis at pitrou.net  Sat May  7 10:34:57 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 7 May 2011 10:34:57 +0200
Subject: [Python-Dev] Linus on garbage collection
References: <iq0v4q$ubm$1@dough.gmane.org>
	<AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com>
Message-ID: <20110507103457.1e586a76@pitrou.net>

On Fri, 6 May 2011 21:39:10 -0400
Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
> 
> The assertion that "modern hardware" is not designed for big data-structure pointer-chasing is also a bit silly.  On the contrary, modern hardware has evolved staggeringly massive caches, specifically because large programs (whether they're GC'd or not) tend to do lots of this kind of thing, because there's a certain level of complexity beyond which one can no longer avoid it.

"Staggeringly massive"?
The average 4MB L3 cache is very small compared to the heap of
non-trivial Python (or Java) workloads.

And Linus is right: modern hardware is not optimized for random
pointer-chasing, simply because optimizing for it is very hard.

Regards

Antoine.



From doug.hellmann at gmail.com  Sat May  7 13:54:59 2011
From: doug.hellmann at gmail.com (Doug Hellmann)
Date: Sat, 7 May 2011 07:54:59 -0400
Subject: [Python-Dev] Python Insider translations
Message-ID: <23CF5EAF-FA24-48D3-89B5-E6C07F920FD1@gmail.com>

I wanted to take a few minutes to let you all know that the recent call for help with translating Python Insider was met with a wave of enthusiastic contributors. We now have teams prepared to translate all posts to Simplified and Traditional Chinese, German, Japanese, Portuguese, Romanian, and Spanish. Setting up each blog takes a bit of effort, so we are launching them in batches as they are ready. When all of the existing teams are launched, I will be looking for translators for additional languages.

The next time you have Python related information that you would like to share with the community, I hope you will consider working with us and publishing it through Python Insider, so it can reach the widest possible audience. Either Brian Curtin or I can help you get set up, so contact one of us directly when you are ready.

Thanks,
Doug


From merwok at netwok.org  Sat May  7 18:28:21 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Sat, 07 May 2011 18:28:21 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <loom.20110506T205048-495@post.gmane.org>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
	<loom.20110506T205048-495@post.gmane.org>
Message-ID: <2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org>

 Le 06/05/2011 20:57, Vinay Sajip a ?crit :
> ?ric Araujo <merwok <at> netwok.org> writes:
>>  Second: in packaging, we have two modules that create a logging
>>  handler.  I?m not sure how if we should change the code or fix the 
>> tests
>>  to restore the _handlerList, or how.
>
> If you are saying this happens in your unit tests for packaging, then 
> you can
> either restore the _handlerList using the approach in test_logging, 
> or else you
> can just close the handlers when you've done with them.

 We create one handler in a command-line script, not in the lib, which 
 is the Right Way AFAIU, but there is also one module that creates one 
 handler (in order to set its level depending on a verbose attribute) 
 deep in the library code, not in the command-line script, which I think 
 is bad.  Our tests that instantiate that object (dist.Distribution) end 
 up modifying logging._handlerList, but I feel that the code is wrong, 
 not the tests.

 The code is on https://bitbucket.org/tarek/cpython, in Lib/packaging.

 Thanks!

From merwok at netwok.org  Sat May  7 18:28:37 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Sat, 07 May 2011 18:28:37 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <20110506200734.049872500DF@webabinitio.net>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
	<20110506200734.049872500DF@webabinitio.net>
Message-ID: <17c0c1bacc61292edec4600e3feb40f9@netwok.org>

 Hi,

 Le 06/05/2011 22:07, R. David Murray a ?crit :
> On Fri, 06 May 2011 19:51:31 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= 
> <merwok at netwok.org> wrote:
>> regrtest helpfully reports when a test leaves the environment 
>> unclean
>> (sys.path, os.environ, logging._handlerList), but I think the
>> implementation is buggy: it compares object identity and then value.
>> Why is comparing identity useful?  I?d just use ==.  It makes 
>> writing
>> cleanup code easier (just use addCleanup(setattr, obj, 'attr',
>> copy(obj.attr))).
>
> Well, the implementation is intentional.  Nick (I think) added the
> identity check, and he had a reason at the time.  I don't remember 
> what
> it was, though.

 Drat.  Nick, if it was indeed you, can you enlighten me?

 /off to replace all those addCleanup/setattr combos :(

 Regards

From catch-all at masklinn.net  Sat May  7 20:31:45 2011
From: catch-all at masklinn.net (Xavier Morel)
Date: Sat, 7 May 2011 20:31:45 +0200
Subject: [Python-Dev] Linus on garbage collection
In-Reply-To: <AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com>
References: <iq0v4q$ubm$1@dough.gmane.org>
	<AF61F6E0-4A2A-4036-8305-65CDE66458CD@twistedmatrix.com>
Message-ID: <FB627310-A744-4C09-8035-421157A76757@masklinn.net>

On 2011-05-07, at 03:39 , Glyph Lefkowitz wrote:
> 
> I don't know if there's a programming language and runtime with a real-time, VM-cooperating garbage collector that actually exists today which has all the bells and whistles required to implement an OS kernel, so I wouldn't give the Linux kernel folks too much of a hard time for still using C; but there's nothing wrong with the idea in the abstract.
Not sure it had all those bells and whistles, and there were other issues, but I believe Lisp Machines implemented garbage collection at the hardware (or at least microcode) level, and the OS itself provided a pretty direct interface to it (it was part of the core services).


From solipsis at pitrou.net  Sat May  7 23:52:05 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 7 May 2011 23:52:05 +0200
Subject: [Python-Dev] cpython (2.7): Some tests were incorrectly marked
 as C specific.
References: <E1QIorr-0000Fm-IG@dinsdale.python.org>
Message-ID: <20110507235205.162d414c@pitrou.net>

On Sat, 07 May 2011 23:16:51 +0200
raymond.hettinger <python-checkins at python.org> wrote:
>  
> +class TestErrorHandling_Python(unittest.TestCase):
> +    module = py_heapq

This class contains no tests.

Regards

Antoine.



From vinay_sajip at yahoo.co.uk  Sun May  8 16:22:18 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Sun, 8 May 2011 14:22:18 +0000 (UTC)
Subject: [Python-Dev] Problems with regrtest and with logging
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
	<loom.20110506T205048-495@post.gmane.org>
	<2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org>
Message-ID: <loom.20110508T153655-217@post.gmane.org>

?ric Araujo <merwok <at> netwok.org> writes:

>  The code is on https://bitbucket.org/tarek/cpython, in Lib/packaging.

The cases you refer to seem to be _set_logger in packaging/run.py (which appears
not to be used at all - there appear to be no other references to it in the
code), Dispatcher.__init__ in packaging/run.py and
Distribution.parse_command_line in packaging/dist.py.

I can't see why the first case is there.

In the second and third cases, can you be sure that only one of these code paths
will be executed, at most once? If not, multiple StreamHandler instances would
be added to the logger, resulting in duplicated messages. If the code paths will
be executed at most once, then the code seems to be acceptable. You may wish to
add a guard using "if not logger.hasHandlers():" so that even if the code is
executed multiple times, a handler isn't added multiple times.

In the case of the test support code, I'm not really sure that LoggingCatcher is
needed. There is already a TestHandler class in test.support which captures
records in a buffer, and allows flexible matching for assertions, as described in

http://plumberjack.blogspot.com/2010/09/unit-testing-and-logging.html

The _handlerList in logging contains weak references to handlers, and when the
referent is finalised, it's removed from the list. If you want to control this
more finely, you could do something like (untested):

class MyTestCase(unittest.TestCase):
    def setUp(self):
        self.handler = TestHandler(Matcher())
        logging.getLogger().addHandler(self.handler)

    def tearDown(self):
        logging.getLogger().removeHandler(self.handler)
        self.handler.close()
        refs = weakref.getweakrefs(self.handler)
        for ref in refs:
            logging._removeHandlerRef(ref)
        
    def test_something(self):
        logging.warning('Test')
        self.assertTrue(self.handler.matches(message='Test'))


Regards,

Vinay Sajip


From victor.stinner at haypocalc.com  Mon May  9 12:32:48 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 09 May 2011 12:32:48 +0200
Subject: [Python-Dev] Commit changelog: issue number and merges
Message-ID: <1304937168.22910.21.camel@marge>

Hi,

Commit changelogs are important to understand why the code was changed.
I regulary use hg blame to search which commit introduced a particular
line of code, and I am always happy if I can find an issue number
because it usually contains the whole story.

And since the migration to Mercurial, we have also a great tool adding a
comment to an issue if the changelog contains an issue number (e.g.
changelog starting with "Issue #118888: ..."). So if someone watchs an
issue (is in the nosy list), (s)he will be noticed that a related commit
was pushed. It is not exactly something new: we already do that with
Subversion except that today it is more automatic.

I noticed that some recent commits don't contain the issue number:
please try to always prefix your changelog with the issue number. It is
not "mandatory", but it helps me when I dig the Python history.

--

For merge commits: many developers just write "merge" or "merge 3.1". I
have to go to the parent commit (and something to the grandparent,
3.1->3.2->3.3) to learn more about the commit.

Would it be possible to repeat the changelog of the original commit in
the merge commits? svnmerge toold prepared a nice changelog containing
the changelog of all pendings commits, even when a commit was "blocked".

For a merge commit, I copy/paste the changelog of the original commit
and I add a "(Merge 3.1) " prefix. I prefer to add explictly a prefix
because it is not easy to notice that it is a merge commit in a
python-checkins email or in the history of hg.python.org.

We need maybe new tools to help the process.

--

Usecases needing better changelogs:

 - "All changes" section of a buildbot build
 - hg blame (or just hg log)

Victor


From rdmurray at bitdance.com  Mon May  9 14:40:03 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 09 May 2011 08:40:03 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <1304937168.22910.21.camel@marge>
References: <1304937168.22910.21.camel@marge>
Message-ID: <20110509124003.EB9D5250044@webabinitio.net>

On Mon, 09 May 2011 12:32:48 +0200, Victor Stinner <victor.stinner at haypocalc.com> wrote:
> For merge commits: many developers just write "merge" or "merge 3.1". I
> have to go to the parent commit (and something to the grandparent,
> 3.1->3.2->3.3) to learn more about the commit.
> 
> Would it be possible to repeat the changelog of the original commit in
> the merge commits? svnmerge toold prepared a nice changelog containing
> the changelog of all pendings commits, even when a commit was "blocked".
> 
> For a merge commit, I copy/paste the changelog of the original commit
> and I add a "(Merge 3.1) " prefix. I prefer to add explictly a prefix
> because it is not easy to notice that it is a merge commit in a
> python-checkins email or in the history of hg.python.org.

+1.  What I do is, in the edit window for the commit message, I pull
in .hg/last-message.txt, and just type 'Merge' in front of my previous
first line.  I don't add the merge-from number, because I figure if you
know which branch you are looking at you know which branch the merge
came from, given that there is a strict progression.

--
R. David Murray           http://www.bitdance.com

From jimjjewett at gmail.com  Mon May  9 14:53:52 2011
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 9 May 2011 08:53:52 -0400
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277:
 Remove useless test from test_zlib.
In-Reply-To: <E1QIe26-0006OM-07@dinsdale.python.org>
References: <E1QIe26-0006OM-07@dinsdale.python.org>
Message-ID: <BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com>

Can you clarify (preferably in the commit message as well) exactly
*why* these largefile tests are useless?  For example, is there
another test that covers this already?

-jJ

On 5/7/11, nadeem.vawda <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/201dcfc56e86
> changeset:   69886:201dcfc56e86
> branch:      2.7
> parent:      69881:a0147a1f1776
> user:        Nadeem Vawda <nadeem.vawda at gmail.com>
> date:        Sat May 07 11:28:03 2011 +0200
> summary:
>   Issue #11277: Remove useless test from test_zlib.
>
> files:
>   Lib/test/test_zlib.py |  42 -------------------------------
>   1 files changed, 0 insertions(+), 42 deletions(-)
>
>
> diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py
> --- a/Lib/test/test_zlib.py
> +++ b/Lib/test/test_zlib.py
> @@ -72,47 +72,6 @@
>                           zlib.crc32('spam',  (2**31)))
>
>
> -# Issue #11277 - check that inputs of 2 GB (or 1 GB on 32 bits system) are
> -# handled correctly. Be aware of issues #1202. We cannot test a buffer of 4
> GB
> -# or more (#8650, #8651 and #10276), because the zlib stores the buffer
> size
> -# into an int.
> -class ChecksumBigBufferTestCase(unittest.TestCase):
> -    if sys.maxsize > _4G:
> -        # (64 bits system) crc32() and adler32() stores the buffer size
> into an
> -        # int, the maximum filesize is INT_MAX (0x7FFFFFFF)
> -        filesize = 0x7FFFFFFF
> -    else:
> -        # (32 bits system) On a 32 bits OS, a process cannot usually
> address
> -        # more than 2 GB, so test only 1 GB
> -        filesize = _1G
> -
> -    @unittest.skipUnless(mmap, "mmap() is not available.")
> -    def test_big_buffer(self):
> -        if sys.platform[:3] == 'win' or sys.platform == 'darwin':
> -            requires('largefile',
> -                     'test requires %s bytes and a long time to run' %
> -                     str(self.filesize))
> -        try:
> -            with open(TESTFN, "wb+") as f:
> -                f.seek(self.filesize-4)
> -                f.write("asdf")
> -                f.flush()
> -                m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
> -                try:
> -                    if sys.maxsize > _4G:
> -                        self.assertEqual(zlib.crc32(m), 0x709418e7)
> -                        self.assertEqual(zlib.adler32(m), -2072837729)
> -                    else:
> -                        self.assertEqual(zlib.crc32(m), 722071057)
> -                        self.assertEqual(zlib.adler32(m), -1002962529)
> -                finally:
> -                    m.close()
> -        except (IOError, OverflowError):
> -            raise unittest.SkipTest("filesystem doesn't have largefile
> support")
> -        finally:
> -            unlink(TESTFN)
> -
> -
>  class ExceptionTestCase(unittest.TestCase):
>      # make sure we generate some expected errors
>      def test_badlevel(self):
> @@ -595,7 +554,6 @@
>  def test_main():
>      run_unittest(
>          ChecksumTestCase,
> -        ChecksumBigBufferTestCase,
>          ExceptionTestCase,
>          CompressTestCase,
>          CompressObjectTestCase
>
> --
> Repository URL: http://hg.python.org/cpython
>

From eliben at gmail.com  Mon May  9 14:56:57 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Mon, 9 May 2011 15:56:57 +0300
Subject: [Python-Dev] more timely detection of unbound locals
Message-ID: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>

Hi all,

It's a known Python gotcha (*) that the following code:

x = 5
def foo():
    print(x)
    x = 1
    print(x)
foo()

Will throw:

       UnboundLocalError: local variable 'x' referenced before assignment

On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.

IIUC, the reason it behaves this way is that the symbol table logic goes
over the code before the code generation runs, sees the assignment 'x = 1`
and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
for all loads of  'x' in 'foo', even though 'x' is actually bound locally
after the first print. When the bytecode is run, since it's LOAD_FAST and no
store was made into the local 'x', ceval.c then throws the exception.

On first sight, it's possible to signal that 'x' truly becomes local only
after it's bound in the scope (and before that LOAD_NAME can be generated
for it instead of LOAD_FAST). To do this, some modifications to the symbol
table creation and usage are required, because we can no longer say "x is
local in this block", but rather should attach scope information to each
instance of "x". This has some overhead, but it's only at the compilation
stage so it shouldn't have a real effect on the runtime of Python code. This
is also less convenient and "clean" than the current approach - this is why
I'm wondering whether the behavior is an artifact of the implementation.

Would it not be worth to make Python's behavior more expected in this case,
at the cost of some implementation complexity? What are the cons to making
such a change? At least judging by the amount of people getting confused by
it, maybe it's in line with the zen of Python to behave more explicitly
here.

Thanks in advance,
Eli

(*) Variation of this FAQ:
http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/2fd5298b/attachment.html>

From jimjjewett at gmail.com  Mon May  9 15:00:17 2011
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 9 May 2011 09:00:17 -0400
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <E1QIf1U-0001ch-SK@dinsdale.python.org>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
Message-ID: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>

Are you asserting that all foreign modules (or at least all handled by
this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
won't change?)

Is this ASCII restriction (as opposed to even UTF8) really needed?

Or are you just saying that we need to create an ASCII name for passing to C?

-jJ

On 5/7/11, victor.stinner <python-checkins at python.org> wrote:
> http://hg.python.org/cpython/rev/eb003c3d1770
> changeset:   69889:eb003c3d1770
> user:        Victor Stinner <victor.stinner at haypocalc.com>
> date:        Sat May 07 12:46:05 2011 +0200
> summary:
>   _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
>
> The name must be encodable to ASCII because dynamic module must have a
> function
> called "PyInit_NAME", they are written in C, and the C language doesn't
> accept
> non-ASCII identifiers.
>
> files:
>   Python/importdl.c |  40 +++++++++++++++++++++-------------
>   1 files changed, 25 insertions(+), 15 deletions(-)
>
>
> diff --git a/Python/importdl.c b/Python/importdl.c
> --- a/Python/importdl.c
> +++ b/Python/importdl.c
> @@ -20,31 +20,36 @@
>                                             const char *pathname, FILE *fp);
>  #endif
>
> -/* name should be ASCII only because the C language doesn't accept
> non-ASCII
> -   identifiers, and dynamic modules are written in C. */
> -
>  PyObject *
>  _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp)
>  {
> -    PyObject *m;
> +    PyObject *m = NULL;
>  #ifndef MS_WINDOWS
>      PyObject *pathbytes;
>  #endif
> +    PyObject *nameascii;
>      char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext;
>      dl_funcptr p0;
>      PyObject* (*p)(void);
>      struct PyModuleDef *def;
>
> -    namestr = _PyUnicode_AsString(name);
> -    if (namestr == NULL)
> -        return NULL;
> -
>      m = _PyImport_FindExtensionObject(name, path);
>      if (m != NULL) {
>          Py_INCREF(m);
>          return m;
>      }
>
> +    /* name must be encodable to ASCII because dynamic module must have a
> +       function called "PyInit_NAME", they are written in C, and the C
> language
> +       doesn't accept non-ASCII identifiers. */
> +    nameascii = PyUnicode_AsEncodedString(name, "ascii", NULL);
> +    if (nameascii == NULL)
> +        return NULL;
> +
> +    namestr = PyBytes_AS_STRING(nameascii);
> +    if (namestr == NULL)
> +        goto error;
> +
>      lastdot = strrchr(namestr, '.');
>      if (lastdot == NULL) {
>          packagecontext = NULL;
> @@ -60,34 +65,33 @@
>  #else
>      pathbytes = PyUnicode_EncodeFSDefault(path);
>      if (pathbytes == NULL)
> -        return NULL;
> +        goto error;
>      p0 = _PyImport_GetDynLoadFunc(shortname,
>                                    PyBytes_AS_STRING(pathbytes), fp);
>      Py_DECREF(pathbytes);
>  #endif
>      p = (PyObject*(*)(void))p0;
>      if (PyErr_Occurred())
> -        return NULL;
> +        goto error;
>      if (p == NULL) {
>          PyErr_Format(PyExc_ImportError,
>                       "dynamic module does not define init function"
>                       " (PyInit_%s)",
>                       shortname);
> -        return NULL;
> +        goto error;
>      }
>      oldcontext = _Py_PackageContext;
>      _Py_PackageContext = packagecontext;
>      m = (*p)();
>      _Py_PackageContext = oldcontext;
>      if (m == NULL)
> -        return NULL;
> +        goto error;
>
>      if (PyErr_Occurred()) {
> -        Py_DECREF(m);
>          PyErr_Format(PyExc_SystemError,
>                       "initialization of %s raised unreported exception",
>                       shortname);
> -        return NULL;
> +        goto error;
>      }
>
>      /* Remember pointer to module init function. */
> @@ -101,12 +105,18 @@
>          Py_INCREF(path);
>
>      if (_PyImport_FixupExtensionObject(m, name, path) < 0)
> -        return NULL;
> +        goto error;
>      if (Py_VerboseFlag)
>          PySys_FormatStderr(
>              "import %U # dynamically loaded from %R\n",
>              name, path);
> +    Py_DECREF(nameascii);
>      return m;
> +
> +error:
> +    Py_DECREF(nameascii);
> +    Py_XDECREF(m);
> +    return NULL;
>  }
>
>  #endif /* HAVE_DYNAMIC_LOADING */
>
> --
> Repository URL: http://hg.python.org/cpython
>

From orsenthil at gmail.com  Mon May  9 15:08:48 2011
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Mon, 9 May 2011 21:08:48 +0800
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <20110509124003.EB9D5250044@webabinitio.net>
References: <1304937168.22910.21.camel@marge>
	<20110509124003.EB9D5250044@webabinitio.net>
Message-ID: <20110509130848.GA2402@kevin>

On Mon, May 09, 2011 at 08:40:03AM -0400, R. David Murray wrote:
> +1.  What I do is, in the edit window for the commit message, I pull
> in .hg/last-message.txt, and just type 'Merge' in front of my previous

Thanks for this tip. I shall start following this one too.

-- 
Senthil

From ijmorlan at uwaterloo.ca  Mon May  9 15:26:38 2011
From: ijmorlan at uwaterloo.ca (Isaac Morland)
Date: Mon, 9 May 2011 09:26:38 -0400 (EDT)
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
Message-ID: <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>

On Mon, 9 May 2011, Eli Bendersky wrote:

> It's a known Python gotcha (*) that the following code:
>
> x = 5
> def foo():
>    print(x)
>    x = 1
>    print(x)
> foo()
>
> Will throw:
>
>       UnboundLocalError: local variable 'x' referenced before assignment
>
> On the usage of 'x' in the *first* print. Recently, while reading the
> zillionth question on StackOverflow on some variation of this case, I
> started thinking whether this behavior is desired or just an implementation
> artifact.
>
> IIUC, the reason it behaves this way is that the symbol table logic goes
> over the code before the code generation runs, sees the assignment 'x = 1`
> and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
> for all loads of  'x' in 'foo', even though 'x' is actually bound locally
> after the first print. When the bytecode is run, since it's LOAD_FAST and no
> store was made into the local 'x', ceval.c then throws the exception.
>
> On first sight, it's possible to signal that 'x' truly becomes local only
> after it's bound in the scope (and before that LOAD_NAME can be generated
> for it instead of LOAD_FAST). To do this, some modifications to the symbol
> table creation and usage are required, because we can no longer say "x is
> local in this block", but rather should attach scope information to each
> instance of "x". This has some overhead, but it's only at the compilation
> stage so it shouldn't have a real effect on the runtime of Python code. This
> is also less convenient and "clean" than the current approach - this is why
> I'm wondering whether the behavior is an artifact of the implementation.

x = 5
def foo ():
 	print (x)
 	if bar ():
 		x = 1
 	print (x)

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist

From stefan_ml at behnel.de  Mon May  9 15:27:09 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Mon, 09 May 2011 15:27:09 +0200
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
Message-ID: <iq8q3d$kfk$1@dough.gmane.org>

Eli Bendersky, 09.05.2011 14:56:
> It's a known Python gotcha (*) that the following code:
>
> x = 5
> def foo():
>      print(x)
>      x = 1
>      print(x)
> foo()
>
> Will throw:
>
>         UnboundLocalError: local variable 'x' referenced before assignment
>
> On the usage of 'x' in the *first* print. Recently, while reading the
> zillionth question on StackOverflow on some variation of this case, I
> started thinking whether this behavior is desired or just an implementation
> artifact.

Well, basically any compiler these days can detect that a variable is being 
used before assignment, or at least that this is possibly the case, 
depending on prior branching.

ISTM that your suggestion is to let x refer to the outer x up to the 
assignment and to the inner x from that point on. IMHO, that's much worse 
than the current behaviour and potentially impractical due to conditional 
assignments.

However, it's also a semantic change to reject code with unbound locals at 
compile time, as the specific code in question may actually be unreachable 
at runtime. This makes me think that it would be best to discuss this on 
the python-ideas list first.

If nothing else, I'd like to see a discussion on this behaviour being an 
implementation detail of CPython or a feature of the Python language.

Stefan


From ericsnowcurrently at gmail.com  Mon May  9 15:41:43 2011
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 9 May 2011 07:41:43 -0600
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
Message-ID: <BANLkTi=d3JpxxC59ckuxSmE4K-v2woJ_eg@mail.gmail.com>

On May 9, 2011 6:59 AM, "Eli Bendersky" <eliben at gmail.com> wrote:
>
> Hi all,
>
> It's a known Python gotcha (*) that the following code:
>
> x = 5
> def foo():
>     print(x)
>     x = 1
>     print(x)
> foo()
>
> Will throw:
>
>        UnboundLocalError: local variable 'x' referenced before assignment
>
> On the usage of 'x' in the *first* print. Recently, while reading the
zillionth question on StackOverflow on some variation of this case, I
started thinking whether this behavior is desired or just an implementation
artifact.
>
> IIUC, the reason it behaves this way is that the symbol table logic goes
over the code before the code generation runs, sees the assignment 'x = 1`
and marks 'x' as local in foo. Then, the code generator generates LOAD_FAST
for all loads of  'x' in 'foo', even though 'x' is actually bound locally
after the first print. When the bytecode is run, since it's LOAD_FAST and no
store was made into the local 'x', ceval.c then throws the exception.
>
> On first sight, it's possible to signal that 'x' truly becomes local only
after it's bound in the scope (and before that LOAD_NAME can be generated
for it instead of LOAD_FAST). To do this, some modifications to the symbol
table creation and usage are required, because we can no longer say "x is
local in this block", but rather should attach scope information to each
instance of "x". This has some overhead, but it's only at the compilation
stage so it shouldn't have a real effect on the runtime of Python code. This
is also less convenient and "clean" than the current approach - this is why
I'm wondering whether the behavior is an artifact of the implementation.
>
> Would it not be worth to make Python's behavior more expected in this
case, at the cost of some implementation complexity? What are the cons to
making such a change? At least judging by the amount of people getting
confused by it, maybe it's in line with the zen of Python to behave more
explicitly here.

This is about mixing scopes for the the same name in the same block, right?
Perhaps a more specific error would be enough, unless there is a good use
case for having that mixed scope for the name.

-eric

> Thanks in advance,
> Eli
>
> (*) Variation of this FAQ:
http://docs.python.org/faq/programming.html#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/010f77dc/attachment.html>

From nadeem.vawda at gmail.com  Mon May  9 16:08:55 2011
From: nadeem.vawda at gmail.com (Nadeem Vawda)
Date: Mon, 9 May 2011 16:08:55 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #11277:
 Remove useless test from test_zlib.
In-Reply-To: <BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com>
References: <E1QIe26-0006OM-07@dinsdale.python.org>
	<BANLkTi=BGGJRbi-i94tj6HmuOkqGq+RL9Q@mail.gmail.com>
Message-ID: <BANLkTimiA9OzfNob0-b0GfPnymeb0h7zrg@mail.gmail.com>

On Mon, May 9, 2011 at 2:53 PM, Jim Jewett <jimjjewett at gmail.com> wrote:
> Can you clarify (preferably in the commit message as well) exactly
> *why* these largefile tests are useless?  For example, is there
> another test that covers this already?

Ah, sorry about that. It was discussed on the tracker issue, but I guess I
can't expect people to read through 90+ messages to figure it out :P

The short version is that it was supposed to test 4GB+ inputs, but in 2.7,
the functions being tested don't accept inputs that large.

The details:

The test was originally intended to catch the case where crc32() or adler32()
would get a buffer of >=4GB, and then silently truncate the buffer size and
produce an incorrect result (issue10276). It had been written for 3.x, and then
backported to 2.7. However, in 2.7, zlibmodule.c doesn't define
PY_SSIZE_T_CLEAN, so passing in a buffer of >=2GB raises an OverflowError
(see issue8651). This means that it is impossible to trigger the bug in question
on 2.7, making the test pointless.

Of course, the code that was deleted tests with an input sized 2GB-1 or 1GB,
rather than 4GB (the size used in 3.x). When the test was backported, the size
of the input was reduced, to avoid triggering an OverflowException. At the time,
no-one realized that this also would not trigger the bug being tested
for; it only
came to light when the test started crashing for unrelated reasons (issue11277).

Cheers,
Nadeem

From benjamin at python.org  Mon May  9 16:08:53 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 9 May 2011 09:08:53 -0500
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <1304937168.22910.21.camel@marge>
References: <1304937168.22910.21.camel@marge>
Message-ID: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>

2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>:
> Hi,
>
> Commit changelogs are important to understand why the code was changed.
> I regulary use hg blame to search which commit introduced a particular
> line of code, and I am always happy if I can find an issue number
> because it usually contains the whole story.
>
> And since the migration to Mercurial, we have also a great tool adding a
> comment to an issue if the changelog contains an issue number (e.g.
> changelog starting with "Issue #118888: ..."). So if someone watchs an
> issue (is in the nosy list), (s)he will be noticed that a related commit
> was pushed. It is not exactly something new: we already do that with
> Subversion except that today it is more automatic.
>
> I noticed that some recent commits don't contain the issue number:
> please try to always prefix your changelog with the issue number. It is
> not "mandatory", but it helps me when I dig the Python history.
>
> --
>
> For merge commits: many developers just write "merge" or "merge 3.1". I
> have to go to the parent commit (and something to the grandparent,
> 3.1->3.2->3.3) to learn more about the commit.

I thought the whole point of merging was that you brought a changeset
from one branch to another. This why I just write "merge" because
otherwise you're technically duplicating information that is pulled
onto the branch by merging.

It seems like something that should be solved by tools like a display
visual graph indicating what is merged. (like Bazaar)



-- 
Regards,
Benjamin

From victor.stinner at haypocalc.com  Mon May  9 16:11:15 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 09 May 2011 16:11:15 +0200
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
Message-ID: <1304950275.22910.32.camel@marge>

Le lundi 09 mai 2011 ? 09:00 -0400, Jim Jewett a ?crit :
> Are you asserting that all foreign modules (or at least all handled by
> this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
> won't change?)

C and C++ identifiers are restricted to ASCII. I don't know for Fortran
or Java.

Is it possible to write a CPython extension module in Java or Fortran?

(My change doesn't concern Jython: it's an implementation detail of
dynamic modules in CPython.)

> Is this ASCII restriction (as opposed to even UTF8) really needed?

I prefer to explicitly limit module names of dynamic modules to ASCII.

If we decide to extend the support to something else than ASCII, we will
need a working module to test it, and maybe also a test.

> Or are you just saying that we need to create an ASCII name for passing to C?

You pass a Unicode module name to import (import h? or
__import__('h?')), and Python encodes the name to ASCII if it is a
dynamic module. It is still possible to use non-ASCII module names, but
only for modules written in Python.

Victor


From victor.stinner at haypocalc.com  Mon May  9 16:14:03 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 09 May 2011 16:14:03 +0200
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
Message-ID: <1304950443.22910.34.camel@marge>

Le lundi 09 mai 2011 ? 09:08 -0500, Benjamin Peterson a ?crit :
> It seems like something that should be solved by tools like a display
> visual graph indicating what is merged. (like Bazaar)

Yeah, we could fix buildbot, hg.python.org website, improve hg log, and
all other tools using Mercurial. But until that, I would prefer to
duplicate the information.

Victor


From ncoghlan at gmail.com  Mon May  9 16:36:04 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 00:36:04 +1000
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <20110506122703.17c4d889@pitrou.net>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
Message-ID: <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>

On Fri, May 6, 2011 at 8:27 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Fri, 06 May 2011 13:28:11 +1200
> Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
>> Amaury Forgeot d'Arc wrote [concerning the Doc/data/refcounts.dat file]:
>>
>> > This is not always true, for example when the item is already present
>> > in the dict.
>> > It's not important to know what the function does to the object,
>> > Only the action on the reference is relevant.
>>
>> Yes, that's the whole point. When using a functon,
>> what you need to know is whether it borrows or steals
>> a reference.
>
> Doesn't "borrow" mean the same as "steal" in that context?
> If an API borrows a reference, I expect it to take it from me.

Input parameter, borrowed or new reference: caller retains ownership
and must still decref item
Input parameter, stolen reference: caller transfers ownership and must
NOT decref item (or must incref before call to guarantee lifecycle if
planning to continue using the object after the call)

Output parameter or return value, borrowed reference: caller does NOT
receive ownership and does not need to decref item, but needs to be
careful of lifecycle (and promote to a full reference with incref if
the borrowed reference may outlive the original)
Output parameter or return value, stolen or new reference: caller
receives ownership and must decref item

One interesting aspect is that from the caller's point of view, a
*new* reference to the relevant behaves like a borrowed reference for
input parameters, but like a stolen reference for output parameters
and return values. It is typically the converse cases (stolen
reference to an input parameter, borrowed reference to an output
parameter or return value) that requires special attention on the
caller's part.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From steve at pearwood.info  Mon May  9 16:45:09 2011
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 May 2011 00:45:09 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
Message-ID: <4DC7FDF5.9060003@pearwood.info>

Eli Bendersky wrote:
> Hi all,
> 
> It's a known Python gotcha (*) that the following code:
> 
> x = 5
> def foo():
>     print(x)
>     x = 1
>     print(x)
> foo()
> 
> Will throw:
> 
>        UnboundLocalError: local variable 'x' referenced before assignment

I think part of the problem is that UnboundLocalError is a jargon name, 
while it's predecessor NameError (used up to Python 1.5) is far more 
intuitively obvious.


> On the usage of 'x' in the *first* print. Recently, while reading the
> zillionth question on StackOverflow on some variation of this case, I
> started thinking whether this behavior is desired or just an implementation
> artifact.
[...]
> Would it not be worth to make Python's behavior more expected in this case,
> at the cost of some implementation complexity? What are the cons to making
> such a change? At least judging by the amount of people getting confused by
> it, maybe it's in line with the zen of Python to behave more explicitly
> here.

I think you are making an unwarranted assumption about what is "more 
expected". I presume you are thinking that the expected behaviour is 
that foo() should:

print global x (5)
assign 1 to local x
print local x (1)

If we implemented this change, there would be no more questions about 
UnboundLocalError, but instead there would be lots of questions like 
"why is it that globals revert to their old value after I change them in 
a function?".




-- 
Steven


From ncoghlan at gmail.com  Mon May  9 17:00:38 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 01:00:38 +1000
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
Message-ID: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com>

On Sat, May 7, 2011 at 3:51 AM, ?ric Araujo <merwok at netwok.org> wrote:
> regrtest helpfully reports when a test leaves the environment unclean
> (sys.path, os.environ, logging._handlerList), but I think the implementation
> is buggy: it compares object identity and then value. ?Why is comparing
> identity useful? ?I?d just use ==. ?It makes writing cleanup code easier
> (just use addCleanup(setattr, obj, 'attr', copy(obj.attr))).

Because changing the identity of any of those global state attributes
that regrtest monitors is itself suggestive of a bug. When it comes to
containers, identity matters at least as much as value does (and
sometimes more so - e.g. sys.modules). Replacing those global
containers with new ones isn't guaranteed to work, as they may be
cached in various places rather than always retrieved fresh from the
relevant module namespace. Modifying them in place, on the other hand,
does the right thing even in the presence of cached references.

A comment to that effect may be a useful addition to regrtest, as I
expect others may have similar questions about those identity checks
in the future. (It may even be a useful addition to the documentation,
but I have no idea where it could be sensibly included).

Also, don't be surprised if wholesale cleanup like that isn't
completely reliable - it's far, far better if the test case
understands the changes it is making (even indirectly) and explicitly
reverses them. Save-and-restore should be a last resort technique
(although context managers that are designed for more general use,
such as warnings.catch_warnings(), use save-and-restore by necessity,
since they have no control over the body of the relevant with
statements).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From eliben at gmail.com  Mon May  9 17:01:06 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Mon, 9 May 2011 18:01:06 +0300
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <4DC7FDF5.9060003@pearwood.info>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<4DC7FDF5.9060003@pearwood.info>
Message-ID: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com>

> I think you are making an unwarranted assumption about what is "more
> expected". I presume you are thinking that the expected behaviour is that
> foo() should:
>
> print global x (5)
> assign 1 to local x
> print local x (1)
>
> If we implemented this change, there would be no more questions about
> UnboundLocalError, but instead there would be lots of questions like "why is
> it that globals revert to their old value after I change them in a
> function?".
>

True, but this is less confusing and follows the rules in a more
straightforward way. x = 1 without a 'global x' assigns a local x, this make
sense and is similar to what happens in C where an inner declaration
temporarily shadows a global one.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/20553648/attachment.html>

From ncoghlan at gmail.com  Mon May  9 17:04:21 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 01:04:21 +1000
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
Message-ID: <BANLkTiko98eop8xF18q3KZFSBQFXJ3mdtQ@mail.gmail.com>

On Mon, May 9, 2011 at 11:00 PM, Jim Jewett <jimjjewett at gmail.com> wrote:
> Are you asserting that all foreign modules (or at least all handled by
> this) are in C, as opposed to C++ or even Java or Fortran? ?(And the C
> won't change?)

The extension module that interfaces them to CPython will be written
in C, or something that can export a C-compatible library interface
(after reading in the Python C API headers).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From eliben at gmail.com  Mon May  9 17:06:18 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Mon, 9 May 2011 18:06:18 +0300
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
Message-ID: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>

> x = 5
> def foo ():
>        print (x)
>        if bar ():
>                x = 1
>        print (x)
>

I wish you'd annotate this code sample, what do you intend it to
demonstrate?

It probably shows the original complaint even more strongly. As for being a
problem with the suggested solution, I suppose you're right, although it
doesn't make it much different. Still, before a *possible* assignment to
'x', it should be loaded as LOAD_NAME since it was surely not bound as
local, yet.

Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110509/82d1a583/attachment.html>

From ncoghlan at gmail.com  Mon May  9 17:17:35 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 01:17:35 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<4DC7FDF5.9060003@pearwood.info>
	<BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com>
Message-ID: <BANLkTikW9UgdaPvFLjF3_OrzPJ-aUMHAUQ@mail.gmail.com>

On Tue, May 10, 2011 at 1:01 AM, Eli Bendersky <eliben at gmail.com> wrote:
>
>> I think you are making an unwarranted assumption about what is "more
>> expected". I presume you are thinking that the expected behaviour is that
>> foo() should:
>>
>> print global x (5)
>> assign 1 to local x
>> print local x (1)
>>
>> If we implemented this change, there would be no more questions about
>> UnboundLocalError, but instead there would be lots of questions like "why is
>> it that globals revert to their old value after I change them in a
>> function?".
>
> True, but this is less confusing and follows the rules in a more
> straightforward way. x = 1 without a 'global x' assigns a local x, this make
> sense and is similar to what happens in C where an inner declaration
> temporarily shadows a global one.

However, since flow control constructs in Python don't create new
scopes (unlike C/C++), you run into a fundamental problem with cases
like the one Isaac posted, or even nastier ones like the following:

def f():
  if bar():
    fill = 1
  else:
    fiil = 2
  print(fill)  # Q: What does this do when bool(bar()) is False?

Since we want to make the decision categorically at compile-time, the
simplest, least-confusing option is to say "assignment makes a
variable name local, referencing it before the first assignment is now
an error". I don't know of anyone that particularly *likes*
UnboundLocalError, but it's better than letting errors like the one
above pass silently. (It obviously doesn't trap *all* typo-related
errors, but it at least lets you reason sanely about name bindings)

On the reasoning-sanely front, closures likely present a more
compelling argument:

def f():
  def g():
    print(x) # We want this to refer to the closure in f(), thanks
  x = 1
  return g

UnboundLocalError is really about aligning the rules for the current
scope with those for references from nested scopes (i.e. x is a local
variable of f, whether it is referenced from f's local scope, or any
nested scope within f)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Mon May  9 17:22:36 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 01:22:36 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
Message-ID: <BANLkTin-V6HZ0Z_kxSj9mgjnMgkPE0sTRw@mail.gmail.com>

On Tue, May 10, 2011 at 1:06 AM, Eli Bendersky <eliben at gmail.com> wrote:
> It probably shows the original complaint even more strongly. As for being a
> problem with the suggested solution, I suppose you're right, although it
> doesn't make it much different. Still, before a *possible* assignment to
> 'x', it should be loaded as LOAD_NAME since it was surely not bound as
> local, yet.

Yeah, I've decided I'm happier with the closure based arguments than
the conditional statement related ones. "Assignments create local
variables" is a relatively simple rule to reason about, and is equally
valid for the current scope and for any nested scopes. The symtable
analysis for nested scopes is ordering independent (and can't be
changed for backwards compatibility reasons if nothing else), and
UnboundLocalError is a natural outgrowth of applying those semantics
to the current scope as well.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From rdmurray at bitdance.com  Mon May  9 17:44:15 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 09 May 2011 11:44:15 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
Message-ID: <20110509154416.35BBF250037@webabinitio.net>

On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.org> wrote:
> I thought the whole point of merging was that you brought a changeset
> from one branch to another. This why I just write "merge" because
> otherwise you're technically duplicating information that is pulled
> onto the branch by merging.

No it isn't.  The commit message isn't pulled into the new branch.

> It seems like something that should be solved by tools like a display
> visual graph indicating what is merged. (like Bazaar)

You'd need some extension to hg log that would show the original commit
message for the first changeset in the merge line in order to "fix"
this.  I doubt that is going to happen.

Note that saying just 'merge' makes perfect sense when you are pulling
in a whole group of changesets in order to synchronize two branches.
But if you are applying a single changeset to multiple branches,
as we often do in our workflow, then I think duplicating the commit
message is (1) easy to do and (2) very helpful when looking at
hg log output.

--
R. David Murray           http://www.bitdance.com

From ijmorlan at uwaterloo.ca  Mon May  9 17:44:21 2011
From: ijmorlan at uwaterloo.ca (Isaac Morland)
Date: Mon, 9 May 2011 11:44:21 -0400 (EDT)
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
Message-ID: <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>

On Mon, 9 May 2011, Eli Bendersky wrote:

>> x = 5
>> def foo ():
>>        print (x)
>>        if bar ():
>>                x = 1
>>        print (x)
>>
>
> I wish you'd annotate this code sample, what do you intend it to
> demonstrate?
>
> It probably shows the original complaint even more strongly. As for being a
> problem with the suggested solution, I suppose you're right, although it
> doesn't make it much different. Still, before a *possible* assignment to
> 'x', it should be loaded as LOAD_NAME since it was surely not bound as
> local, yet.

Extrapolating from your suggestion, you're saying before a *possible* 
assignment it will be treated as global, and after a *possible* assignment 
it will be treated as local?

But surely:

print (x)
if False:
 	x = 1
print (x)

should always print the same thing twice (in the absence of actions taken 
by other threads)!

Replace "False" by something that is usually (but not always) True, and 
"print (x)" by something that actually does something, and you had best 
put on your helmet because it's going to be a fun ride.

But I won't be on it.

The idea that the same name within the same scope always refers to the 
same value is an idea from functional programming and not part of Python; 
but surely the same name within the same scope should at least always 
refer to the same variable!

If something is to be done here, it occurs to me that the same parser that 
decides that the initial reference to x should use the local x could 
conceivably issue an error right away - "local variable can never be 
assigned before use" rather than waiting until runtime.  But even if I 
haven't confused myself about the possibility of this raising a false 
positive (and it certainly could in the presence of dead code), it 
wouldn't catch cases of conditional premature use of a local variable. I 
think in those cases people would still ask the same questions they do 
with the existing implementation.

Isaac Morland			CSCF Web Guru
DC 2554C, x36650		WWW Software Specialist

From merwok at netwok.org  Mon May  9 17:55:42 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Mon, 09 May 2011 17:55:42 +0200
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
Message-ID: <7fa082450fb750d082e71d5070a62171@netwok.org>

 Hi,

 Le 09/05/2011 16:08, Benjamin Peterson a ?crit :
> 2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>:
>> For merge commits: many developers just write "merge" or "merge 
>> 3.1". I
>> have to go to the parent commit (and something to the grandparent,
>> 3.1->3.2->3.3) to learn more about the commit.

 I follow conventions I?ve seen elsewhere (maybe Mercurial itself): I 
 use ?Branch merge? when I merge anonymous branches on the same named 
 branch, and ?Merge x.y? for forward-porting across named branches.

 I also tend to do more than one commit before merging.  It would not be 
 very easy with my current toolchain to get the commit message(s) to 
 insert into the new message, and I think it?s not necessary.

> I thought the whole point of merging was that you brought a changeset
> from one branch to another. This why I just write "merge" because
> otherwise you're technically duplicating information that is pulled
> onto the branch by merging.

 +1.  No interest in manually duplicating available information.

 Le 09/05/2011 17:44, R. David Murray a ?crit :
> No it isn't.  The commit message isn't pulled into the new branch.

 Sorry, your terminology does not make sense.  If you mean that the 
 commit message is not reused in the new commit after the merge, it?s 
 true.  However, the commit message with the relevant information is 
 available as part of the changesets that have been pulled and merged.

 Regards

From merwok at netwok.org  Mon May  9 18:36:43 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Mon, 09 May 2011 18:36:43 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <loom.20110508T153655-217@post.gmane.org>
References: "\"<acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>"
	<loom.20110506T205048-495@post.gmane.org>"
	<2ed4b4e7b4fc17cba2162535d2a220d8@netwok.org>
	<loom.20110508T153655-217@post.gmane.org>
Message-ID: <d2381201d78ab4c329487cc9f0c20236@netwok.org>

 Hi,

 Thanks for the help.  I didn?t know about handler.close.  (By which I 
 mean that I used logging without re-reading its documentation, which is 
 a testimony to its usability :)

> The cases you refer to seem to be _set_logger in packaging/run.py 
> (which appears
> not to be used at all - there appear to be no other references to it 
> in the
> code),
 Yep, probably dead code.  I think that an handler should be defined 
 only once, in the ?if __name__ == '__main__'? block.  Am I right?  Just 
 like you don?t call sys.exit from library code (hello optparse!), you 
 don?t set logging handlers in library code, only in the outmost layer of 
 the script.

> Dispatcher.__init__ in packaging/run.py and
 This is the new-fangled command line parser, which can run global 
 (Python-wide) commands (search, uninstall, etc.) as well as traditional 
 project-wide commands (build, check, sdist, etc.)

> Distribution.parse_command_line in packaging/dist.py.
 This is the older command line parser, that can handle only 
 project-wide commands.  I?m not sure the work is finished to integrate 
 both parsers; my smoke test used to be --help-commands, which can be 
 hard to run these days.

 The problem is that Dispatcher or Distribution get the quiet or verbose 
 options from the command-line deep in the library code, and want to use 
 it to configure the log level on the handler, which I?ve just said 
 should be set up at a much higher level.  To solve this, I?m going to 
 add a *logginghandler* argument to Dispatcher/Distribution; that way, 
 the creation of the handler will happen only once and at a high level, 
 but the command-line parsing code will be able to set the log handler 
 from the command-line arguments. :)

> In the second and third cases, can you be sure that only one of these 
> code paths
> will be executed, at most once?

 Gut feeling is yes, but we?ve learned not to trust our instinct with 
 distutils.

> In the case of the test support code, I'm not really sure that 
> LoggingCatcher is
> needed. There is already a TestHandler class in test.support which 
> captures
> records in a buffer, and allows flexible matching for assertions, as 
> described in

 distutils used its own log module; this mixin was used to intercept 
 messages sent with this system.  When we migrated to stdlib logging, I 
 added a todo comment to update the code to use something less kludgy :)  
 The post you linked to is already in my bookmarks.  Note that this 
 support module also helps with Python 2.4+, so I may have to copy-paste 
 TestHandler.

 So, I will fix the LoggingCatcher mixin to use the much cleaner 
 addHandler/removeHandler combo (I?ll avoid calling 
 logging._removeHandlerRef if I don?t have to) and try my idea about the 
 handler instantiation in the code.  Thanks a lot!

 Cheers

From steve at pearwood.info  Mon May  9 18:39:14 2011
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 10 May 2011 02:39:14 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>	<4DC7FDF5.9060003@pearwood.info>
	<BANLkTimUOUMfSkTFR+99869RK5Y3nEoZyw@mail.gmail.com>
Message-ID: <4DC818B2.5060508@pearwood.info>

Eli Bendersky wrote:
>> I think you are making an unwarranted assumption about what is "more
>> expected". I presume you are thinking that the expected behaviour is that
>> foo() should:
>>
>> print global x (5)
>> assign 1 to local x
>> print local x (1)
>>
>> If we implemented this change, there would be no more questions about
>> UnboundLocalError, but instead there would be lots of questions like "why is
>> it that globals revert to their old value after I change them in a
>> function?".
>>
> 
> True, but this is less confusing and follows the rules in a more
> straightforward way. x = 1 without a 'global x' assigns a local x, this make
> sense and is similar to what happens in C where an inner declaration
> temporarily shadows a global one.

I disagree that it is less confusing. Instead of a nice, straightforward 
error that you can google, the function will silently do the wrong 
thing, giving no clue that weirdness is happening.

def spam():
     if x < 0:  # refers to global x
         x = 1  # now local
     if x > 0:  # could be either global or local
         x = x - 1  # local on the LHS of the equal
         # sometimes global on the RHS
     else:
         x += 1  # local x, but what value does it have?


Just thinking about debugging the mess that this could make gives me a 
headache.



-- 
Steven


From merwok at netwok.org  Mon May  9 18:42:06 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Mon, 09 May 2011 18:42:06 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
	<BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com>
Message-ID: <190b407c9c414b46c8658307a88e5dfa@netwok.org>

 Hi,

> When it comes to
> containers, identity matters at least as much as value does (and
> sometimes more so - e.g. sys.modules). Replacing those global
> containers with new ones isn't guaranteed to work, as they may be
> cached in various places rather than always retrieved fresh from the
> relevant module namespace. Modifying them in place, on the other 
> hand,
> does the right thing even in the presence of cached references.

 That makes sense, thanks for the explanation!

> A comment to that effect may be a useful addition to regrtest, as I
> expect others may have similar questions about those identity checks
> in the future. (It may even be a useful addition to the 
> documentation,
> but I have no idea where it could be sensibly included).

 Somewhere in unittest doc, say in the section about tearDown.  Or maybe 
 it?s time for a Python testing best practices howto?

> Also, don't be surprised if wholesale cleanup like that isn't
> completely reliable - it's far, far better if the test case
> understands the changes it is making (even indirectly) and explicitly
> reverses them.

 Yep, I was probably bringing out the big guns too early.  
 self.addCleanup(sys.path.remove, path) is better and even shorter than 
 my previous code!

 Cheers

From tjreedy at udel.edu  Mon May  9 18:59:29 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 May 2011 12:59:29 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <iq8q3d$kfk$1@dough.gmane.org>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<iq8q3d$kfk$1@dough.gmane.org>
Message-ID: <iq96he$1q8$1@dough.gmane.org>

On 5/9/2011 9:27 AM, Stefan Behnel wrote:
> Eli Bendersky, 09.05.2011 14:56:
>> It's a known Python gotcha (*) that the following code:
>>
>> x = 5
>> def foo():
>> print(x)
>> x = 1
>> print(x)
>> foo()
>>
>> Will throw:
>>
>> UnboundLocalError: local variable 'x' referenced before assignment
>>
>> On the usage of 'x' in the *first* print. Recently, while reading the
>> zillionth question on StackOverflow on some variation of this case, I
>> started thinking whether this behavior is desired or just an
>> implementation
>> artifact.
>
> Well, basically any compiler these days can detect that a variable is
> being used before assignment, or at least that this is possibly the
> case, depending on prior branching.
>
> ISTM that your suggestion is to let x refer to the outer x up to the
> assignment and to the inner x from that point on. IMHO, that's much
> worse than the current behaviour and potentially impractical due to
> conditional assignments.
>
> However, it's also a semantic change to reject code with unbound locals
> at compile time, as the specific code in question may actually be
> unreachable at runtime. This makes me think that it would be best to
> discuss this on the python-ideas list first.
>
> If nothing else, I'd like to see a discussion on this behaviour being an
> implementation detail of CPython or a feature of the Python language.
>
> Stefan
>


-- 
Terry Jan Reedy


From tjreedy at udel.edu  Mon May  9 19:24:20 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 May 2011 13:24:20 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
Message-ID: <iq9802$e9m$1@dough.gmane.org>

A commit (push) partition time and behavior into before and after (with 
a short change period in between during which behavior is undefined).

Some commit messages have the form 'x does y'. Does 'does' mean before 
or after? Sometimes that is clear. 'x crashes' means before. 'x return 
correct value' means after. But some messages of this type are unclear 
to me as written.

Consider 'x raises exception'? The temporal reference is obvious to the 
committer but not necessary to everyone else. It could mean 'x used to 
segfault and now raises a catchable exception'. There was a fix like 
this (with a clear message) just today. It could also mean 'x used to 
raise but now return an answer. There have been many fixes like this.

Two minimal fixes are 'x raised exception' or 'make x raise exception'.

-- 
Terry Jan Reedy


From vinay_sajip at yahoo.co.uk  Mon May  9 19:40:03 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Mon, 9 May 2011 17:40:03 +0000 (UTC)
Subject: [Python-Dev] Problems with regrtest and with logging
Message-ID: <loom.20110509T193140-280@post.gmane.org>

?ric Araujo <merwok <at> netwok.org> writes:


>  Yep, probably dead code.  I think that an handler should be defined 
>  only once, in the ?if __name__ == '__main__'? block.  Am I right?  Just 
>  like you don?t call sys.exit from library code (hello optparse!), you 
>  don?t set logging handlers in library code, only in the outmost layer of 
>  the script.

That's right, though it's OK to provide a documented convenience API for adding
handlers.
 
>  The problem is that Dispatcher or Distribution get the quiet or verbose 
>  options from the command-line deep in the library code, and want to use 
>  it to configure the log level on the handler, which I?ve just said 
>  should be set up at a much higher level.  To solve this, I?m going to 
>  add a *logginghandler* argument to Dispatcher/Distribution; that way, 
>  the creation of the handler will happen only once and at a high level, 
>  but the command-line parsing code will be able to set the log handler 
>  from the command-line arguments. :)

You don't necessarily need to set the level on the handler - why can you not
just set it on the logger? The effect would often be the same: the logger's
level is checked first, and then the handler's level. Generally you set levels
on handlers when you want specific behaviour, such as all ERROR and above to a
particular file, all CRITICAL to an email handler etc.

For command-line scripts outputting to the console and nowhere else, usually you
could just add a StreamHandler (with no level set on it), and set the level on
the logger. Where the functionality may be used in an API, you should perhaps
check logger.hasHandlers() and avoid adding handlers if there are already some
added by a using library or application.

Regards,

Vinay Sajip



From rdmurray at bitdance.com  Mon May  9 19:54:45 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 09 May 2011 13:54:45 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <7fa082450fb750d082e71d5070a62171@netwok.org>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<7fa082450fb750d082e71d5070a62171@netwok.org>
Message-ID: <20110509175447.4DC56250039@webabinitio.net>

On Mon, 09 May 2011 17:55:42 +0200, =?UTF-8?Q?=C3=89ric_Araujo?= <merwok at netwok.org> wrote:
>  Le 09/05/2011 16:08, Benjamin Peterson a ??crit :
> > 2011/5/9 Victor Stinner <victor.stinner at haypocalc.com>:
> >> For merge commits: many developers just write "merge" or "merge
> >> 3.1". I
> >> have to go to the parent commit (and something to the grandparent,
> >> 3.1->3.2->3.3) to learn more about the commit.
> 
>  I follow conventions I???ve seen elsewhere (maybe Mercurial itself): I
>  use ???Branch merge??? when I merge anonymous branches on the same named
>  branch, and ???Merge x.y??? for forward-porting across named branches.
> 
>  I also tend to do more than one commit before merging.  It would not be
>  very easy with my current toolchain to get the commit message(s) to
>  insert into the new message, and I think it???s not necessary.
> 
> > I thought the whole point of merging was that you brought a changeset
> > from one branch to another. This why I just write "merge" because
> > otherwise you're technically duplicating information that is pulled
> > onto the branch by merging.
> 
>  +1.  No interest in manually duplicating available information.
> 
>  Le 09/05/2011 17:44, R. David Murray a ??crit :
> > No it isn't.  The commit message isn't pulled into the new branch.
> 
>  Sorry, your terminology does not make sense.  If you mean that the
>  commit message is not reused in the new commit after the merge, it???s
>  true.  However, the commit message with the relevant information is
>  available as part of the changesets that have been pulled and merged.

The changesets are in the repository and there are pointers to them
from the merge changeset, sure, but the data isn't in the checkout
(that's how I understood "pulled in to the new branch").

If I do 'hg log' and search for a revno (that I got from hg annotate),
the commit message describing the change is not attached to that revno,
nor as far as I know is there a tool that makes it easy to get from that
revno to the explanatory commit message.  That's what Victor and I are
talking about.  Is there a tool that fixes this problem?  (svnmerge did a
nice job of that from the automate-the-message-generation end of things).

--
R. David Murray           http://www.bitdance.com

From ned at nedbatchelder.com  Mon May  9 20:36:44 2011
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 09 May 2011 14:36:44 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <iq9802$e9m$1@dough.gmane.org>
References: <iq9802$e9m$1@dough.gmane.org>
Message-ID: <4DC8343C.2050005@nedbatchelder.com>

On 5/9/2011 1:24 PM, Terry Reedy wrote:
> A commit (push) partition time and behavior into before and after 
> (with a short change period in between during which behavior is 
> undefined).
>
> Some commit messages have the form 'x does y'. Does 'does' mean before 
> or after? Sometimes that is clear. 'x crashes' means before. 'x return 
> correct value' means after. But some messages of this type are unclear 
> to me as written.
>
> Consider 'x raises exception'? The temporal reference is obvious to 
> the committer but not necessary to everyone else. It could mean 'x 
> used to segfault and now raises a catchable exception'. There was a 
> fix like this (with a clear message) just today. It could also mean 'x 
> used to raise but now return an answer. There have been many fixes 
> like this.
>
> Two minimal fixes are 'x raised exception' or 'make x raise exception'.
>
I've always favored "X now properly raises an exception."

--Ned.

From guido at python.org  Mon May  9 21:17:45 2011
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 May 2011 12:17:45 -0700
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <4DC8343C.2050005@nedbatchelder.com>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
Message-ID: <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>

On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
> On 5/9/2011 1:24 PM, Terry Reedy wrote:
>>
>> A commit (push) partition time and behavior into before and after (with a
>> short change period in between during which behavior is undefined).
>>
>> Some commit messages have the form 'x does y'. Does 'does' mean before or
>> after? Sometimes that is clear. 'x crashes' means before. 'x return correct
>> value' means after. But some messages of this type are unclear to me as
>> written.
>>
>> Consider 'x raises exception'? The temporal reference is obvious to the
>> committer but not necessary to everyone else. It could mean 'x used to
>> segfault and now raises a catchable exception'. There was a fix like this
>> (with a clear message) just today. It could also mean 'x used to raise but
>> now return an answer. There have been many fixes like this.
>>
>> Two minimal fixes are 'x raised exception' or 'make x raise exception'.
>>
> I've always favored "X now properly raises an exception."

While my own preference is "make X properly raise an exception" I'm
happy with any of the alternatives proposed here, and grateful to
Terry for calling this out. Checkin comments of the form "X does Y"
are ambiguous and confusing. (Same for feature requests in the
tracker.)

I'm curious where the habit to use the present tense comes from; I
wonder if it originates in some agile development practice?

-- 
--Guido van Rossum (python.org/~guido)

From eric at trueblade.com  Mon May  9 21:36:21 2011
From: eric at trueblade.com (Eric Smith)
Date: Mon, 09 May 2011 15:36:21 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
Message-ID: <4DC84235.4060600@trueblade.com>

On 05/09/2011 03:17 PM, Guido van Rossum wrote:
> On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
>> On 5/9/2011 1:24 PM, Terry Reedy wrote:
>>>
>>> A commit (push) partition time and behavior into before and after (with a
>>> short change period in between during which behavior is undefined).
>>>
>>> Some commit messages have the form 'x does y'. Does 'does' mean before or
>>> after? Sometimes that is clear. 'x crashes' means before. 'x return correct
>>> value' means after. But some messages of this type are unclear to me as
>>> written.
>>>
>>> Consider 'x raises exception'? The temporal reference is obvious to the
>>> committer but not necessary to everyone else. It could mean 'x used to
>>> segfault and now raises a catchable exception'. There was a fix like this
>>> (with a clear message) just today. It could also mean 'x used to raise but
>>> now return an answer. There have been many fixes like this.
>>>
>>> Two minimal fixes are 'x raised exception' or 'make x raise exception'.
>>>
>> I've always favored "X now properly raises an exception."
> 
> While my own preference is "make X properly raise an exception" I'm
> happy with any of the alternatives proposed here, and grateful to
> Terry for calling this out. Checkin comments of the form "X does Y"
> are ambiguous and confusing. (Same for feature requests in the
> tracker.)
> 
> I'm curious where the habit to use the present tense comes from; I
> wonder if it originates in some agile development practice?
> 

Thanks indeed for bringing this up, Terry. It's been on my to-do list
for a while. I think it comes from just copying the title of a bug
report. The bug is "X does Y", and that's what's used in the fix.

Eric.


From guido at python.org  Mon May  9 22:05:30 2011
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 May 2011 13:05:30 -0700
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <4DC84235.4060600@trueblade.com>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
Message-ID: <BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com>

On Mon, May 9, 2011 at 12:36 PM, Eric Smith <eric at trueblade.com> wrote:
> On 05/09/2011 03:17 PM, Guido van Rossum wrote:
>> On Mon, May 9, 2011 at 11:36 AM, Ned Batchelder <ned at nedbatchelder.com> wrote:
>>> On 5/9/2011 1:24 PM, Terry Reedy wrote:
>>>>
>>>> A commit (push) partition time and behavior into before and after (with a
>>>> short change period in between during which behavior is undefined).
>>>>
>>>> Some commit messages have the form 'x does y'. Does 'does' mean before or
>>>> after? Sometimes that is clear. 'x crashes' means before. 'x return correct
>>>> value' means after. But some messages of this type are unclear to me as
>>>> written.
>>>>
>>>> Consider 'x raises exception'? The temporal reference is obvious to the
>>>> committer but not necessary to everyone else. It could mean 'x used to
>>>> segfault and now raises a catchable exception'. There was a fix like this
>>>> (with a clear message) just today. It could also mean 'x used to raise but
>>>> now return an answer. There have been many fixes like this.
>>>>
>>>> Two minimal fixes are 'x raised exception' or 'make x raise exception'.
>>>>
>>> I've always favored "X now properly raises an exception."
>>
>> While my own preference is "make X properly raise an exception" I'm
>> happy with any of the alternatives proposed here, and grateful to
>> Terry for calling this out. Checkin comments of the form "X does Y"
>> are ambiguous and confusing. (Same for feature requests in the
>> tracker.)
>>
>> I'm curious where the habit to use the present tense comes from; I
>> wonder if it originates in some agile development practice?
>>
>
> Thanks indeed for bringing this up, Terry. It's been on my to-do list
> for a while. I think it comes from just copying the title of a bug
> report. The bug is "X does Y", and that's what's used in the fix.

But in bug reports it is also ambiguous, since I've often seen it used
meaning "X should do Y" which is very confusing when it doesn't do Y
yet at the time the bug is created. :-(

-- 
--Guido van Rossum (python.org/~guido)

From tjreedy at udel.edu  Tue May 10 00:59:41 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 May 2011 18:59:41 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com>
References: <iq9802$e9m$1@dough.gmane.org>
	<4DC8343C.2050005@nedbatchelder.com>	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>	<4DC84235.4060600@trueblade.com>
	<BANLkTi=BGR3D+RwHFff3rgUfBt4A9pOX5w@mail.gmail.com>
Message-ID: <iq9rkr$2j7$1@dough.gmane.org>

On 5/9/2011 4:05 PM, Guido van Rossum wrote:
> On Mon, May 9, 2011 at 12:36 PM, Eric Smith<eric at trueblade.com>  wrote:
>> On 05/09/2011 03:17 PM, Guido van Rossum wrote:

>>> While my own preference is "make X properly raise an exception" I'm
>>> happy with any of the alternatives proposed here, and grateful to
>>> Terry for calling this out.

I am willing to admit that I do not know all corners of Python ;-)
I read the commit messages to learn more; in particular what sort of 
errors exist and how are they fixed.

 >>> Checkin comments of the form "X does Y"
>>> are ambiguous and confusing. (Same for feature requests in the
>>> tracker.)

I have always assumed that an issue entitled 'x does y' is a bug report 
about doing y now, before a fix.

>> Thanks indeed for bringing this up, Terry. It's been on my to-do list
>> for a while. I think it comes from just copying the title of a bug
>> report. The bug is "X does Y", and that's what's used in the fix.

I have also seen this type of message for non-tracker-issue commits.

> But in bug reports it is also ambiguous, since I've often seen it used
> meaning "X should do Y" which is very confusing when it doesn't do Y
> yet at the time the bug is created. :-(

If I notice a title that bad, I will try to change it.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue May 10 01:03:24 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 09 May 2011 19:03:24 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <20110509175447.4DC56250039@webabinitio.net>
References: <1304937168.22910.21.camel@marge>	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>	<7fa082450fb750d082e71d5070a62171@netwok.org>
	<20110509175447.4DC56250039@webabinitio.net>
Message-ID: <iq9rrq$3nh$1@dough.gmane.org>

On 5/9/2011 1:54 PM, R. David Murray wrote:

> If I do 'hg log' and search for a revno (that I got from hg annotate),
> the commit message describing the change is not attached to that revno,
> nor as far as I know is there a tool that makes it easy to get from that
> revno to the explanatory commit message.  That's what Victor and I are
> talking about.  Is there a tool that fixes this problem?  (svnmerge did a
> nice job of that from the automate-the-message-generation end of things).

TortoiseSvn, and I presume TortoiseHg also, has a 'recent messages' box 
that makes is trivial to reuse a message. I used it with svn and will 
make sure to use it, if it exists, when I get started with hg.
-- 
Terry Jan Reedy


From benjamin at python.org  Tue May 10 01:23:45 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 9 May 2011 18:23:45 -0500
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <20110509154416.35BBF250037@webabinitio.net>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<20110509154416.35BBF250037@webabinitio.net>
Message-ID: <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>

2011/5/9 R. David Murray <rdmurray at bitdance.com>:
> On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.org> wrote:
>> I thought the whole point of merging was that you brought a changeset
>> from one branch to another. This why I just write "merge" because
>> otherwise you're technically duplicating information that is pulled
>> onto the branch by merging.
>
> No it isn't. ?The commit message isn't pulled into the new branch.
>
>> It seems like something that should be solved by tools like a display
>> visual graph indicating what is merged. (like Bazaar)
>
> You'd need some extension to hg log that would show the original commit
> message for the first changeset in the merge line in order to "fix"
> this. ?I doubt that is going to happen.

*cough* http://mercurial.selenic.com/wiki/GraphlogExtension

>
> Note that saying just 'merge' makes perfect sense when you are pulling
> in a whole group of changesets in order to synchronize two branches.
> But if you are applying a single changeset to multiple branches,
> as we often do in our workflow, then I think duplicating the commit
> message is (1) easy to do and (2) very helpful when looking at
> hg log output.

What's the difference between pulling multiple changesets in and one then?


-- 
Regards,
Benjamin

From nyamatongwe at gmail.com  Tue May 10 01:52:49 2011
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 10 May 2011 09:52:49 +1000
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <1304950275.22910.32.camel@marge>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
Message-ID: <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>

Victor Stinner:

> C and C++ identifiers are restricted to ASCII. I don't know for Fortran
> or Java.

   Some C and C++ implementations currently allow non-ASCII
identifiers and the forthcoming C1X and C++0x language standards
include non-ASCII identifiers. The allowed characters are specified in
Annexes of the respective standards.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E

   Neil

From solipsis at pitrou.net  Tue May 10 02:06:03 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 10 May 2011 02:06:03 +0200
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
Message-ID: <20110510020603.25774eb9@pitrou.net>

On Mon, 09 May 2011 16:11:15 +0200
Victor Stinner <victor.stinner at haypocalc.com> wrote:
> Le lundi 09 mai 2011 ? 09:00 -0400, Jim Jewett a ?crit :
> > Are you asserting that all foreign modules (or at least all handled by
> > this) are in C, as opposed to C++ or even Java or Fortran?  (And the C
> > won't change?)
> 
> C and C++ identifiers are restricted to ASCII. I don't know for Fortran
> or Java.

Why is it important, though?
What matters is not what C/C++ can produce, but what a shared library
can export. So the question is: are shared libraries limited to ASCII
symbols?

Regards

Antoine.



From greg.ewing at canterbury.ac.nz  Tue May 10 02:13:47 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 May 2011 12:13:47 +1200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
	<BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
Message-ID: <4DC8833B.9080503@canterbury.ac.nz>

Nick Coghlan wrote:

> One interesting aspect is that from the caller's point of view, a
> *new* reference to the relevant behaves like a borrowed reference for
> input parameters, but like a stolen reference for output parameters
> and return values.

I think it's less confusing to use the term "new" only for
output/return values, and "stolen" only for input values.

Inputs are either "borrowed" or "stolen" (by the callee).

Outputs are either "new" (to the caller) or "borrowed"
(by the caller).

(Or maybe the terms for outputs should be "given" and "lent"?-)

-- 
Greg

From victor.stinner at haypocalc.com  Tue May 10 02:57:14 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 10 May 2011 02:57:14 +0200
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
Message-ID: <1304989034.29582.7.camel@marge>

Le mardi 10 mai 2011 ? 09:52 +1000, Neil Hodgson a ?crit :
>    Some C and C++ implementations currently allow non-ASCII
> identifiers and the forthcoming C1X and C++0x language standards
> include non-ASCII identifiers. The allowed characters are specified in
> Annexes of the respective standards.
> 
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E

I read these documents but they don't explain which encoding is used in
libraries and programs. Does it mean that Windows and Linux may use
different encodings? At least, the surrogate range (U+DC00-U+DFFF) is
excluded, which is a good news (UTF-8 decoder of Python 3 rejects
surrogate characters).

I discovered -fextended-identifiers option of gcc: using this option,
you can use \uHHHH and \UHHHHHHHH in identifiers, but not \xHH. On
Linux, identifiers are encoded to UTF-8.

Example:
--------------
#define _ISOC99_SOURCE
#include <stdio.h>

int f\u00E9() { wprintf(L"U+00E9 = \xE9\n"); }

int g\U000000E8() { wprintf(L"U+00E8 = \xE8\n"); }

int main() { f\u00E9(); g\U000000E8(); return 0; }
--------------

It's not very practical, I would prefer to write directly Unicode
characters (as I can do in Python 3!). I'm not sure that chineses will
prefer to call \u4f60\u597d() instead of hello().

Ok, I now agree, it is possible to use non-ASCII characters in C. But
what about the encoding of symbols in a dynamic library: is it always
UTF-8?

Victor


From nyamatongwe at gmail.com  Tue May 10 03:08:32 2011
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 10 May 2011 11:08:32 +1000
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <1304989034.29582.7.camel@marge>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
Message-ID: <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>

Victor Stinner:

> I read these documents but they don't explain which encoding is used in
> libraries and programs. Does it mean that Windows and Linux may use
> different encodings?

   Yes, Windows will use UTF-16 as it does for almost everything. From
a user's point of view, these should both just be seen as Unicode.

   Neil

From greg.ewing at canterbury.ac.nz  Tue May 10 03:28:04 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 10 May 2011 13:28:04 +1200
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <20110510005835.GA29281@rectangular.com>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
	<BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
	<4DC8833B.9080503@canterbury.ac.nz>
	<20110510005835.GA29281@rectangular.com>
Message-ID: <4DC894A4.2000709@canterbury.ac.nz>

Marvin Humphrey wrote:

>   incremented: The caller has to account for an additional refcount.
>   decremented: The caller has to account for a lost refcount.

I'm not sure that really clarifies anything. These terms
sound like they're talking about the reference count of the
object, but if they correspond to borrowed/stolen, they
don't necessarily correlate with what actually happens to
the reference count.

-- 
Greg

From marvin at rectangular.com  Tue May 10 02:58:35 2011
From: marvin at rectangular.com (Marvin Humphrey)
Date: Mon, 9 May 2011 17:58:35 -0700
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <4DC8833B.9080503@canterbury.ac.nz>
References: <1304499523.15694.11.camel@marge> <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
	<BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
	<4DC8833B.9080503@canterbury.ac.nz>
Message-ID: <20110510005835.GA29281@rectangular.com>

On Tue, May 10, 2011 at 12:13:47PM +1200, Greg Ewing wrote:
> Nick Coghlan wrote:
>
>> One interesting aspect is that from the caller's point of view, a
>> *new* reference to the relevant behaves like a borrowed reference for
>> input parameters, but like a stolen reference for output parameters
>> and return values.
>
> I think it's less confusing to use the term "new" only for
> output/return values, and "stolen" only for input values.
>
> Inputs are either "borrowed" or "stolen" (by the callee).
>
> Outputs are either "new" (to the caller) or "borrowed"
> (by the caller).
>
> (Or maybe the terms for outputs should be "given" and "lent"?-)

To solve this problem in a similar system (the Clownfish object system used by
Apache Lucy) we used the keywords "incremented" and "decremented".  Applied to
some Python C API function documentation:

  incremented PyObject* PyTuple_New(Py_ssize_t len)

  int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, 
                      decremented PyObject *o)

With "incremented" and "decremented", the perspective is always that of the
caller.  

  incremented: The caller has to account for an additional refcount.
  decremented: The caller has to account for a lost refcount.

Marvin Humphrey


From rdmurray at bitdance.com  Tue May 10 03:32:46 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 09 May 2011 21:32:46 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<20110509154416.35BBF250037@webabinitio.net>
	<BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>
Message-ID: <20110510013247.655DA250037@webabinitio.net>

On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote:
> 2011/5/9 R. David Murray <rdmurray at bitdance.com>:
> > On Mon, 09 May 2011 09:08:53 -0500, Benjamin Peterson <benjamin at python.or=
> g> wrote:
> >> I thought the whole point of merging was that you brought a changeset
> >> from one branch to another. This why I just write "merge" because
> >> otherwise you're technically duplicating information that is pulled
> >> onto the branch by merging.
> >
> > No it isn't. =C2=A0The commit message isn't pulled into the new branch.
> >
> >> It seems like something that should be solved by tools like a display
> >> visual graph indicating what is merged. (like Bazaar)
> >
> > You'd need some extension to hg log that would show the original commit
> > message for the first changeset in the merge line in order to "fix"
> > this. =C2=A0I doubt that is going to happen.
> 
> *cough* http://mercurial.selenic.com/wiki/GraphlogExtension

I'm sorry, but I've looked at the output of that and the mental overhead
has so far proven too high for it to be of any use to me.  I apologize for
not having made the full mental transition to "distributed VCS"/DAG
(apparently), but it sounds like I'm not the only one....

> > Note that saying just 'merge' makes perfect sense when you are pulling
> > in a whole group of changesets in order to synchronize two branches.
> > But if you are applying a single changeset to multiple branches,
> > as we often do in our workflow, then I think duplicating the commit
> > message is (1) easy to do and (2) very helpful when looking at
> > hg log output.
> 
> What's the difference between pulling multiple changesets in and one then?

I'm talking about merging trunk to a feature branch, for example.
I'd not expect any message other than 'merge' for that.

I'd be satisfied if the commit messages listed the issue numbers involved
in the merge, especially if someone (like ??ric) is merging more than
one change at a time.

But as I think about this, frankly I'd rather see atomic commits, even
on merges.  That was something I disliked about svnmerge, the fact that
often an svnmerge commit involved many changesets from the other branch.
That was especially painful in exactly the same situation:  trying to
backtrack a change starting from 'svn blame'.  I limited my own use
of multiple-changeset-svnmerge to doc changes and changesets that were
actually related, despite the overhead involved in doing it that way.

All that said, I'm not trying to impose my will on the workflow, I'll
certainly live with the consensus (though unless there is an outcry
against it I'll continue putting the full commit message in my own
merges).

--
R. David Murray           http://www.bitdance.com

From marvin at rectangular.com  Tue May 10 03:53:10 2011
From: marvin at rectangular.com (Marvin Humphrey)
Date: Mon, 9 May 2011 18:53:10 -0700
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <4DC894A4.2000709@canterbury.ac.nz>
References: <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
	<BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
	<4DC8833B.9080503@canterbury.ac.nz>
	<20110510005835.GA29281@rectangular.com>
	<4DC894A4.2000709@canterbury.ac.nz>
Message-ID: <20110510015310.GA29407@rectangular.com>

On Tue, May 10, 2011 at 01:28:04PM +1200, Greg Ewing wrote:
> Marvin Humphrey wrote:
>
>>   incremented: The caller has to account for an additional refcount.
>>   decremented: The caller has to account for a lost refcount.
>
> I'm not sure that really clarifies anything. These terms
> sound like they're talking about the reference count of the
> object, but if they correspond to borrowed/stolen, they
> don't necessarily correlate with what actually happens to
> the reference count.

Hmm, they don't correspond to borrowed/stolen.

    stolen from the caller -> decremented
    stolen from the callee -> incremented
    borrowed               -> [no modifier]

We don't have a modifier keyword which is analogous to "borrowed".  The user
is expected to understand object lifespan issues for borrowed references
without explicit guidance.

With regards to "what actually happens to the reference count", I would argue
that "incremented" and "decremented" are accurate descriptions.

  * When a function returns an "incremented" object, that function has added
    a refcount to it.
  * When a function accepts a "decremented" object as an argument, it will
    consume a refcount from it -- either right away, or at some point in the
    future.

In my view, it is not desirable to label arguments or return values as
"borrowed"; it is only necessary to advise the user when they must take action
to account for a refcount, gained or lost.

Cheers,

Marvin Humphrey


From stephen at xemacs.org  Tue May 10 04:51:19 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 10 May 2011 11:51:19 +0900
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <20110510013247.655DA250037@webabinitio.net>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<20110509154416.35BBF250037@webabinitio.net>
	<BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>
	<20110510013247.655DA250037@webabinitio.net>
Message-ID: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>

R. David Murray writes:
 > On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote:

 > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension
 > 
 > I'm sorry, but I've looked at the output of that and the mental overhead
 > has so far proven too high for it to be of any use to me.

How about the hgk extension, and "hg view"?

http://mercurial.selenic.com/wiki/HgkExtension

 > But as I think about this, frankly I'd rather see atomic commits, even
 > on merges.  That was something I disliked about svnmerge, the fact that
 > often an svnmerge commit involved many changesets from the other branch.
 > That was especially painful in exactly the same situation:  trying to
 > backtrack a change starting from 'svn blame'.

I don't understand the issue.  In my experience, hg annotate will
point to the commit on the branch, not to the merge, unless there was
a conflict, in which case the merge is the "right" place (although not
necessarily the most useful place) to point.


From murman at gmail.com  Tue May 10 05:18:10 2011
From: murman at gmail.com (Michael Urman)
Date: Mon, 9 May 2011 22:18:10 -0500
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
Message-ID: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>

On Mon, May 9, 2011 at 20:08, Neil Hodgson <nyamatongwe at gmail.com> wrote:
> ? Yes, Windows will use UTF-16 as it does for almost everything. From
> a user's point of view, these should both just be seen as Unicode.

I'm not convinced this is correct for this case. GetProcAddress takes
an "ANSI" string, meaning while it could theoretically use UTF-8, in
practice I doubt it uses anything outside of ASCII safely. So while
the name of the library would be encoded in UTF-16, the name of the
function loaded from the library would not be.

http://msdn.microsoft.com/en-us/library/ms683212(v=vs.85).aspx

-- 
Michael Urman

From nyamatongwe at gmail.com  Tue May 10 06:09:06 2011
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 10 May 2011 14:09:06 +1000
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
	<BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
Message-ID: <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com>

Michael Urman:

> I'm not convinced this is correct for this case. GetProcAddress takes
> an "ANSI" string, meaning while it could theoretically use UTF-8, in
> practice I doubt it uses anything outside of ASCII safely. So while
> the name of the library would be encoded in UTF-16, the name of the
> function loaded from the library would not be.

   Yes you are right:
http://scintilla.org/NarrowName.png

   Neil

From murman at gmail.com  Tue May 10 06:23:54 2011
From: murman at gmail.com (Michael Urman)
Date: Mon, 9 May 2011 23:23:54 -0500
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
	<BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
	<BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com>
Message-ID: <BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com>

On Mon, May 9, 2011 at 23:09, Neil Hodgson <nyamatongwe at gmail.com> wrote:
> Michael Urman:
>
>> I'm not convinced this is correct for this case. GetProcAddress takes
>> an "ANSI" string, meaning while it could theoretically use UTF-8, in
>> practice I doubt it uses anything outside of ASCII safely. So while
>> the name of the library would be encoded in UTF-16, the name of the
>> function loaded from the library would not be.
>
> ? Yes you are right:
> http://scintilla.org/NarrowName.png
>
> ? Neil
>

That screenshot seems to show UTF-8 is being used. This may just be
the literal bytes in the .c file, but could it be something more
dependable?

http://unicode.org/cgi-bin/GetUnihanData.pl?codepoint=6728

From nyamatongwe at gmail.com  Tue May 10 06:35:27 2011
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 10 May 2011 14:35:27 +1000
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
	<BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
	<BANLkTi=J4GGFY2FQF37ahnJJ69csmTTi2Q@mail.gmail.com>
	<BANLkTincMf_YA8exSrr1=R8LWnC3Ty3VvQ@mail.gmail.com>
Message-ID: <BANLkTinJ-mcY5A-W0EUY7qfJRNu8x7TKHg@mail.gmail.com>

Michael Urman:

> That screenshot seems to show UTF-8 is being used. This may just be
> the literal bytes in the .c file, but could it be something more
> dependable?

   The file is in UTF-8 so the compiler may just be copying the bytes.
There is a setlocale pragma but that seems to be just for string
literals.

   Neil

From eliben at gmail.com  Tue May 10 07:36:38 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Tue, 10 May 2011 08:36:38 +0300
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
Message-ID: <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>

On Mon, May 9, 2011 at 18:44, Isaac Morland <ijmorlan at uwaterloo.ca> wrote:

> On Mon, 9 May 2011, Eli Bendersky wrote:
>
>  x = 5
>>> def foo ():
>>>       print (x)
>>>       if bar ():
>>>               x = 1
>>>       print (x)
>>>
>>>
>> I wish you'd annotate this code sample, what do you intend it to
>> demonstrate?
>>
>> It probably shows the original complaint even more strongly. As for being
>> a
>> problem with the suggested solution, I suppose you're right, although it
>> doesn't make it much different. Still, before a *possible* assignment to
>> 'x', it should be loaded as LOAD_NAME since it was surely not bound as
>> local, yet.
>>
>
> Extrapolating from your suggestion, you're saying before a *possible*
> assignment it will be treated as global, and after a *possible* assignment
> it will be treated as local?
>
> But surely:
>
> print (x)
> if False:
>        x = 1
> print (x)
>
> [snip]

Alright, I now understand the problems with the suggestion. Indeed,
conditional assignments that are only really resolved at runtime are the big
stumbling block here.

However, maybe the error message/reporting can still be improved?

ISTM the UnboundLocalError exception gets raised only in those weird and
confusing cases. After all, why would Python decide an access to some name
is to a local? Only if it found an assignment to that local in the scope.
But that assignment clearly didn't happen yet, so the error is thrown. So
cases like these:

x = 2

def foo1():
  x += 1

def foo2():
  print(x)
  x = 10

def foo3():
  if something_that_didnot_happen:
    x = 10
  print(x)

All belong to the category.

With an unlimited error message length it could make sense to say "Hey, I
see 'x' may be assigned in this scope, so I mark it local. But this access
to 'x' happens before assignment - so ERROR". This isn't realistic, of
course, so I'm wondering:

1. Does this error message (although unrealistic) capture all possible
appearances of UnboundLocalError?
2. If the answer to (1) is yes - could it be usefully shortened to be
clearer than the current "local variable referenced before assignment"?

This may not be possible, of course, but it doesn't harm trying :-)
Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/95de9ed5/attachment.html>

From stefan_ml at behnel.de  Tue May 10 08:16:24 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 10 May 2011 08:16:24 +0200
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <iq8q3d$kfk$1@dough.gmane.org>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<iq8q3d$kfk$1@dough.gmane.org>
Message-ID: <iqal7o$g96$2@dough.gmane.org>

[forwarded to the python-ideas list]

Stefan


From victor.stinner at haypocalc.com  Tue May 10 10:03:29 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 10 May 2011 10:03:29 +0200
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
	<BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
Message-ID: <1305014609.2014.6.camel@marge>

Le lundi 09 mai 2011 ? 22:18 -0500, Michael Urman a ?crit :
> On Mon, May 9, 2011 at 20:08, Neil Hodgson <nyamatongwe at gmail.com> wrote:
> >   Yes, Windows will use UTF-16 as it does for almost everything. From
> > a user's point of view, these should both just be seen as Unicode.
> 
> I'm not convinced this is correct for this case. GetProcAddress takes
> an "ANSI" string, meaning while it could theoretically use UTF-8, in
> practice I doubt it uses anything outside of ASCII safely.

If GetProcAddress() expects a byte string encoded to the ANSI code page,
my patch is correct because the function used the UTF-8 encoding, not
the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode
string. I don't know which encoding is used by GetProcAddressW()...

I already patched _PyImport_GetDynLoadFunc() for Windows: the path is
now a Unicode object instead of a byte string encoded to the filesystem
encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and
LoadLibraryExW(). The work to be fully Unicode compliant (for the path
field, not for the name) is not completly done... but I have a pending
patch, see:
http://bugs.python.org/issue11619

But this patch is huge and creates many functions. I am not sure that we
need it, I will work on this later.

Victor


From orsenthil at gmail.com  Tue May 10 10:37:42 2011
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Tue, 10 May 2011 16:37:42 +0800
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12039: Add
 end_headers() call to avoid BadStatusLine.
In-Reply-To: <E1QJi1H-0003qr-Cd@dinsdale.python.org>
References: <E1QJi1H-0003qr-Cd@dinsdale.python.org>
Message-ID: <20110510083742.GA16239@kevin>

On Tue, May 10, 2011 at 10:10:15AM +0200, vinay.sajip wrote:
> diff --git a/Lib/test/test_logging.py b/Lib/test/test_logging.py
> --- a/Lib/test/test_logging.py
> +++ b/Lib/test/test_logging.py
> @@ -1489,6 +1489,7 @@
>              except:
>                  self.post_data = None
>          request.send_response(200)
> +        request.end_headers()
>          self.handled.set()

This is accurate. It should have resulted from the change made in the
http.server, because the headers are now cached and then written to
the output stream in one-shot when end_headers/flush_headers are
called. 

Thanks,
Senthil


From aurelien.campeas at logilab.fr  Tue May 10 13:51:41 2011
From: aurelien.campeas at logilab.fr (=?ISO-8859-1?Q?Aur=E9lien_Camp=E9as?=)
Date: Tue, 10 May 2011 13:51:41 +0200
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1304937168.22910.21.camel@marge>	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>	<20110509154416.35BBF250037@webabinitio.net>	<BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>	<20110510013247.655DA250037@webabinitio.net>
	<87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4DC926CD.1040108@logilab.fr>

Le 10/05/2011 04:51, Stephen J. Turnbull a ?crit :
> R. David Murray writes:
>   >  On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson<benjamin at python.org>  wrote:
>
>   >  >  *cough* http://mercurial.selenic.com/wiki/GraphlogExtension
>   >
>   >  I'm sorry, but I've looked at the output of that and the mental overhead
>   >  has so far proven too high for it to be of any use to me.
>
> How about the hgk extension, and "hg view"?
>
> http://mercurial.selenic.com/wiki/HgkExtension
>

or, FWIW, "hgview" (http://www.logilab.org/project/hgview)

From ncoghlan at gmail.com  Tue May 10 14:29:58 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 22:29:58 +1000
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <4DC84235.4060600@trueblade.com>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
Message-ID: <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>

On Tue, May 10, 2011 at 5:36 AM, Eric Smith <eric at trueblade.com> wrote:
> Thanks indeed for bringing this up, Terry. It's been on my to-do list
> for a while. I think it comes from just copying the title of a bug
> report. The bug is "X does Y", and that's what's used in the fix.

I believe I've actually seen it in NEWS entries as well (although
thankfully not often and I can't recall any specific instances off the
top of my head).

I'm also a fan of including the word "now" and describing the new
behaviour, although I'll sometimes use "no longer" and describe the
old behaviour for some bugs where that seems more appropriate.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Tue May 10 14:44:29 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 22:44:29 +1000
Subject: [Python-Dev] Borrowed and Stolen References in API
In-Reply-To: <20110510015310.GA29407@rectangular.com>
References: <4DC11791.2000109@dcs.gla.ac.uk>
	<BANLkTi=V8FFQBO4LvC_DAMnUinHRkDB77A@mail.gmail.com>
	<4DC1D1C5.9010507@canterbury.ac.nz>
	<BANLkTi=iz++hmnrzJvC0aM56JhcyrfcqTQ@mail.gmail.com>
	<4DC34EAB.9050001@canterbury.ac.nz>
	<20110506122703.17c4d889@pitrou.net>
	<BANLkTikfL1_ocbdVLz5uHttEwLrNVJVWsQ@mail.gmail.com>
	<4DC8833B.9080503@canterbury.ac.nz>
	<20110510005835.GA29281@rectangular.com>
	<4DC894A4.2000709@canterbury.ac.nz>
	<20110510015310.GA29407@rectangular.com>
Message-ID: <BANLkTinzYcvKVzP8heCUy0uC8nerqv7beQ@mail.gmail.com>

On Tue, May 10, 2011 at 11:53 AM, Marvin Humphrey
<marvin at rectangular.com> wrote:
> With regards to "what actually happens to the reference count", I would argue
> that "incremented" and "decremented" are accurate descriptions.
>
> ?* When a function returns an "incremented" object, that function has added
> ? ?a refcount to it.

Except that's not quite true in cases like PySet_Pop(). In that case,
the net effect on the refcount is neutral. The significant point is
that the set no longer holds a reference, it has passed that
responsibility back to the caller.

> In my view, it is not desirable to label arguments or return values as
> "borrowed"; it is only necessary to advise the user when they must take action
> to account for a refcount, gained or lost.

Agreed on this part, though. Callers need to know when:

1. The return value is a new reference that must be decremented
(currently indicated in the docs by "Return value: New reference")
2. An input parameter transfers responsibility for the reference to
the callee (the only example I noticed in the docs is PyList_SetItem,
which uses an explicit note rather than any kind of markup or the
refcount data)

I believe the current refcount data covers the first case reasonably
well, but not the latter (and still has the problem of being separated
from the documentation of the functions themselves).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From rdmurray at bitdance.com  Tue May 10 14:50:34 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 08:50:34 -0400
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<20110509154416.35BBF250037@webabinitio.net>
	<BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>
	<20110510013247.655DA250037@webabinitio.net>
	<87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20110510125035.8F20B250041@webabinitio.net>

On Tue, 10 May 2011 11:51:19 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> R. David Murray writes:
>  > On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote:
> 
>  > > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension
>  > 
>  > I'm sorry, but I've looked at the output of that and the mental overhead
>  > has so far proven too high for it to be of any use to me.
> 
> How about the hgk extension, and "hg view"?

I think the problem is in my brain, not the graphical tools :)  With rare
exceptions I don't use tools that require a mouse to operate, though,
so unless I feel like doing tcl hacking to make good keyboard bindings
that particular tool won't help me anyway.

>  > But as I think about this, frankly I'd rather see atomic commits, even
>  > on merges.  That was something I disliked about svnmerge, the fact that
>  > often an svnmerge commit involved many changesets from the other branch.
>  > That was especially painful in exactly the same situation:  trying to
>  > backtrack a change starting from 'svn blame'.
> 
> I don't understand the issue.  In my experience, hg annotate will
> point to the commit on the branch, not to the merge, unless there was
> a conflict, in which case the merge is the "right" place (although not
> necessarily the most useful place) to point.

That's what I get for reasoning about hg based on my svn experience.
Someone on IRC also pointed this out.  I haven't done this often
enough in HG for the difference to have penetrated my brain.  I have
a feeling I'm still going to get confused occasionally, but then
I'm sure I did with svn too...

--
R. David Murray           http://www.bitdance.com

From ncoghlan at gmail.com  Tue May 10 14:59:02 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 May 2011 22:59:02 +1000
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<20110509154416.35BBF250037@webabinitio.net>
	<BANLkTik3M2y=W3t4pkGYqbc2MbWsWo=h+A@mail.gmail.com>
	<20110510013247.655DA250037@webabinitio.net>
	<87d3jre09k.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <BANLkTinuY1qCp0oQjbqJDyEk2Cg7kH2Spg@mail.gmail.com>

On Tue, May 10, 2011 at 12:51 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> R. David Murray writes:
> ?> On Mon, 09 May 2011 18:23:45 -0500, Benjamin Peterson <benjamin at python.org> wrote:
>
> ?> > *cough* http://mercurial.selenic.com/wiki/GraphlogExtension
> ?>
> ?> I'm sorry, but I've looked at the output of that and the mental overhead
> ?> has so far proven too high for it to be of any use to me.
>
> How about the hgk extension, and "hg view"?
>
> http://mercurial.selenic.com/wiki/HgkExtension

I don't think it's really a jump up to the "graphical" level that
we're after. It's more a matter of:

1. Display commit message for current commit
2. Notice that this commit has two parents
3. Ignore any parent commit in the same branch as the current commit
4. For a parent commit in another branch, also display that commit message
5. If the commit in step 4 also has multiple parents, repeat from step
3 (but based off that parent commit)

So a standard 3.1->3.2->default merge could be displayed along the lines of:

Merge from 3.2
  3.2: Merge from 3.1
    3.1: Issue #123456: mod.func now works correctly when argument is negative

It won't help if the last commit on the initial branch was something
boring like "Fix whitespace", but it will be adequate for our typical
single-commit bug fix workflow.

(If nobody does anything before then, I'll see what I can do with the
email hook next week)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From rdmurray at bitdance.com  Tue May 10 15:11:44 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 09:11:44 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
Message-ID: <20110510131144.C8D75250041@webabinitio.net>

On Tue, 10 May 2011 08:36:38 +0300, Eli Bendersky <eliben at gmail.com> wrote:
> With an unlimited error message length it could make sense to say "Hey, I
> see 'x' may be assigned in this scope, so I mark it local. But this access
> to 'x' happens before assignment - so ERROR". This isn't realistic, of
> course, so I'm wondering:
> 
> 1. Does this error message (although unrealistic) capture all possible
> appearances of UnboundLocalError?
> 2. If the answer to (1) is yes - could it be usefully shortened to be
> clearer than the current "local variable referenced before assignment"?
> 
> This may not be possible, of course, but it doesn't harm trying :-)

How about:

"reference to variable 'y' precedes an assignment that makes it a local
variable"

IMO this still leaves room for confusion, but is better because it
indicates the causation more clearly.  (I don't think it is necessary to
capture the subtlety of conditional assignment in the error message.)

--
R. David Murray           http://www.bitdance.com

From rdmurray at bitdance.com  Tue May 10 15:33:13 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 09:33:13 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>
Message-ID: <20110510133314.66B48250041@webabinitio.net>

On Tue, 10 May 2011 22:29:58 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, May 10, 2011 at 5:36 AM, Eric Smith <eric at trueblade.com> wrote:
> > Thanks indeed for bringing this up, Terry. It's been on my to-do list
> > for a while. I think it comes from just copying the title of a bug
> > report. The bug is "X does Y", and that's what's used in the fix.
> 
> I believe I've actually seen it in NEWS entries as well (although
> thankfully not often and I can't recall any specific instances off the
> top of my head).
> 
> I'm also a fan of including the word "now" and describing the new
> behaviour, although I'll sometimes use "no longer" and describe the
> old behaviour for some bugs where that seems more appropriate.

I generally don't use the same text for commit and NEWS, because I like
to stick to one-liners for the first line of the commit, possibly with
more detail in the body, while for NEWS items I'm aiming for a one to
three line description.  But in both cases what I'm thinking about is
"what have I *changed*".  In the commit message that will probably focus
more on code changes, while the NEWS item will focus more on behavior
changes, but the results are generally similar.

So for example my most recent two comments look like this:

commit:
    11999: sync based on comparing mtimes, not mtime to system clock
NEWS:
    Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its
    trying to detect mtime changes by comparing to the system clock
    instead of to the previous value of the mtime.

commit:
    #11873: Improve test regex so random directory names don't cause test to fail
NEWS:
    Issue #11873: Change regex in test_compileall to fix occasional
    failures when when the randomly generated temporary path happened to
    match the regex.

You will note the *active* verbs "fixed", "improve", and "change"
figure in there prominently :)

(Eh.  And proofreading this email I see I made a grammar error in
that first NEWS example :(

--
R. David Murray           http://www.bitdance.com

From murman at gmail.com  Tue May 10 15:34:38 2011
From: murman at gmail.com (Michael Urman)
Date: Tue, 10 May 2011 08:34:38 -0500
Subject: [Python-Dev] [Python-checkins] cpython:
 _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
In-Reply-To: <1305014609.2014.6.camel@marge>
References: <E1QIf1U-0001ch-SK@dinsdale.python.org>
	<BANLkTimm=KLc0XmCbqM7Vx5tBE=C=Kmneg@mail.gmail.com>
	<1304950275.22910.32.camel@marge>
	<BANLkTinbZ91fp1-d7XxN1Ej+xBbvfts-Ew@mail.gmail.com>
	<1304989034.29582.7.camel@marge>
	<BANLkTim_KJOPR9rddRHOEdxaLYnLPfB3pg@mail.gmail.com>
	<BANLkTi=qR=BKfrfr4_7sdVROfqt3xG9F5Q@mail.gmail.com>
	<1305014609.2014.6.camel@marge>
Message-ID: <BANLkTinZjexmqhFfBshL3NANangZE8o7AA@mail.gmail.com>

On Tue, May 10, 2011 at 03:03, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> If GetProcAddress() expects a byte string encoded to the ANSI code page,
> my patch is correct because the function used the UTF-8 encoding, not
> the ANSI code page. We can maybe use GetProcAddressW() to pass a Unicode
> string. I don't know which encoding is used by GetProcAddressW()...

While I can find references to a GetProcAddressW, most of them seem to
agree it doesn't exist. "My kernel32.dll only exports GetProcAddress."
This suggests to me it accepts a null-terminated bytestring instead of
specifically an ANSI string. What data ends up in the export table is
likely similar to the linux filesystem case, only with less likelihood
of the environment telling you its encoding.

> I already patched _PyImport_GetDynLoadFunc() for Windows: the path is
> now a Unicode object instead of a byte string encoded to the filesystem
> encoding. _PyImport_GetDynLoadWindows() uses GetFullPathNameW() and
> LoadLibraryExW(). The work to be fully Unicode compliant (for the path
> field, not for the name) is not completly done... but I have a pending
> patch, see:
> http://bugs.python.org/issue11619
>
> But this patch is huge and creates many functions. I am not sure that we
> need it, I will work on this later.

I'm comfortable with the idea of requiring UTF-8 encoding for the
initmodule entry points of modules named with non-ASCII identifiers,
especially if there is nothing which works consistently today. I've
only seen pure-ASCII library names in all my C++ work, so I feel it
borders on YAGNI (but I like it in theory).

As an alternate approach, one article I read suggested to use ordinals
instead of names if you wanted to use non-ASCII names. Python could
certainly try to load by ordinal on Windows, and fall back to loading
by name. I don't have a clue what the rate of false positives would
be.

-- 
Michael Urman

From phd at phdru.name  Tue May 10 15:45:44 2011
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 10 May 2011 17:45:44 +0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <20110510133314.66B48250041@webabinitio.net>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>
	<20110510133314.66B48250041@webabinitio.net>
Message-ID: <20110510134544.GA9665@iskra.aviel.ru>

On Tue, May 10, 2011 at 09:33:13AM -0400, R. David Murray wrote:
> commit:
>     11999: sync based on comparing mtimes, not mtime to system clock
> NEWS:
>     Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its
>     trying to detect mtime changes by comparing to the system clock
>     instead of to the previous value of the mtime.
> 
> commit:
>     #11873: Improve test regex so random directory names don't cause test to fail
> NEWS:
>     Issue #11873: Change regex in test_compileall to fix occasional
>     failures when when the randomly generated temporary path happened to
>     match the regex.
> 
> You will note the *active* verbs "fixed", "improve", and "change"
> figure in there prominently :)

   Why "fixed" is in the past tense, but "improve", and "change" are in
present tense?

   I use past tense to describe what I did on the code, and present
simple to describe what the new code does when running. For example:

"Fixed a bug in time comparison: compare mtime to mtime, not mtime to system clock"

   I.e., "fixed" - that what I did, and "compare" is what the code does.

(I used an excerpt from above only for the example, not to correct
something.)

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.

From eliben at gmail.com  Tue May 10 16:02:07 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Tue, 10 May 2011 17:02:07 +0300
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <20110510131144.C8D75250041@webabinitio.net>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
Message-ID: <BANLkTimP6He_jqQjBi0WQ38Ckxk+ZoZ=YQ@mail.gmail.com>

On Tue, May 10, 2011 at 16:11, R. David Murray <rdmurray at bitdance.com>wrote:

> On Tue, 10 May 2011 08:36:38 +0300, Eli Bendersky <eliben at gmail.com>
> wrote:
> > With an unlimited error message length it could make sense to say "Hey, I
> > see 'x' may be assigned in this scope, so I mark it local. But this
> access
> > to 'x' happens before assignment - so ERROR". This isn't realistic, of
> > course, so I'm wondering:
> >
> > 1. Does this error message (although unrealistic) capture all possible
> > appearances of UnboundLocalError?
> > 2. If the answer to (1) is yes - could it be usefully shortened to be
> > clearer than the current "local variable referenced before assignment"?
> >
> > This may not be possible, of course, but it doesn't harm trying :-)
>
> How about:
>
> "reference to variable 'y' precedes an assignment that makes it a local
> variable"
>  <http://www.bitdance.com>


Yes, this is much better and not too long IMHO
Eli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/ce04e972/attachment.html>

From rdmurray at bitdance.com  Tue May 10 16:46:18 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 10:46:18 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <20110510134544.GA9665@iskra.aviel.ru>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>
	<20110510133314.66B48250041@webabinitio.net>
	<20110510134544.GA9665@iskra.aviel.ru>
Message-ID: <20110510144618.DEC5B250041@webabinitio.net>

On Tue, 10 May 2011 17:45:44 +0400, Oleg Broytman <phd at phdru.name> wrote:
> On Tue, May 10, 2011 at 09:33:13AM -0400, R. David Murray wrote:
> > commit:
> >     11999: sync based on comparing mtimes, not mtime to system clock
> > NEWS:
> >     Issue 11999: fixed sporadic sync failure mailbox.Maildir due to its
> >     trying to detect mtime changes by comparing to the system clock
> >     instead of to the previous value of the mtime.
> > 
> > commit:
> >     #11873: Improve test regex so random directory names don't cause test to fail
> > NEWS:
> >     Issue #11873: Change regex in test_compileall to fix occasional
> >     failures when when the randomly generated temporary path happened to
> >     match the regex.
> > 
> > You will note the *active* verbs "fixed", "improve", and "change"
> > figure in there prominently :)
> 
>    Why "fixed" is in the past tense, but "improve", and "change" are in
> present tense?
> 
>    I use past tense to describe what I did on the code, and present
> simple to describe what the new code does when running. For example:
> 
> "Fixed a bug in time comparison: compare mtime to mtime, not mtime to system clock"
> 
>    I.e., "fixed" - that what I did, and "compare" is what the code does.
> 
> (I used an excerpt from above only for the example, not to correct
> something.)

Yes, that's a good point.  I'll try to be more consistent about that
in the future.  Change should have been Changed.

--
R. David Murray           http://www.bitdance.com

From ncoghlan at gmail.com  Tue May 10 16:59:08 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 May 2011 00:59:08 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <20110510131144.C8D75250041@webabinitio.net>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
Message-ID: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>

On Tue, May 10, 2011 at 11:11 PM, R. David Murray <rdmurray at bitdance.com> wrote:
> How about:
>
> "reference to variable 'y' precedes an assignment that makes it a local
> variable"

For comparison, the error messages I was able to elicit from 2.7 were
as follows:

# Module level
NameError: name 'bob' is not defined

# Function level reference to implicit global
NameError: global name 'bob' is not defined

# Early reference to local
UnboundLocalError: local variable 'bob' referenced before assignment

# Early reference from closure
NameError: free variable 'bob' referenced before assignment in enclosing scope

Personally, I would just add "in current scope" to the existing error
message for the unbound local case (and potentially collapse the
exception hierarchy a bit by setting UnboundLocalError = NameError).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From rdmurray at bitdance.com  Tue May 10 19:31:04 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 13:31:04 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
Message-ID: <20110510173107.74C9B250041@webabinitio.net>

On Wed, 11 May 2011 00:59:08 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, May 10, 2011 at 11:11 PM, R. David Murray <rdmurray at bitdance.com> w=
> rote:
> > How about:
> >
> > "reference to variable 'y' precedes an assignment that makes it a local
> > variable"
> 
> For comparison, the error messages I was able to elicit from 2.7 were
> as follows:
> 
> # Module level
> NameError: name 'bob' is not defined
> 
> # Function level reference to implicit global
> NameError: global name 'bob' is not defined
> 
> # Early reference to local
> UnboundLocalError: local variable 'bob' referenced before assignment
> 
> # Early reference from closure
> NameError: free variable 'bob' referenced before assignment in enclosing sc=
> ope
> 
> Personally, I would just add "in current scope" to the existing error
> message for the unbound local case (and potentially collapse the
> exception hierarchy a bit by setting UnboundLocalError = NameError).

I don't think adding that phrase would add any clarity, myself.
The mental disconnect comes from the fact that the UnboundLocal error
message is emitted for the reference, but it is not immediately obvious
*why* the variable is considered local.  My rephrasing emphasizes that it
is the assignment statement that led to that classification and therefore
the error.  This disconnect doesn't apply in the global cases.  It applies
less strongly in the free variable case because there is visibly another
scope involved (that is, the triggering assignment isn't in the same
scope as the reference producing the error message).

--
R. David Murray           http://www.bitdance.com

From tjreedy at udel.edu  Tue May 10 19:56:58 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 10 May 2011 13:56:58 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
Message-ID: <iqbu9b$981$1@dough.gmane.org>

On 5/10/2011 10:59 AM, Nick Coghlan wrote:
> On Tue, May 10, 2011 at 11:11 PM, R. David Murray<rdmurray at bitdance.com>  wrote:
>> How about:
>>
>> "reference to variable 'y' precedes an assignment that makes it a local
>> variable"
>
> For comparison, the error messages I was able to elicit from 2.7 were
> as follows:
>
> # Module level
> NameError: name 'bob' is not defined
>
> # Function level reference to implicit global
> NameError: global name 'bob' is not defined
>
> # Early reference to local
> UnboundLocalError: local variable 'bob' referenced before assignment

I would change this to
"local name 'bob' used before the assignment that makes it a local name"

Calling names 'variables' is itself a point of confusion.
>
> # Early reference from closure
> NameError: free variable 'bob' referenced before assignment in enclosing scope
>
> Personally, I would just add "in current scope" to the existing error
> message for the unbound local case (and potentially collapse the
> exception hierarchy a bit by setting UnboundLocalError = NameError).
>
> Cheers,
> Nick.
>


-- 
Terry Jan Reedy


From rdmurray at bitdance.com  Tue May 10 20:31:17 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 10 May 2011 14:31:17 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <iqbu9b$981$1@dough.gmane.org>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
	<iqbu9b$981$1@dough.gmane.org>
Message-ID: <20110510183118.6D2B8250041@webabinitio.net>

On Tue, 10 May 2011 13:56:58 -0400, Terry Reedy <tjreedy at udel.edu> wrote:
> On 5/10/2011 10:59 AM, Nick Coghlan wrote:
> > On Tue, May 10, 2011 at 11:11 PM, R. David Murray<rdmurray at bitdance.com>  wrote:
> >> How about:
> >>
> >> "reference to variable 'y' precedes an assignment that makes it a local
> >> variable"
> >
> > For comparison, the error messages I was able to elicit from 2.7 were
> > as follows:
> >
> > # Module level
> > NameError: name 'bob' is not defined
> >
> > # Function level reference to implicit global
> > NameError: global name 'bob' is not defined
> >
> > # Early reference to local
> > UnboundLocalError: local variable 'bob' referenced before assignment
> 
> I would change this to
> "local name 'bob' used before the assignment that makes it a local name"
> 
> Calling names 'variables' is itself a point of confusion.

Yes, your phrasing is much better.

--
R. David Murray           http://www.bitdance.com

From eliben at gmail.com  Tue May 10 20:59:04 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Tue, 10 May 2011 21:59:04 +0300
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <20110510183118.6D2B8250041@webabinitio.net>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
	<iqbu9b$981$1@dough.gmane.org>
	<20110510183118.6D2B8250041@webabinitio.net>
Message-ID: <BANLkTimk+7Qv-_AJF6oRif1oGX=4uFe1+A@mail.gmail.com>

<snip>

> > > # Early reference to local
> > > UnboundLocalError: local variable 'bob' referenced before assignment
> >
> > I would change this to
> > "local name 'bob' used before the assignment that makes it a local name"
> >
> > Calling names 'variables' is itself a point of confusion.
>
> Yes, your phrasing is much better.
>

+1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110510/780da783/attachment.html>

From steve at pearwood.info  Wed May 11 00:38:53 2011
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 11 May 2011 08:38:53 +1000
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
Message-ID: <4DC9BE7D.1070800@pearwood.info>

Nick Coghlan wrote:

> Personally, I would just add "in current scope" to the existing error
> message for the unbound local case (and potentially collapse the
> exception hierarchy a bit by setting UnboundLocalError = NameError).

-0

That was the case prior to Python 2.0. Reverting is potentially a 
semantic change that will break any code that distinguishes between 
(global) NameError and (local) UnboundLocalError. But personally, I 
don't know why it was thought necessary to distinguish between them in 
the first place.




-- 
Steven

From fdrake at acm.org  Wed May 11 01:37:09 2011
From: fdrake at acm.org (Fred Drake)
Date: Tue, 10 May 2011 19:37:09 -0400
Subject: [Python-Dev] more timely detection of unbound locals
In-Reply-To: <4DC9BE7D.1070800@pearwood.info>
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>
	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>
	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>
	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>
	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>
	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
	<4DC9BE7D.1070800@pearwood.info>
Message-ID: <BANLkTi=MeFiNBKm5RgsYhYrmPisjRugW5A@mail.gmail.com>

On Tue, May 10, 2011 at 6:38 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> I don't know why it was thought necessary to distinguish between them in the
> first place.

New users almost constantly expressed confusion by NameError when the name
was clearly bound at global scope, and a subsequent assignment caused it to be
considered a local in their function.


  -Fred

-- 
Fred L. Drake, Jr.? ? <fdrake at acm.org>
"Give me the luxuries of life and I will willingly do without the necessities."
?? --Frank Lloyd Wright

From palla74 at gmail.com  Wed May 11 11:51:59 2011
From: palla74 at gmail.com (Palla)
Date: Wed, 11 May 2011 11:51:59 +0200
Subject: [Python-Dev] EuroPython: Early Bird will end in 2 days!
Message-ID: <BANLkTikg6L64aJET4yH1mSs3AgsARNQ7XA@mail.gmail.com>

Hi all,

If you plan to attend, you could save quite a bit on registration fees!

The end of Early bird is on May 12th, Friday, 23:59:59 CEST. We'd like
to ask to you to forward this post to anyone that you feel may be
interested.

We have an amazing lineup of tutorials, events and talks. We have some
excellent keynote speakers and a very complete partner program... but
early bird registration ends in 2 days!

Right now, you still get discounts on talks and tutorials so if you
plan to attend Register Now:

http://ep2011.europython.eu/registration/

While you are booking, remember to have a look at the partner program
and our offer for a prepaid, data+voice+tethering SIM.

We'd like to ask to you to forward this post to anyone that you feel
may be interested.

All the best,


-- 
->PALLA

From merwok at netwok.org  Wed May 11 18:38:53 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Wed, 11 May 2011 18:38:53 +0200
Subject: [Python-Dev] Commit changelog: issue number and merges
In-Reply-To: <20110509175447.4DC56250039@webabinitio.net>
References: <1304937168.22910.21.camel@marge>
	<BANLkTimfd7d_U-ffnuH_KOdURRkmFQgM0w@mail.gmail.com>
	<7fa082450fb750d082e71d5070a62171@netwok.org>
	<20110509175447.4DC56250039@webabinitio.net>
Message-ID: <5899b3f80aa1a05b5f6f3364eebde44e@netwok.org>

 Le 09/05/2011 19:54, R. David Murray a ?crit :
>>> No it isn't.  The commit message isn't pulled into the new branch.
>>  Sorry, your terminology does not make sense.  If you mean that the
>>  commit message is not reused in the new commit after the merge, 
>> it?s
>>  true.  However, the commit message with the relevant information is
>>  available as part of the changesets that have been pulled and 
>> merged.
>
> The changesets are in the repository and there are pointers to them
> from the merge changeset, sure, but the data isn't in the checkout
> (that's how I understood "pulled in to the new branch").

 No commit message is ever in the checkout, so I don?t follow you.

> If I do 'hg log' and search for a revno (that I got from hg 
> annotate),
> the commit message describing the change is not attached to that 
> revno,

 Ah, I understand your problem now.  I would not object to a policy 
 requiring to put helpful information in merge changesets commit 
 messages, like ?Merge fixes for #4444 and #5555? or ?Merge doc fixes? 
 when there are no bug reports.

 I?m not sure about the ?atomic? merge changesets idea that someone else 
 expressed; I don?t think it would be that useful.

> nor as far as I know is there a tool that makes it easy to get from 
> that
> revno to the explanatory commit message.  That's what Victor and I 
> are
> talking about.  Is there a tool that fixes this problem?

 I tend to use graphical tools for history viewing.  I like the GTK
 version of TortoiseHg, or failing that the graph displayed by ?hg 
 serve?
 if you enable the graphlog extension and use a browser with JavaScript.

From merwok at netwok.org  Wed May 11 18:39:21 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Wed, 11 May 2011 18:39:21 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <loom.20110509T193140-280@post.gmane.org>
References: <loom.20110509T193140-280@post.gmane.org>
Message-ID: <d1989510eccc69219dc75384faf7be23@netwok.org>

 Hi,

> That's right, though it's OK to provide a documented convenience API 
> for adding
> handlers.

 I think I?ll aim for simplicity.  We?ll document that we use the logger 
 ?packaging? throughout and let people use getLogger and  addHandler with 
 that.

> You don't necessarily need to set the level on the handler - why can 
> you not
> just set it on the logger? The effect would often be the same: the 
> logger's
> level is checked first, and then the handler's level.

 I thought that if we set the level on the logger, we would prevent 
 third-party code to get some messages.  E.g., we set level to INFO but 
 pip uses some packaging functions and would like to get DEBUG messages.

 Regards

From merwok at netwok.org  Wed May 11 18:39:54 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Wed, 11 May 2011 18:39:54 +0200
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <20110510144618.DEC5B250041@webabinitio.net>
References: <iq9802$e9m$1@dough.gmane.org>
	"\"<4DC8343C.2050005@nedbatchelder.com>	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>	<4DC84235.4060600@trueblade.com>	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>"
	<20110510133314.66B48250041@webabinitio.net>"
	<20110510134544.GA9665@iskra.aviel.ru>
	<20110510144618.DEC5B250041@webabinitio.net>
Message-ID: <ed9491d44d238df972576bf8b215c91b@netwok.org>

 Le 10/05/2011 16:46, R. David Murray a ?crit :
> On Tue, 10 May 2011 17:45:44 +0400, Oleg Broytman <phd at phdru.name> 
> wrote:
>>    Why "fixed" is in the past tense, but "improve", and "change" are 
>> in
>> present tense?
>>    I use past tense to describe what I did on the code, and present
>> simple to describe what the new code does when running. For example:

 Funny, I always use the present tense, to convey what the code does 
 now.

From merwok at netwok.org  Wed May 11 19:05:48 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Wed, 11 May 2011 19:05:48 +0200
Subject: [Python-Dev]
 =?utf-8?q?=5BPython-checkins=5D_cpython_=282=2E7=29?=
 =?utf-8?q?=3A_=28Merge_3=2E1=29_Issue_=2312012=3A_ssl=2EPROTOCOL=5FSSLv2_?=
 =?utf-8?q?becomes_optional?=
In-Reply-To: <E1QJaFY-00046A-Tn@dinsdale.python.org>
References: <E1QJaFY-00046A-Tn@dinsdale.python.org>
Message-ID: <c5683e5e669b57e6645e01eb80501fa9@netwok.org>

 Le 10/05/2011 01:52, victor.stinner a ?crit :
> http://hg.python.org/cpython/rev/3c87a13980be
> changeset:   70001:3c87a13980be
> branch:      2.7
> parent:      69996:c9f07c69b138
> user:        Victor Stinner <victor.stinner at haypocalc.com>
> date:        Tue May 10 01:52:03 2011 +0200
> summary:
>   (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional

 ?(Merge 3.1)? is inaccurate for 2.7.

 Regards

From tjreedy at udel.edu  Wed May 11 19:45:39 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 May 2011 13:45:39 -0400
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <ed9491d44d238df972576bf8b215c91b@netwok.org>
References: <iq9802$e9m$1@dough.gmane.org>	"\"<4DC8343C.2050005@nedbatchelder.com>	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>	<4DC84235.4060600@trueblade.com>	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>"	<20110510133314.66B48250041@webabinitio.net>"	<20110510134544.GA9665@iskra.aviel.ru>	<20110510144618.DEC5B250041@webabinitio.net>
	<ed9491d44d238df972576bf8b215c91b@netwok.org>
Message-ID: <iqei04$kic$1@dough.gmane.org>

On 5/11/2011 12:39 PM, ?ric Araujo wrote:

> Funny, I always use the present tense, to convey what the code does now.

Which code ;-).

At the moment you write a push message, your private clone does 
something different from the public repository (and other private 
clones). At the moment people read a push message, they may not have 
pulled the change, so that there is a difference between the repository 
and *their* clone. Besides the ambiguity, there is also inconsistency 
between writers. Hence my request for a few clarifying keystrokes when 
needed.

-- 
Terry Jan Reedy



From victor.stinner at haypocalc.com  Wed May 11 20:08:49 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 11 May 2011 20:08:49 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue
 #12012: ssl.PROTOCOL_SSLv2 becomes optional
In-Reply-To: <c5683e5e669b57e6645e01eb80501fa9@netwok.org>
References: <E1QJaFY-00046A-Tn@dinsdale.python.org>
	<c5683e5e669b57e6645e01eb80501fa9@netwok.org>
Message-ID: <1305137329.12577.1.camel@marge>

Le mercredi 11 mai 2011 ? 19:05 +0200, ?ric Araujo a ?crit :
> Le 10/05/2011 01:52, victor.stinner a ?crit :
> > http://hg.python.org/cpython/rev/3c87a13980be
> > changeset:   70001:3c87a13980be
> > branch:      2.7
> > parent:      69996:c9f07c69b138
> > user:        Victor Stinner <victor.stinner at haypocalc.com>
> > date:        Tue May 10 01:52:03 2011 +0200
> > summary:
> >   (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional
> 
>  ?(Merge 3.1)? is inaccurate for 2.7.

Ah, why? I did not use "hg merge" command (but hg export|hg import), but
it's a "merge" between two branches. Which term would you use?

Victor


From guido at python.org  Wed May 11 20:48:50 2011
From: guido at python.org (Guido van Rossum)
Date: Wed, 11 May 2011 11:48:50 -0700
Subject: [Python-Dev] Commit messages: please avoid temporal ambiguity
In-Reply-To: <ed9491d44d238df972576bf8b215c91b@netwok.org>
References: <iq9802$e9m$1@dough.gmane.org> <4DC8343C.2050005@nedbatchelder.com>
	<BANLkTi=32fwy6+SC1vrDGZ6he_fvNcrX-g@mail.gmail.com>
	<4DC84235.4060600@trueblade.com>
	<BANLkTinRySS374X9yatULztpG7MLxD_SJg@mail.gmail.com>
	<20110510133314.66B48250041@webabinitio.net>
	<20110510134544.GA9665@iskra.aviel.ru>
	<20110510144618.DEC5B250041@webabinitio.net>
	<ed9491d44d238df972576bf8b215c91b@netwok.org>
Message-ID: <BANLkTi=NSeCknTq4wrJ66J_Ce3x-+otU6A@mail.gmail.com>

On Wed, May 11, 2011 at 9:39 AM, ?ric Araujo <merwok at netwok.org> wrote:
> Funny, I always use the present tense, to convey what the code does now.

Yeah, and that's exactly what I am objecting to. Please describe what
changed how, since that is the focus of the patch.

-- 
--Guido van Rossum (python.org/~guido)

From vinay_sajip at yahoo.co.uk  Wed May 11 21:45:12 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Wed, 11 May 2011 19:45:12 +0000 (UTC)
Subject: [Python-Dev] Problems with regrtest and with logging
References: <loom.20110509T193140-280@post.gmane.org>
	<d1989510eccc69219dc75384faf7be23@netwok.org>
Message-ID: <loom.20110511T213726-472@post.gmane.org>

?ric Araujo <merwok <at> netwok.org> writes:

>  I thought that if we set the level on the logger, we would prevent 
>  third-party code to get some messages.  E.g., we set level to INFO but 
>  pip uses some packaging functions and would like to get DEBUG messages.

Then pip can set the level of the packaging logger as it wishes, perhaps in
response to command-line arguments for verbosity. It'd be easier for pip to do
that, regardless of which handlers are attached. And pip itself might be being
used, say by virtualenv. It's hard in general to say what the top-level code
will be, and generally that's the code which should set the handlers.

The levels set by a library for its loggers are merely defaults. Applications
using the library can choose to override those levels as they wish.

Regards,

Vinay Sajip




From tjreedy at udel.edu  Wed May 11 23:12:39 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 May 2011 17:12:39 -0400
Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue
 #12012: ssl.PROTOCOL_SSLv2 becomes optional
In-Reply-To: <1305137329.12577.1.camel@marge>
References: <E1QJaFY-00046A-Tn@dinsdale.python.org>	<c5683e5e669b57e6645e01eb80501fa9@netwok.org>
	<1305137329.12577.1.camel@marge>
Message-ID: <iqeu47$udg$1@dough.gmane.org>

On 5/11/2011 2:08 PM, Victor Stinner wrote:
> Le mercredi 11 mai 2011 ? 19:05 +0200, ?ric Araujo a ?crit :
>> Le 10/05/2011 01:52, victor.stinner a ?crit :
>>> http://hg.python.org/cpython/rev/3c87a13980be
>>> changeset:   70001:3c87a13980be
>>> branch:      2.7
>>> parent:      69996:c9f07c69b138
>>> user:        Victor Stinner<victor.stinner at haypocalc.com>
>>> date:        Tue May 10 01:52:03 2011 +0200
>>> summary:
>>>    (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional
>>
>>   ?(Merge 3.1)? is inaccurate for 2.7.
>
> Ah, why? I did not use "hg merge" command (but hg export|hg import), but
> it's a "merge" between two branches. Which term would you use?

export/import sounds like transport: "(transport from 3.1)" would be 
clear enough to me.

-- 
Terry Jan Reedy



From pythondev at genstein.net  Thu May 12 04:35:16 2011
From: pythondev at genstein.net (Genstein)
Date: Thu, 12 May 2011 03:35:16 +0100
Subject: [Python-Dev] py3k buffered I/O - flush() required between
	read/write?
Message-ID: <4DCB4764.5080902@genstein.net>

Hi all,

Sincere apologies for posting a question without lurking for a while 
first. I'm not sure whether I'm being dumb (which is very plausible) or 
whether this is a potential bug. I asked on comp.lang.python but 
responses were equivocal, so I'm following the README.txt advice and 
asking here. If I'm out of line, do feel free to slap me down viciously, 
remove me from the list, or whatever seems most appropriate.

Under py3k, is it necessary to flush() a file between buffered 
read/write calls in order to see consistent results? I have a case under 
Python 3.2 (r32:88445) where I see different results depending on 
whether buffering is active, on Gentoo Linux and Windows Vista. Perusing 
the docs and PEPs I couldn't seem to find an answer; I did find 
bufferedio.c's comment: "BufferedReader, BufferedWriter and 
BufferedRandom...share a single buffer...this enables interleaved reads 
and writes without flushing" which is suggestive but I may be taking it 
out of context.

The following is the smallest code I can conjure which demonstrates the 
issue I'm seeing:

[code]
START = 0
MID = 1
LENGTH = 4

def test(buffering):
     f = open("test.bin", "w+b", buffering = buffering)
     for i in range(LENGTH):
         f.write(b'\x00')
     f.seek(MID)
     f.read(1)
     f.write(b'\x00')
     f.seek(MID)
     f.write(b'\x01')
     f.seek(START)
     f.seek(MID)
     print(f.read(1))
     f.close()

print("Buffered result: ")
test(-1)
print("Unbuffered result:")
test(0)
[end code]

Output on both Gentoo and Vista is:
     Buffered result:
     b'\x00'
     Unbuffered result:
     b'\x01'

I expected the results to be the same, but they aren't. The issue is 
reproducible with larger files provided that the constants are increased 
~proportionally (START 0, MID 500, LENGTH 1000 for example). Transposing 
the buffered/unbuffered tests and/or using different buffer sizes for 
the buffered test seem have no effect.

Apologies once more if I'm wasting your time.

All the best,

     -eg.


PS. By way of entirely belated introduction, I'm a UK software developer 
with a background mostly in C#, C++ and Lua in both "real software" and 
commercial games. In my spare time I mostly write code (curiously I 
don't know many developers who do; I suspect I just know the wrong 
people.) I perpetrated the Trizbort mapper for interactive fiction which 
doubtless nobody will have heard of, and with good reason. I'm toying 
with Python as a genuinely portable alternative to C# for my own 
projects, and so far loving it.


From solipsis at pitrou.net  Thu May 12 12:47:27 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 May 2011 12:47:27 +0200
Subject: [Python-Dev] py3k buffered I/O - flush() required between
	read/write?
References: <4DCB4764.5080902@genstein.net>
Message-ID: <20110512124727.3f4d921e@pitrou.net>


Hello,

On Thu, 12 May 2011 03:35:16 +0100
Genstein <pythondev at genstein.net> wrote:
> 
> The following is the smallest code I can conjure which demonstrates the 
> issue I'm seeing:

This is a bug indeed. Can you report it on http://bugs.python.org ?

Thanks a lot for finding this,

Antoine.



From pythondev at genstein.net  Thu May 12 15:22:22 2011
From: pythondev at genstein.net (Genstein)
Date: Thu, 12 May 2011 14:22:22 +0100
Subject: [Python-Dev] py3k buffered I/O - flush() required
	between	read/write?
In-Reply-To: <20110512124727.3f4d921e@pitrou.net>
References: <4DCB4764.5080902@genstein.net>
	<20110512124727.3f4d921e@pitrou.net>
Message-ID: <4DCBDF0E.4070504@genstein.net>

On 12/05/2011 11:47, Antoine Pitrou wrote:
> This is a bug indeed. Can you report it on http://bugs.python.org ?
>
> Thanks a lot for finding this,
>
> Antoine.
>
Duly reported as http://bugs.python.org/issue12062.

I'm glad it wasn't me being dumb(er than usual). It took a while to pin 
down to a small reproducible case.

Thanks for the fast and definite response, I'll cheerfully revert to 
lurking now ;)

All the best,

     -eg.


From skip at montanaro.dyndns.org  Thu May 12 18:33:37 2011
From: skip at montanaro.dyndns.org (Skip Montanaro)
Date: Thu, 12 May 2011 11:33:37 -0500 (CDT)
Subject: [Python-Dev] Could these restrictions be removed?
Message-ID: <20110512163337.3758D12B7749@montanaro.dyndns.org>


A friend at work who is new to Python wondered why this didn't work with
pickle:

    class Outer:

        Class Inner:

            ...

        def __init__(self):
            self.i = Outer.Inner()

I explained:

> http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled 
> 
> 
>  From that:
> 
>     # functions defined at the top level of a module
>     # built-in functions defined at the top level of a module
>     # classes that are defined at the top level of a module

I've never questions this, but I wonder, is this a fundamental restriction
or could it be overcome with a modest amount of work?

Just curious...

Skip


From walter at livinglogic.de  Thu May 12 18:58:12 2011
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Thu, 12 May 2011 18:58:12 +0200
Subject: [Python-Dev] Could these restrictions be removed?
In-Reply-To: <4DCC10A3.9000209@livinglogic.de>
References: <20110512163337.3758D12B7749@montanaro.dyndns.org>
	<4DCC10A3.9000209@livinglogic.de>
Message-ID: <4DCC11A4.4050101@livinglogic.de>

On 12.05.11 18:53, Walter D?rwald wrote:

> On 12.05.11 18:33, skip at pobox.com wrote:
> 
>> A friend at work who is new to Python wondered why this didn't work with
>> pickle:
>>
>>     class Outer:
>>
>>         Class Inner:
>>
>>             ...
>>
>>         def __init__(self):
>>             self.i = Outer.Inner()
>>
>> I explained:
>>
>>> http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled 
>>>
>>>
>>>  From that:
>>>
>>>     # functions defined at the top level of a module
>>>     # built-in functions defined at the top level of a module
>>>     # classes that are defined at the top level of a module
>>
>> I've never questions this, but I wonder, is this a fundamental restriction
>> or could it be overcome with a modest amount of work?
> 
> This is related to http://bugs.python.org/issue633930

See also the thread started at:

   http://mail.python.org/pipermail/python-dev/2005-March/052454.html

Servus,
   Walter

From solipsis at pitrou.net  Thu May 12 19:05:46 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 12 May 2011 19:05:46 +0200
Subject: [Python-Dev] Could these restrictions be removed?
References: <20110512163337.3758D12B7749@montanaro.dyndns.org>
Message-ID: <20110512190546.116a9a91@pitrou.net>

On Thu, 12 May 2011 11:33:37 -0500 (CDT)
Skip Montanaro <skip at montanaro.dyndns.org> wrote:
> 
> A friend at work who is new to Python wondered why this didn't work with
> pickle:
> 
>     class Outer:
> 
>         Class Inner:
> 
>             ...
> 
>         def __init__(self):
>             self.i = Outer.Inner()
> 
[...]
> 
> I've never questions this, but I wonder, is this a fundamental restriction
> or could it be overcome with a modest amount of work?

pickle uses heuristics to try to find out the "official name" of a
class or function. It would be a matter of improving these heuristics.

There are other cases in which pickle similarly fails:

>>> pickle.dumps(random.random)
b'\x80\x03crandom\nrandom\nq\x00.'
>>> pickle.dumps(random.randint)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <class 'method'>: attribute lookup builtins.method failed

Regards

Antoine.



From walter at livinglogic.de  Thu May 12 18:53:55 2011
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Thu, 12 May 2011 18:53:55 +0200
Subject: [Python-Dev] Could these restrictions be removed?
In-Reply-To: <20110512163337.3758D12B7749@montanaro.dyndns.org>
References: <20110512163337.3758D12B7749@montanaro.dyndns.org>
Message-ID: <4DCC10A3.9000209@livinglogic.de>

On 12.05.11 18:33, skip at pobox.com wrote:

> A friend at work who is new to Python wondered why this didn't work with
> pickle:
> 
>     class Outer:
> 
>         Class Inner:
> 
>             ...
> 
>         def __init__(self):
>             self.i = Outer.Inner()
> 
> I explained:
> 
>> http://docs.python.org/library/pickle.html#what-can-be-pickled-and-unpickled 
>>
>>
>>  From that:
>>
>>     # functions defined at the top level of a module
>>     # built-in functions defined at the top level of a module
>>     # classes that are defined at the top level of a module
> 
> I've never questions this, but I wonder, is this a fundamental restriction
> or could it be overcome with a modest amount of work?

This is related to http://bugs.python.org/issue633930

Servus,
   Walter

From dickinsm at gmail.com  Fri May 13 10:14:12 2011
From: dickinsm at gmail.com (Mark Dickinson)
Date: Fri, 13 May 2011 09:14:12 +0100
Subject: [Python-Dev] Python Language Summit at EuroPython: 19th June
In-Reply-To: <4DA9ACB5.6030505@python.org>
References: <4DA9ACB5.6030505@python.org>
Message-ID: <BANLkTinAHe9kAumvJFqqJ6sbtHpW9KJdmg@mail.gmail.com>

Hi Michael,

Sorry for the late reply;  it's been kinda busy around here.

If there are places left, I'll definitely be there at the summit.

Congratulations on your impending doom!  (And sorry to hear that you
might not be there in Florence.)

Mark


On Sat, Apr 16, 2011 at 3:50 PM, Michael Foord <michael at python.org> wrote:
> Hello all,
>
> This is an invite to all core-python developers, and developers of
> alternative implementations, to attend the Python Language Summit at
> EuroPython. The summit will be on June 19th and EuroPython this year will be
> held at the beautiful city of Florence in Italy.
>
> ? ?http://ep2011.europython.eu/
>
> If you are not a core-Python developer but would like to attend then please
> email me privately and I will let you know if spaces are available. If you
> are a core developer, or you have received a direct invitation, then please
> respond by private email to let me know if you are able to attend. A maybe
> is fine, you can always change your mind later. Attending for only part of
> the day is fine.
>
> We expect the summit to run from 10am - 4pm with appropriate breaks.
>
> Like previous language summits it is an opportunity to discuss topics like,
> Python 3 adoption, PEPs and changes for Python 3.3, the future of Python
> 2.7, documentation, package index, web site, etc.
>
> If you have topics you'd like to discuss at the language summit please let
> me know.
>
> Volunteers for taking notes at the language summit, for posting to
> Python-dev and the Python Insiders blog after the event, would be much
> appreciated.
>
> All the best,
>
> Michael Foord
>
> N.B. Due to my impending doom (oops, I mean impending fatherhood) I am not
> yet 100% certain I will be able to attend. If I can't I will arrange for
> someone else to chair.
>
> --
> http://www.voidspace.org.uk/
>
> May you do good and not evil
> May you find forgiveness for yourself and forgive others
> May you share freely, never taking more than you give.
> -- the sqlite blessing http://www.sqlite.org/different.html
>
>

From sandeep.mathew at hp.com  Fri May 13 11:25:44 2011
From: sandeep.mathew at hp.com (Mathew, Sandeep (OpenVMS))
Date: Fri, 13 May 2011 09:25:44 +0000
Subject: [Python-Dev] Python Support on OpenVMS
Message-ID: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>

Hi  Folks,



I am Sandeep Mathew from OpenVMS engineering in Hewlett-Packard. I have worked on various components of the OpenVMS operating system including MONITOR, TDF, EXEC, LIBRTL, DCL and SYSMAN.  I happened to read this blog post about dropping OpenVMS support for further releases of python here: http://blog.python.org/2011/05/python-33-to-drop-support-for-os2.html.



I am willing to spend time and effort to ensure that python remains supported on OpenVMS. Please let me know what needs to be done for continued OpenVMS Support in Python. Looking forward to working with the Python community.





Regards

Sandeep Mathew



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110513/163b2103/attachment.html>

From solipsis at pitrou.net  Fri May 13 12:08:18 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 13 May 2011 12:08:18 +0200
Subject: [Python-Dev] Python Support on OpenVMS
References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>
Message-ID: <20110513120818.139dca63@pitrou.net>


Welcome Sandeep,

> I am willing to spend time and effort to ensure that python remains supported
> on OpenVMS. Please let me know what needs to be done for continued
> OpenVMS Support in Python. Looking forward to working with the Python
> community.

The first thing would be to check whether the current development tree
(the future Python 3.3) compiles and works fine for OpenVMS. Given that
3.x has had many changes compared to 2.x, this is not guaranteed.

Instructions for getting the source tree are here:
http://docs.python.org/devguide/setup.html
Once the interpreter compiled fine, the second step is to run the test
suite:
http://docs.python.org/devguide/runtests.html

Any compilation errors and test suite failures should be reported to
the bug tracker (http://bugs.python.org/), preferably with patches
since I doubt any of us would be able to fix the issues themselves.

If you have any questions, don't hesitate to ask.

Regards

Antoine.



From merwok at netwok.org  Fri May 13 17:14:46 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 13 May 2011 17:14:46 +0200
Subject: [Python-Dev] [Python-checkins] cpython (2.7): (Merge 3.1) Issue
 #12012: ssl.PROTOCOL_SSLv2 becomes optional
In-Reply-To: <1305137329.12577.1.camel@marge>
References: <E1QJaFY-00046A-Tn@dinsdale.python.org>	
	<c5683e5e669b57e6645e01eb80501fa9@netwok.org>
	<1305137329.12577.1.camel@marge>
Message-ID: <4DCD4AE6.7030704@netwok.org>

Le 11/05/2011 20:08, Victor Stinner a ?crit :
>>>   (Merge 3.1) Issue #12012: ssl.PROTOCOL_SSLv2 becomes optional
>>  ?(Merge 3.1)? is inaccurate for 2.7.
> Ah, why? I did not use "hg merge" command (but hg export|hg import), but
> it's a "merge" between two branches. Which term would you use?

I prefer to use merge only to refer to hg merges.  The 2.7 and 3.x lines
are independent, so I wouldn?t put any marker in the commit message,
just use the same as the message used in 3.1.

Regards

From merwok at netwok.org  Fri May 13 17:35:01 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 13 May 2011 17:35:01 +0200
Subject: [Python-Dev] [Python-checkins] Python Regression Test Failures
	doc (1)
In-Reply-To: <20110508200046.GA2465@kbk-i386-bb.dyndns.org>
References: <20110508200046.GA2465@kbk-i386-bb.dyndns.org>
Message-ID: <4DCD4FA5.3040607@netwok.org>

Hi,

Le 08/05/2011 22:00, Neal Norwitz a ?crit :
> rm -rf build/*
> rm -rf tools/sphinx
> rm -rf tools/pygments
> rm -rf tools/jinja2
> rm -rf tools/docutils
> Checking out Sphinx...
> svn: PROPFIND request failed on '/projects/external/Sphinx-0.6.5/sphinx'
> svn: PROPFIND of '/projects/external/Sphinx-0.6.5/sphinx': Could not resolve hostname `svn.python.org': Host not found (http://svn.python.org)
> make: *** [checkout] Error 1

I always wonder about these messages.  They?re mostly error messages
recently; what are python-checkins subscribers supposed to do in
reaction?  In non-error mode, what are they useful for?

Thanks in advance for enlightening me.

Regards

From merwok at netwok.org  Fri May 13 17:44:00 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 13 May 2011 17:44:00 +0200
Subject: [Python-Dev] [Python-checkins] cpython (3.1): Fix for issue
 10684: Folders get deleted when trying to change case with
In-Reply-To: <E1QIHOF-0003NK-11@dinsdale.python.org>
References: <E1QIHOF-0003NK-11@dinsdale.python.org>
Message-ID: <4DCD51C0.9080304@netwok.org>

Hi,

Le 06/05/2011 11:32, ronald.oussoren a ?crit :
> http://hg.python.org/cpython/rev/26da299ca88e
> summary:
>   Fix for issue 10684: Folders get deleted when trying to change case with shutil.move (case insensitive file systems only)
> 
> -    except OSError:
> +    except OSError as exc:
>          if os.path.isdir(src):
>              if _destinsrc(src, dst):
>                  raise Error("Cannot move a directory '%s' into itself '%s'." % (src, dst))

Is this change a debugging leftover?

Regards

From status at bugs.python.org  Fri May 13 18:07:22 2011
From: status at bugs.python.org (Python tracker)
Date: Fri, 13 May 2011 18:07:22 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-05-06 - 2011-05-13)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2784 ( +1)
  closed 21069 (+52)
  total  23853 (+53)

Open issues with patches: 1198 


Issues opened (36)
==================

#6011: python doesn't build if prefix contains non-ascii characters
http://bugs.python.org/issue6011  reopened by haypo

#11786: ConfigParser.[Raw]ConfigParser optionxform()
http://bugs.python.org/issue11786  reopened by eric.araujo

#11873: test_regexp() of test_compileall fails occassionally
http://bugs.python.org/issue11873  reopened by haypo

#11977: Document int.conjugate, .denominator, ...
http://bugs.python.org/issue11977  reopened by georg.brandl

#12019: Dead or buggy code in importlib.test.__main__
http://bugs.python.org/issue12019  opened by eric.araujo

#12020: Attribute error with flush on stdout,stderr
http://bugs.python.org/issue12020  opened by Jimbofbx

#12021: mmap.read requires an argument
http://bugs.python.org/issue12021  opened by rich-noir

#12022: AttributeError should report the same details when raised by l
http://bugs.python.org/issue12022  opened by dholth

#12024: 2.6 svn and hg branches are out of sync
http://bugs.python.org/issue12024  opened by barry

#12026: Support more of MSI api
http://bugs.python.org/issue12026  opened by markm

#12028: threading._get_ident(): remove it in the doc and make it publi
http://bugs.python.org/issue12028  opened by haypo

#12029: ABC registration of Exceptions
http://bugs.python.org/issue12029  opened by acooke

#12034: check_GetFinalPathNameByHandle() suboptimal
http://bugs.python.org/issue12034  opened by pitrou

#12037: test_email failures under Windows with the eol extension activ
http://bugs.python.org/issue12037  opened by pitrou

#12038: assertEqual doesn't display newline differences quite well
http://bugs.python.org/issue12038  opened by pitrou

#12040: Expose a Process.sentinel property (and fix polling loop in Pr
http://bugs.python.org/issue12040  opened by pitrou

#12042: What's New multiprocessing example error
http://bugs.python.org/issue12042  opened by davipo

#12043: Update shutil documentation
http://bugs.python.org/issue12043  opened by sandro.tosi

#12045: external shell command executed twice in ctypes.util._get_sona
http://bugs.python.org/issue12045  opened by pitrou

#12046: Windows build identification incomplete
http://bugs.python.org/issue12046  opened by loewis

#12048: Python 3, ZipFile Bug In Chinese
http://bugs.python.org/issue12048  opened by yaoyu

#12049: expose RAND_bytes() function of OpenSSL
http://bugs.python.org/issue12049  opened by haypo

#12050: unconsumed_tail of zlib.Decompress is not always cleared on de
http://bugs.python.org/issue12050  opened by Takeshi.Yoshino

#12053: Add prefetch() for Buffered IO (experiment)
http://bugs.python.org/issue12053  opened by jcon

#12055: doctest not working on nested functions
http://bugs.python.org/issue12055  opened by dabrahams

#12057: HZ codec has no test
http://bugs.python.org/issue12057  opened by haypo

#12059: hashlib does not handle missing hash functions correctly
http://bugs.python.org/issue12059  opened by Ian.Wienand

#12060: Python doesn't support real time signals
http://bugs.python.org/issue12060  opened by haypo

#12063: tokenize module appears to treat unterminated single and doubl
http://bugs.python.org/issue12063  opened by Devin Jeanpierre

#12065: test_ssl failure when svn.python.org fails to resolve
http://bugs.python.org/issue12065  opened by r.david.murray

#12066: Empty ('') xmlns attribute is not properly handled by xml.dom.
http://bugs.python.org/issue12066  opened by atamyrat

#12067: Doc: remove errors about mixed-type comparisons.
http://bugs.python.org/issue12067  opened by terry.reedy

#12068: test_logging failure in test_rollover
http://bugs.python.org/issue12068  opened by pitrou

#12069: test_signal.test_without_siginterrupt() failure on AMD64 OpenI
http://bugs.python.org/issue12069  opened by haypo

#12070: Unlimited loop in sysconfig._parse_makefile()
http://bugs.python.org/issue12070  opened by haypo

#12071: test_concurrent_futures.test_context_manager_shutdown() hangs 
http://bugs.python.org/issue12071  opened by haypo



Most recent 15 issues with no replies (15)
==========================================

#12071: test_concurrent_futures.test_context_manager_shutdown() hangs 
http://bugs.python.org/issue12071

#12069: test_signal.test_without_siginterrupt() failure on AMD64 OpenI
http://bugs.python.org/issue12069

#12066: Empty ('') xmlns attribute is not properly handled by xml.dom.
http://bugs.python.org/issue12066

#12063: tokenize module appears to treat unterminated single and doubl
http://bugs.python.org/issue12063

#12059: hashlib does not handle missing hash functions correctly
http://bugs.python.org/issue12059

#12055: doctest not working on nested functions
http://bugs.python.org/issue12055

#12053: Add prefetch() for Buffered IO (experiment)
http://bugs.python.org/issue12053

#12045: external shell command executed twice in ctypes.util._get_sona
http://bugs.python.org/issue12045

#12043: Update shutil documentation
http://bugs.python.org/issue12043

#12037: test_email failures under Windows with the eol extension activ
http://bugs.python.org/issue12037

#12034: check_GetFinalPathNameByHandle() suboptimal
http://bugs.python.org/issue12034

#12029: ABC registration of Exceptions
http://bugs.python.org/issue12029

#12024: 2.6 svn and hg branches are out of sync
http://bugs.python.org/issue12024

#12019: Dead or buggy code in importlib.test.__main__
http://bugs.python.org/issue12019

#11992: sys.settrace doesn't disable tracing if a local trace function
http://bugs.python.org/issue11992



Most recent 15 issues waiting for review (15)
=============================================

#12060: Python doesn't support real time signals
http://bugs.python.org/issue12060

#12059: hashlib does not handle missing hash functions correctly
http://bugs.python.org/issue12059

#12057: HZ codec has no test
http://bugs.python.org/issue12057

#12049: expose RAND_bytes() function of OpenSSL
http://bugs.python.org/issue12049

#12040: Expose a Process.sentinel property (and fix polling loop in Pr
http://bugs.python.org/issue12040

#12026: Support more of MSI api
http://bugs.python.org/issue12026

#12018: No tests for ntpath.samefile, ntpath.sameopenfile
http://bugs.python.org/issue12018

#12015: possible characters in temporary file name is too few
http://bugs.python.org/issue12015

#12014: str.format parses replacement field incorrectly
http://bugs.python.org/issue12014

#12008: HtmlParser non-strict goes wrong with unquoted attributes
http://bugs.python.org/issue12008

#12004: PyZipFile.writepy gives internal error on syntax errors
http://bugs.python.org/issue12004

#12002: ftplib.FTP.abort fails with TypeError on Python 3.x
http://bugs.python.org/issue12002

#11999: sporadic failure in test_mailbox
http://bugs.python.org/issue11999

#11998: test_signal cannot test blocked signals if _tkinter is loaded;
http://bugs.python.org/issue11998

#11996: libpython.py: nicer py-bt output
http://bugs.python.org/issue11996



Top 10 most discussed issues (10)
=================================

#11948: Tutorial/Modules - small fix to better clarify the modules sea
http://bugs.python.org/issue11948  15 msgs

#6727: ImportError when package is symlinked on Windows
http://bugs.python.org/issue6727  14 msgs

#8407: expose signalfd(2) and pthread_sigmask in the signal module
http://bugs.python.org/issue8407  14 msgs

#11877: Change os.fsync() to support physical backing store syncs
http://bugs.python.org/issue11877  14 msgs

#12015: possible characters in temporary file name is too few
http://bugs.python.org/issue12015  12 msgs

#9205: Parent process hanging in multiprocessing if children terminat
http://bugs.python.org/issue9205  10 msgs

#10666: OS X installer variants have confusing readline differences
http://bugs.python.org/issue10666  10 msgs

#12057: HZ codec has no test
http://bugs.python.org/issue12057  10 msgs

#5723: Incomplete json tests
http://bugs.python.org/issue5723   9 msgs

#12010: Compile fails when sizeof(wchar_t) == 1
http://bugs.python.org/issue12010   9 msgs



Issues closed (51)
==================

#1195: Problems on Linux with Ctrl-D and Ctrl-C during raw_input
http://bugs.python.org/issue1195  closed by haypo

#1350: IDLE - CallTips enhancement - show full doc-string in new wind
http://bugs.python.org/issue1350  closed by kbk

#5154: OSX broken poll testing doesn't work
http://bugs.python.org/issue5154  closed by ronaldoussoren

#5559: IDLE Output Window 's goto fails when path has spaces
http://bugs.python.org/issue5559  closed by kbk

#8498: Cannot use backlog = 0 for sockets
http://bugs.python.org/issue8498  closed by pitrou

#8808: imaplib should support SSL contexts
http://bugs.python.org/issue8808  closed by pitrou

#9971: Optimize BufferedReader.readinto
http://bugs.python.org/issue9971  closed by pitrou

#10169: socket.sendto raises incorrect exception when passed incorrect
http://bugs.python.org/issue10169  closed by ezio.melotti

#10419: distutils command build_scripts fails with UnicodeDecodeError
http://bugs.python.org/issue10419  closed by python-dev

#11072: Add MLSD command support to ftplib
http://bugs.python.org/issue11072  closed by giampaolo.rodola

#11164: xml shouldn't use _xmlplus
http://bugs.python.org/issue11164  closed by python-dev

#11347: libpython3.so: Broken soname and linking
http://bugs.python.org/issue11347  closed by python-dev

#11607: Apllication crashes when saving file
http://bugs.python.org/issue11607  closed by ronaldoussoren

#11617: Sporadic failure in test_httpservers
http://bugs.python.org/issue11617  closed by haypo

#11743: Rewrite PipeConnection and Connection in pure Python
http://bugs.python.org/issue11743  closed by pitrou

#11799: urllib HTTP authentication behavior with unrecognized auth met
http://bugs.python.org/issue11799  closed by orsenthil

#11888: Add C99's log2() function to the math library
http://bugs.python.org/issue11888  closed by haypo

#11896: Save on Close fails in IDLE, from Linux system
http://bugs.python.org/issue11896  closed by kbk

#11910: test_heapq C tests are not skipped when _heapq is missing
http://bugs.python.org/issue11910  closed by ezio.melotti

#11916: A few errnos from OSX
http://bugs.python.org/issue11916  closed by python-dev

#11927: SMTP_SSL doesn't use port 465 by default
http://bugs.python.org/issue11927  closed by pitrou

#11962: Buildbot reliability
http://bugs.python.org/issue11962  closed by skrah

#11968: wsgiref's wsgi application sample code does not work
http://bugs.python.org/issue11968  closed by orsenthil

#11972: input does not strip a trailing newline correctly on Windows
http://bugs.python.org/issue11972  closed by terry.reedy

#11994: [2.7/gcc-4.4.3] Segfault under valgrind in string.split()
http://bugs.python.org/issue11994  closed by haypo

#12001: Extend json.dumps to handle N-triples strings
http://bugs.python.org/issue12001  closed by terry.reedy

#12011: The signal module should raise OSError for OS-related exceptio
http://bugs.python.org/issue12011  closed by haypo

#12012: _ssl module doesn't compile with OpenSSL 1.0.0d: SSLv2_method 
http://bugs.python.org/issue12012  closed by haypo

#12013: file /usr/local/lib/python3.1/lib-dynload/_socket.so: symbol i
http://bugs.python.org/issue12013  closed by eric.araujo

#12017: Decoding a highly-nested object with json (_speedups enabled) 
http://bugs.python.org/issue12017  closed by ezio.melotti

#12023: non causal behavior
http://bugs.python.org/issue12023  closed by ezio.melotti

#12025: strangely missing separator in "resource" table
http://bugs.python.org/issue12025  closed by jcea

#12027: Optimize import this (patch to make it 10x faster)
http://bugs.python.org/issue12027  closed by rhettinger

#12030: Roundup Refused Update with No text/plain
http://bugs.python.org/issue12030  closed by benjamin.peterson

#12031: subprocess module does not accept file twice
http://bugs.python.org/issue12031  closed by neologix

#12032: Tools/Scripts/crlv.py needs updating for python 3+
http://bugs.python.org/issue12032  closed by python-dev

#12033: AttributeError: 'module' object has no attribute 'scipy'
http://bugs.python.org/issue12033  closed by alex

#12035: problem with installing validator.nu on windows
http://bugs.python.org/issue12035  closed by amaury.forgeotdarc

#12036: ConfigParser: Document items() added the vars dictionary to th
http://bugs.python.org/issue12036  closed by python-dev

#12039: test_logging: bad file descriptor on FreeBSD bot
http://bugs.python.org/issue12039  closed by vinay.sajip

#12041: test_os test_ctypes test_wait3 causes test_wait3 error
http://bugs.python.org/issue12041  closed by pitrou

#12044: subprocess.Popen.__exit__ doesn't wait for process end
http://bugs.python.org/issue12044  closed by brian.curtin

#12047: Expand the style guide
http://bugs.python.org/issue12047  closed by rhettinger

#12051: Segfaults in _json while encoding objects
http://bugs.python.org/issue12051  closed by ezio.melotti

#12052: round() erroneous for some large arguments
http://bugs.python.org/issue12052  closed by mark.dickinson

#12054: test_socket: replace custom _get_unused_port() by support.find
http://bugs.python.org/issue12054  closed by pitrou

#12056: "???" (HORIZONTAL ELLIPSIS) should be an alternative syntax fo
http://bugs.python.org/issue12056  closed by benjamin.peterson

#12058: Minor edits to comments in faulthandler
http://bugs.python.org/issue12058  closed by ezio.melotti

#12061: Remove duplicate 'key functions' entry in Glossary
http://bugs.python.org/issue12061  closed by georg.brandl

#12062: Buffered I/O inconsistent with unbuffered I/O in certain cases
http://bugs.python.org/issue12062  closed by pitrou

#12064: unexpected behavior with exception variable
http://bugs.python.org/issue12064  closed by ezio.melotti

From merwok at netwok.org  Fri May 13 19:56:28 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 13 May 2011 19:56:28 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <loom.20110511T213726-472@post.gmane.org>
References: <loom.20110509T193140-280@post.gmane.org>	<d1989510eccc69219dc75384faf7be23@netwok.org>
	<loom.20110511T213726-472@post.gmane.org>
Message-ID: <4DCD70CC.7030406@netwok.org>

Le 11/05/2011 21:45, Vinay Sajip a ?crit :
> ?ric Araujo <merwok <at> netwok.org> writes:
>>  I thought that if we set the level on the logger, we would prevent 
>>  third-party code to get some messages.  E.g., we set level to INFO but 
>>  pip uses some packaging functions and would like to get DEBUG messages.
> Then pip can set the level of the packaging logger as it wishes, perhaps in
> response to command-line arguments for verbosity. It'd be easier for pip to do
> that, regardless of which handlers are attached. And pip itself might be being
> used, say by virtualenv. It's hard in general to say what the top-level code
> will be, and generally that's the code which should set the handlers.
Okay.  I?ll go ahead and remove handlers (except for the command-line
script), and set the level on the logger.  If it turns out that the code
in packaging incorrectly resets the level set by calling code, we?ll fix
it later; now we want to fix the tests to produce the patch that will
add packaging to CPython.

> The levels set by a library for its loggers are merely defaults.
The conflict here is that there?s a class setting the logging level on
instantiation, which could reset the level set by calling code.

Thanks again for your messages (and blog).

From eric at netwok.org  Fri May 13 20:02:05 2011
From: eric at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Fri, 13 May 2011 20:02:05 +0200
Subject: [Python-Dev] Problems with regrtest and with logging
In-Reply-To: <BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com>
References: <acbfe5fdfbc9da0ecec6d2591ab3757d@netwok.org>
	<BANLkTikx0KXuo9vyDBi=Ky2mWqY61Vp=6g@mail.gmail.com>
Message-ID: <4DCD721D.1080108@netwok.org>

On Sat, May 7, 2011 at 3:51 AM, ?ric Araujo <merwok at netwok.org> wrote:
> regrtest helpfully reports when a test leaves the environment unclean
> (sys.path, os.environ, logging._handlerList)

A quick follow-up: I talked about regrtest with RDM on IRC, and I will
use the context manager that detects changes in the ?if __name__ ==
'__main__'? blocks of our test files to find the guilty ones.  Some
warnings are subtle to track down: the test runs a command which
instantiates a class which calls a function and here?s the code that
sets an environment variable.

In the future, I?ll take part in the efforts to reimplement parts of
regrtest with new unittest features.  Right now it?s quite painful to
have to use either unittest to run just one file or regrtest to get the
warnings.

Cheers

From sdaoden at googlemail.com  Fri May 13 21:49:01 2011
From: sdaoden at googlemail.com (Steffen Daode Nurpmeso)
Date: Fri, 13 May 2011 21:49:01 +0200
Subject: [Python-Dev] Summary of Python tracker Issues
In-Reply-To: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za>
References: <20110513160722.1F2C31CE85@psf.upfronthosting.co.za>
Message-ID: <20110513194901.GA40824@sherwood.local>

The summary mails part 1 was declared as US-ASCII, 8bit, but it
contained a UTF-8 character:

> #12056: "???" (HORIZONTAL ELLIPSIS) should be an alternative syntax fo
> http://bugs.python.org/issue12056  closed by benjamin.peterson

This is handled without any problem by Python 3000 due to
David Murrays patch of issue 11605 for 3.2 and 3.3.
(It however broke my obviously insufficient non-postman thing :(,
and it's of course not a valid mail, strictly speaking.
So i report this just in case your stricken MUAs simply do the
right thing and noone recognizes it at all.)

May the juice be with you

--
Steffen, sdaoden(*)(gmail.com)


From ncoghlan at gmail.com  Sat May 14 09:00:30 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 14 May 2011 17:00:30 +1000
Subject: [Python-Dev] Python Support on OpenVMS
In-Reply-To: <20110513120818.139dca63@pitrou.net>
References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>
	<20110513120818.139dca63@pitrou.net>
Message-ID: <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com>

On Fri, May 13, 2011 at 8:08 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Any compilation errors and test suite failures should be reported to
> the bug tracker (http://bugs.python.org/), preferably with patches
> since I doubt any of us would be able to fix the issues themselves.

For ongoing support, it would also be *really* helpful if HP could
provide an OpenVMS buildbot.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From peck at us.ibm.com  Sat May 14 18:03:10 2011
From: peck at us.ibm.com (Jon K Peck)
Date: Sat, 14 May 2011 10:03:10 -0600
Subject: [Python-Dev] AUTO: Jon K Peck is out of the office (returning
	05/18/2011)
Message-ID: <OFE00C5BAC.4D8A3197-ON87257890.00582E60-87257890.00582E60@us.ibm.com>



I am out of the office until 05/18/2011.

I am out of the office traveling  Wed - Thursday, May 11-12 and
Saturday-Tuesday, May 14-17.
I will have limited access to email during this time, so I will be delayed
in responding.


Note: This is an automated response to your message  "Python-Dev Digest,
Vol 94, Issue 25" sent on 5/14/11 4:00:03.

This is the only notification you will receive while this person is away.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110514/83d53036/attachment.html>

From vinay_sajip at yahoo.co.uk  Sun May 15 10:55:13 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Sun, 15 May 2011 08:55:13 +0000 (UTC)
Subject: [Python-Dev] more timely detection of unbound locals
References: <BANLkTim0Q+3==pb-N8FkWz_0kCg1HBezCQ@mail.gmail.com>	<Pine.GSO.4.64.1105090913360.27288@core.cs.uwaterloo.ca>	<BANLkTimmAwLUzHr_4aoX8HNscQRE83LR1A@mail.gmail.com>	<Pine.GSO.4.64.1105091129450.27288@core.cs.uwaterloo.ca>	<BANLkTik8JCOp_n9WGqazg6r0gdQdGt9Ugg@mail.gmail.com>	<20110510131144.C8D75250041@webabinitio.net>
	<BANLkTi=MPSWxrDR6rU=5nFMtDu1meUJX4A@mail.gmail.com>
	<iqbu9b$981$1@dough.gmane.org>
Message-ID: <loom.20110515T105420-429@post.gmane.org>

Terry Reedy <tjreedy <at> udel.edu> writes:

> I would change this to
> "local name 'bob' used before the assignment that makes it a local name"
> 
> Calling names 'variables' is itself a point of confusion.

+1



From senthil at uthcode.com  Mon May 16 04:15:03 2011
From: senthil at uthcode.com (Senthil Kumaran)
Date: Mon, 16 May 2011 10:15:03 +0800
Subject: [Python-Dev] Python Support on OpenVMS
In-Reply-To: <BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com>
References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>
	<20110513120818.139dca63@pitrou.net>
	<BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com>
Message-ID: <20110516021503.GB2808@kevin>

On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote:
> For ongoing support, it would also be *really* helpful if HP could
> provide an OpenVMS buildbot.

Yes, that would be best first step in the on-going struggle to support
OpenVMS platform. The problem in the first place is no one has the
hardware to try install python, leaving alone fixing the bugs in that.
So, Sandeep, if you can setup a buildbot (
http://python.org/dev/buildbot/) and be the owner of the buildbot, it
would be really helpful.

-- 
Senthil


From martin at v.loewis.de  Mon May 16 09:20:41 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 16 May 2011 09:20:41 +0200
Subject: [Python-Dev] Python Support on OpenVMS
In-Reply-To: <20110516021503.GB2808@kevin>
References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>	<20110513120818.139dca63@pitrou.net>	<BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com>
	<20110516021503.GB2808@kevin>
Message-ID: <4DD0D049.9050407@v.loewis.de>

Am 16.05.2011 04:15, schrieb Senthil Kumaran:
> On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote:
>> For ongoing support, it would also be *really* helpful if HP could
>> provide an OpenVMS buildbot.
> 
> Yes, that would be best first step in the on-going struggle to support
> OpenVMS platform.

I guess the best first step would be to make it compile at all. Then try
to make it pass the test suite. This may well take several months to
complete.

Regards,
Martin

From ncoghlan at gmail.com  Mon May 16 10:04:05 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 16 May 2011 18:04:05 +1000
Subject: [Python-Dev] Python Support on OpenVMS
In-Reply-To: <4DD0D049.9050407@v.loewis.de>
References: <DB140D138DBD2F42A2768791634E108021064E4A9B@GVW1351EXA.americas.hpqcorp.net>
	<20110513120818.139dca63@pitrou.net>
	<BANLkTi=-WiQ6EpdiOQCvwMVEsOJ67qB_zQ@mail.gmail.com>
	<20110516021503.GB2808@kevin> <4DD0D049.9050407@v.loewis.de>
Message-ID: <BANLkTi=oG6brdSvOP8fZSQsYA8vPXyHRow@mail.gmail.com>

On Mon, May 16, 2011 at 5:20 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Am 16.05.2011 04:15, schrieb Senthil Kumaran:
>> On Sat, May 14, 2011 at 05:00:30PM +1000, Nick Coghlan wrote:
>>> For ongoing support, it would also be *really* helpful if HP could
>>> provide an OpenVMS buildbot.
>>
>> Yes, that would be best first step in the on-going struggle to support
>> OpenVMS platform.
>
> I guess the best first step would be to make it compile at all. Then try
> to make it pass the test suite. This may well take several months to
> complete.

And then make sure the buildbot client runs properly. Still, having
someone start down that path now (with a green stable buildbot as the
target end state) provides a specific goal that any patches can work
towards.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From sandeep.mathew at hp.com  Mon May 16 10:08:27 2011
From: sandeep.mathew at hp.com (Mathew, Sandeep (OpenVMS))
Date: Mon, 16 May 2011 08:08:27 +0000
Subject: [Python-Dev] Python Support on OpenVMS
In-Reply-To: <mailman.53.1305453602.25450.python-dev@python.org>
References: <mailman.53.1305453602.25450.python-dev@python.org>
Message-ID: <DB140D138DBD2F42A2768791634E108021064E4E7B@GVW1351EXA.americas.hpqcorp.net>

Hi All,

Thanks for your responses!. First thing on my radar is to get buildbot working on OpenVMS.
I had a quick glance at source, although buildbot is written purely in python it has many
platform specific issues. See: https://github.com/buildbot/buildbot/blob/master/master/README.w32
However I am guessing that it may not be very difficult to resolve.

I will concentrating on Itanium systems initially and will later port it to Alpha in a similar
way. I have requested for an account in HP's OpenVMS cluster meant for open source development.
I will kick off my work after the account has been activated!

Regards
Sandeep Mathew



From solipsis at pitrou.net  Mon May 16 15:46:47 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 16 May 2011 15:46:47 +0200
Subject: [Python-Dev] Python Support on OpenVMS
References: <mailman.53.1305453602.25450.python-dev@python.org>
	<DB140D138DBD2F42A2768791634E108021064E4E7B@GVW1351EXA.americas.hpqcorp.net>
Message-ID: <20110516154647.4a927e2e@pitrou.net>

On Mon, 16 May 2011 08:08:27 +0000
"Mathew, Sandeep (OpenVMS)" <sandeep.mathew at hp.com> wrote:

> Hi All,
> 
> Thanks for your responses!. First thing on my radar is to get buildbot working on OpenVMS.
> I had a quick glance at source, although buildbot is written purely in python it has many
> platform specific issues. See: https://github.com/buildbot/buildbot/blob/master/master/README.w32

I think this file is way out of date. We have Windows buildbots running
fine, and I don't think they required a modification of the buildbot
software.

Furthermore, you only need the buildbot slave, not master.
See http://wiki.python.org/moin/BuildBot for more info

Regards

Antoine.



From dsuch at gefira.pl  Mon May 16 20:31:45 2011
From: dsuch at gefira.pl (Dariusz Suchojad)
Date: Mon, 16 May 2011 20:31:45 +0200
Subject: [Python-Dev] Simple XML-RPC server over SSL/TLS
In-Reply-To: <27392.1304016849@parc.com>
References: <BANLkTinDGtWZsDPZ37U5_zqw9Aio-CpeXw@mail.gmail.com>	<4DB975BB.1040402@netwok.org>
	<27392.1304016849@parc.com>
Message-ID: <4DD16D91.6040805@gefira.pl>

Bill Janssen wrote:

Hello,

>>> But what I would like to know, is if is there any reason why XML-RPC can't
>>> optionally work over TLS/SSL using Python's ssl module. I'll create a
>>> ticket, and send a patch, but I was wondering if it was a reason why this
>>> was not implemented.
>>
>> I think there's no deeper reason than nobody thought about it.  The ssl
>> module is new in 2.6 and 3.x, xmlrpc is an older module for an old
>> technology *cough*, so feel free to open a bug report.  Patch guidelines
>> are found at http://docs.python.org/devguide  Thanks in advance!
>
> What he said.  I'm not a big fan of XMLRPC in the first place, so I
> probably didn't even notice that there wasn't SSL support for it.
>
> Go for it!

I know it's been some time but I've only now spotted the thread and just 
in case it could be helpful to anyone, Spring Python project has 
implemented the feature last year for Python 2.x

http://static.springsource.org/spring-python/1.2.x/sphinx/html/remoting.html#secure-xml-rpc

cheers,

-- 
Dariusz Suchojad

From digitalxero at gmail.com  Tue May 17 01:15:48 2011
From: digitalxero at gmail.com (Dj Gilcrease)
Date: Mon, 16 May 2011 19:15:48 -0400
Subject: [Python-Dev] [OT] Server Side Clone mode
Message-ID: <BANLkTimBFRTyD3us6pPSb=aXCwEuG4Q2cQ@mail.gmail.com>

I was wondering if there was a place I could get the modifications
that have been made at hg.python.org to add the Server Side Clone to
the hgweb interface.


Dj Gilcrease
?____
( | ? ? \ ?o ? ?() ? | ?o ?|`|
? | ? ? ?| ? ? ?/`\_/| ? ? ?| | ? ,__ ? ,_, ? ,_, ? __, ? ?, ? ,_,
_| ? ? ?| | ? ?/ ? ? ?| ?| ? |/ ? / ? ? ?/ ? | ? |_/ ?/ ? ?| ? / \_|_/
(/\___/ ?|/ ?/(__,/ ?|_/|__/\___/ ? ?|_/|__/\__/|_/\,/ ?|__/
? ? ? ? ?/|
? ? ? ? ?\|

From mhammond at skippinet.com.au  Tue May 17 09:38:07 2011
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Tue, 17 May 2011 17:38:07 +1000
Subject: [Python-Dev] Updated version of PEP-0397 - Python launcher for
	Windows.
Message-ID: <4DD225DF.2060605@skippinet.com.au>

Hi all,
   I've updated PEP-0397 to try and address some of the comments from 
the last draft.  I've checked the new version into hg, so you can find a 
full diff there, but the key items I've changed are:

* Spelled out the "version qualifier" rules for the shebang lines.
* Spelled out some customization options, both for fine-tuning the 
specific Python version selected and for supporting other Python 
implementations via "custom" commands.
* Indicated the launcher is not supported at all on Win2k or earlier.
* Removed some cruft.

The new version is attached and I welcome all comments, including 
bike-shedding on the environment variable names and INI section/value names.

Note that the reference implementation has not changed - I'll update 
that once there is general agreement on the functionality described in 
the PEP.

Thanks,

Mark
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-0397.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/72c828f8/attachment.txt>

From victor.stinner at haypocalc.com  Tue May 17 16:01:35 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 17 May 2011 16:01:35 +0200
Subject: [Python-Dev] Success x86 XP-4 2.7 buildbot without any log and
	should be a failure
Message-ID: <201105171601.35813.victor.stinner@haypocalc.com>

Hi,

I broke recently all tests of CJK encodings (#12057) in Python 2.7 (sorry, it 
is now fixed). But the "x86 XP-4 2.7" buildbot is green, I don't understand 
how (the bug was not fixed in the build 894):

http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%202.7/builds/894

This build doesn't contain any log.

Victor

From ziade.tarek at gmail.com  Tue May 17 17:36:10 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 17 May 2011 17:36:10 +0200
Subject: [Python-Dev] "packaging" merge imminent
Message-ID: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>

Hello

I am about to merge packaging in the stdlib, and we will continue our
work there :)

The impact is:

- addition of Lib/packaging
- addition of test/test_packaging.py
- changes in Lib/sysconfig.py
- addition of Lib/sysconfig.cfg

For the last one, I would like to make sure again that everyone is ok
with having a .cfg file added in the Lib/ directory. If not, we need
to discuss how to do this differently.

== purpose of sysconfig.cfg ==

The sysconfig.cfg file is a ini-like file that sysconfig.py reads to
get the installation paths. We currently have these paths harcoded in
the python module.

The next change I have planned is to allow several levels of
configuration, like distutils.cfg does. sysconfig.py will look for a
sysconfig.cfg file in these places:

1. the current working directory -- so can be potentially included in
a project source release
2. the user home  (specific location be defined, maybe in ~/local)
[inherits from the previous one]
3. the global
                            [inherits from the previous one]

I have decided to make it a .cfg file instead of a .py file for various reasons:

- easier for people to edit, without the danger of ending-up with an
over-engineered python module (that's the problem we have with
setup.py files)
- the override logic is easier to implement and understand: if I want
to change a single path, I add a ini file in my home with this single
path.

If I have no complains, the merge will happen tomorrow of my time

== next moves ==

- make sysconfig.py stop reading Makefile and pyconfig.h, this will be
done by adding a _sysconfig.py file created by the Makefile
- continue our work in packaging for 3.3
- we're planning to merge the doc in Doc/packaging very soon (still
working on it)


Cheers
Tarek

-- 
Tarek Ziad? | http://ziade.org

From lists at cheimes.de  Tue May 17 18:42:59 2011
From: lists at cheimes.de (Christian Heimes)
Date: Tue, 17 May 2011 18:42:59 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
Message-ID: <4DD2A593.90203@cheimes.de>

Am 17.05.2011 17:36, schrieb Tarek Ziad?:
> The next change I have planned is to allow several levels of
> configuration, like distutils.cfg does. sysconfig.py will look for a
> sysconfig.cfg file in these places:
> 
> 1. the current working directory -- so can be potentially included in
> a project source release
> 2. the user home  (specific location be defined, maybe in ~/local)
> [inherits from the previous one]
> 3. the global

You may want to study my site package PEP [1] regarding possible
security implications. I recommend that you ignore the current working
directory and user's home directory under conditions like different
effective user or the -E option.

A good place for a local sysconfig.cfg could be the user's stdlib
directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg).

Christian

[1] http://www.python.org/dev/peps/pep-0370

From jdunck at gmail.com  Tue May 17 19:40:04 2011
From: jdunck at gmail.com (Jeremy Dunck)
Date: Tue, 17 May 2011 12:40:04 -0500
Subject: [Python-Dev] Bug in json (the format and the module)
Message-ID: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>

This blog post describes a bug in a common usage pattern of JSON:

http://timelessrepo.com/json-isnt-a-javascript-subset

That is, there are some characters which are legal in JSON
serializations, but not in JavaScript strings.

This works OK for JSON parsers, but a common use case of JSON is
JSONP, where the result of a request is presumed to be executable
javascript:

<script src="http://someapi.com/jsonp?callback=foo"> might return a response:

foo({"some_json":"which might or might not be legal javascript"})

The post also suggests a solution -- to replace literal U+2028 - Line
separator and U+2029 - Paragraph separator with their escape sequences
\u2028 and \u2029.

This is a nice solution in that it makes the JSON valid JS while
keeping the same semantics.  Of course there's the annoyance of
processing the full string, comparable in overhead to utf-8 encoding,
I presume.

So, to start with, is there a maintainer for the json module, or how
should I go about discussing implementing this solution?

From bob at redivi.com  Tue May 17 20:18:15 2011
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 17 May 2011 12:18:15 -0600
Subject: [Python-Dev] Bug in json (the format and the module)
In-Reply-To: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>
References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>
Message-ID: <BANLkTim+wEMv_kSBqvtBPM=aHYytd7+N+g@mail.gmail.com>

By default the json module already escapes anything outside of 7-bit
ASCII, so unless you're using ensure_ascii=False then this is a
non-issue.

I implemented a workaround for ensure_ascii=False in simplejson here,
it would be pretty trivial to add this feature to the json module as
well:
https://github.com/simplejson/simplejson/commit/4989e693bab39b1ce5cf6fc0b21dbacd108c312c

On Tue, May 17, 2011 at 11:40 AM, Jeremy Dunck <jdunck at gmail.com> wrote:
> This blog post describes a bug in a common usage pattern of JSON:
>
> http://timelessrepo.com/json-isnt-a-javascript-subset
>
> That is, there are some characters which are legal in JSON
> serializations, but not in JavaScript strings.
>
> This works OK for JSON parsers, but a common use case of JSON is
> JSONP, where the result of a request is presumed to be executable
> javascript:
>
> <script src="http://someapi.com/jsonp?callback=foo"> might return a response:
>
> foo({"some_json":"which might or might not be legal javascript"})
>
> The post also suggests a solution -- to replace literal U+2028 - Line
> separator and U+2029 - Paragraph separator with their escape sequences
> \u2028 and \u2029.
>
> This is a nice solution in that it makes the JSON valid JS while
> keeping the same semantics. ?Of course there's the annoyance of
> processing the full string, comparable in overhead to utf-8 encoding,
> I presume.
>
> So, to start with, is there a maintainer for the json module, or how
> should I go about discussing implementing this solution?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/bob%40redivi.com
>

From dirkjan at ochtman.nl  Tue May 17 20:21:26 2011
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Tue, 17 May 2011 20:21:26 +0200
Subject: [Python-Dev] Bug in json (the format and the module)
In-Reply-To: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>
References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>
Message-ID: <BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com>

On Tue, May 17, 2011 at 19:40, Jeremy Dunck <jdunck at gmail.com> wrote:
> So, to start with, is there a maintainer for the json module, or how
> should I go about discussing implementing this solution?

Your subject states that there is an actual bug in the json module,
but your message fails to mention any actual bug. Is this what you
mean?

Python 2.7.1 (r271:86832, Mar 28 2011, 09:54:04)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> print json.dumps(u'foo\u2028bar')
"foo\u2028bar"

Cheers,

Dirkjan

From ronaldoussoren at mac.com  Tue May 17 19:21:26 2011
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 17 May 2011 19:21:26 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
Message-ID: <D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com>


On 17 May, 2011, at 17:36, Tarek Ziad? wrote:

> Hello
> 
> I am about to merge packaging in the stdlib, and we will continue our
> work there :)
> 
> The impact is:
> 
> - addition of Lib/packaging
> - addition of test/test_packaging.py
> - changes in Lib/sysconfig.py
> - addition of Lib/sysconfig.cfg
> 
> For the last one, I would like to make sure again that everyone is ok
> with having a .cfg file added in the Lib/ directory. If not, we need
> to discuss how to do this differently.
> 
> == purpose of sysconfig.cfg ==
> 
> The sysconfig.cfg file is a ini-like file that sysconfig.py reads to
> get the installation paths. We currently have these paths harcoded in
> the python module.
> 
> The next change I have planned is to allow several levels of
> configuration, like distutils.cfg does. sysconfig.py will look for a
> sysconfig.cfg file in these places:
> 
> 1. the current working directory -- so can be potentially included in
> a project source release

Does this mean that python behaves differently when there happens to be a sysconfig.cfg file in the current working directory? That's a potentional security risk.  


> 2. the user home  (specific location be defined, maybe in ~/local)
> [inherits from the previous one]

How hard would it be to disable this behavior for tools like virtualenv and py2app?

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2224 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/9caee69c/attachment.bin>

From jdunck at gmail.com  Tue May 17 20:48:44 2011
From: jdunck at gmail.com (Jeremy Dunck)
Date: Tue, 17 May 2011 13:48:44 -0500
Subject: [Python-Dev] Bug in json (the format and the module)
In-Reply-To: <BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com>
References: <BANLkTi=YmC3RqB6d44pHFHu3A9d=tNxGqA@mail.gmail.com>
	<BANLkTinF-c5mNFkU6_1wZXrrSCg8L-B8Nw@mail.gmail.com>
Message-ID: <BANLkTi=W3Ep_O8E+r9bYtdLpVpfeFffZPg@mail.gmail.com>

On Tue, May 17, 2011 at 1:21 PM, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
> On Tue, May 17, 2011 at 19:40, Jeremy Dunck <jdunck at gmail.com> wrote:
>> So, to start with, is there a maintainer for the json module, or how
>> should I go about discussing implementing this solution?
>
> Your subject states that there is an actual bug in the json module,
> but your message fails to mention any actual bug. Is this what you
> mean?
>
> Python 2.7.1 (r271:86832, Mar 28 2011, 09:54:04)
> [GCC 4.4.5] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import json
>>>> print json.dumps(u'foo\u2028bar')
> "foo\u2028bar"

Actually, that would be fine, and Bob's right that this is a non-issue
with ensure_ascii=True (the default).  His change upstream seems good
for the ensure_ascii=False case.

To be complete, this is what I meant:

>>> s = '{"JSON":"ro
cks!"}' # this string has a literal U+2028 in it
>>> s
'{"JSON":"ro\xe2\x80\xa8cks!"}'

>>> import json
>>> json.dumps(s) # fine by default
'"{\\"JSON\\":\\"ro\\u2028cks!\\"}"'

>>> json.dumps(s, ensure_ascii=False) # not fine with ensure_ascii=False
'"{\\"JSON\\":\\"ro\xe2\x80\xa8cks!\\"}"'

From georg at python.org  Tue May 17 20:50:37 2011
From: georg at python.org (Georg Brandl)
Date: Tue, 17 May 2011 20:50:37 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
Message-ID: <4DD2C37D.7000008@python.org>

On behalf of the Python development team, I am pleased to announce the
first release candidate of Python 3.2.1.

Python 3.2.1 will the first bugfix release for Python 3.2, fixing over 120
bugs and regressions in Python 3.2.

For an extensive list of changes and features in the 3.2 line, see

    http://docs.python.org/3.2/whatsnew/3.2.html

To download Python 3.2.1 visit:

    http://www.python.org/download/releases/3.2.1/

This is a testing release: Please consider trying Python 3.2.1 with your code
and reporting any bugs you may notice to:

    http://bugs.python.org/


Enjoy!

-- 
Georg Brandl, Release Manager
georg at python.org
(on behalf of the entire python-dev team and 3.2's contributors)

From victor.stinner at haypocalc.com  Tue May 17 22:40:35 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 17 May 2011 22:40:35 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
Message-ID: <1305664835.29701.2.camel@marge>

Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit :
> - addition of Lib/packaging
> - addition of test/test_packaging.py
> - changes in Lib/sysconfig.py
> - addition of Lib/sysconfig.cfg

Does setup.py continue to use the "old" distutils module?

I fixed recently some bugs in distutils. Should I also fix them in the
packaging module, or are both modules already "synchronized"?

Victor


From ziade.tarek at gmail.com  Tue May 17 23:20:23 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 17 May 2011 23:20:23 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<D3727EF7-AEFC-4335-96FE-53E44A21ACBC@mac.com>
Message-ID: <BANLkTikCo4_oYfQCPAwv1M9zM2tUAmJpzg@mail.gmail.com>

On Tue, May 17, 2011 at 7:21 PM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
...
>> 1. the current working directory -- so can be potentially included in
>> a project source release
>
> Does this mean that python behaves differently when there happens to be a sysconfig.cfg file in the current working directory? That's a potentional security risk.

The use case is to have it there at install time so packaging can have
alternative locations if needed.

We could also drop the working dir scanning and have it:

1- passed explicitly to the pysetup script via an option.
2- used only if found in a root of a project at installation time, and only then

>
>> 2. the user home ?(specific location be defined, maybe in ~/local)
>> [inherits from the previous one]
>
> How hard would it be to disable this behavior for tools like virtualenv and py2app?

Not hard at all, just an option. And the goal is also to allow
virtualenv to have its own copy, like it does for distutils.cfg

>
> Ronald



-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue May 17 23:23:32 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 17 May 2011 23:23:32 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <1305664835.29701.2.camel@marge>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<1305664835.29701.2.camel@marge>
Message-ID: <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com>

On Tue, May 17, 2011 at 10:40 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit :
>> - addition of Lib/packaging
>> - addition of test/test_packaging.py
>> - changes in Lib/sysconfig.py
>> - addition of Lib/sysconfig.cfg
>
> Does setup.py continue to use the "old" distutils module?

Yes. The plan is to keep distutils support, so projects with setup.py
still work.

For the new packaging, people will have to provide new sections in setup.cfg.

The pysetup script will detect at installation time if the project has
the required bits in the cfg, and if not will fallback to executing
setup.py


> I fixed recently some bugs in distutils. Should I also fix them in the
> packaging module, or are both modules already "synchronized"?

They need to be backported yes. We did some, but we'll need to double
check distutils timeline to make sure it's synced

>
> Victor
>
>



-- 
Tarek Ziad? | http://ziade.org

From ziade.tarek at gmail.com  Tue May 17 23:25:38 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 17 May 2011 23:25:38 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <4DD2A593.90203@cheimes.de>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<4DD2A593.90203@cheimes.de>
Message-ID: <BANLkTingN8a1my-e5O6soXY9b9uEM2eu_A@mail.gmail.com>

On Tue, May 17, 2011 at 6:42 PM, Christian Heimes <lists at cheimes.de> wrote:
> Am 17.05.2011 17:36, schrieb Tarek Ziad?:
>> The next change I have planned is to allow several levels of
>> configuration, like distutils.cfg does. sysconfig.py will look for a
>> sysconfig.cfg file in these places:
>>
>> 1. the current working directory -- so can be potentially included in
>> a project source release
>> 2. the user home ?(specific location be defined, maybe in ~/local)
>> [inherits from the previous one]
>> 3. the global
>
> You may want to study my site package PEP [1] regarding possible
> security implications. I recommend that you ignore the current working
> directory and user's home directory under conditions like different
> effective user or the -E option.

Sounds good, thanks


> A good place for a local sysconfig.cfg could be the user's stdlib
> directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg).

Yes, so, part of the packaging imcoming work will be to relocate all
user .cfg files in the same, python-specific place.

That includes pydistutils.cfg, and pypirc. I remember we did talk
about that a few months ago, and will restart this discussion asap


> Christian
>
> [1] http://www.python.org/dev/peps/pep-0370
>



-- 
Tarek Ziad? | http://ziade.org

From ethan at stoneleaf.us  Wed May 18 00:27:45 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 17 May 2011 15:27:45 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
Message-ID: <4DD2F661.2050005@stoneleaf.us>

The bytes type in Python 3 does not feel very consistent.

For example:

--> some_var = 'abcdef'
--> some_var
'abcdef'
--> some_var[3]
'd'
--> some_other_var = b'abcdef'
--> some_other_var
b'abcdef'
--> some_other_var[3]
100


On the one hand we have the 'bytes are ascii data' type interface, and 
on the other we have the 'bytes are a list of integers between 0 - 256' 
interface.  And trying to use the two is not intuitive:

--> some_other_var[3] == b'd'
False

When I'm parsing a .dbf file and extracting field types from the byte 
stream, I'm not thinking, "okay, 67 is a Character field" -- what I'm 
thinking is, "b'C' is a Character field".

Considering that ord() still works fine, I'm not sure why it was done 
this way.

Is there code out there that is using this "list of int's" interface, or 
is there time to make changes to bytes?

~Ethan~

From benjamin at python.org  Wed May 18 01:04:51 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 17 May 2011 18:04:51 -0500
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <BANLkTi=B4EmqRyfUMBFzQVLgWdEY9v7RZw@mail.gmail.com>

2011/5/17 Ethan Furman <ethan at stoneleaf.us>:
> Considering that ord() still works fine, I'm not sure why it was done this
> way.

I agree that this change was unfortunate and not too useful in practice.

>
> Is there code out there that is using this "list of int's" interface, or is
> there time to make changes to bytes?

I don't doubt there is, and I'm afraid it's far to late to change this.



-- 
Regards,
Benjamin

From raymond.hettinger at gmail.com  Wed May 18 01:05:00 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 17 May 2011 18:05:00 -0500
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <F0C153BA-29BB-45D0-88F3-0EE5C21D1B19@gmail.com>


On May 17, 2011, at 5:27 PM, Ethan Furman wrote:

> The bytes type in Python 3 does not feel very consistent.
> 
> For example:
> 
> --> some_var = 'abcdef'
> --> some_var
> 'abcdef'
> --> some_var[3]
> 'd'
> --> some_other_var = b'abcdef'
> --> some_other_var
> b'abcdef'
> --> some_other_var[3]
> 100
> 
> 
> On the one hand we have the 'bytes are ascii data' type interface,

This is incidental.  Bytes can and often do contain data with non-ascii encoded text,  plain binary data, or structs, or raw data read off a disk, etc.

> and on the other we have the 'bytes are a list of integers between 0 - 256' interface.  And trying to use the two is not intuitive:
> 
> --> some_other_var[3] == b'd'
> False
> 
> When I'm parsing a .dbf file and extracting field types from the byte stream, I'm not thinking, "okay, 67 is a Character field" -- what I'm thinking is, "b'C' is a Character field".
> 
> Considering that ord() still works fine, I'm not sure why it was done this way.
> 
> Is there code out there that is using this "list of int's" interface,

Yes.

> or is there time to make changes to bytes?

No.


Raymond

From nad at acm.org  Wed May 18 01:25:16 2011
From: nad at acm.org (Ned Deily)
Date: Tue, 17 May 2011 16:25:16 -0700
Subject: [Python-Dev] "packaging" merge imminent
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<1305664835.29701.2.camel@marge>
	<BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com>
Message-ID: <nad-FF73E4.16251617052011@news.gmane.org>

In article <BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg at mail.gmail.com>,
 Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Tue, May 17, 2011 at 10:40 PM, Victor Stinner
> <victor.stinner at haypocalc.com> wrote:
> > Le mardi 17 mai 2011 ? 17:36 +0200, Tarek Ziad? a ?crit :
> >> - addition of Lib/packaging
> >> - addition of test/test_packaging.py
> >> - changes in Lib/sysconfig.py
> >> - addition of Lib/sysconfig.cfg
> >
> > Does setup.py continue to use the "old" distutils module?
> 
> Yes. The plan is to keep distutils support, so projects with setup.py
> still work.

Just to be clear: what about for the build of the interpreter itself, 
i.e. its setup.py for the standard library extension modules?  Will the 
existing distutils code continue to be used for that?  Or is it being 
replaced by code in packaging?  If so, have Python builds been tested 
yet on the various platforms?

-- 
 Ned Deily,
 nad at acm.org


From ziade.tarek at gmail.com  Wed May 18 01:37:18 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 18 May 2011 01:37:18 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <nad-FF73E4.16251617052011@news.gmane.org>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<1305664835.29701.2.camel@marge>
	<BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com>
	<nad-FF73E4.16251617052011@news.gmane.org>
Message-ID: <BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw@mail.gmail.com>

On Wed, May 18, 2011 at 1:25 AM, Ned Deily <nad at acm.org> wrote:
...
> Just to be clear: what about for the build of the interpreter itself,
> i.e. its setup.py for the standard library extension modules? ?Will the
> existing distutils code continue to be used for that? ?Or is it being
> replaced by code in packaging? ?If so, have Python builds been tested
> yet on the various platforms?

It will remain distutils-based for now. Moving it to packaging is not
our top priority.


Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From nad at acm.org  Wed May 18 01:53:56 2011
From: nad at acm.org (Ned Deily)
Date: Tue, 17 May 2011 16:53:56 -0700
Subject: [Python-Dev] "packaging" merge imminent
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<1305664835.29701.2.camel@marge>
	<BANLkTikjoaU=HoG3OsSOvJ5UzOWPTwT=vg@mail.gmail.com>
	<nad-FF73E4.16251617052011@news.gmane.org>
	<BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw@mail.gmail.com>
Message-ID: <nad-86F423.16535617052011@news.gmane.org>

In article <BANLkTikhNDSPPdeOce5KF18G+vbjcwqwEw at mail.gmail.com>,
 Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Wed, May 18, 2011 at 1:25 AM, Ned Deily <nad at acm.org> wrote:
> > Just to be clear: what about for the build of the interpreter itself,
> > i.e. its setup.py for the standard library extension modules? ?Will the
> > existing distutils code continue to be used for that? [...]
> It will remain distutils-based for now. Moving it to packaging is not
> our top priority.

+1.  Thanks!

-- 
 Ned Deily,
 nad at acm.org


From ncoghlan at gmail.com  Wed May 18 05:13:32 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 13:13:32 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>

On Wed, May 18, 2011 at 8:27 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> On the one hand we have the 'bytes are ascii data' type interface, and on
> the other we have the 'bytes are a list of integers between 0 - 256'
> interface.

No. Bytes are a list of integers between 0-256. End of story. Using
them to represent text as well was precisely the problem with 2.x
8-bit strings, since the boundaries got blurred.

However, as a matter of practicality, many byte-oriented protocols use
ASCII to make elements of the protocol readable by humans. The
"text-like" elements of the bytes and bytearray types are a concession
to the existence of those protocols. However, that doesn't make them
text - they're still binary data streams. If you want to treat them as
text, convert them to "str" objects first (e.g. that's what
urlib.urlparse does internally in order to operate on bytes and
bytearray instances).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From robertc at robertcollins.net  Wed May 18 05:23:07 2011
From: robertc at robertcollins.net (Robert Collins)
Date: Wed, 18 May 2011 15:23:07 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
Message-ID: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>

On Wed, May 18, 2011 at 3:13 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Wed, May 18, 2011 at 8:27 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
>> On the one hand we have the 'bytes are ascii data' type interface, and on
>> the other we have the 'bytes are a list of integers between 0 - 256'
>> interface.
>
> No. Bytes are a list of integers between 0-256. End of story. Using
> them to represent text as well was precisely the problem with 2.x
> 8-bit strings, since the boundaries got blurred.
>
> However, as a matter of practicality, many byte-oriented protocols use
> ASCII to make elements of the protocol readable by humans. The
> "text-like" elements of the bytes and bytearray types are a concession
> to the existence of those protocols. However, that doesn't make them
> text - they're still binary data streams. If you want to treat them as
> text, convert them to "str" objects first (e.g. that's what
> urlib.urlparse does internally in order to operate on bytes and
> bytearray instances).

This is a not a useful argument - its an implementation choice in
Python 3, and urlparse converting bytes to 'str' to operate on them is
at best a kludge - you're forcing 5 times the storage (the original
bytes + 4 bytes-per-byte when its decoded into unicode) to work on
something which is defined as a BNF * that uses ascii *.

The Python 2 confusion was deplorable, but it doesn't make the Python
3 situation better: its different, but still very awkward for people
to write code that is correct and fast in.

Its probably too late to change, but please don't try to argue that
its correct: the continued confusion of folk running into this is
evidence that confusion *is happening*. Treat that as evidence and
think about how to fix it going forward.

_Rob

From ncoghlan at gmail.com  Wed May 18 05:40:14 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 13:40:14 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
Message-ID: <BANLkTinBiOKZiTknmV5+jxMGbwG39E4Uuw@mail.gmail.com>

On Wed, May 18, 2011 at 1:23 PM, Robert Collins
<robertc at robertcollins.net> wrote:
> The Python 2 confusion was deplorable, but it doesn't make the Python
> 3 situation better: its different, but still very awkward for people
> to write code that is correct and fast in.

When Python 3 goes wrong, it raises exceptions or executes the wrong
control flow. That's a vast improvement over silently corrupting the
data stream the way that 2.x does.

If it really bothers anyone, they should feel free to implement and
promote their own "ascii" data type on PyPI. If it is explicitly
restricted to 7 bit characters, it may even avoid many of the problems
of silent corruption that the 2.x str had. Speculation on python-dev
isn't going to be convincing here, though: only code in real use will
be effective on that front.

As far as the memory and runtime overhead goes, yes, that's a real
problem (indeed, that overhead is *why* bytes and bytearray have as
many str-like features as they do). PEP 393 is intended to at least
alleviate the memory burden of the Unicode text.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From greg.ewing at canterbury.ac.nz  Wed May 18 07:39:40 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 May 2011 17:39:40 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <4DD35B9C.3030702@canterbury.ac.nz>

Ethan Furman wrote:

> On the one hand we have the 'bytes are ascii data' type interface, and 
> on the other we have the 'bytes are a list of integers between 0 - 256' 
> interface.

I think the weird part is that there exists a literal for
writing a byte array as an ascii string, and furthermore
that it's the *only* kind of literal available for bytes.

Personally I think that the default literal syntax for
bytes, and also the form produced by repr(), should have
been something more neutral, such as hex, with the ascii
form available for use when it makes sense. Currently if
you want to write a bytes literal in hex, you have to
say something like

    some_var = b'\xde\xad\xbe\xef'

which is ugly and unreadable. Much nicer would be

    some_var = x'deadbeef'

As for

> --> some_other_var[3] == b'd'

there ought to be a literal for specifying an integer
using an ascii character, so you could say something like

   if some_other_var[3] == c'd':

which would be equivalent to

   if some_other_var[3] == ord(b'd')

but without the overhead of computing the value each time
at run time.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Wed May 18 07:43:37 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 18 May 2011 17:43:37 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
Message-ID: <4DD35C89.2030807@canterbury.ac.nz>

Robert Collins wrote:
> urlparse converting bytes to 'str' to operate on them is
> at best a kludge - you're forcing 5 times the storage (the original
> bytes + 4 bytes-per-byte when its decoded into unicode)

That is itself an implementation detail of current Python,
though, due to it only having one internal representation of
unicode.

In principle there could be a form of str that keeps its
data encoded in latin1, in which case constructing it from
a byte string could simply involve storing a pointer to the
original bytes data.

-- 
Greg

From v+python at g.nevcal.com  Wed May 18 07:46:34 2011
From: v+python at g.nevcal.com (Glenn Linderman)
Date: Tue, 17 May 2011 22:46:34 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>
Message-ID: <4DD35D3A.6050004@g.nevcal.com>

On 5/17/2011 10:39 PM, Greg Ewing wrote:
> Personally I think that the default literal syntax for
> bytes, and also the form produced by repr(), should have
> been something more neutral, such as hex, with the ascii
> form available for use when it makes sense.

> Much nicer would be
>
>    some_var = x'deadbeef'
>
> As for
>
>> --> some_other_var[3] == b'd'
>
> there ought to be a literal for specifying an integer
> using an ascii character, so you could say something like
>
>   if some_other_var[3] == c'd':
>
> which would be equivalent to
>
>   if some_other_var[3] == ord(b'd')
>
> but without the overhead of computing the value each time
> at run time.

+1

Seems this could be added compatibly?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110517/17ceb88f/attachment.html>

From chris at simplistix.co.uk  Wed May 18 07:51:43 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Wed, 18 May 2011 06:51:43 +0100
Subject: [Python-Dev] how do you find out what version of Python a PEP
	landed in?
Message-ID: <4DD35E6F.8030901@simplistix.co.uk>

Hi All,

A friend of mine is coming over to Python and asked a question I thought 
would have a better answer than it appears to:

How do I know which version of Python a PEP lands in?

I was expecting there to be a note at the bottom of the PEP, 342 in this 
case, but that doesn't appear to be the case.

What is the policy on this? Where should we be looking?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk

From amauryfa at gmail.com  Wed May 18 08:00:08 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 18 May 2011 08:00:08 +0200
Subject: [Python-Dev] how do you find out what version of Python a PEP
 landed in?
In-Reply-To: <4DD35E6F.8030901@simplistix.co.uk>
References: <4DD35E6F.8030901@simplistix.co.uk>
Message-ID: <BANLkTi=bpk98TXuE2T-dPtsJ6wGKbQOzQA@mail.gmail.com>

Hi,

2011/5/18 Chris Withers <chris at simplistix.co.uk>:
> A friend of mine is coming over to Python and asked a question I thought
> would have a better answer than it appears to:
>
> How do I know which version of Python a PEP lands in?
>
> I was expecting there to be a note at the bottom of the PEP, 342 in this
> case, but that doesn't appear to be the case.
>
> What is the policy on this? Where should we be looking?

Normally PEPs are important enough to be mentioned in the "whatsnew"
document of each release.
Googling for "what's new pep 342" suggests that it was released with Python 2.5.

Now, an "official" way to get this information would probably be better...

-- 
Amaury Forgeot d'Arc

From techtonik at gmail.com  Wed May 18 08:07:19 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 18 May 2011 09:07:19 +0300
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD2C37D.7000008@python.org>
References: <4DD2C37D.7000008@python.org>
Message-ID: <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>

That's great, but where is the list if changes?
--
anatoly t.



On Tue, May 17, 2011 at 9:50 PM, Georg Brandl <georg at python.org> wrote:
> On behalf of the Python development team, I am pleased to announce the
> first release candidate of Python 3.2.1.
>
> Python 3.2.1 will the first bugfix release for Python 3.2, fixing over 120
> bugs and regressions in Python 3.2.
>
> For an extensive list of changes and features in the 3.2 line, see
>
> ? ?http://docs.python.org/3.2/whatsnew/3.2.html
>
> To download Python 3.2.1 visit:
>
> ? ?http://www.python.org/download/releases/3.2.1/
>
> This is a testing release: Please consider trying Python 3.2.1 with your code
> and reporting any bugs you may notice to:
>
> ? ?http://bugs.python.org/
>
>
> Enjoy!
>
> --
> Georg Brandl, Release Manager
> georg at python.org
> (on behalf of the entire python-dev team and 3.2's contributors)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com
>

From amauryfa at gmail.com  Wed May 18 08:18:06 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 18 May 2011 08:18:06 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
Message-ID: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>

Hi,

2011/5/18 anatoly techtonik <techtonik at gmail.com>:
> That's great, but where is the list if changes?

All changes are always listed in the Misc/NEWS file.
A "Change log" link on every download page displays this file.

-- 
Amaury Forgeot d'Arc

From martin at v.loewis.de  Wed May 18 08:24:29 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 08:24:29 +0200
Subject: [Python-Dev] how do you find out what version of Python a PEP
 landed in?
In-Reply-To: <4DD35E6F.8030901@simplistix.co.uk>
References: <4DD35E6F.8030901@simplistix.co.uk>
Message-ID: <4DD3661D.30908@v.loewis.de>

> How do I know which version of Python a PEP lands in?

You should look at the Python-Version header of the PEP.

Regards,
Martin

From g.brandl at gmx.net  Wed May 18 08:31:28 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 08:31:28 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
Message-ID: <iqvp43$et3$1@dough.gmane.org>

On 18.05.2011 07:39, Greg Ewing wrote:
> Ethan Furman wrote:
> 
>> On the one hand we have the 'bytes are ascii data' type interface, and 
>> on the other we have the 'bytes are a list of integers between 0 - 256' 
>> interface.
> 
> I think the weird part is that there exists a literal for
> writing a byte array as an ascii string, and furthermore
> that it's the *only* kind of literal available for bytes.
> 
> Personally I think that the default literal syntax for
> bytes, and also the form produced by repr(), should have
> been something more neutral, such as hex, with the ascii
> form available for use when it makes sense. Currently if
> you want to write a bytes literal in hex, you have to
> say something like
> 
>     some_var = b'\xde\xad\xbe\xef'
> 
> which is ugly and unreadable. Much nicer would be
> 
>     some_var = x'deadbeef'

We do have

  bytes.fromhex('deadbeef')

Georg


From martin at v.loewis.de  Wed May 18 08:32:17 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 08:32:17 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <4DD367F1.2050306@v.loewis.de>

> Is there code out there that is using this "list of int's" interface

Just in case this isn't clear yet: yes, certainly. Any non-trivial piece
of Python 3 code that has been written already (and there is some) will
have run into that issue.

Regards,
Martin

From martin at v.loewis.de  Wed May 18 08:34:07 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 08:34:07 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
Message-ID: <4DD3685F.1040503@v.loewis.de>

>> That's great, but where is the list if changes?
> 
> All changes are always listed in the Misc/NEWS file.
> A "Change log" link on every download page displays this file.

I think it would be good if the release announcement made some
summary statement, though, like "NNN bugs have been fixed, in MMM
modules; see NEWS for details", or some such.

Regards,
Martin

From amauryfa at gmail.com  Wed May 18 08:38:17 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 18 May 2011 08:38:17 +0200
Subject: [Python-Dev] how do you find out what version of Python a PEP
 landed in?
In-Reply-To: <4DD3661D.30908@v.loewis.de>
References: <4DD35E6F.8030901@simplistix.co.uk>
	<4DD3661D.30908@v.loewis.de>
Message-ID: <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com>

2011/5/18 "Martin v. L?wis" <martin at v.loewis.de>:
>> How do I know which version of Python a PEP lands in?
>
> You should look at the Python-Version header of the PEP.

But some PEPs don't have it: 341, 342, 343, 353...

-- 
Amaury Forgeot d'Arc

From ncoghlan at gmail.com  Wed May 18 08:39:54 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 16:39:54 +1000
Subject: [Python-Dev] how do you find out what version of Python a PEP
 landed in?
In-Reply-To: <4DD3661D.30908@v.loewis.de>
References: <4DD35E6F.8030901@simplistix.co.uk>
	<4DD3661D.30908@v.loewis.de>
Message-ID: <BANLkTi=_uW6mo_H=i8YhEve8yexdK8z6mQ@mail.gmail.com>

On Wed, May 18, 2011 at 4:24 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> How do I know which version of Python a PEP lands in?
>
> You should look at the Python-Version header of the PEP.

Which is unfortunately missing from some PEPs (including PEP 342). PEP
344 shows where this information *should* be, though.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From g.brandl at gmx.net  Wed May 18 09:57:45 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 09:57:45 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD3685F.1040503@v.loewis.de>
References: <4DD2C37D.7000008@python.org>	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de>
Message-ID: <iqvu5t$82s$1@dough.gmane.org>

On 18.05.2011 08:34, "Martin v. L?wis" wrote:
>>> That's great, but where is the list if changes?
>> 
>> All changes are always listed in the Misc/NEWS file.
>> A "Change log" link on every download page displays this file.
> 
> I think it would be good if the release announcement made some
> summary statement, though, like "NNN bugs have been fixed, in MMM
> modules; see NEWS for details", or some such.

It does say "over NNN bugs have been fixed", not sure if the MMM modules
add anything of value.

I agree that a link to the NEWS file should be present though.

Georg


From techtonik at gmail.com  Wed May 18 09:58:11 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 18 May 2011 10:58:11 +0300
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
Message-ID: <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>

On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc
<amauryfa at gmail.com> wrote:
> Hi,
>
> 2011/5/18 anatoly techtonik <techtonik at gmail.com>:
>> That's great, but where is the list if changes?
>
> All changes are always listed in the Misc/NEWS file.
> A "Change log" link on every download page displays this file.

I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to
Misc/NEWS, but it doesn't contain any references of 3.2.1
--
anatoly t.

From techtonik at gmail.com  Wed May 18 11:25:51 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 18 May 2011 12:25:51 +0300
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD3685F.1040503@v.loewis.de>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de>
Message-ID: <BANLkTikVJy64DqYqwD5xSJgJRJ7aU19+cg@mail.gmail.com>

On Wed, May 18, 2011 at 9:34 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> That's great, but where is the list if changes?
>>
>> All changes are always listed in the Misc/NEWS file.
>> A "Change log" link on every download page displays this file.
>
> I think it would be good if the release announcement made some
> summary statement, though, like "NNN bugs have been fixed, in MMM
> modules; see NEWS for details", or some such.

That's a good idea. But for such kind of query Roundup should be
module aware [1,2]. I'd say if Jesse could make a competition on best
announcement format - we could easily see what information we tend to
skip while preparing the releases (and improve NEWS format [3]).

[1] http://code.google.com/p/pydotorg/issues/detail?id=8
[2] http://psf.upfronthosting.co.za/roundup/meta/issue373
[3] https://convore.com/the-changelog/the-best-changelog/
--
anatoly t.

From ncoghlan at gmail.com  Wed May 18 12:40:09 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 20:40:09 +1000
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>
Message-ID: <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com>

On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc
> <amauryfa at gmail.com> wrote:
>> Hi,
>>
>> 2011/5/18 anatoly techtonik <techtonik at gmail.com>:
>>> That's great, but where is the list if changes?
>>
>> All changes are always listed in the Misc/NEWS file.
>> A "Change log" link on every download page displays this file.
>
> I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to
> Misc/NEWS, but it doesn't contain any references of 3.2.1

What's New and Misc/NEWS are not the same thing.

Misc/NEWS is the second info link on the download page ("Change log
for this release"). (In this case, it lands at
http://hg.python.org/releasing/3.2.1/file/v3.2.1rc1/Misc/NEWS)

Agreed that What's New isn't a hugely useful thing to link from a
point release announcement, though. It sounds like Georg is going to
change that for the actual release.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Wed May 18 12:49:08 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 20:49:08 +1000
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <iqvu5t$82s$1@dough.gmane.org>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
Message-ID: <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>

On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 18.05.2011 08:34, "Martin v. L?wis" wrote:
>>>> That's great, but where is the list if changes?
>>>
>>> All changes are always listed in the Misc/NEWS file.
>>> A "Change log" link on every download page displays this file.
>>
>> I think it would be good if the release announcement made some
>> summary statement, though, like "NNN bugs have been fixed, in MMM
>> modules; see NEWS for details", or some such.
>
> It does say "over NNN bugs have been fixed", not sure if the MMM modules
> add anything of value.
>
> I agree that a link to the NEWS file should be present though.

Wishlist item: How hard would it be to run a ReST parser over
Misc/NEWS and create a HTML version for inclusion in the release
pages? (Bonus points if it steals the issue reference linkification
code from the tracker...)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From techtonik at gmail.com  Wed May 18 12:50:03 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 18 May 2011 13:50:03 +0300
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>
	<BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com>
Message-ID: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com>

On Wed, May 18, 2011 at 1:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc
>> <amauryfa at gmail.com> wrote:
>>> Hi,
>>>
>>> 2011/5/18 anatoly techtonik <techtonik at gmail.com>:
>>>> That's great, but where is the list if changes?
>>>
>>> All changes are always listed in the Misc/NEWS file.
>>> A "Change log" link on every download page displays this file.
>>
>> I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to
>> Misc/NEWS, but it doesn't contain any references of 3.2.1
>
> What's New and Misc/NEWS are not the same thing.

I believe you misunderstood. If you follow what's new link above, you
will see a link to Misc/NEWS, but this one leads to
http://hg.python.org/cpython/file/default/Misc/NEWS where no
references to 3.2.1 are available.

> Agreed that What's New isn't a hugely useful thing to link from a
> point release announcement, though. It sounds like Georg is going to
> change that for the actual release.

There is nothing bad in linking to major release notes (i.e. What's
New). IIRC, Mozilla does that for their minor releases, but they
explicitly mention changes since last minor release.
--
anatoly t.

From g.brandl at gmx.net  Wed May 18 12:58:18 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 12:58:18 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
Message-ID: <ir08oe$5et$1@dough.gmane.org>

On 18.05.2011 12:49, Nick Coghlan wrote:
> On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>> On 18.05.2011 08:34, "Martin v. L?wis" wrote:
>>>>> That's great, but where is the list if changes?
>>>>
>>>> All changes are always listed in the Misc/NEWS file.
>>>> A "Change log" link on every download page displays this file.
>>>
>>> I think it would be good if the release announcement made some
>>> summary statement, though, like "NNN bugs have been fixed, in MMM
>>> modules; see NEWS for details", or some such.
>>
>> It does say "over NNN bugs have been fixed", not sure if the MMM modules
>> add anything of value.
>>
>> I agree that a link to the NEWS file should be present though.
> 
> Wishlist item: How hard would it be to run a ReST parser over
> Misc/NEWS and create a HTML version for inclusion in the release
> pages? (Bonus points if it steals the issue reference linkification
> code from the tracker...)

See

http://dev.pocoo.org/~gbrandl/news.html

which I made as an experiment a while ago.

Georg


From ncoghlan at gmail.com  Wed May 18 13:04:15 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 21:04:15 +1000
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <ir08oe$5et$1@dough.gmane.org>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org>
Message-ID: <BANLkTimY3r57cspcYp63Xk5GF+iX2pzFqw@mail.gmail.com>

On Wed, May 18, 2011 at 8:58 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 18.05.2011 12:49, Nick Coghlan wrote:
>> Wishlist item: How hard would it be to run a ReST parser over
>> Misc/NEWS and create a HTML version for inclusion in the release
>> pages? (Bonus points if it steals the issue reference linkification
>> code from the tracker...)
>
> See
>
> http://dev.pocoo.org/~gbrandl/news.html
>
> which I made as an experiment a while ago.

I quite like that! What would we need to do to make it part of the
docs build process?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From g.brandl at gmx.net  Wed May 18 12:59:55 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 12:59:55 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>
	<BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com>
	<BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com>
Message-ID: <ir08re$5et$2@dough.gmane.org>

On 18.05.2011 12:50, anatoly techtonik wrote:
> On Wed, May 18, 2011 at 1:40 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On Wed, May 18, 2011 at 5:58 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>>> On Wed, May 18, 2011 at 9:18 AM, Amaury Forgeot d'Arc
>>> <amauryfa at gmail.com> wrote:
>>>> Hi,
>>>>
>>>> 2011/5/18 anatoly techtonik <techtonik at gmail.com>:
>>>>> That's great, but where is the list if changes?
>>>>
>>>> All changes are always listed in the Misc/NEWS file.
>>>> A "Change log" link on every download page displays this file.
>>>
>>> I actually followed http://docs.python.org/3.2/whatsnew/3.2.html to
>>> Misc/NEWS, but it doesn't contain any references of 3.2.1
>>
>> What's New and Misc/NEWS are not the same thing.
> 
> I believe you misunderstood. If you follow what's new link above, you
> will see a link to Misc/NEWS, but this one leads to
> http://hg.python.org/cpython/file/default/Misc/NEWS where no
> references to 3.2.1 are available.

This link is wrong, it should point to /cpython/file/3.2/Misc/NEWS.

(But you'll still not see 3.2.1 changes until the 3.2.1 final release,
because the rc is made from a separate clone.)

Georg


From ncoghlan at gmail.com  Wed May 18 13:12:29 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 May 2011 21:12:29 +1000
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<BANLkTikYY2O-JJwECbq4URxmjLEXKPNXbw@mail.gmail.com>
	<BANLkTins+pavZbQGTziP-Ba+ba34Nu5EZg@mail.gmail.com>
	<BANLkTi=XTL1WVFB8bZ0-BJ1KMvKGPEr9EA@mail.gmail.com>
Message-ID: <BANLkTi=unD89XG2zKtRGpEvdzMwSKnAWKA@mail.gmail.com>

On Wed, May 18, 2011 at 8:50 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> I believe you misunderstood. If you follow what's new link above, you
> will see a link to Misc/NEWS, but this one leads to
> http://hg.python.org/cpython/file/default/Misc/NEWS where no
> references to 3.2.1 are available.

Ah, I see what you mean. That actually looks to be a bug in the
":source:" tag that generates the file links. It should really
generate version appropriate links, but it currently just always links
to "default". (This wasn't an issue until 3.2 was released and 3.3
development started. Older versions didn't have that tag, and hence
referenced the specific release directly).

The source code links in the module docs have the same problem (e.g.
see http://docs.python.org/3.2/library/functools)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From victor.stinner at haypocalc.com  Wed May 18 13:26:40 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 18 May 2011 13:26:40 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <ir08oe$5et$1@dough.gmane.org>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org>
Message-ID: <1305718000.16682.0.camel@marge>

Le mercredi 18 mai 2011 ? 12:58 +0200, Georg Brandl a ?crit :
> On 18.05.2011 12:49, Nick Coghlan wrote:
> > On Wed, May 18, 2011 at 5:57 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> >> On 18.05.2011 08:34, "Martin v. L?wis" wrote:
> >>>>> That's great, but where is the list if changes?
> >>>>
> >>>> All changes are always listed in the Misc/NEWS file.
> >>>> A "Change log" link on every download page displays this file.
> >>>
> >>> I think it would be good if the release announcement made some
> >>> summary statement, though, like "NNN bugs have been fixed, in MMM
> >>> modules; see NEWS for details", or some such.
> >>
> >> It does say "over NNN bugs have been fixed", not sure if the MMM modules
> >> add anything of value.
> >>
> >> I agree that a link to the NEWS file should be present though.
> > 
> > Wishlist item: How hard would it be to run a ReST parser over
> > Misc/NEWS and create a HTML version for inclusion in the release
> > pages? (Bonus points if it steals the issue reference linkification
> > code from the tracker...)
> 
> See
> 
> http://dev.pocoo.org/~gbrandl/news.html
> 
> which I made as an experiment a while ago.

Oh, I like it. But the output should be reST to be able to include it
directly in the Python documentation. Sphinx would generate a new table
of contents with links to each release.

Victor


From orsenthil at gmail.com  Wed May 18 13:33:51 2011
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Wed, 18 May 2011 19:33:51 +0800
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <1305718000.16682.0.camel@marge>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge>
Message-ID: <20110518113351.GA3199@kevin>

On Wed, May 18, 2011 at 01:26:40PM +0200, Victor Stinner wrote:
> > http://dev.pocoo.org/~gbrandl/news.html
> > 
> Oh, I like it. But the output should be reST to be able to include it
> directly in the Python documentation. Sphinx would generate a new table

Interesting ideas! It would be really useful too.
+1

-- 
Senthil

From g.brandl at gmx.net  Wed May 18 13:35:51 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 13:35:51 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <1305718000.16682.0.camel@marge>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge>
Message-ID: <ir0aur$hl4$1@dough.gmane.org>

On 18.05.2011 13:26, Victor Stinner wrote:

>> See
>> 
>> http://dev.pocoo.org/~gbrandl/news.html
>> 
>> which I made as an experiment a while ago.
> 
> Oh, I like it. But the output should be reST to be able to include it
> directly in the Python documentation. Sphinx would generate a new table
> of contents with links to each release.

The output of processing reST should be reST?  Now I'm confused.

Georg


From techtonik at gmail.com  Wed May 18 14:01:17 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 18 May 2011 15:01:17 +0300
Subject: [Python-Dev] Inconsistent case in directory names for installed
	Python on Windows
Message-ID: <BANLkTinxQU=4AeEncfai1K5fdA3sFL6Zhg@mail.gmail.com>

Greetings,

While studying `virtualenv` code I've noticed that in Python directory
tree `include`, `libs` and `tcl` are lowercased while other dirs are
capitalized. It doesn't seem important (especially for developers
here), but it still can leave an unpleasant image for people new to
Python (and programming in general).

?[Python27]
? ??DLLs
? ??Doc
? ??include
? ??Lib
? ??libs
? ??Scripts
? ??tcl
? ??Tools

How about making a consistent lowercased or uppercased scheme? Windows
filesystems are case-insensitive, so the change shouldn't affect
anybody. Another candidate for normalization is Tools/Scripts dir,
which I'd lowercase FWIW:

??Tools
  ??i18n
  ??pynche
  ??Scripts
  ??versioncheck
  ??webchecker


Lowercased dirs on a top level seem to contains files that are
relevant to C developers only. However, I can not say for sure. It
seems that there could be a better place for them like top level
directory named Dev or C-API.
--
anatoly t.

From victor.stinner at haypocalc.com  Wed May 18 14:06:07 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 18 May 2011 14:06:07 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <ir0aur$hl4$1@dough.gmane.org>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge>
	<ir0aur$hl4$1@dough.gmane.org>
Message-ID: <1305720367.16682.2.camel@marge>

Le mercredi 18 mai 2011 ? 13:35 +0200, Georg Brandl a ?crit :
> On 18.05.2011 13:26, Victor Stinner wrote:
> 
> >> See
> >> 
> >> http://dev.pocoo.org/~gbrandl/news.html
> >> 
> >> which I made as an experiment a while ago.
> > 
> > Oh, I like it. But the output should be reST to be able to include it
> > directly in the Python documentation. Sphinx would generate a new table
> > of contents with links to each release.
> 
> The output of processing reST should be reST?  Now I'm confused.

Misc/NEWS is already formatted to reST? It doesn't contain any link (to
the issues). We may replace "Issue #xxx" by :issue:`xxx` (directly in
Misc/NEWS) to simplify the process? And maybe move Misc/NEWS to Doc?

http://dev.pocoo.org/~gbrandl/news.html is an HTML document.

Victor


From g.brandl at gmx.net  Wed May 18 14:17:16 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 14:17:16 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <1305720367.16682.2.camel@marge>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge>
	<ir0aur$hl4$1@dough.gmane.org> <1305720367.16682.2.camel@marge>
Message-ID: <ir0dcg$vok$1@dough.gmane.org>

On 18.05.2011 14:06, Victor Stinner wrote:
> Le mercredi 18 mai 2011 ? 13:35 +0200, Georg Brandl a ?crit :
>> On 18.05.2011 13:26, Victor Stinner wrote:
>> 
>> >> See
>> >> 
>> >> http://dev.pocoo.org/~gbrandl/news.html
>> >> 
>> >> which I made as an experiment a while ago.
>> > 
>> > Oh, I like it. But the output should be reST to be able to include it
>> > directly in the Python documentation. Sphinx would generate a new table
>> > of contents with links to each release.
>> 
>> The output of processing reST should be reST?  Now I'm confused.
> 
> Misc/NEWS is already formatted to reST?

Yes, it is.

> It doesn't contain any link (to
> the issues). We may replace "Issue #xxx" by :issue:`xxx` (directly in
> Misc/NEWS) to simplify the process?

Replacing the issue links is the only preprocessing that I did.

> And maybe move Misc/NEWS to Doc?

I don't think people would like that :)

> http://dev.pocoo.org/~gbrandl/news.html is an HTML document.

As the file name says :)

Georg


From victor.stinner at haypocalc.com  Wed May 18 14:21:55 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 18 May 2011 14:21:55 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension or
	generator
Message-ID: <1305721315.16682.10.camel@marge>

Hi,

''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a
temporary c variable. In this case, the variable is useless and requires
two opcodes: STORE_FAST(c), LOAD_FAST(c). The variable is not available
outside the list comprehension/generator.

I would like to remove the variable in these cases to speed up
(micro-optimize!) Python.

Remove the variable breaks code using introspection like:

   list([locals()['x'] for x in range(3)])

We may detect the usage of introspection (I don't know how hard it is),
only optimize trivial cases (like "x for x in ..."), or only optmize
with Python is running in optimize mode (python -O or python -OO).

What do you think? Is it useless and/or stupid?

I heard about optimization in the AST tree instead of working on the
bytecode. What is the status of this project?

Victor


From brian.curtin at gmail.com  Wed May 18 14:47:27 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Wed, 18 May 2011 07:47:27 -0500
Subject: [Python-Dev] Inconsistent case in directory names for installed
 Python on Windows
Message-ID: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com>

On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com> wrote:
>
> Greetings,
>
> While studying `virtualenv` code I've noticed that in Python directory
> tree `include`, `libs` and `tcl` are lowercased while other dirs are
> capitalized. It doesn't seem important (especially for developers
> here), but it still can leave an unpleasant image for people new to
> Python (and programming in general).

In theory there are probably a lot of things that might seem unpleasant but
are actually non-issues. I don't believe there have been any complaints
about actual unpleasantries with directory case.

>
> ?[Python27]
> ? ??DLLs
> ? ??Doc
> ? ??include
> ? ??Lib
> ? ??libs
> ? ??Scripts
> ? ??tcl
> ? ??Tools
>
> How about making a consistent lowercased or uppercased scheme? Windows
> filesystems are case-insensitive, so the change shouldn't affect
> anybody.

Some Macs have case-sensitive file systems, and some people use
case-sensitive file systems on various flavors of UNIX. The change would
probably require a thorough look through the build chain.

> Another candidate for
> normalization is Tools/Scripts dir,
> which I'd lowercase FWIW:
>
> ??Tools
>  ??i18n
>  ??pynche
>  ??Scripts
>  ??versioncheck
>  ??webchecker
>
>
> Lowercased dirs on a top level seem to contains files that are
> relevant to C developers only. However, I can not say for sure. It
> seems that there could be a better place for them like top level
> directory named Dev or C-API.
> --
> anatoly t.

Overall I think it boils down to a cosmetic change that I'm not sure we need
to make, which could unnecessarily break people's work. -1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110518/31d855ad/attachment.html>

From benjamin at python.org  Wed May 18 15:41:48 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Wed, 18 May 2011 08:41:48 -0500
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305721315.16682.10.camel@marge>
References: <1305721315.16682.10.camel@marge>
Message-ID: <BANLkTingy7NHxP6w5j5OpghNwdOcO6Zw=w@mail.gmail.com>

2011/5/18 Victor Stinner <victor.stinner at haypocalc.com>:
> Hi,
>
> ''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a
> temporary c variable. In this case, the variable is useless and requires
> two opcodes: STORE_FAST(c), LOAD_FAST(c). The variable is not available
> outside the list comprehension/generator.
>
> I would like to remove the variable in these cases to speed up
> (micro-optimize!) Python.
>
> Remove the variable breaks code using introspection like:
>
> ? list([locals()['x'] for x in range(3)])
>
> We may detect the usage of introspection (I don't know how hard it is),
> only optimize trivial cases (like "x for x in ..."), or only optmize
> with Python is running in optimize mode (python -O or python -OO).
>
> What do you think? Is it useless and/or stupid?

Far more useful would be figuring out how to remove the call.



-- 
Regards,
Benjamin

From ncoghlan at gmail.com  Wed May 18 16:05:37 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 00:05:37 +1000
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <1305718000.16682.0.camel@marge>
References: <4DD2C37D.7000008@python.org>
	<BANLkTimoo-8QUQqBx3C8R2zG4Mr1NQ6q8g@mail.gmail.com>
	<BANLkTik58_QM=uAd-suvjiSnOBzZBmajkg@mail.gmail.com>
	<4DD3685F.1040503@v.loewis.de> <iqvu5t$82s$1@dough.gmane.org>
	<BANLkTimZqGr0JqCrrgBaj1hx8pBELPxErw@mail.gmail.com>
	<ir08oe$5et$1@dough.gmane.org> <1305718000.16682.0.camel@marge>
Message-ID: <BANLkTiniDyExGSiJW6N=Tqs53MzMh=zjeQ@mail.gmail.com>

On Wed, May 18, 2011 at 9:26 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
>
> Oh, I like it. But the output should be reST to be able to include it
> directly in the Python documentation. Sphinx would generate a new table
> of contents with links to each release.

As Georg noted, Misc/NEWS is already ReST. My proposal was essentially
to add an extra step to the docs build process that invoked the same
commands that Georg used to generate the sample version (with
appropriate additions to Doc/tools as needed to make that work).

The generated NEWS.html file could easily live inside the whatsnew
directory alongside the actual What's New document.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From benjamin at python.org  Wed May 18 16:12:34 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Wed, 18 May 2011 09:12:34 -0500
Subject: [Python-Dev] 2.7.2 and 3.1.4
Message-ID: <BANLkTikhNOA1LpzJgNX2CEeav87LgowxHg@mail.gmail.com>

It's time to continue 2.7.* point releases with 2.7.2 and finish off
3.1.* with 3.1.4. I plan to do a RC for both on May 28th and a final
on June 11th.

-- 
Regards,
Benjamin

From ncoghlan at gmail.com  Wed May 18 16:17:28 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 00:17:28 +1000
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305721315.16682.10.camel@marge>
References: <1305721315.16682.10.camel@marge>
Message-ID: <BANLkTikroG0Vkx545Spy-jEiiR+1qeEkDA@mail.gmail.com>

On Wed, May 18, 2011 at 10:21 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> What do you think? Is it useless and/or stupid?

I wouldn't call it useless or stupid - merely "lost in the noise". In
small cases, I expect it would be swamped completely by the high fixed
overhead of entering the new scope and in all generator expressions I
expected it would be swamped by the cost of resuming the generator on
each iteration, and even for comprehensions any time spent on the
unneeded variable assignment is likely still going to be dominated by
the __next__() call overhead.

> I heard about optimization in the AST tree instead of working on the
> bytecode. What is the status of this project?

First step is getting back to Eugene Toder's AST cleanup patch and
working on getting that in. It's a big patch though, and I'd like to
see it broken up into a couple of distinct phases before we proceed.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From nadeem.vawda at gmail.com  Wed May 18 16:19:59 2011
From: nadeem.vawda at gmail.com (Nadeem Vawda)
Date: Wed, 18 May 2011 16:19:59 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305721315.16682.10.camel@marge>
References: <1305721315.16682.10.camel@marge>
Message-ID: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>

On Wed, May 18, 2011 at 2:21 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> ''.join(c for c in 'abc') and ''.join([c for c in 'abc']) do create a
> temporary c variable.

I'm not sure why you would encounter code like that in the first place.
Surely any code of the form:

    ''.join(c for c in my_string)

would just return my_string? Or am I missing something?

> I heard about optimization in the AST tree instead of working on the
> bytecode. What is the status of this project?

Are you referring to issue11549? There was some related discussion [1] on
python-dev about six weeks ago, but I haven't seen anything on the topic
since then.

Cheers,
Nadeem

[1] http://mail.python.org/pipermail/python-dev/2011-April/110399.html

From janssen at parc.com  Wed May 18 16:59:58 2011
From: janssen at parc.com (Bill Janssen)
Date: Wed, 18 May 2011 07:59:58 PDT
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <iqvp43$et3$1@dough.gmane.org>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
	<iqvp43$et3$1@dough.gmane.org>
Message-ID: <86793.1305730798@parc.com>

Georg Brandl <g.brandl at gmx.net> wrote:

> We do have
> 
>   bytes.fromhex('deadbeef')

Sort of reminds me of Java's Integer.parseInt(), and not in a good way.

Bill

From ethan at stoneleaf.us  Wed May 18 17:57:46 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 08:57:46 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD35B9C.3030702@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>
Message-ID: <4DD3EC7A.8070801@stoneleaf.us>

Greg Ewing wrote:
> Ethan Furman wrote:
> 
>> On the one hand we have the 'bytes are ascii data' type interface, and 
>> on the other we have the 'bytes are a list of integers between 0 - 
>> 255' interface.
> 
> I think the weird part is that there exists a literal for
> writing a byte array as an ascii string, and furthermore
> that it's the *only* kind of literal available for bytes.

That is the point I was trying to make -- thank you for stating it more 
clearly than I managed to.  :)


> Personally I think that the default literal syntax for
> bytes, and also the form produced by repr(), should have
> been something more neutral, such as hex,

Agreed.  It is surprising to extract an element out of bytes, and not 
end up with bytes, but with an int -- if the repr used something besides 
the plain ascii representation, this would not be an expectation.  For 
comparison, when one extracts an element out of a str one gets a str -- 
not the int representing the unicode code point.

> with the ascii form available for use when it makes sense.
> 
> As for
> 
>> --> some_other_var[3] == b'd'
> 
> there ought to be a literal for specifying an integer
> using an ascii character, so you could say something like
> 
>   if some_other_var[3] == c'd':
> 
> which would be equivalent to
> 
>   if some_other_var[3] == ord(b'd')
> 
> but without the overhead of computing the value each time
> at run time.

Given that we can't change the behavior of b'abc'[1], that would be 
better than what we have.

+1

~Ethan~


From stephen at xemacs.org  Wed May 18 18:16:44 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 19 May 2011 01:16:44 +0900
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
Message-ID: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>

Robert Collins writes:

 > Its probably too late to change, but please don't try to argue that
 > its correct: the continued confusion of folk running into this is
 > evidence that confusion *is happening*. Treat that as evidence and
 > think about how to fix it going forward.

Sorry, Rob, but you're just wrong here, and Nick is right.  It's
possible to improve Python 3, but not to "fix" it in this respect.
The Python 3 solution is correct, the Python 2 approach is not.
There's no way to avoid discontinuity and confusion here.

Confusion is indeed happening, but it's real confusion in the way
people think about the problem space, not a language design cockup.
The problem can't be solved by embedding ASCII in Unicode, because
non-ASCII bytes don't have a canonical embedding in Unicode.  Ie, the
situation is inherently confusing.  You can't wish it away, you can
only choose to impose more or less of it on particular constituencies.

Now, it's quite possible that there are other correct approaches that
allow straightforward manipulation of non-ASCII text, but I don't know
what they are, and I don't know anybody else who does.



From merwok at netwok.org  Wed May 18 18:47:49 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 18 May 2011 18:47:49 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Skip some more tests in
 the absence of threading.
In-Reply-To: <E1QMDZY-0002xi-1W@dinsdale.python.org>
References: <E1QMDZY-0002xi-1W@dinsdale.python.org>
Message-ID: <4DD3F835.6070609@netwok.org>

Hi,

> http://hg.python.org/cpython/rev/c83fb59b73ea
> user:        Vinay Sajip <vinay_sajip at yahoo.co.uk>
> date:        Tue May 17 07:15:53 2011 +0100
> summary:
>   Skip some more tests in the absence of threading

> diff --git a/Lib/test/test_logging.py b/Lib/test/test_logging.py
> --- a/Lib/test/test_logging.py
> +++ b/Lib/test/test_logging.py
>  try:
>      import threading
> +    # The following imports are needed only for tests which
> +    import asynchat
I guess ?for tests which use threading?

> +if threading:
> +    class TestSMTPChannel(smtpd.SMTPChannel):
I wonder if you could have saved yourself all this reindenting if your
import had fallen back to dummy_threading.

> + at unittest.skipUnless(threading, 'Threading required for this test.')
I?d have used lower-case threading, to make it a bit clearer that it?s
the threading module that?s require, not some abstract notion of
threading.  But they may be the same thing, I?m not sure.

Regards

From merwok at netwok.org  Wed May 18 18:51:18 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 18 May 2011 18:51:18 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Skip some tests in the
 absence of multiprocessing.
In-Reply-To: <E1QMDyC-0004jq-1J@dinsdale.python.org>
References: <E1QMDyC-0004jq-1J@dinsdale.python.org>
Message-ID: <4DD3F906.2080100@netwok.org>

Hi again,

> http://hg.python.org/cpython/rev/4b7c29201c60
> user:        Vinay Sajip <vinay_sajip at yahoo.co.uk>
> summary:
>   Skip some tests in the absence of multiprocessing.

> +    @unittest.skipUnless(threading, 'Threading required for this test.')
Who wins, the commit message or the code? :)

> +        try:
> +            import multiprocessing as mp
> +            r = logging.makeLogRecord({})
> +            self.assertEqual(r.processName, mp.current_process().name)
> +        except ImportError:
> +            pass
Isn?t support.import_module or somesuch useful for this kind of checks?

Regards

From rdmurray at bitdance.com  Wed May 18 19:10:17 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Wed, 18 May 2011 13:10:17 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20110518171018.807E1250045@webabinitio.net>

On Thu, 19 May 2011 01:16:44 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Robert Collins writes:
> 
>  > Its probably too late to change, but please don't try to argue that
>  > its correct: the continued confusion of folk running into this is
>  > evidence that confusion *is happening*. Treat that as evidence and
>  > think about how to fix it going forward.
> 
> Sorry, Rob, but you're just wrong here, and Nick is right.  It's
> possible to improve Python 3, but not to "fix" it in this respect.
> The Python 3 solution is correct, the Python 2 approach is not.
> There's no way to avoid discontinuity and confusion here.
> 
> Confusion is indeed happening, but it's real confusion in the way
> people think about the problem space, not a language design cockup.

Note that the more common idiom (not that I can measure it, mind)
when dealing with byte strings is something analogous to

    if my_byte_string[i:i+1] == b'x':

rather than

    if my_byte_string[i] == 170:

and the former is a lot more readable than the latter, even though
you have to stare at the slice for a couple seconds the first time
you encounter it to realize what is going on.

So *something* is wrong with Python3's approach.  Python2 was wronger,
though :)

--David

From merwok at netwok.org  Wed May 18 19:46:16 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 18 May 2011 19:46:16 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <4DD2A593.90203@cheimes.de>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<4DD2A593.90203@cheimes.de>
Message-ID: <4DD405E8.1090401@netwok.org>

Le 17/05/2011 18:42, Christian Heimes a ?crit :
> A good place for a local sysconfig.cfg could be the user's stdlib
> directory (e.g. ~/.local/lib/python3.2/sysconfig.cfg).

I don?t think so.  See http://bugs.python.org/issue7175 and
http://mail.python.org/pipermail/python-dev/2010-August/103011.html

Regards

From merwok at netwok.org  Wed May 18 19:48:25 2011
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Wed, 18 May 2011 19:48:25 +0200
Subject: [Python-Dev] "packaging" merge imminent
In-Reply-To: <1305664835.29701.2.camel@marge>
References: <BANLkTimcbu6_tSD=KrKoG=nvHrUWzS_McQ@mail.gmail.com>
	<1305664835.29701.2.camel@marge>
Message-ID: <4DD40669.7000904@netwok.org>

> I fixed recently some bugs in distutils. Should I also fix them in the
> packaging module, or are both modules already "synchronized"?

I ported some fixes, especially in sysconfig; for distutils, I have a
number of them marked for backport in the bug tracker (distutils2
component) or in personal bookmarks.  There are not very many.

Cheers

From ethan at stoneleaf.us  Wed May 18 20:51:54 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 11:51:54 -0700
Subject: [Python-Dev] Equality testing
Message-ID: <4DD4154A.3080603@stoneleaf.us>

In Python 3 inequality comparisons became forbidden.

--> 123 < [1, 2, 3]
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: unorderable types: int() < list()

However, equality comparisons are still allowed

--> 123 == [1, 2, 3]
False

But you can't mix them (inequality wins)

--> 123 <= [1, 2, 3]
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: unorderable types: int() <= list()

I realize this is probably a Py4000 change if it happens at all, but 
does this make sense?  Shouldn't an attempt to compare to unlike objects 
be a TypeError, just like trying to order them is?

It bit me when I tried to compare a byte string element with a single 
character byte string (of course they should have matched, but since the 
element was an int, the match was not longer True).

~Ethan~

From hagen at zhuliguan.net  Wed May 18 20:39:58 2011
From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=)
Date: Wed, 18 May 2011 20:39:58 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD2C37D.7000008@python.org>
References: <4DD2C37D.7000008@python.org>
Message-ID: <4DD4127E.6050301@zhuliguan.net>

> On behalf of the Python development team, I am pleased to announce the
> first release candidate of Python 3.2.1.

Shouldn't there be a tag "v3.2.1rc1" in the hg repo?

Cheers,
Hagen


From martin at v.loewis.de  Wed May 18 20:52:17 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 20:52:17 +0200
Subject: [Python-Dev] how do you find out what version of Python a PEP
 landed in?
In-Reply-To: <BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com>
References: <4DD35E6F.8030901@simplistix.co.uk>	<4DD3661D.30908@v.loewis.de>
	<BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com>
Message-ID: <4DD41561.6090305@v.loewis.de>

Am 18.05.2011 08:38, schrieb Amaury Forgeot d'Arc:
> 2011/5/18 "Martin v. L?wis" <martin at v.loewis.de>:
>>> How do I know which version of Python a PEP lands in?
>>
>> You should look at the Python-Version header of the PEP.
> 
> But some PEPs don't have it: 341, 342, 343, 353...

In these cases, the respective authors (or somebody else
who cares) should add it.

Regards,
Martin

From martin at v.loewis.de  Wed May 18 21:06:09 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 21:06:09 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <20110518171018.807E1250045@webabinitio.net>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20110518171018.807E1250045@webabinitio.net>
Message-ID: <4DD418A1.6000508@v.loewis.de>

> Note that the more common idiom (not that I can measure it, mind)
> when dealing with byte strings is something analogous to
> 
>     if my_byte_string[i:i+1] == b'x':
> 
> rather than
> 
>     if my_byte_string[i] == 170:

FWIW, Another spelling of this is

      if my_byte_string[i] == ord(b'x')

>From a readability point, it's in the same category as the first one,
but less twisted.

Regards,
Martin

From martin at v.loewis.de  Wed May 18 21:09:03 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 21:09:03 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD4127E.6050301@zhuliguan.net>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
Message-ID: <4DD4194F.9020009@v.loewis.de>

Am 18.05.2011 20:39, schrieb Hagen F?rstenau:
>> On behalf of the Python development team, I am pleased to announce the
>> first release candidate of Python 3.2.1.
> 
> Shouldn't there be a tag "v3.2.1rc1" in the hg repo?

http://hg.python.org/releasing/3.2.1/

Regards,
Martin

P.S. "Shouldn't" makes it sound as if there was a mistake.

From eric at trueblade.com  Wed May 18 21:10:15 2011
From: eric at trueblade.com (Eric Smith)
Date: Wed, 18 May 2011 15:10:15 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4DD41997.4060401@trueblade.com>

On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
> Robert Collins writes:
> 
>  > Its probably too late to change, but please don't try to argue that
>  > its correct: the continued confusion of folk running into this is
>  > evidence that confusion *is happening*. Treat that as evidence and
>  > think about how to fix it going forward.
> 
> Sorry, Rob, but you're just wrong here, and Nick is right.  It's
> possible to improve Python 3, but not to "fix" it in this respect.
> The Python 3 solution is correct, the Python 2 approach is not.
> There's no way to avoid discontinuity and confusion here.

I don't think there's any connection between the way 2.x confused text
strings and binary data (which certainly needed addressing) with the way
that 3.x returns a different type for byte_str[i] than it does for
byte_str[i:i+1]. I think it's the latter that's confusing to people.
There's no particular requirement for different types that's needed to
fix the byte/str problem.

And of course it's too late to make any change to this.

Eric.

From ethan at stoneleaf.us  Wed May 18 21:29:47 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 12:29:47 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD3EC7A.8070801@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>
	<4DD3EC7A.8070801@stoneleaf.us>
Message-ID: <4DD41E2B.7000404@stoneleaf.us>

Ethan Furman wrote:
> Greg Ewing wrote:
>> As for
>>
>>> --> some_other_var[3] == b'd'
>>
>> there ought to be a literal for specifying an integer
>> using an ascii character, so you could say something like
>>
>>   if some_other_var[3] == c'd':
>>
>> which would be equivalent to
>>
>>   if some_other_var[3] == ord(b'd')
>>
>> but without the overhead of computing the value each time
>> at run time.
> 
> Given that we can't change the behavior of b'abc'[1], that would be 
> better than what we have.
> 
> +1

Here's another thought, that perhaps is not backwards-incompatible...

some_var[3] == b'd'

At some point, the bytes class' __eq__ will be called -- is there a 
reason why we cannot have

1) a check to see if the bytes instance is length 1
2) a check to see if
    i) the other object is an int, and
    2) 0 <= other_obj < 256
3) if 1 and 2, make the comparison instead of returning NotImplemented?

This makes sense to me -- after all, the bytes class is an array of ints 
in range(256);  it is a special case, but doesn't feel any more special 
than passing an int into bytes() giving a string of that many null 
bytes; and it would get rid of the, in my opinion ugly, idiom of

some_var[i:i+1] == b'd'

It would also not require a new literal syntax.

~Ethan~

From benjamin at python.org  Wed May 18 21:22:18 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Wed, 18 May 2011 14:22:18 -0500
Subject: [Python-Dev] Equality testing
In-Reply-To: <4DD4154A.3080603@stoneleaf.us>
References: <4DD4154A.3080603@stoneleaf.us>
Message-ID: <BANLkTikyb7A_sjdwB_jNRVybOm5OnqTeqQ@mail.gmail.com>

2011/5/18 Ethan Furman <ethan at stoneleaf.us>:
> In Python 3 inequality comparisons became forbidden.
>
> --> 123 < [1, 2, 3]
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: unorderable types: int() < list()
>
> However, equality comparisons are still allowed
>
> --> 123 == [1, 2, 3]
> False
>
> But you can't mix them (inequality wins)
>
> --> 123 <= [1, 2, 3]
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: unorderable types: int() <= list()
>
> I realize this is probably a Py4000 change if it happens at all, but does
> this make sense? ?Shouldn't an attempt to compare to unlike objects be a
> TypeError, just like trying to order them is?

No. Ordering for types which completely different doesn't make any
sense, but equality testing is just fine because it has an obvious
answer: no.



-- 
Regards,
Benjamin

From hagen at zhuliguan.net  Wed May 18 21:37:29 2011
From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=)
Date: Wed, 18 May 2011 21:37:29 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD4194F.9020009@v.loewis.de>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
	<4DD4194F.9020009@v.loewis.de>
Message-ID: <4DD41FF9.9040704@zhuliguan.net>

> P.S. "Shouldn't" makes it sound as if there was a mistake.

Well, I thought there was. When do these tags get merged into "cpython"
then? "v3.2.1b1" is there, but "v3.2.1rc1" isn't:

http://hg.python.org/cpython/tags

Cheers,
Hagen


From g.brandl at gmx.net  Wed May 18 21:37:57 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 21:37:57 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD4194F.9020009@v.loewis.de>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
	<4DD4194F.9020009@v.loewis.de>
Message-ID: <ir176p$25a$1@dough.gmane.org>

On 18.05.2011 21:09, "Martin v. L?wis" wrote:
> Am 18.05.2011 20:39, schrieb Hagen F?rstenau:
>>> On behalf of the Python development team, I am pleased to announce the
>>> first release candidate of Python 3.2.1.
>> 
>> Shouldn't there be a tag "v3.2.1rc1" in the hg repo?
> 
> http://hg.python.org/releasing/3.2.1/
> 
> Regards,
> Martin
> 
> P.S. "Shouldn't" makes it sound as if there was a mistake.

To clarify: once the final is done, the repo Martin mentioned will be
merged back to main and then vanish.

Georg


From g.brandl at gmx.net  Wed May 18 21:47:43 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 18 May 2011 21:47:43 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD418A1.6000508@v.loewis.de>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20110518171018.807E1250045@webabinitio.net>
	<4DD418A1.6000508@v.loewis.de>
Message-ID: <ir17p3$60h$1@dough.gmane.org>

On 18.05.2011 21:06, "Martin v. L?wis" wrote:
>> Note that the more common idiom (not that I can measure it, mind)
>> when dealing with byte strings is something analogous to
>> 
>>     if my_byte_string[i:i+1] == b'x':
>> 
>> rather than
>> 
>>     if my_byte_string[i] == 170:
> 
> FWIW, Another spelling of this is
> 
>       if my_byte_string[i] == ord(b'x')
> 
>>From a readability point, it's in the same category as the first one,
> but less twisted.

Probably more twisted:

if my_byte_string[i] == b'x'[0]:

:)

Georg


From ethan at stoneleaf.us  Wed May 18 22:10:11 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 13:10:11 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD41E2B.7000404@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>
Message-ID: <4DD427A3.9080207@stoneleaf.us>

Ethan Furman wrote:

[...]

Also posted to Python-Ideas.

~Ethan~

From martin at v.loewis.de  Wed May 18 22:01:12 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 22:01:12 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD41FF9.9040704@zhuliguan.net>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
	<4DD4194F.9020009@v.loewis.de> <4DD41FF9.9040704@zhuliguan.net>
Message-ID: <4DD42588.4080202@v.loewis.de>

Am 18.05.2011 21:37, schrieb Hagen F?rstenau:
>> P.S. "Shouldn't" makes it sound as if there was a mistake.
> 
> Well, I thought there was. When do these tags get merged into "cpython"
> then? 

See PEP 101

Regards,
Martin

From martin at v.loewis.de  Wed May 18 22:06:26 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 22:06:26 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD41E2B.7000404@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>
Message-ID: <4DD426C2.7060706@v.loewis.de>

> Here's another thought, that perhaps is not backwards-incompatible...
> 
> some_var[3] == b'd'
> 
> At some point, the bytes class' __eq__ will be called -- is there a
> reason why we cannot have
> 
> 1) a check to see if the bytes instance is length 1
> 2) a check to see if
>    i) the other object is an int, and
>    2) 0 <= other_obj < 256
> 3) if 1 and 2, make the comparison instead of returning NotImplemented?

Immutable objects that compare equal should hash equal;
so we would also have to change the hashing of byte strings. Not sure
whether that, in turn, has undesirable consequences.

In addition, equality should be transitive, so b'A' == 65.0.

Regards,
Martin

From lac at openend.se  Wed May 18 22:30:28 2011
From: lac at openend.se (Laura Creighton)
Date: Wed, 18 May 2011 22:30:28 +0200
Subject: [Python-Dev] how do you find out what version of Python a PEP
	landed in?
In-Reply-To: Message from =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
	<martin@v.loewis.de>
	of "Wed, 18 May 2011 20:52:17 +0200." <4DD41561.6090305@v.loewis.de>
References: <4DD35E6F.8030901@simplistix.co.uk> <4DD3661D.30908@v.loewis.de>
	<BANLkTiny++ziwc_+UsOfWfiUD-dxM7KZxA@mail.gmail.com><4DD41561.6090305@v.loewis.de>
Message-ID: <201105182030.p4IKUSU9005831@theraft.openend.se>

Politely ask them to add it.
(just my suggrestion).

Laura

From ethan at stoneleaf.us  Wed May 18 22:48:07 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 13:48:07 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD426C2.7060706@v.loewis.de>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us> <4DD426C2.7060706@v.loewis.de>
Message-ID: <4DD43087.6090602@stoneleaf.us>

Martin v. L?wis wrote:
>> Here's another thought, that perhaps is not backwards-incompatible...
>>
>> some_var[3] == b'd'
>>
>> At some point, the bytes class' __eq__ will be called -- is there a
>> reason why we cannot have
>>
>> 1) a check to see if the bytes instance is length 1
>> 2) a check to see if
>>    i) the other object is an int, and
>>    2) 0 <= other_obj < 256
>> 3) if 1 and 2, make the comparison instead of returning NotImplemented?
> 
> Immutable objects that compare equal should hash equal;
> so we would also have to change the hashing of byte strings. Not sure
> whether that, in turn, has undesirable consequences.

I thought it was the other-way-round -- if they hash equal, they should 
compare equal?  Or is this just for immutables?

> In addition, equality should be transitive, so b'A' == 65.0.

I'm not sure what you're getting at...  we could certainly have step 2 
check for a number instead of an int, and then step 3 could extract the 
one element, giving an int, and then let that int compare itself with 
the other number, whether it be int, float, fraction, what-have-you.


~Ethan~

From tjreedy at udel.edu  Wed May 18 22:41:45 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 16:41:45 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD427A3.9080207@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>	<4DD41E2B.7000404@stoneleaf.us>
	<4DD427A3.9080207@stoneleaf.us>
Message-ID: <ir1au8$o4s$1@dough.gmane.org>

On 5/18/2011 4:10 PM, Ethan Furman wrote:
> Ethan Furman wrote:
>
> [...]
>
> Also posted to Python-Ideas.

Good. That is where it should have gone in the first place, as this is 
about ideas not yet even in the PEP stage.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Wed May 18 23:01:28 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 17:01:28 -0400
Subject: [Python-Dev] Equality testing
In-Reply-To: <4DD4154A.3080603@stoneleaf.us>
References: <4DD4154A.3080603@stoneleaf.us>
Message-ID: <ir1c3a$rk$1@dough.gmane.org>

On 5/18/2011 2:51 PM, Ethan Furman wrote:
> In Python 3 inequality comparisons became forbidden.
>
> --> 123 < [1, 2, 3]
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unorderable types: int() < list()
>
> However, equality comparisons are still allowed
>
> --> 123 == [1, 2, 3]
> False
>
> But you can't mix them (inequality wins)
>
> --> 123 <= [1, 2, 3]
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unorderable types: int() <= list()
>
> I realize this is probably a Py4000 change if it happens at all, but
> does this make sense? Shouldn't an attempt to compare to unlike objects
> be a TypeError, just like trying to order them is?
>
> It bit me when I tried to compare a byte string element with a single
> character byte string (of course they should have matched, but since the
> element was an int, the match was not longer True).

Questions/comments like this that are not about developing the next 
versions of Python, as you acknowledge above, really belong elsewhere, 
like on the ideas list.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Wed May 18 23:13:23 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 17:13:23 -0400
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
Message-ID: <ir1cpj$52a$1@dough.gmane.org>

On 5/18/2011 10:19 AM, Nadeem Vawda wrote:

> I'm not sure why you would encounter code like that in the first place.
> Surely any code of the form:
>
>      ''.join(c for c in my_string)
>
> would just return my_string? Or am I missing something?

Good question. Anything useful like "'-'.join(c for c in 'abc')" is the 
same as "'-'.join('abc'). The same, as far as I can think of, for 
anything like list() or set() taking an iterable arg.

-- 
Terry Jan Reedy


From ethan at stoneleaf.us  Wed May 18 23:42:37 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 18 May 2011 14:42:37 -0700
Subject: [Python-Dev] Equality testing
In-Reply-To: <ir1c3a$rk$1@dough.gmane.org>
References: <4DD4154A.3080603@stoneleaf.us> <ir1c3a$rk$1@dough.gmane.org>
Message-ID: <4DD43D4D.1080902@stoneleaf.us>

Terry Reedy wrote:
> On 5/18/2011 2:51 PM, Ethan Furman wrote:
>> In Python 3 inequality comparisons became forbidden.
>>
>> --> 123 < [1, 2, 3]
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> TypeError: unorderable types: int() < list()
>>
>> However, equality comparisons are still allowed
>>
>> --> 123 == [1, 2, 3]
>> False
>>
>> But you can't mix them (inequality wins)
>>
>> --> 123 <= [1, 2, 3]
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> TypeError: unorderable types: int() <= list()
>>
>> I realize this is probably a Py4000 change if it happens at all, but
>> does this make sense? Shouldn't an attempt to compare to unlike objects
>> be a TypeError, just like trying to order them is?
>>
>> It bit me when I tried to compare a byte string element with a single
>> character byte string (of course they should have matched, but since the
>> element was an int, the match was not longer True).
> 
> Questions/comments like this that are not about developing the next 
> versions of Python, as you acknowledge above, really belong elsewhere, 
> like on the ideas list.

My apologies.  I'll be more careful.

~Ethan~


From victor.stinner at haypocalc.com  Wed May 18 23:34:09 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 18 May 2011 23:34:09 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
Message-ID: <1305754449.27389.30.camel@marge>

Le mercredi 18 mai 2011 ? 16:19 +0200, Nadeem Vawda a ?crit :
> I'm not sure why you would encounter code like that in the first place.

Well, I found the STORE_FAST/LOAD_FAST "issue" while trying to optimize
the this module which reimplements rot13 using a dict in Python 3:

d = {}
for c in (65, 97):
    for i in range(26):
        d[chr(i+c)] = chr((i+13) % 26 + c)

I tried:

d = {chr(i+c): chr((i+13) % 26 + c)
     for i in range(26)
     for c in (65, 97)}

But it is slower whereas I read somewhere than generators are faster
than loops. By the way, (c for c in ...) is slower than [c for c
in ...]. I suppose that a generator is slower because it exits/reenter
into PyEval_EvalFrameEx() at each step, whereas [c for c ...] uses
BUILD_LIST in a dummy (but fast) loop.

(c for c in ...) and [c for c in ...] is stupid, but I used a simplified
example to explain the problem. A more realistic example would be:

   squares = (x*x for x in range(10000))

You don't really need the "x" variable, you just want the square.
Another example is the syntax using a if the filter the data set:

   (x for x in ... if condition(x))

> > I heard about optimization in the AST tree instead of working on the
> > bytecode. What is the status of this project?
> 
> Are you referring to issue11549? There was some related discussion [1] on
> python-dev about six weeks ago, but I haven't seen anything on the topic
> since then.

Ah yes, it looks to be this issue. I didn't know that there was an
issue.

Victor


From amauryfa at gmail.com  Wed May 18 23:37:30 2011
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Wed, 18 May 2011 23:37:30 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <ir1cpj$52a$1@dough.gmane.org>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<ir1cpj$52a$1@dough.gmane.org>
Message-ID: <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com>

Hi,

2011/5/18 Terry Reedy <tjreedy at udel.edu>:
> On 5/18/2011 10:19 AM, Nadeem Vawda wrote:
>
>> I'm not sure why you would encounter code like that in the first place.
>> Surely any code of the form:
>>
>> ? ? ''.join(c for c in my_string)
>>
>> would just return my_string? Or am I missing something?
>
> Good question. Anything useful like "'-'.join(c for c in 'abc')" is the same
> as "'-'.join('abc'). The same, as far as I can think of, for anything like
> list() or set() taking an iterable arg.

With a little imagination you can build something non trivial.
For example, a join_words function:

def join_words(words):
    return ', '.join(w.strip() for w in words)

Like Victor says, the code of the generator object contains a
STORE_FAST followed by LOAD_FAST.
This pair of opcodes could be removed, and the value left on the stack.

>>> dis.dis(join_words.func_code.co_consts[2])
  1           0 SETUP_LOOP              24 (to 27)
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                17 (to 26)
              9 STORE_FAST               1 (w)
             12 LOAD_FAST                1 (w)
             15 LOAD_ATTR                0 (strip)
             18 CALL_FUNCTION            0
             21 YIELD_VALUE
             22 POP_TOP
             23 JUMP_ABSOLUTE            6
        >>   26 POP_BLOCK
        >>   27 LOAD_CONST               0 (None)
             30 RETURN_VALUE

It's probably not easy to do though.
Think of expressions where the variable appears several times,
or even where the variable is not the first object, like str(ord(x)).

-- 
Amaury Forgeot d'Arc

From martin at v.loewis.de  Wed May 18 23:58:21 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 18 May 2011 23:58:21 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD43087.6090602@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us> <4DD426C2.7060706@v.loewis.de>
	<4DD43087.6090602@stoneleaf.us>
Message-ID: <4DD440FD.7060208@v.loewis.de>

>> Immutable objects that compare equal should hash equal;
>> so we would also have to change the hashing of byte strings. Not sure
>> whether that, in turn, has undesirable consequences.
> 
> I thought it was the other-way-round -- if they hash equal, they should
> compare equal?

No no no. If they hash equal, it could just be a hash collision -
objects of a class could all hash to 42, if they wanted to.
Dictionaries require the property I mentioned. If they compare
equal, but hash differently, a dictionary lookup would fail to
find the key.

>> In addition, equality should be transitive, so b'A' == 65.0.
> 
> I'm not sure what you're getting at...

That it is counter-intuitive to have a bytes object compare equal
to a floating-point number.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Thu May 19 00:02:48 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 May 2011 10:02:48 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <iqvp43$et3$1@dough.gmane.org>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
	<iqvp43$et3$1@dough.gmane.org>
Message-ID: <4DD44208.70101@canterbury.ac.nz>

Georg Brandl wrote:

> We do have
> 
>   bytes.fromhex('deadbeef')

But again, there is a run-time overhead to this.

-- 
Greg

From greg.ewing at canterbury.ac.nz  Thu May 19 00:32:28 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 May 2011 10:32:28 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD41997.4060401@trueblade.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4DD41997.4060401@trueblade.com>
Message-ID: <4DD448FC.9030301@canterbury.ac.nz>

Eric Smith wrote:

> And of course it's too late to make any change to this.

It's too late to change the meaning of b'...', but is it
really too late to introduce an x'...' literal and change
the repr() to produce it?

-- 
Greg

From greg.ewing at canterbury.ac.nz  Thu May 19 00:39:34 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 May 2011 10:39:34 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD41E2B.7000404@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
	<4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us>
Message-ID: <4DD44AA6.9030600@canterbury.ac.nz>

Ethan Furman wrote:

> some_var[3] == b'd'
> 
> 1) a check to see if the bytes instance is length 1
> 2) a check to see if
>    i) the other object is an int, and
>    2) 0 <= other_obj < 256
> 3) if 1 and 2, make the comparison instead of returning NotImplemented?

It might seem convenient, but I'd worry that it would lead to
even more confusion in other ways. If someone sees that

    some_var[3] == b'd'

is true, and that

    some_var[3] == 100

is also true, they might expect to be able to do things
like

    n = b'd' + 1

and get 101... or maybe b'e'...

-- 
Greg

From eric at trueblade.com  Thu May 19 00:46:01 2011
From: eric at trueblade.com (Eric Smith)
Date: Wed, 18 May 2011 18:46:01 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD448FC.9030301@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>	<4DD41997.4060401@trueblade.com>
	<4DD448FC.9030301@canterbury.ac.nz>
Message-ID: <4DD44C29.6050008@trueblade.com>

On 5/18/2011 6:32 PM, Greg Ewing wrote:
> Eric Smith wrote:
> 
>> And of course it's too late to make any change to this.
> 
> It's too late to change the meaning of b'...', but is it
> really too late to introduce an x'...' literal and change
> the repr() to produce it?

My "this" was the different types returned by b[i] and b[i:i+1].

Eric.

From greg.ewing at canterbury.ac.nz  Thu May 19 00:47:09 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 19 May 2011 10:47:09 +1200
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <1305754449.27389.30.camel@marge>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge>
Message-ID: <4DD44C6D.8000808@canterbury.ac.nz>

Victor Stinner wrote:

>    squares = (x*x for x in range(10000))

What bytecode would you optimise that into?

-- 
Greg

From robertc at robertcollins.net  Thu May 19 01:39:19 2011
From: robertc at robertcollins.net (Robert Collins)
Date: Thu, 19 May 2011 11:39:19 +1200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com>

On Thu, May 19, 2011 at 4:16 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Robert Collins writes:
>
> ?> Its probably too late to change, but please don't try to argue that
> ?> its correct: the continued confusion of folk running into this is
> ?> evidence that confusion *is happening*. Treat that as evidence and
> ?> think about how to fix it going forward.
>
> Sorry, Rob, but you're just wrong here, and Nick is right. ?It's
> possible to improve Python 3, but not to "fix" it in this respect.
> The Python 3 solution is correct, the Python 2 approach is not.
> There's no way to avoid discontinuity and confusion here.

The top level description: 'bytes is a different type to text[unicode]
and casting between them must be explicit' is completely correct in
Python 3: I didn't (and have never AFAIK) quibbled about that.

Thats separate to the implementation issues I have mentioned in this
thread and previous.

Arguing that implicit casting is a good idea isn't what I was doing,
nor what Nick was rebutting, AFAICT.

-Rob

From tjreedy at udel.edu  Thu May 19 03:44:24 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 21:44:24 -0400
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305754449.27389.30.camel@marge>
References: <1305721315.16682.10.camel@marge>	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge>
Message-ID: <ir1slo$ibs$1@dough.gmane.org>

On 5/18/2011 5:34 PM, Victor Stinner wrote:

You initial example gave me the impression that the issue has something 
to do with join in particular, or even comprehensions in particular. It 
is really about for loops.

>     squares = (x*x for x in range(10000))

 >>> dis('for x in range(3): y = x*x')
   1           0 SETUP_LOOP              30 (to 33)
               3 LOAD_NAME                0 (range)
               6 LOAD_CONST               0 (3)
               9 CALL_FUNCTION            1
              12 GET_ITER
         >>   13 FOR_ITER                16 (to 32)
              16 STORE_NAME               1 (x)
              19 LOAD_NAME                1 (x)
              22 LOAD_NAME                1 (x)
              25 BINARY_MULTIPLY
              26 STORE_NAME               2 (y)
              29 JUMP_ABSOLUTE           13
         >>   32 POP_BLOCK
         >>   33 LOAD_CONST               1 (None)
              36 RETURN_VALUE

> You don't really need the "x" variable, you just want the square.

It is nothing new that hand-crafted assembler (which mnemonic bytecode 
is) can sometimes beat a compiler. In this case, you want store, load, 
load before the multiply replaced with dup, and you cannot get that with 
Python code without a much smarter optimizer.

>

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Thu May 19 03:59:47 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 21:59:47 -0400
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com>
References: <1305721315.16682.10.camel@marge>	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>	<ir1cpj$52a$1@dough.gmane.org>
	<BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com>
Message-ID: <ir1til$lo6$1@dough.gmane.org>

On 5/18/2011 5:37 PM, Amaury Forgeot d'Arc wrote:
> Hi,
>
> 2011/5/18 Terry Reedy<tjreedy at udel.edu>:
>> On 5/18/2011 10:19 AM, Nadeem Vawda wrote:
>>
>>> I'm not sure why you would encounter code like that in the first place.
>>> Surely any code of the form:
>>>
>>>      ''.join(c for c in my_string)
>>>
>>> would just return my_string? Or am I missing something?
>>
>> Good question. Anything useful like "'-'.join(c for c in 'abc')" is the same
>> as "'-'.join('abc'). The same, as far as I can think of, for anything like
>> list() or set() taking an iterable arg.
>
> With a little imagination you can build something non trivial.
> For example, a join_words function:
>
> def join_words(words):
>      return ', '.join(w.strip() for w in words)
>
> Like Victor says, the code of the generator object contains a
> STORE_FAST followed by LOAD_FAST.
> This pair of opcodes could be removed, and the value left on the stack.
>
>>>> dis.dis(join_words.func_code.co_consts[2])
>    1           0 SETUP_LOOP              24 (to 27)
>                3 LOAD_FAST                0 (.0)
>          >>     6 FOR_ITER                17 (to 26)
>                9 STORE_FAST               1 (w)
>               12 LOAD_FAST                1 (w)
>               15 LOAD_ATTR                0 (strip)
>               18 CALL_FUNCTION            0
>               21 YIELD_VALUE
>               22 POP_TOP
>               23 JUMP_ABSOLUTE            6
>          >>    26 POP_BLOCK
>          >>    27 LOAD_CONST               0 (None)
>               30 RETURN_VALUE

As I pointed out in response to Victor, you get nearly the same with 
bytecode with regular old for loops; in particular, the store x/load x pair.

> It's probably not easy to do though.
> Think of expressions where the variable appears several times,
> or even where the variable is not the first object, like str(ord(x)).

Where first means first in left-to-right order rather than in innermost 
to outermost order. (OT: I think Python is a bit unusual in this way.)

-- 
Terry Jan Reedy


From techtonik at gmail.com  Thu May 19 04:33:11 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 19 May 2011 05:33:11 +0300
Subject: [Python-Dev] Inconsistent case in directory names for installed
 Python on Windows
In-Reply-To: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com>
References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com>
Message-ID: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com>

On Wed, May 18, 2011 at 3:47 PM, Brian Curtin <brian.curtin at gmail.com> wrote:
>
> On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com> wrote:
>>
>> Greetings,
>>
>> While studying `virtualenv` code I've noticed that in Python directory
>> tree `include`, `libs` and `tcl` are lowercased while other dirs are
>> capitalized. It doesn't seem important (especially for developers
>> here), but it still can leave an unpleasant image for people new to
>> Python (and programming in general).
>
> In theory there are probably a lot of things that might seem unpleasant but
> are actually non-issues. I don't believe there have been any complaints
> about actual unpleasantries with directory case.

Among web folks there are no people who care less about typography
than those who spend most of their time in text terminals. =) I think
that probability of receiving such complaint is very low even if
everybody notices that. "Why should I bother about consistency if
Python developers are not giving damn about it?"

>>
>> ?[Python27]
>> ? ??DLLs
>> ? ??Doc
>> ? ??include
>> ? ??Lib
>> ? ??libs
>> ? ??Scripts
>> ? ??tcl
>> ? ??Tools
>>
>> How about making a consistent lowercased or uppercased scheme? Windows
>> filesystems are case-insensitive, so the change shouldn't affect
>> anybody.
>
> Some Macs have case-sensitive file systems, and some people use
> case-sensitive file systems on various flavors of UNIX. The change would
> probably require a thorough look through the build chain.

But we are speaking only about Windows.

>> Another candidate for
>> normalization is Tools/Scripts dir,
>> which I'd lowercase FWIW:
>>
>> ??Tools
>> ???i18n
>> ???pynche
>> ???Scripts
>> ???versioncheck
>> ???webchecker
>>
>>
>> Lowercased dirs on a top level seem to contains files that are
>> relevant to C developers only. However, I can not say for sure. It
>> seems that there could be a better place for them like top level
>> directory named Dev or C-API.
>
> Overall I think it boils down to a cosmetic change that I'm not sure we need
> to make, which could unnecessarily break people's work. -1

That's right - I started that without cosmetic changes the project
becomes ugly and start to accumulate a lot of garbage. With due
attention to improving an image of Python from perspective of project
layout organization, this change could be made in Python 3. It is
something to keep in mind for the future.
-- 
anatoly t.

From techtonik at gmail.com  Thu May 19 04:46:23 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Thu, 19 May 2011 05:46:23 +0300
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <ir176p$25a$1@dough.gmane.org>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
	<4DD4194F.9020009@v.loewis.de> <ir176p$25a$1@dough.gmane.org>
Message-ID: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>

On Wed, May 18, 2011 at 10:37 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 18.05.2011 21:09, "Martin v. L?wis" wrote:
>> Am 18.05.2011 20:39, schrieb Hagen F?rstenau:
>>>> On behalf of the Python development team, I am pleased to announce the
>>>> first release candidate of Python 3.2.1.
>>>
>>> Shouldn't there be a tag "v3.2.1rc1" in the hg repo?
>>
>> http://hg.python.org/releasing/3.2.1/
>>
>> Regards,
>> Martin
>>
>> P.S. "Shouldn't" makes it sound as if there was a mistake.
>
> To clarify: once the final is done, the repo Martin mentioned will be
> merged back to main and then vanish.

Can't this work be done in the branch of main repo, so that everybody
can track the progress in place? Is there any picture of the process
similar to http://nvie.com/posts/a-successful-git-branching-model/ ?
--
anatoly t.

From brian.curtin at gmail.com  Thu May 19 04:48:03 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Wed, 18 May 2011 21:48:03 -0500
Subject: [Python-Dev] Inconsistent case in directory names for installed
 Python on Windows
In-Reply-To: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com>
References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com>
	<BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com>
Message-ID: <BANLkTikJxMkM+ByNRc4hV8L8Mf1k2VeQNw@mail.gmail.com>

On Wed, May 18, 2011 at 21:33, anatoly techtonik <techtonik at gmail.com>wrote:

> On Wed, May 18, 2011 at 3:47 PM, Brian Curtin <brian.curtin at gmail.com>
> wrote:
> >
> > On May 18, 2011 7:03 AM, "anatoly techtonik" <techtonik at gmail.com>
> wrote:
> >>
> >> Greetings,
> >>
> >> While studying `virtualenv` code I've noticed that in Python directory
> >> tree `include`, `libs` and `tcl` are lowercased while other dirs are
> >> capitalized. It doesn't seem important (especially for developers
> >> here), but it still can leave an unpleasant image for people new to
> >> Python (and programming in general).
> >
> > In theory there are probably a lot of things that might seem unpleasant
> but
> > are actually non-issues. I don't believe there have been any complaints
> > about actual unpleasantries with directory case.
>
> Among web folks there are no people who care less about typography
> than those who spend most of their time in text terminals. =) I think
> that probability of receiving such complaint is very low even if
> everybody notices that. "Why should I bother about consistency if
> Python developers are not giving damn about it?"
>
> >>
> >> ?[Python27]
> >> ? ??DLLs
> >> ? ??Doc
> >> ? ??include
> >> ? ??Lib
> >> ? ??libs
> >> ? ??Scripts
> >> ? ??tcl
> >> ? ??Tools
> >>
> >> How about making a consistent lowercased or uppercased scheme? Windows
> >> filesystems are case-insensitive, so the change shouldn't affect
> >> anybody.
> >
> > Some Macs have case-sensitive file systems, and some people use
> > case-sensitive file systems on various flavors of UNIX. The change would
> > probably require a thorough look through the build chain.
>
> But we are speaking only about Windows.
>

Definitely -1 to change the folder names only on Windows.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110518/e4177fa4/attachment.html>

From tjreedy at udel.edu  Thu May 19 05:20:25 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 23:20:25 -0400
Subject: [Python-Dev] Inconsistent case in directory names for installed
 Python on Windows
In-Reply-To: <BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com>
References: <BANLkTik_U_k3+xAPhPjvYj9n=zC58SdkwQ@mail.gmail.com>
	<BANLkTikXpJ3FuAxZLaYpkvnxU2MHcW4LBw@mail.gmail.com>
Message-ID: <ir229p$9kq$1@dough.gmane.org>

On 5/18/2011 10:33 PM, anatoly techtonik wrote:

>>> ?[Python27]
>>> ? ??DLLs
>>> ? ??Doc
>>> ? ??include
>>> ? ??Lib
>>> ? ??libs
>>> ? ??Scripts
>>> ? ??tcl
>>> ? ??Tools

Except for DLLs and tcl, these are the platform-independent names in the 
source tree. They are copied directly over to the installations, and I 
would not want it any way. Since I suspect change on *nix is out, I 
would feel the same for winX. I actually like having 'Lib' uppercase 
versus 'libs' lowercase, to make it easier to pick out 'Lib'. Most users 
have little reason to look as this directory list very often.
Certainly, Doc, Lib, Scripts, and Tools are ones they might want to look 
in, which include, libs, and tcl have nothing to look at. Notice the 
pattern? Hmmm. By the same logic, DLLs should have been dlls, but I 
suspect someone wanted to distinguish the plural s from dll.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Thu May 19 05:24:38 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 18 May 2011 23:24:38 -0400
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<4DD4127E.6050301@zhuliguan.net>	<4DD4194F.9020009@v.loewis.de>
	<ir176p$25a$1@dough.gmane.org>
	<BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
Message-ID: <ir22ho$amc$1@dough.gmane.org>

On 5/18/2011 10:46 PM, anatoly techtonik wrote:
> On Wed, May 18, 2011 at 10:37 PM, Georg Brandl<g.brandl at gmx.net>  wrote:
>> On 18.05.2011 21:09, "Martin v. L?wis" wrote:

>>> http://hg.python.org/releasing/3.2.1/

>> To clarify: once the final is done, the repo Martin mentioned will be
>> merged back to main and then vanish.
>
> Can't this work be done in the branch of main repo, so that everybody
> can track the progress in place?

As I understand it, this is a snapshot that George hopes will require No 
work between the candidate and final release and which will get only the 
minimum needed.

-- 
Terry Jan Reedy



From martin at v.loewis.de  Thu May 19 05:59:15 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 19 May 2011 05:59:15 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<4DD4127E.6050301@zhuliguan.net>	<4DD4194F.9020009@v.loewis.de>
	<ir176p$25a$1@dough.gmane.org>
	<BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
Message-ID: <4DD49593.30605@v.loewis.de>

> Can't this work be done in the branch of main repo, so that everybody
> can track the progress in place? Is there any picture of the process
> similar to http://nvie.com/posts/a-successful-git-branching-model/ ?

It *is* a branch of the main repo, so everybody *can* track the progress
(not sure what "track in place" means).

If you are asking for a named branch: no, that shouldn't be done.

Regards,
Martin

From g.brandl at gmx.net  Thu May 19 07:28:36 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 19 May 2011 07:28:36 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD44AA6.9030600@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
	<4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us>
	<4DD44AA6.9030600@canterbury.ac.nz>
Message-ID: <ir29q8$90n$1@dough.gmane.org>

On 19.05.2011 00:39, Greg Ewing wrote:
> Ethan Furman wrote:
> 
>> some_var[3] == b'd'
>> 
>> 1) a check to see if the bytes instance is length 1
>> 2) a check to see if
>>    i) the other object is an int, and
>>    2) 0 <= other_obj < 256
>> 3) if 1 and 2, make the comparison instead of returning NotImplemented?
> 
> It might seem convenient, but I'd worry that it would lead to
> even more confusion in other ways. If someone sees that
> 
>     some_var[3] == b'd'
> 
> is true, and that
> 
>     some_var[3] == 100
> 
> is also true, they might expect to be able to do things
> like
> 
>     n = b'd' + 1
> 
> and get 101... or maybe b'e'...

Maybe they should :)

Georg



From g.brandl at gmx.net  Thu May 19 07:32:18 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 19 May 2011 07:32:18 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <4DD41FF9.9040704@zhuliguan.net>
References: <4DD2C37D.7000008@python.org> <4DD4127E.6050301@zhuliguan.net>
	<4DD4194F.9020009@v.loewis.de> <4DD41FF9.9040704@zhuliguan.net>
Message-ID: <ir2a16$90n$2@dough.gmane.org>

On 18.05.2011 21:37, Hagen F?rstenau wrote:
>> P.S. "Shouldn't" makes it sound as if there was a mistake.
> 
> Well, I thought there was. When do these tags get merged into "cpython"
> then? "v3.2.1b1" is there, but "v3.2.1rc1" isn't:
> 
> http://hg.python.org/cpython/tags

3.2.1b1 was already merged back.  (And 3.2.1rc1 will also be merged back
soon, since there will be a 3.2.1rc2.)

Georg


From stefan_ml at behnel.de  Thu May 19 08:11:20 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 19 May 2011 08:11:20 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD44208.70101@canterbury.ac.nz>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>	<iqvp43$et3$1@dough.gmane.org>
	<4DD44208.70101@canterbury.ac.nz>
Message-ID: <ir2ca8$h97$1@dough.gmane.org>

Greg Ewing, 19.05.2011 00:02:
> Georg Brandl wrote:
>
>> We do have
>>
>> bytes.fromhex('deadbeef')
>
> But again, there is a run-time overhead to this.

Well, yes, but it's negligible if you assign it to a suitable variable first.

Stefan


From hagen at zhuliguan.net  Thu May 19 09:01:01 2011
From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=)
Date: Thu, 19 May 2011 09:01:01 +0200
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <ir2a16$90n$2@dough.gmane.org>
References: <4DD2C37D.7000008@python.org>
	<4DD4127E.6050301@zhuliguan.net>	<4DD4194F.9020009@v.loewis.de>
	<4DD41FF9.9040704@zhuliguan.net> <ir2a16$90n$2@dough.gmane.org>
Message-ID: <4DD4C02D.2030100@zhuliguan.net>

> 3.2.1b1 was already merged back.  (And 3.2.1rc1 will also be merged back
> soon, since there will be a 3.2.1rc2.)

Thanks for the clarification! :-)

Cheers,
Hagen


From python-dev at masklinn.net  Thu May 19 09:41:08 2011
From: python-dev at masklinn.net (Xavier Morel)
Date: Thu, 19 May 2011 09:41:08 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <ir29q8$90n$1@dough.gmane.org>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us> <4DD35B9C.3030702@canterbury.ac.nz>
	<4DD3EC7A.8070801@stoneleaf.us> <4DD41E2B.7000404@stoneleaf.us>
	<4DD44AA6.9030600@canterbury.ac.nz> <ir29q8$90n$1@dough.gmane.org>
Message-ID: <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>

On 2011-05-19, at 07:28 , Georg Brandl wrote:
> On 19.05.2011 00:39, Greg Ewing wrote:
>> Ethan Furman wrote:
>> 
>>> some_var[3] == b'd'
>>> 
>>> 1) a check to see if the bytes instance is length 1
>>> 2) a check to see if
>>>   i) the other object is an int, and
>>>   2) 0 <= other_obj < 256
>>> 3) if 1 and 2, make the comparison instead of returning NotImplemented?
>> 
>> It might seem convenient, but I'd worry that it would lead to
>> even more confusion in other ways. If someone sees that
>> 
>>    some_var[3] == b'd'
>> 
>> is true, and that
>> 
>>    some_var[3] == 100
>> 
>> is also true, they might expect to be able to do things
>> like
>> 
>>    n = b'd' + 1
>> 
>> and get 101... or maybe b'e'...
> 
> Maybe they should :)

But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?

From ncoghlan at gmail.com  Thu May 19 09:49:47 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 17:49:47 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD41997.4060401@trueblade.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4DD41997.4060401@trueblade.com>
Message-ID: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>

On Thu, May 19, 2011 at 5:10 AM, Eric Smith <eric at trueblade.com> wrote:
> On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
>> Robert Collins writes:
>>
>> ?> Its probably too late to change, but please don't try to argue that
>> ?> its correct: the continued confusion of folk running into this is
>> ?> evidence that confusion *is happening*. Treat that as evidence and
>> ?> think about how to fix it going forward.
>>
>> Sorry, Rob, but you're just wrong here, and Nick is right. ?It's
>> possible to improve Python 3, but not to "fix" it in this respect.
>> The Python 3 solution is correct, the Python 2 approach is not.
>> There's no way to avoid discontinuity and confusion here.
>
> I don't think there's any connection between the way 2.x confused text
> strings and binary data (which certainly needed addressing) with the way
> that 3.x returns a different type for byte_str[i] than it does for
> byte_str[i:i+1]. I think it's the latter that's confusing to people.
> There's no particular requirement for different types that's needed to
> fix the byte/str problem.

It's a mental model problem. People try to think of bytes as
equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
closer to array.array('c'). Strings are basically *unique* in
returning a length 1 instance of themselves for indexing operations.
For every other sequence type, including tuples, lists and arrays,
slicing returns a new instance of the same type, while indexing will
typically return something different.

Now, we definitely didn't *help* matters by keeping so many of the
default behaviours of bytes() and bytearray() coupled to ASCII-encoded
text, but that was a matter of practicality beating purity: there
really *are* a lot of wire protocols out there that are ASCII based.
In hindsight, perhaps we should have gone further in breaking things
to try to make the point about the mental model shift more forcefully.
(However, that idea carries with it its own problems).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From stephen at xemacs.org  Thu May 19 10:00:24 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 19 May 2011 17:00:24 +0900
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<BANLkTikh6D1KgF7_k_guy4s4JETpz5cJpw@mail.gmail.com>
Message-ID: <87fwobazmv.fsf@uwakimon.sk.tsukuba.ac.jp>

Robert Collins writes:

 > Thats separate to the implementation issues I have mentioned in this
 > thread and previous.

Oops, sorry.

Nevertheless, I personally think that b'a'[0] == 97 is a good idea,
and consistent with everything else in Python.  It's Unicode (str)
that is weird, it's str is surprising when first encountered by a C or
Lisp programmer at first, but not enough to cause a heart attack given
how weird natural language is.  But I don't see why that weirdness (an
element of LIST of TYPE is a LIST of TYPE, hey, young man, you're very
smart but *it's turtles all the way down!*) should be replicated
elsewhere.

If you want your bytes object to behave like a str, it's very easy to
get that (.decode('latin1')), and nobody has yet demonstrated that
this is too time-inefficient for real work, given the other overhead
imposed by Python.  The space inefficiency could be dealt with as Greg
points out (by internally having a Unicode representation using 1 byte
instead of 2 or 4).  But if you want your bytes object to *be* a
string, then you're confused.  It isn't (any more).  Even if it's just
a matter of flipping one bit in the type field, a str-with-unibyte-
representation, is not equal to a bytes object with the same bytes.

For example, you write:

 > urlparse converting bytes to 'str' to operate on them is at best a
 > kludge - you're forcing 5 times the storage (the original bytes + 4
 > bytes-per-byte when its decoded into unicode) to work on something
 > which is defined as a BNF * that uses ascii *.

Indeed it (RFC 3896) does *use* ASCII.  But I think there is confusion
in your words.  This is what the RFC says about that use of ASCII:

   2.  Characters

   The URI syntax provides a method of encoding data, presumably for the
   sake of identifying a resource, as a sequence of characters.  [...]

   The ABNF notation defines its terminal values to be non-negative
   integers (codepoints) based on the US-ASCII coded character set
   [ASCII].  Because a URI is a sequence of characters, we must invert
   that relation in order to understand the URI syntax.  Therefore, the
   integer values used by the ABNF must be mapped back to their
   corresponding characters via US-ASCII in order to complete the syntax
   rules.

Ie, ASCII is *irrelevant* to (the modern definition of) URLs except as
it is a convenient and familiar way to refer to a certain familiar and
rather small set of *characters*.  There are reasons for this (that
I'm not going to rehash here), and they are the *same* reasons why
Python 3's behavior is "correct" IMHO (modulo the issue about the type
of a list element, which I discuss above).

It is true that one might like there to be a literal that expresses
`ord(bytes-object-of-length-one)', ie, something like o'a' == 97.
(This is different from Greg's x'6465616462656566' == b'deadbeef',
which I don't think helps solve the confusion problem although it
would definitely be convenient.)

From python-dev at masklinn.net  Thu May 19 10:05:04 2011
From: python-dev at masklinn.net (Xavier Morel)
Date: Thu, 19 May 2011 10:05:04 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4DD41997.4060401@trueblade.com>
	<BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>
Message-ID: <78681B05-171D-40EF-BEFC-0ABE57FBE3ED@masklinn.net>

On 2011-05-19, at 09:49 , Nick Coghlan wrote:
> On Thu, May 19, 2011 at 5:10 AM, Eric Smith <eric at trueblade.com> wrote:
>> On 05/18/2011 12:16 PM, Stephen J. Turnbull wrote:
>>> Robert Collins writes:
>>> 
>>>  > Its probably too late to change, but please don't try to argue that
>>>  > its correct: the continued confusion of folk running into this is
>>>  > evidence that confusion *is happening*. Treat that as evidence and
>>>  > think about how to fix it going forward.
>>> 
>>> Sorry, Rob, but you're just wrong here, and Nick is right.  It's
>>> possible to improve Python 3, but not to "fix" it in this respect.
>>> The Python 3 solution is correct, the Python 2 approach is not.
>>> There's no way to avoid discontinuity and confusion here.
>> 
>> I don't think there's any connection between the way 2.x confused text
>> strings and binary data (which certainly needed addressing) with the way
>> that 3.x returns a different type for byte_str[i] than it does for
>> byte_str[i:i+1]. I think it's the latter that's confusing to people.
>> There's no particular requirement for different types that's needed to
>> fix the byte/str problem.
> 
> It's a mental model problem. People try to think of bytes as
> equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
> closer to array.array('c'). Strings are basically *unique* in
> returning a length 1 instance of themselves for indexing operations.
> For every other sequence type, including tuples, lists and arrays,
> slicing returns a new instance of the same type, while indexing will
> typically return something different.
> 
> Now, we definitely didn't *help* matters by keeping so many of the
> default behaviours of bytes() and bytearray() coupled to ASCII-encoded
> text, but that was a matter of practicality beating purity: there
> really *are* a lot of wire protocols out there that are ASCII based.
> In hindsight, perhaps we should have gone further in breaking things
> to try to make the point about the mental model shift more forcefully.
> (However, that idea carries with it its own problems).

For what it's worth, Erlang's approach to the subject is ? in my
opinion ? excellent:
binaries (whose literals are called "bit syntax" there) are quite
distinct from strings in both syntax and API, but you can put
chunks of strings within binaries (the bit syntax acts as a container,
in which you can put a literal or non-literal string). This
simultaneously impresses upon the user that binaries are *not* strings
and that they can still easily create binaries from strings.


From stefan_ml at behnel.de  Thu May 19 10:37:03 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 19 May 2011 10:37:03 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>
	<ir29q8$90n$1@dough.gmane.org>
	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>
Message-ID: <ir2krg$20r$1@dough.gmane.org>

Xavier Morel, 19.05.2011 09:41:
> On 2011-05-19, at 07:28 , Georg Brandl wrote:
>> On 19.05.2011 00:39, Greg Ewing wrote:
>>> If someone sees that
>>>
>>>     some_var[3] == b'd'
>>>
>>> is true, and that
>>>
>>>     some_var[3] == 100
>>>
>>> is also true, they might expect to be able to do things
>>> like
>>>
>>>     n = b'd' + 1
>>>
>>> and get 101... or maybe b'e'...
>>
>> Maybe they should :)
>
> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?

The result of this must obviously be b"de1".

Stefan


From ncoghlan at gmail.com  Thu May 19 10:43:54 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 18:43:54 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD2F661.2050005@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
Message-ID: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>

OK, summarising the thread so far from my point of view.

1. There are some aspects of the behavior of bytes() objects that
tempt people to think of them as string-like objects (primarily the
b'' literals and their use in repr(), along with the fact that they
fill roles that were filled by str in it's "arbitrary binary data"
incarnation in Python 2.x). The mental model this creates in the
reader is incorrect, as bytes() are far closer to array.array('c') in
their underlying behaviour (and deliberately so - cf. PEP 358, 3112,
3137).

One proposal for addressing this is to add a x'deadbeef' literal and
using that in repr() rather than the bytestring. Another would be to
escape all characters, even printable ASCII, in the bytes()
representation. Both of these are undesirable, as they miss the
original purpose of this behaviour: making it easier to work with the
many ASCII based wire protocols that are in widespread use.

To be honest, I don't think there is a lot we can do here except to
further emphasise in the documentation and elsewhere that *bytes is
not a string type* (regardless of any API similarities retained to
ease transition from the 2.x series). For example, if we have any
lingering references to "byte strings" they should be replaced with
"byte sequences" or "bytes objects" (depending on context, as the
former phrasing also encompasses bytearray objects).

2. As a concrete usability issue, it is awkward to programmatically
check the value of a specific byte when working with an ASCII based
protocol:

  data[i] == b'a' # Intuitive, but always False due to type mismatch
  data[i:i+1] == b'a'  # Works, but clumsy
  data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
const-expression optimisation)
  data[i] == ord('a') # Clumsy and slow
  data[i] == 97 # Hard to read

Proposals to address this include:
- introduce a "character" literal to allow c'a' as an alternative to ord('a')
    Potentially workable, but leaves the intuitive answer above
silently producing an unexpected answer
- allow 1-element byte sequences to compare equal to the corresponding
integer values.
    - would require reworking of bytes.__hash__ to use the hash of the
contained element when the data length is exactly 1
    - transitivity of equality would recommend also supporting
equivalences such as b'a' == 97.0
    - backwards compatibility concerns arise due to introduction of
new key collisions in dictionaries and sets and other value based
containers
    - yet more string-like behaviour in a type that is *not* a string
(further reinforcing the mistaken impression from point 1)
    - One thing that *isn't* a concern from my point of view is the
fact that we have ample precedent in decimal.Decimal for supporting
implicit coercion in comparison operations while disallowing them in
arithmetic operations (Decimal("1") == 1.0 is allowed, but
Decimal("1") + 1.0 will raise TypeError).

For point 2, I'm personally +0 on the idea of having 1-element bytes
and bytearray objects delegate hashing and comparison operations to
the corresponding integer object. We have the power to make the
obvious code correct code, so let's do that. However, the implications
of the additional key collisions in value based containers may need to
be explored further.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu May 19 10:54:18 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 18:54:18 +1000
Subject: [Python-Dev] [Python-checkins] cpython: Skip some tests in the
 absence of multiprocessing.
In-Reply-To: <4DD3F906.2080100@netwok.org>
References: <E1QMDyC-0004jq-1J@dinsdale.python.org>
	<4DD3F906.2080100@netwok.org>
Message-ID: <BANLkTikwW-Eh33B4zBooH3qCpVCUakA8ig@mail.gmail.com>

On Thu, May 19, 2011 at 2:51 AM, ?ric Araujo <merwok at netwok.org> wrote:
> Isn?t support.import_module or somesuch useful for this kind of checks?

You have to restructure your tests into the appropriate files for that
to work, as support.import_module() throws SkipTest if the module
isn't available.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Thu May 19 11:03:10 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 19:03:10 +1000
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305754449.27389.30.camel@marge>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge>
Message-ID: <BANLkTikxbGi7UPt4_KfBvS7C5fQCA24TGA@mail.gmail.com>

On Thu, May 19, 2011 at 7:34 AM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> But it is slower whereas I read somewhere than generators are faster
> than loops.

Are you sure it wasn't that generator expressions can be faster than
list comprehensions (if the memory savings are significant)?

Or that a reduction function with a generator expression can be faster
than a module-level explicit loop (due to the replacement of
dict-based variable assignment with fast locals in the generator and C
looping in the reduction function)?

In general, as long as both are using fast locals and looping in
Python, I would expect inline looping code to be faster than the
equivalent generator (but often harder to maintain due to lack of
reusability).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From lukasz at langa.pl  Thu May 19 11:25:23 2011
From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=)
Date: Thu, 19 May 2011 11:25:23 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <ir2krg$20r$1@dough.gmane.org>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>
	<ir29q8$90n$1@dough.gmane.org>
	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>
	<ir2krg$20r$1@dough.gmane.org>
Message-ID: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>

Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:

>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
> 
> The result of this must obviously be b"de1".

I hope you're joking. At best, the result should be b"de\x01". But I don't think such construct should be allowed. Just like you can't do `[1, 2, 3] + 4`. I wouldn't ever expect that a single byte behaves like a sequence of bytes. In the case of bytes b'a' is obviously still a sequence of bytes, just happening to store a single one. Indexing should return a byte so I'm not surprised it returns a number. Slicing on the other hand returns a sub-sequence.

However inconvenient, I find the current behaviour logical and predictable. A shortcut for b'a'[0] would obviously be nice but that's for python-ideas.

-- 
Best regards,
?ukasz Langa
Senior Systems Architecture Engineer

IT Infrastructure Department
Grupa Allegro Sp. z o.o.

From stefan_ml at behnel.de  Thu May 19 12:06:19 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 19 May 2011 12:06:19 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>	<ir29q8$90n$1@dough.gmane.org>	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>	<ir2krg$20r$1@dough.gmane.org>
	<340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
Message-ID: <ir2q2r$ag$1@dough.gmane.org>

?ukasz Langa, 19.05.2011 11:25:
> Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:
>
>>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
>>
>> The result of this must obviously be b"de1".
>
> I hope you're joking.

I "obviously" was. My point is that expectations and "obvious behaviour" 
may not be obvious to everyone.

Nick summed it up very nicely IMHO.

Stefan


From catch-all at masklinn.net  Thu May 19 12:12:56 2011
From: catch-all at masklinn.net (Xavier Morel)
Date: Thu, 19 May 2011 12:12:56 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>
	<ir29q8$90n$1@dough.gmane.org>
	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>
	<ir2krg$20r$1@dough.gmane.org>
	<340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
Message-ID: <052FA5C5-F6F2-4702-9E8A-E78C8E6DD34F@masklinn.net>

On 2011-05-19, at 11:25 , ?ukasz Langa wrote:
> Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:
> 
>>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
>> 
>> The result of this must obviously be b"de1".
> I hope you're joking. At best, the result should be b"de\x01".

Actually, if `b'd'+1` returns `b'e'` an equivalent behavior should be that `b'de'+1` returns `b'df'`.


From victor.stinner at haypocalc.com  Thu May 19 12:34:29 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 19 May 2011 12:34:29 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <4DD44C6D.8000808@canterbury.ac.nz>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge> <4DD44C6D.8000808@canterbury.ac.nz>
Message-ID: <1305801269.2380.4.camel@marge>

Le jeudi 19 mai 2011 ? 10:47 +1200, Greg Ewing a ?crit :
> Victor Stinner wrote:
> 
> >    squares = (x*x for x in range(10000))
> 
> What bytecode would you optimise that into?

I suppose that you have the current value of range(10000) on the stack:
DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x
variable (LOAD_FAST/STORE_FAST).

Full example using a function (instead of loop, so I need to load x):
-----------
import dis, opcode, struct

def f(x): return x*x

def patch_bytecode(f, bytecode):
    fcode = f.__code__
    code_type = type(f.__code__)
    new_code = code_type(
        fcode.co_argcount,
        fcode.co_kwonlyargcount,
        fcode.co_nlocals,
        fcode.co_stacksize,
        fcode.co_flags,
        bytecode,
        fcode.co_consts,
        fcode.co_names,
        fcode.co_varnames,
        fcode.co_filename,
        fcode.co_name,
        fcode.co_firstlineno,
        fcode.co_lnotab,
    )
    f.__code__ = new_code

print("Original:")
print("f(4) = %s" % f(4))
dis.dis(f)
print()

LOAD_FAST = opcode.opmap['LOAD_FAST']
DUP_TOP = opcode.opmap['DUP_TOP']
BINARY_MULTIPLY = opcode.opmap['BINARY_MULTIPLY']
RETURN_VALUE = opcode.opmap['RETURN_VALUE']

bytecode = struct.pack(
    '=BHBBB',
    LOAD_FAST, 0,
    DUP_TOP,
    BINARY_MULTIPLY,
    RETURN_VALUE)

print("Patched:")
patch_bytecode(f, bytecode)
print("f(4) patched = %s" % f(4))
dis.dis(f)
-----------

Output:
-----------
$ python3 square.py 
Original:
f(4) = 16
  3           0 LOAD_FAST                0 (x) 
              3 LOAD_FAST                0 (x) 
              6 BINARY_MULTIPLY      
              7 RETURN_VALUE         

Patched:
f(4) patched = 16
  3           0 LOAD_FAST                0 (x) 
              3 DUP_TOP              
              4 BINARY_MULTIPLY      
              5 RETURN_VALUE     
-----------

Victor


From solipsis at pitrou.net  Thu May 19 12:37:27 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 19 May 2011 12:37:27 +0200
Subject: [Python-Dev] Python 3.x and bytes
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>
	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>
	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4DD41997.4060401@trueblade.com>
	<BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>
Message-ID: <20110519123727.408b401f@pitrou.net>

On Thu, 19 May 2011 17:49:47 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> It's a mental model problem. People try to think of bytes as
> equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
> closer to array.array('c'). Strings are basically *unique* in
> returning a length 1 instance of themselves for indexing operations.
> For every other sequence type, including tuples, lists and arrays,
> slicing returns a new instance of the same type, while indexing will
> typically return something different.
> 
> Now, we definitely didn't *help* matters by keeping so many of the
> default behaviours of bytes() and bytearray() coupled to ASCII-encoded
> text, but that was a matter of practicality beating purity: there
> really *are* a lot of wire protocols out there that are ASCII based.

I think "practicality beating purity" should have been extended to
__getitem__ as well. I have almost never had a use for treating a
bytestring as a sequence of integers, while treating a bytestring as a
sequence of one-byte strings is *very* common.

(and, as you say, if you want a sequence of integers you can already
use array.array() which gives you more flexibility as to the width and
signedness of integers)

Regards

Antoine.



From victor.stinner at haypocalc.com  Thu May 19 12:39:57 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 19 May 2011 12:39:57 +0200
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <ir1slo$ibs$1@dough.gmane.org>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge>  <ir1slo$ibs$1@dough.gmane.org>
Message-ID: <1305801597.2380.9.camel@marge>

Le mercredi 18 mai 2011 ? 21:44 -0400, Terry Reedy a ?crit :
> On 5/18/2011 5:34 PM, Victor Stinner wrote:
> 
> You initial example gave me the impression that the issue has something 
> to do with join in particular, or even comprehensions in particular. It 
> is really about for loops.
> 
>  >>> dis('for x in range(3): y = x*x')
>    ...
>          >>   13 FOR_ITER                16 (to 32)
>               16 STORE_NAME               1 (x)
>               19 LOAD_NAME                1 (x)
>               22 LOAD_NAME                1 (x)
>               25 BINARY_MULTIPLY
>               26 STORE_NAME               2 (y)
>   ...

Yeah, "STORE_NAME; LOAD_NAME; LOAD_NAME" can be replaced by a single
opcode: DUP_TOP. But the user expects x to be defined outside the loop:

>>> for x in range(3): y = x*x
... 
>>> x
2

Well, it is possible to detect if x is used or not after the loop, but
it is a little more complex to optimize than list
comprehension/generator :-)

> .. you cannot get that with Python code without a much smarter optimizer.

Yes, I would like to write a smarter optimizer. But I first asked if it
would accepted to avoid the temporary loop variable because it changes
the Python language: the user can expect a loop variable using
introspection or a debugger. That's why I suggested to only enable the
optimization if Python is running in optimized mode (python -O or python
-OO).

Victor


From ncoghlan at gmail.com  Thu May 19 13:02:12 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 19 May 2011 21:02:12 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
Message-ID: <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>

On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> For point 2, I'm personally +0 on the idea of having 1-element bytes
> and bytearray objects delegate hashing and comparison operations to
> the corresponding integer object. We have the power to make the
> obvious code correct code, so let's do that. However, the implications
> of the additional key collisions in value based containers may need to
> be explored further.

On further reflection, the key collision and semantics blurring
problems mean I am at best -0 on this particular solution to the
problem (and heading fairly rapidly in the direction of -1).

Best to just go with b'a'[0] and let the optimiser sort it out (PyPy
should handle it automatically, CPython would need work).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From fuzzyman at voidspace.org.uk  Thu May 19 13:29:07 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Thu, 19 May 2011 12:29:07 +0100
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>	<ir29q8$90n$1@dough.gmane.org>	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>	<ir2krg$20r$1@dough.gmane.org>
	<340C7155-49FE-4EF7-963E-65EA8DB9DDEE@langa.pl>
Message-ID: <4DD4FF03.5070005@voidspace.org.uk>

On 19/05/2011 10:25, ?ukasz Langa wrote:
> Wiadomo?? napisana przez Stefan Behnel w dniu 2011-05-19, o godz. 10:37:
>
>>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If a 1-byte bytes is equivalent to an integer, why not an arbitrary one as well?
>> The result of this must obviously be b"de1".
> I hope you're joking. At best, the result should be b"de\x01".
The behaviour Stefan suggests is what some "weakly typed" languages like 
perl (and possibly php?) do, which masks errors and is rightly abhorred 
by Python programmers (although semantically not *so* different from 1 + 
1.0 == 2.0). I think it's safe to say that Stefan was joking.

Michael

>   But I don't think such construct should be allowed. Just like you can't do `[1, 2, 3] + 4`. I wouldn't ever expect that a single byte behaves like a sequence of bytes. In the case of bytes b'a' is obviously still a sequence of bytes, just happening to store a single one. Indexing should return a byte so I'm not surprised it returns a number. Slicing on the other hand returns a sub-sequence.
>
> However inconvenient, I find the current behaviour logical and predictable. A shortcut for b'a'[0] would obviously be nice but that's for python-ideas.
>


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From ziade.tarek at gmail.com  Thu May 19 13:35:39 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Thu, 19 May 2011 13:35:39 +0200
Subject: [Python-Dev] packaging landed in stdlib
Message-ID: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>

Hey

I've pushed packaging in stdlib. There are a few buildbots errors
we're fixing right now.

We will continue our work in their directly for now on.

The next "big" commit will be for the documentation,

Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From greg.ewing at canterbury.ac.nz  Thu May 19 14:16:31 2011
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 20 May 2011 00:16:31 +1200
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <1305801269.2380.4.camel@marge>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge> <4DD44C6D.8000808@canterbury.ac.nz>
	<1305801269.2380.4.camel@marge>
Message-ID: <4DD50A1F.3010008@canterbury.ac.nz>

Victor Stinner wrote:

> I suppose that you have the current value of range(10000) on the stack:
> DUP_TOP; BINARY_MULTIPLY; gives you the square. You don't need the x
> variable (LOAD_FAST/STORE_FAST).

That seems far too special-purpose to be worth it to me.

-- 
Greg

From doug.hellmann at gmail.com  Thu May 19 15:07:27 2011
From: doug.hellmann at gmail.com (Doug Hellmann)
Date: Thu, 19 May 2011 09:07:27 -0400
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
Message-ID: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>

Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself.

Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google.

Thanks,
Doug

--
Doug Hellmann
Communications Director
Python Software Foundation
http://python.org/psf/


From tseaver at palladion.com  Thu May 19 19:05:36 2011
From: tseaver at palladion.com (Tres Seaver)
Date: Thu, 19 May 2011 13:05:36 -0400
Subject: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
In-Reply-To: <BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
References: <4DD2C37D.7000008@python.org>
	<4DD4127E.6050301@zhuliguan.net>	<4DD4194F.9020009@v.loewis.de>
	<ir176p$25a$1@dough.gmane.org>
	<BANLkTinC3+O88edv+HkN4zNTCz1uXS+z_w@mail.gmail.com>
Message-ID: <ir3il1$ulp$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/18/2011 10:46 PM, anatoly techtonik wrote:
> On Wed, May 18, 2011 at 10:37 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>> On 18.05.2011 21:09, "Martin v. L?wis" wrote:
>>> Am 18.05.2011 20:39, schrieb Hagen F?rstenau:
>>>>> On behalf of the Python development team, I am pleased to announce the
>>>>> first release candidate of Python 3.2.1.
>>>>
>>>> Shouldn't there be a tag "v3.2.1rc1" in the hg repo?
>>>
>>> http://hg.python.org/releasing/3.2.1/
>>>
>>> Regards,
>>> Martin
>>>
>>> P.S. "Shouldn't" makes it sound as if there was a mistake.
>>
>> To clarify: once the final is done, the repo Martin mentioned will be
>> merged back to main and then vanish.
> 
> Can't this work be done in the branch of main repo, so that everybody
> can track the progress in place? Is there any picture of the process
> similar to http://nvie.com/posts/a-successful-git-branching-model/ ?

Note that in that writeup, 'release-*' (and 'hotfix-*') branches are not
shown as pushed to the 'origin' repository.


Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3VTeAACgkQ+gerLs4ltQ42kgCeMbIDH6zRU5uyd0Su28Nb9E5q
WAMAniWnrvzRReDa+b3mYtavbyaywGVJ
=Dr2p
-----END PGP SIGNATURE-----


From ziade.tarek at gmail.com  Thu May 19 19:12:14 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Thu, 19 May 2011 19:12:14 +0200
Subject: [Python-Dev] packaging landed in stdlib
In-Reply-To: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>
References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>
Message-ID: <BANLkTinDyNY0E_NyECXeNGW4zgqHadCqBw@mail.gmail.com>

On Thu, May 19, 2011 at 1:35 PM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> Hey
>
> I've pushed packaging in stdlib. There are a few buildbots errors
> we're fixing right now.

FYI.

there are still some failures we're fixing. Thanks for your patience
and thanks to the folks that are helping me on this :)

I expect the bbots to be back on track later today

Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From ethan at stoneleaf.us  Thu May 19 19:50:10 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 19 May 2011 10:50:10 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
Message-ID: <4DD55852.9070903@stoneleaf.us>

Nick Coghlan wrote:
> OK, summarising the thread so far from my point of view. 

[snip]

> To be honest, I don't think there is a lot we can do here except to
> further emphasise in the documentation and elsewhere that *bytes is
> not a string type* (regardless of any API similarities retained to
> ease transition from the 2.x series). For example, if we have any
> lingering references to "byte strings" they should be replaced with
> "byte sequences" or "bytes objects" (depending on context, as the
> former phrasing also encompasses bytearray objects).

I think this would be a big help.

> 2. As a concrete usability issue, it is awkward to programmatically
> check the value of a specific byte when working with an ASCII based
> protocol:
> 
>   data[i] == b'a' # Intuitive, but always False due to type mismatch
>   data[i:i+1] == b'a'  # Works, but clumsy
>   data[i] == b'a'[0]  # Ditto (but at least susceptible to compiler
> const-expression optimisation)
>   data[i] == ord('a') # Clumsy and slow
>   data[i] == 97 # Hard to read
> 
> Proposals to address this include:
> - introduce a "character" literal to allow c'a' as an alternative to ord('a')
>     Potentially workable, but leaves the intuitive answer above
>     silently producing an unexpected answer

[snip]

> For point 2, I'm personally +0 on the idea of having 1-element bytes
> and bytearray objects delegate hashing and comparison operations to
> the corresponding integer object. We have the power to make the
> obvious code correct code, so let's do that. However, the implications
> of the additional key collisions in value based containers may need to
> be explored further.

Nick Coghlan also wrote:
 > On further reflection, the key collision and semantics blurring
 > problems mean I am at best -0 on this particular solution to the
 > problem (and heading fairly rapidly in the direction of -1).

Last thought I have for a possible 'solution' -- when a bytes object is 
tested for equality against an int raise TypeError.  Precedent being 
sum() raising a TypeError when passed a list of strings because 
performance is so poor.  Reason here being that the intuitive behavior 
will never work and will always produce silent bugs.

~Ethan~


From guido at python.org  Thu May 19 19:43:02 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 May 2011 10:43:02 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
Message-ID: <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com>

On Thu, May 19, 2011 at 1:43 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> OK, summarising the thread so far from my point of view.
>
> 1. There are some aspects of the behavior of bytes() objects that
> tempt people to think of them as string-like objects (primarily the
> b'' literals and their use in repr(), along with the fact that they
> fill roles that were filled by str in it's "arbitrary binary data"
> incarnation in Python 2.x). The mental model this creates in the
> reader is incorrect, as bytes() are far closer to array.array('c') in
> their underlying behaviour (and deliberately so - cf. PEP 358, 3112,
> 3137).

I think most of this "wrong mental model" is actually due to people
not having completely internalized the Python 3 way.

> One proposal for addressing this is to add a x'deadbeef' literal and
> using that in repr() rather than the bytestring. Another would be to
> escape all characters, even printable ASCII, in the bytes()
> representation. Both of these are undesirable, as they miss the
> original purpose of this behaviour: making it easier to work with the
> many ASCII based wire protocols that are in widespread use.

Indeed, -1 on both.

> To be honest, I don't think there is a lot we can do here except to
> further emphasise in the documentation and elsewhere that *bytes is
> not a string type* (regardless of any API similarities retained to
> ease transition from the 2.x series). For example, if we have any
> lingering references to "byte strings" they should be replaced with
> "byte sequences" or "bytes objects" (depending on context, as the
> former phrasing also encompasses bytearray objects).

+1

> 2. As a concrete usability issue, it is awkward to programmatically
> check the value of a specific byte when working with an ASCII based
> protocol:
>
> ?data[i] == b'a' # Intuitive, but always False due to type mismatch
> ?data[i:i+1] == b'a' ?# Works, but clumsy
> ?data[i] == b'a'[0] ?# Ditto (but at least susceptible to compiler
> const-expression optimisation)
> ?data[i] == ord('a') # Clumsy and slow
> ?data[i] == 97 # Hard to read
>
> Proposals to address this include:
> - introduce a "character" literal to allow c'a' as an alternative to ord('a')

-1; the result is not a *character* but an integer. I'm personally
favoring using b'a'[0] and possibly hiding this in a constant
definition.

>    Potentially workable, but leaves the intuitive answer above
> silently producing an unexpected answer

I'm not convinced that that problem is any worse than other
comparison-related problems. E.g. b'a' == 'a' also always returns
False (most likely it'll be disguised by at least one operand being a
variable of course.)

> - allow 1-element byte sequences to compare equal to the corresponding
> integer values.
> ? ?- would require reworking of bytes.__hash__ to use the hash of the
> contained element when the data length is exactly 1
> ? ?- transitivity of equality would recommend also supporting
> equivalences such as b'a' == 97.0
> ? ?- backwards compatibility concerns arise due to introduction of
> new key collisions in dictionaries and sets and other value based
> containers
> ? ?- yet more string-like behaviour in a type that is *not* a string
> (further reinforcing the mistaken impression from point 1)
> ? ?- One thing that *isn't* a concern from my point of view is the
> fact that we have ample precedent in decimal.Decimal for supporting
> implicit coercion in comparison operations while disallowing them in
> arithmetic operations (Decimal("1") == 1.0 is allowed, but
> Decimal("1") + 1.0 will raise TypeError).
>
> For point 2, I'm personally +0 on the idea of having 1-element bytes
> and bytearray objects delegate hashing and comparison operations to
> the corresponding integer object. We have the power to make the
> obvious code correct code, so let's do that. However, the implications
> of the additional key collisions in value based containers may need to
> be explored further.

My gut feeling about this is that this will probably introduce some
confusing or unintended side effect elsewhere, and I am -1 on this
change.

-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Thu May 19 19:46:14 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 May 2011 10:46:14 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD55852.9070903@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<4DD55852.9070903@stoneleaf.us>
Message-ID: <BANLkTimYJc0s=WJjMzwWtHwwC+JKmY68Og@mail.gmail.com>

On Thu, May 19, 2011 at 10:50 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> Last thought I have for a possible 'solution' -- when a bytes object is
> tested for equality against an int raise TypeError. ?Precedent being sum()
> raising a TypeError when passed a list of strings because performance is so
> poor. ?Reason here being that the intuitive behavior will never work and
> will always produce silent bugs.

Not the same thing at all. The == operator is special, and should not
raise exceptions; too many things would start randomly failing (e.g.
membership tests for a dict that has both ints and bytes as keys, or
for a list containing a variety of types).

-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Thu May 19 19:56:23 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 19 May 2011 10:56:23 -0700
Subject: [Python-Dev] Don't set local variable in a list comprehension
	or generator
In-Reply-To: <1305754449.27389.30.camel@marge>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<1305754449.27389.30.camel@marge>
Message-ID: <BANLkTik+av3-HSTRPGJYwiq07dSGtcV6zw@mail.gmail.com>

On Wed, May 18, 2011 at 2:34 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Le mercredi 18 mai 2011 ? 16:19 +0200, Nadeem Vawda a ?crit :
>> I'm not sure why you would encounter code like that in the first place.
>
> Well, I found the STORE_FAST/LOAD_FAST "issue" while trying to optimize
> the this module which reimplements rot13 using a dict in Python 3:
>
> d = {}
> for c in (65, 97):
> ? ?for i in range(26):
> ? ? ? ?d[chr(i+c)] = chr((i+13) % 26 + c)
>
> I tried:
>
> d = {chr(i+c): chr((i+13) % 26 + c)
> ? ? for i in range(26)
> ? ? for c in (65, 97)}
>
> But it is slower whereas I read somewhere than generators are faster
> than loops.

I'm curious where you read that. The explicit loop should be faster or
equally fast *except* when you can avoid a loop in bytecode by
applying map() to a built-in function. However map() with a lambda is
significantly slower. Maybe what you recall actually (correctly) said
that a comprehension is faster than map+lambda?

> By the way, (c for c in ...) is slower than [c for c
> in ...]. I suppose that a generator is slower because it exits/reenter
> into PyEval_EvalFrameEx() at each step, whereas [c for c ...] uses
> BUILD_LIST in a dummy (but fast) loop.

Did you test this in Python 2 or 3? In 2 the genexpr is definitely
slower than the comprehension; in 3 I'm not sure there's much
difference any more.

> (c for c in ...) and [c for c in ...] is stupid, but I used a simplified
> example to explain the problem. A more realistic example would be:
>
> ? squares = (x*x for x in range(10000))
>
> You don't really need the "x" variable, you just want the square.
> Another example is the syntax using a if the filter the data set:
>
> ? (x for x in ... if condition(x))
>
>> > I heard about optimization in the AST tree instead of working on the
>> > bytecode. What is the status of this project?
>>
>> Are you referring to issue11549? There was some related discussion [1] on
>> python-dev about six weeks ago, but I haven't seen anything on the topic
>> since then.
>
> Ah yes, it looks to be this issue. I didn't know that there was an
> issue.

Hm, probably.

-- 
--Guido van Rossum (python.org/~guido)

From glyph at twistedmatrix.com  Thu May 19 20:22:20 2011
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Thu, 19 May 2011 14:22:20 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com>
Message-ID: <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com>


On May 19, 2011, at 1:43 PM, Guido van Rossum wrote:

> -1; the result is not a *character* but an integer.

Well, really the result ought to be an octet, but I suppose adding an 'octet' type is beyond the scope of even this sprawling discussion :).

> I'm personally favoring using b'a'[0] and possibly hiding this in a constant definition.

As someone who spends a frankly unfortunate amount of time handling protocols where things like this are necessary, I agree with this recommendation.  In protocols where one needs to compare network data with one-byte type identifiers or packet prefixes, more (documented) constants and less inscrutable junk like

if p == 'c':
   ...
elif p == 'j':
   ...
elif p == 'J': # for compatibility
   ...

would definitely be a good thing.  Of course, I realize that this sort of programmer will most likely replace those constants with 99, 106, 74 than take a moment to document what they mean, but at least they'll have to pause for a moment and realize that they have now lost _all_ mnemonics...

In fact, I feel like I would want to push in the opposite direction: don't treat one-byte bytes slices less like integers; I wish I could more easily treat n-byte sequences _more_ like integers! :).  More protocols have 2-byte or 4-byte network-endian packed integers embedded in them than have individual tag bytes that I want to examine.  For the typical ASCII-ish protocol where you want to look at command names and CRLF-separated messages, you'd never want to look at an individual octet, stringish operations like split() will give you what you want.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110519/16c4fe14/attachment.html>

From g.brandl at gmx.net  Thu May 19 20:30:18 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 19 May 2011 20:30:18 +0200
Subject: [Python-Dev] packaging landed in stdlib
In-Reply-To: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>
References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>
Message-ID: <ir3njj$u61$1@dough.gmane.org>

On 19.05.2011 13:35, Tarek Ziad? wrote:
> Hey
> 
> I've pushed packaging in stdlib. There are a few buildbots errors
> we're fixing right now.
> 
> We will continue our work in their directly for now on.

Rock on!

Georg



From g.brandl at gmx.net  Thu May 19 21:31:01 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Thu, 19 May 2011 21:31:01 +0200
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <ir2krg$20r$1@dough.gmane.org>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>
	<4DD35B9C.3030702@canterbury.ac.nz>	<4DD3EC7A.8070801@stoneleaf.us>
	<4DD41E2B.7000404@stoneleaf.us>	<4DD44AA6.9030600@canterbury.ac.nz>
	<ir29q8$90n$1@dough.gmane.org>
	<663D4696-0454-45A8-A6F7-AD18A07709FA@masklinn.net>
	<ir2krg$20r$1@dough.gmane.org>
Message-ID: <ir3r5d$ii1$1@dough.gmane.org>

On 19.05.2011 10:37, Stefan Behnel wrote:
> Xavier Morel, 19.05.2011 09:41:
>> On 2011-05-19, at 07:28 , Georg Brandl wrote:
>>> On 19.05.2011 00:39, Greg Ewing wrote:
>>>> If someone sees that
>>>> 
>>>> some_var[3] == b'd'
>>>> 
>>>> is true, and that
>>>> 
>>>> some_var[3] == 100
>>>> 
>>>> is also true, they might expect to be able to do things like
>>>> 
>>>> n = b'd' + 1
>>>> 
>>>> and get 101... or maybe b'e'...
>>> 
>>> Maybe they should :)
>> 
>> But why wouldn't "they" expect `b'de' + 1` to work as well in this case? If
>> a 1-byte bytes is equivalent to an integer, why not an arbitrary one as
>> well?
> 
> The result of this must obviously be b"de1".

To clarify my original one-liner: if bytes objects (but only one-char bytes
objects) equal integers, you should rightly expect to treat them as integers.

This is obviously *not* desirable from a strong-typing POV.

Georg


From tjreedy at udel.edu  Thu May 19 22:36:42 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 19 May 2011 16:36:42 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTimWotV=ZcVD1c_BFS6TyfdrgfLNDw@mail.gmail.com>	<BANLkTim+ygz=Y7ZPttjjWk+VEBpYVwt=gw@mail.gmail.com>	<87k4doasr7.fsf@uwakimon.sk.tsukuba.ac.jp>	<4DD41997.4060401@trueblade.com>
	<BANLkTimE7F68kWyrAO130pO2v9RZrSu1DA@mail.gmail.com>
Message-ID: <ir3v0p$g3d$1@dough.gmane.org>

On 5/19/2011 3:49 AM, Nick Coghlan wrote:

> It's a mental model problem. People try to think of bytes as
> equivalent to 2.x str and that's just wrong, wrong, wrong. It's far
> closer to array.array('c').

Or like C char arrays

> Strings are basically *unique* in
> returning a length 1 instance of themselves for indexing operations.

I still remember having to work that out and get used to it.

-- 
Terry Jan Reedy


From skip at pobox.com  Fri May 20 01:47:57 2011
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 19 May 2011 18:47:57 -0500
Subject: [Python-Dev] Don't set local variable in a list comprehension
 or generator
In-Reply-To: <ir1til$lo6$1@dough.gmane.org>
References: <1305721315.16682.10.camel@marge>
	<BANLkTi=k46QMDYjHcBVCrAs5mAMfJYUA9Q@mail.gmail.com>
	<ir1cpj$52a$1@dough.gmane.org>
	<BANLkTinih_3ght8GSknCNrTjaD3t9i-Ayw@mail.gmail.com>
	<ir1til$lo6$1@dough.gmane.org>
Message-ID: <19925.44077.710651.843807@montanaro.dyndns.org>

On 5/18/2011 10:19 AM, Nadeem Vawda wrote:

> I'm not sure why you would encounter code like that in the first place.
> Surely any code of the form:
> 
> ''.join(c for c in my_string)
> 
> would just return my_string? Or am I missing something?

You might more-or-less legitimately encounter it if the generator expression
originally contained a condition which got removed.

Skip

From victor.stinner at haypocalc.com  Fri May 20 00:51:23 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 20 May 2011 00:51:23 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12120,
 Issue #12119: tests were missing a sys.dont_write_bytecode check
In-Reply-To: <E1QN7SR-0004Nt-A9@dinsdale.python.org>
References: <E1QN7SR-0004Nt-A9@dinsdale.python.org>
Message-ID: <1305845483.10075.5.camel@marge>

Python 3.3 is not supposed to create .pyc files in the same directory
than the .py files. So I don't understand the following code.

Le jeudi 19 mai 2011 ? 19:56 +0200, tarek.ziade a ?crit :
> http://hg.python.org/cpython/rev/9d1fb6a9104b
> changeset:   70207:9d1fb6a9104b
> user:        Tarek Ziade <tarek at ziade.org>
> date:        Thu May 19 19:56:12 2011 +0200
> summary:
>   Issue #12120, Issue #12119: tests were missing a sys.dont_write_bytecode check
> 
> files:
>   Lib/distutils/tests/test_build_py.py         |  3 ++-
>   Lib/packaging/tests/test_command_build_py.py |  3 ++-
>   Misc/NEWS                                    |  3 +++
>   3 files changed, 7 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/Lib/distutils/tests/test_build_py.py b/Lib/distutils/tests/test_build_py.py
> --- a/Lib/distutils/tests/test_build_py.py
> +++ b/Lib/distutils/tests/test_build_py.py
> @@ -58,7 +58,8 @@
>          pkgdest = os.path.join(destination, "pkg")
>          files = os.listdir(pkgdest)
>          self.assertTrue("__init__.py" in files)
> -        self.assertTrue("__init__.pyc" in files)
> +        if not sys.dont_write_bytecode:
> +            self.assertTrue("__init__.pyc" in files)
>          self.assertTrue("README.txt" in files)
>  
>      def test_empty_package_dir (self):
> diff --git a/Lib/packaging/tests/test_command_build_py.py b/Lib/packaging/tests/test_command_build_py.py
> --- a/Lib/packaging/tests/test_command_build_py.py
> +++ b/Lib/packaging/tests/test_command_build_py.py
> @@ -61,7 +61,8 @@
>          pkgdest = os.path.join(destination, "pkg")
>          files = os.listdir(pkgdest)
>          self.assertIn("__init__.py", files)
> -        self.assertIn("__init__.pyc", files)
> +        if not sys.dont_write_bytecode:
> +            self.assertIn("__init__.pyc", files)
>          self.assertIn("README.txt", files)
>  
>      def test_empty_package_dir(self):
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -153,6 +153,9 @@
>  Library
>  -------
>  
> +- Issue #12120, #12119: skip a test in packaging and distutils
> +  if sys.dont_write_bytecode is set to True.
> +
>  - Issue #12065: connect_ex() on an SSL socket now returns the original errno
>    when the socket's timeout expires (it used to return None).
>  
> 
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins



From ethan at stoneleaf.us  Fri May 20 02:40:26 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 19 May 2011 17:40:26 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>
Message-ID: <4DD5B87A.9060902@stoneleaf.us>

Nick Coghlan wrote:
> On Thu, May 19, 2011 at 6:43 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> For point 2, I'm personally +0 on the idea of having 1-element bytes
>> and bytearray objects delegate hashing and comparison operations to
>> the corresponding integer object. We have the power to make the
>> obvious code correct code, so let's do that. However, the implications
>> of the additional key collisions in value based containers may need to
>> be explored further.

Several folk have said that objects that compare equal must hash equal...

Why?  It's an honest question.  Here's what I have tried:

--> class Wierd():
...     def __init__(self, value):
...         self.value = value
...     def __eq__(self, other):
...         return self.value == other
...     def __hash__(self):
...         return hash((self.value + 13) ** 3)
...
--> one = Wierd(1)
--> two = Wierd(2)
--> three = Wierd(3)
--> one
<Wierd object at 0x00BFE710>
--> one == 1
True
--> one == 2
False
--> two == 2
True
--> three == 3
True
--> d = dict()
--> d[one] = '1'
--> d[two] = '2'
--> d[three] = '3'
--> d
{<Wierd object at 0x00BFE710>: '1',
  <Wierd object at 0x00BFE870>: '3',
  <Wierd object at 0x00BFE830>: '2'}
--> d[1] = '1.0'
--> d[2] = '2.0'
--> d[3] = '3.0'
--> d
{<Wierd object at 0x00BFE870>: '3',
  1: '1.0',
  2: '2.0',
  3: '3.0',
  <Wierd object at 0x00BFE830>: '2',
  <Wierd object at 0x00BFE710>: '1'}
--> d[2]
'2.0'
--> d[two]
'2'

This behavior matches what I was imagining for having
b'a' == 97.  They compare equal, yet remain distinct objects
for all other purposes.

If anybody has a link to or an explanation why equal values must be 
equal hashes I'm all ears.  My apologies in advance if this is an 
incredibly naive question.

~Ethan~

From benjamin at python.org  Fri May 20 02:51:16 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 19 May 2011 19:51:16 -0500
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD5B87A.9060902@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>
	<4DD5B87A.9060902@stoneleaf.us>
Message-ID: <BANLkTimQ+yPB+2_PM_u5MMYUZ88XbUYebQ@mail.gmail.com>

2011/5/19 Ethan Furman <ethan at stoneleaf.us>:
> If anybody has a link to or an explanation why equal values must be equal
> hashes I'm all ears. ?My apologies in advance if this is an incredibly naive
> question.

https://secure.wikimedia.org/wikipedia/en/wiki/Hash_table


-- 
Regards,
Benjamin

From raymond.hettinger at gmail.com  Fri May 20 05:10:44 2011
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 19 May 2011 22:10:44 -0500
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD5B87A.9060902@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>
	<4DD5B87A.9060902@stoneleaf.us>
Message-ID: <8102E548-63BE-4674-902E-C458DC5FBA9F@gmail.com>


On May 19, 2011, at 7:40 PM, Ethan Furman wrote:

> Several folk have said that objects that compare equal must hash equal...

And so do the docs:  http://docs.python.org/dev/reference/datamodel.html#object.__hash__
, "the only required property is that objects which compare equal have the same hash value".


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110519/f8100944/attachment.html>

From eliben at gmail.com  Fri May 20 09:02:03 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 20 May 2011 10:02:03 +0300
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
Message-ID: <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>

On Thu, May 19, 2011 at 16:07, Doug Hellmann <doug.hellmann at gmail.com> wrote:
> Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself.
>
> Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google.
>

With respect to Google Blogger, I don't see a good reason to use it as
the platform for the blog. IMHO it would be much better to go for a
less-dependencies approach and just deploy a Wordpress installation,
or possibly even something Python-based (if volunteers to maintain it
are found.

Eli

From ncoghlan at gmail.com  Fri May 20 10:40:09 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 20 May 2011 18:40:09 +1000
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
Message-ID: <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>

On Fri, May 20, 2011 at 5:02 PM, Eli Bendersky <eliben at gmail.com> wrote:
> On Thu, May 19, 2011 at 16:07, Doug Hellmann <doug.hellmann at gmail.com> wrote:
>> Several of the PSF blogs hosted on Google's Blogger platform are experiencing issues as fallout from the recent maintenance problems they had. We have already had to recreate at least one of the translations for Python Insider in order to be able to publish to it, and now we can't edit posts on Python Insider itself.
>>
>> Can anyone put me in contact with someone at Google from the Blogger team? I would at least like to know whether the "bX-qpvq7q" problem is being worked on, so I can decide whether to take a hiatus or start moving us to another platform. There are a lot of posts about the error on the support forums, but no obvious response from Google.
>>
>
> With respect to Google Blogger, I don't see a good reason to use it as
> the platform for the blog.

As with any infrastructure, there is a reasonably high cost in
changing, as people have become used to a certain way of doing things,
and porting the contents from the old system to the new one requires
additional effort.

Blogger has its problems, but it typically gets the job done well
enough (modulo cases like the one currently affecting Doug and his
team).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From Tim.Golden at cbsoutdoor.co.uk  Fri May 20 10:38:24 2011
From: Tim.Golden at cbsoutdoor.co.uk (Tim Golden)
Date: Fri, 20 May 2011 09:38:24 +0100
Subject: [Python-Dev] os.access on Windows
Message-ID: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>

There's a thread on python-list at the moment:

  http://mail.python.org/pipermail/python-list/2011-May/1272505.html

which is discussing the validity of os.access results on
Windows. Now we've been here before: I raised issue2528
for a previous enquiry some years ago and proffered a patch
which uses the AccessCheck API to perform the equivalent check,
but didn't follow through.

Someone on the new thread is suggesting -- validly -- that the
docs should highlight the limitations of this call on Windows.
But the docs for that call are already fairly involved:

  http://docs.python.org/library/os.html#os.access

We seem to have a few options in increasing order of difficulty:

* Do nothing - inform the occasional enquirer of the situation and
  leave it at that.

* Update the docs to add something which describes what the function
  actually does on the Windows platform. (Whether or not we change any code).

* Apply the patch in issue2528 to 3.3 and maybe 2.7

* Leave os.access alone but offer alternative Windows-specific
  functionality in the os module or elsewhere, using essentially
  the code in the issue2528 patch.

As a side note, the pywin32 packages don't actually include AccessCheck
at the moment. (Which makes it slightly harder to explain to people
how they could do this check for themselves). It could probably be added
over there which might ease the burden over here.

Opinions?

TJG
Tim Golden
Very Senior Analyst Programmer

CBS Outdoor UK
Camden Wharf
28 Jamestown Road
London
NW1 7BY
T: 020 7482 3000
F: 020 7267 4944


http://www.cbsoutdoor.co.uk/
http://www.cbsoutdoor.co.uk/
http://www.bigbuschallenge.com/
Don't waste paper. Think before you print.
The contents of this e-mail are confidential to the ordinary user of the e-mail address to which it was addressed, and may also be privileged. If you are not the addressee of this e-mail you may not copy, forward, disclose or otherwise use it or any part of it in any form whatsoever. If you have received this e-mail in error, please e-mail the sender by replying to this message. CBS Outdoor Ltd reserves the right to monitor e-mail communications from external/internal sources for the purposes of ensuring correct and appropriate use of CBS Outdoor facilities. CBS Outdoor Limited, registered in England and Wales with company number 02866133 and registered address at Camden Wharf, 28 Jamestown Road, London, NW1 7BY.




________________________________________________________________________
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

From ncoghlan at gmail.com  Fri May 20 11:21:24 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 20 May 2011 19:21:24 +1000
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DD5B87A.9060902@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>
	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>
	<4DD2D89D.4000303@stoneleaf.us>
	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>
	<4DD2F661.2050005@stoneleaf.us>
	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>
	<BANLkTi=_GqrNntRU7pv7x=kj3gHurk-Gaw@mail.gmail.com>
	<4DD5B87A.9060902@stoneleaf.us>
Message-ID: <BANLkTi=smKJaW9vYM0v8isU5_n5UBUZs6g@mail.gmail.com>

On Fri, May 20, 2011 at 10:40 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> This behavior matches what I was imagining for having
> b'a' == 97. ?They compare equal, yet remain distinct objects
> for all other purposes.
>
> If anybody has a link to or an explanation why equal values must be equal
> hashes I'm all ears. ?My apologies in advance if this is an incredibly naive
> question.

Because whether or not two objects can coexist in the same hash table
should *not* depend on their hash values - it should depend on whether
or not they compare equal to each other. The use of hashing should
just be an optimisation, not fundamentally change the nature of the
comparison operation. (i.e. "hash(a) == hash(b) and a == b" is meant
to be a fast alternative to "a == b", not a completely different
check).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From eliben at gmail.com  Fri May 20 11:39:22 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 20 May 2011 12:39:22 +0300
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
Message-ID: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>

>> With respect to Google Blogger, I don't see a good reason to use it as
>> the platform for the blog.
>
> As with any infrastructure, there is a reasonably high cost in
> changing, as people have become used to a certain way of doing things,
> and porting the contents from the old system to the new one requires
> additional effort.
>
> Blogger has its problems, but it typically gets the job done well
> enough (modulo cases like the one currently affecting Doug and his
> team).

Has the Python insider blog really accumulated enough history and
cruft to make this move problematic? It's a fairly new blog, with not
much content in it. From my blogging experience, Blogger has other
limitations which eventually bite you, and since it's not very
flexible you can either live with it or move to a more flexible
platform.

All of this completely IMHO, of course. Just friendly advice ;-)
Eli

From jnoller at gmail.com  Fri May 20 16:24:29 2011
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 20 May 2011 10:24:29 -0400
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
Message-ID: <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>

On Fri, May 20, 2011 at 5:39 AM, Eli Bendersky <eliben at gmail.com> wrote:
>>> With respect to Google Blogger, I don't see a good reason to use it as
>>> the platform for the blog.
>>
>> As with any infrastructure, there is a reasonably high cost in
>> changing, as people have become used to a certain way of doing things,
>> and porting the contents from the old system to the new one requires
>> additional effort.
>>
>> Blogger has its problems, but it typically gets the job done well
>> enough (modulo cases like the one currently affecting Doug and his
>> team).
>
> Has the Python insider blog really accumulated enough history and
> cruft to make this move problematic? It's a fairly new blog, with not
> much content in it. From my blogging experience, Blogger has other
> limitations which eventually bite you, and since it's not very
> flexible you can either live with it or move to a more flexible
> platform.
>
> All of this completely IMHO, of course. Just friendly advice ;-)
> Eli

There is ongoing work for an RFP by the board to improve the
python.org publishing system/site to allow us to self-host these
things. Moving PSF properties off of it, and onto another "hosted by
someone else" site is probably not a good idea, but our hands may be
forced if google/blogger can not resolve the issues.

jesse

From brian.curtin at gmail.com  Fri May 20 17:21:02 2011
From: brian.curtin at gmail.com (Brian Curtin)
Date: Fri, 20 May 2011 10:21:02 -0500
Subject: [Python-Dev] os.access on Windows
In-Reply-To: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>
References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>
Message-ID: <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>

On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk>wrote:

> There's a thread on python-list at the moment:
>
>  http://mail.python.org/pipermail/python-list/2011-May/1272505.html
>
> which is discussing the validity of os.access results on
> Windows. Now we've been here before: I raised issue2528
> for a previous enquiry some years ago and proffered a patch
> which uses the AccessCheck API to perform the equivalent check,
> but didn't follow through.
>
> Someone on the new thread is suggesting -- validly -- that the
> docs should highlight the limitations of this call on Windows.
> But the docs for that call are already fairly involved:
>
>  http://docs.python.org/library/os.html#os.access
>
> We seem to have a few options in increasing order of difficulty:
>
> * Do nothing - inform the occasional enquirer of the situation and
>  leave it at that.
>
> * Update the docs to add something which describes what the function
>  actually does on the Windows platform. (Whether or not we change any
> code).
>

I think we should tread lightly in the documentation area. We already have
two note boxes, and adding a third probably scares everyone away. Maybe
there should be a bullet list of considerations to be made when using
os.access?

* Apply the patch in issue2528 to 3.3 and maybe 2.7
>

I'd vote in favor of this. If we can be a bit smarter in determining
os.access results, let's do it.

I haven't reviewed the patch other than 1 minute scan, but I'll put this on
my radar and try to get you a review.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110520/98df05ce/attachment.html>

From mail at timgolden.me.uk  Fri May 20 17:25:45 2011
From: mail at timgolden.me.uk (Tim Golden)
Date: Fri, 20 May 2011 16:25:45 +0100
Subject: [Python-Dev] os.access on Windows
In-Reply-To: <BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>
References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>
	<BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>
Message-ID: <4DD687F9.1040403@timgolden.me.uk>

On 20/05/2011 16:21, Brian Curtin wrote:

> On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk
(Sorry about that; I had no idea I'd sent that from my work account)

> I think we should tread lightly in the documentation area. We already
> have two note boxes, and adding a third probably scares everyone away.

I entirely agree. (That's what I meant by "involved" above)

> Maybe there should be a bullet list of considerations to be made when
> using os.access?
>
>     * Apply the patch in issue2528 to 3.3 and maybe 2.7
>
>
> I'd vote in favor of this. If we can be a bit smarter in determining
> os.access results, let's do it.
>
> I haven't reviewed the patch other than 1 minute scan, but I'll put this
> on my radar and try to get you a review.

Thanks. To be honest I wrote the patch 3 years ago; I haven't even
tried to apply it to either of the current posixmodule.c. Let's
see if I can dust it off and mould it into shape, or you'll be
left fighting patch errors instead of reviewing code :)

TJG

From ziade.tarek at gmail.com  Fri May 20 17:29:16 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Fri, 20 May 2011 17:29:16 +0200
Subject: [Python-Dev] packaging landed in stdlib
In-Reply-To: <ir3njj$u61$1@dough.gmane.org>
References: <BANLkTikg_OswAKYE-+r6iLyoA2-yzikBfQ@mail.gmail.com>
	<ir3njj$u61$1@dough.gmane.org>
Message-ID: <BANLkTimC-QEFP5u+2QHSgsYag50DB1k2Pg@mail.gmail.com>

On Thu, May 19, 2011 at 8:30 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 19.05.2011 13:35, Tarek Ziad? wrote:
>> Hey
>>
>> I've pushed packaging in stdlib. There are a few buildbots errors
>> we're fixing right now.
>>
>> We will continue our work in their directly for now on.
>
> Rock on!

Thanks :)

Still working on some issues under windows and solaris bbots today,
but we're getting there.

Sorry for the inconvenience

Tarek



>
> Georg
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
>



-- 
Tarek Ziad? | http://ziade.org

From eliben at gmail.com  Fri May 20 17:35:56 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Fri, 20 May 2011 18:35:56 +0300
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
Message-ID: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>

> There is ongoing work for an RFP by the board to improve the
> python.org publishing system/site to allow us to self-host these
> things. Moving PSF properties off of it, and onto another "hosted by
> someone else" site is probably not a good idea, but our hands may be
> forced if google/blogger can not resolve the issues.
>
> jesse

The whole idea of a Wordpress-(or similar)-based solution is self
hosting, and less reliance on outside providers like blogger.
Wordpress is just a bunch of PHP code you place in a directory on your
server and you have a blog. You don't depend on anyone, except your
own hosting.

Eli

From ncoghlan at gmail.com  Fri May 20 17:37:35 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 21 May 2011 01:37:35 +1000
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
Message-ID: <BANLkTik9v8O250C07Zz7U=GYkAkpW_JxzQ@mail.gmail.com>

On Fri, May 20, 2011 at 7:39 PM, Eli Bendersky <eliben at gmail.com> wrote:
> Has the Python insider blog really accumulated enough history and
> cruft to make this move problematic? It's a fairly new blog, with not
> much content in it. From my blogging experience, Blogger has other
> limitations which eventually bite you, and since it's not very
> flexible you can either live with it or move to a more flexible
> platform.

It's not just the Python Insider blog that is affected (and *any*
effort directed towards platform changes is effort that isn't going
towards writing new articles. Of course, if Blogger don't fix the
currrent problems, then that will be a moot point - moving will be
necessary to get *anything* done).

In general, though, infrastructure changes start from a position of
"not worth the hassle", just like code changes. It takes a pretty
compelling set of features to justify switching, and, while Blogger
isn't the best engine out there, it isn't terrible either (especially
once you replace their lousy comment system with something that is at
least half usable like DISQUS).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Fri May 20 17:44:39 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 21 May 2011 01:44:39 +1000
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
Message-ID: <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>

On Sat, May 21, 2011 at 1:35 AM, Eli Bendersky <eliben at gmail.com> wrote:
>> There is ongoing work for an RFP by the board to improve the
>> python.org publishing system/site to allow us to self-host these
>> things. Moving PSF properties off of it, and onto another "hosted by
>> someone else" site is probably not a good idea, but our hands may be
>> forced if google/blogger can not resolve the issues.
>>
>> jesse
>
> The whole idea of a Wordpress-(or similar)-based solution is self
> hosting, and less reliance on outside providers like blogger.
> Wordpress is just a bunch of PHP code you place in a directory on your
> server and you have a blog. You don't depend on anyone, except your
> own hosting.

As Jesse has said, there is an RFP in development to improve
python.org to the point where we can self-host blogs and the like and
deal with the associated user account administration appropriately.
But when it comes to collaborative blogs, it *isn't* just a matter of
dropping a blogging engine in and running with it.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From tseaver at palladion.com  Fri May 20 18:00:20 2011
From: tseaver at palladion.com (Tres Seaver)
Date: Fri, 20 May 2011 12:00:20 -0400
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
Message-ID: <ir636k$8v1$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 05/20/2011 11:35 AM, Eli Bendersky wrote:
>> There is ongoing work for an RFP by the board to improve the
>> python.org publishing system/site to allow us to self-host these
>> things. Moving PSF properties off of it, and onto another "hosted by
>> someone else" site is probably not a good idea, but our hands may be
>> forced if google/blogger can not resolve the issues.
>>
>> jesse
> 
> The whole idea of a Wordpress-(or similar)-based solution is self
> hosting, and less reliance on outside providers like blogger.
> Wordpress is just a bunch of PHP code you place in a directory on your
> server and you have a blog. You don't depend on anyone, except your
> own hosting.

And your own sysadmins now have to chase fixes for remotely-exploitable
WP bugs:

 http://www.wordpressexploit.com/



Tres.
- -- 
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3WkBQACgkQ+gerLs4ltQ72iwCeIhkCLXm26ujJJ3kqh9vKB4fr
dMYAn05qsoyiNxio02UAYJ7luLjVaSML
=OFdv
-----END PGP SIGNATURE-----


From status at bugs.python.org  Fri May 20 18:07:23 2011
From: status at bugs.python.org (Python tracker)
Date: Fri, 20 May 2011 18:07:23 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110520160723.0E4011CE30@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-05-13 - 2011-05-20)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2794 (+10)
  closed 21115 (+46)
  total  23909 (+56)

Open issues with patches: 1201 


Issues opened (37)
==================

#8796: Deprecate codecs.open()
http://bugs.python.org/issue8796  reopened by haypo

#11377: Deprecate platform.popen()
http://bugs.python.org/issue11377  reopened by eric.araujo

#12068: test_logging failure in test_rollover
http://bugs.python.org/issue12068  reopened by pitrou

#12073: regrtest: use faulthandler to dump the tracebacks on SIGUSR1
http://bugs.python.org/issue12073  opened by haypo

#12074: regrtest: display the current number of failures
http://bugs.python.org/issue12074  opened by haypo

#12075: python3.2 memory leak when setting integer key in dictionary
http://bugs.python.org/issue12075  opened by kaizhu

#12077: Harmonizing descriptor protocol documentation
http://bugs.python.org/issue12077  opened by davide.rizzo

#12079: decimal.py: TypeError precedence in fma()
http://bugs.python.org/issue12079  opened by skrah

#12080: decimal.py: performance in _power_exact
http://bugs.python.org/issue12080  opened by skrah

#12081: Remove distributed copy of libffi
http://bugs.python.org/issue12081  opened by benjamin.peterson

#12082: Python/import.c still references fstat even with DONT_HAVE_FST
http://bugs.python.org/issue12082  opened by joshtriplett

#12084: os.stat() on windows doesn't consider relative symlink
http://bugs.python.org/issue12084  opened by ocean-city

#12085: subprocess.Popen.__del__ raises AttributeError if __init__ was
http://bugs.python.org/issue12085  opened by chortos

#12086: Tutorial doesn't discourage name mangling
http://bugs.python.org/issue12086  opened by sheep

#12087: install_egg_info fails with UnicodeEncodeError depending on lo
http://bugs.python.org/issue12087  opened by hagen

#12089: regrtest.py doesn't check for unexpected output anymore?
http://bugs.python.org/issue12089  opened by haypo

#12090: 3.2: build --without-threads fails
http://bugs.python.org/issue12090  opened by skrah

#12091: multiprocessing: simplify ApplyResult and MapResult with threa
http://bugs.python.org/issue12091  opened by charles-francois.natali

#12097: python.exe crashes if it is unable to find its .dll
http://bugs.python.org/issue12097  opened by techtonik

#12098: Child process running as debug on Windows
http://bugs.python.org/issue12098  opened by thebits

#12100: Incremental encoders of CJK codecs reset the codec at each cal
http://bugs.python.org/issue12100  opened by haypo

#12101: PEPs should have consecutive revision numbers
http://bugs.python.org/issue12101  opened by techtonik

#12102: mmap requires file to be synced
http://bugs.python.org/issue12102  opened by rion4ik at gmail.com

#12103: Documentation of open() does not claim 'e' support in mode str
http://bugs.python.org/issue12103  opened by mmarkk

#12105: open() does not able to set flags, such as O_CLOEXEC
http://bugs.python.org/issue12105  opened by mmarkk

#12106: reflect syntatic sugar in with ast
http://bugs.python.org/issue12106  opened by benjamin.peterson

#12107: TCP listening sockets created without FD_CLOEXEC flag
http://bugs.python.org/issue12107  opened by Christophe.Devriese

#12112: The new packaging module should not use the locale encoding
http://bugs.python.org/issue12112  opened by haypo

#12113: test_packaging fails when run twice
http://bugs.python.org/issue12113  opened by pitrou

#12114: packaging.util._find_exe_version(): potential deadlock
http://bugs.python.org/issue12114  opened by haypo

#12115: some tests need to be skipped on threadless systems
http://bugs.python.org/issue12115  opened by tarek

#12117: Failures with PYTHONDONTWRITEBYTECODE: test_importlib, test_im
http://bugs.python.org/issue12117  opened by pitrou

#12121: test_packaging failure when ssl is not available
http://bugs.python.org/issue12121  opened by pitrou

#12124: python -m test test_packaging test_zipimport failure
http://bugs.python.org/issue12124  opened by haypo

#12125: test_sysconfig fails on OpenIndiana because of test_packaging
http://bugs.python.org/issue12125  opened by haypo

#12126: incorrect select documentation
http://bugs.python.org/issue12126  opened by exarkun

#12127: Inconsistent leading zero treatment
http://bugs.python.org/issue12127  opened by Peter.Wentworth



Most recent 15 issues with no replies (15)
==========================================

#12126: incorrect select documentation
http://bugs.python.org/issue12126

#12125: test_sysconfig fails on OpenIndiana because of test_packaging
http://bugs.python.org/issue12125

#12121: test_packaging failure when ssl is not available
http://bugs.python.org/issue12121

#12114: packaging.util._find_exe_version(): potential deadlock
http://bugs.python.org/issue12114

#12106: reflect syntatic sugar in with ast
http://bugs.python.org/issue12106

#12100: Incremental encoders of CJK codecs reset the codec at each cal
http://bugs.python.org/issue12100

#12091: multiprocessing: simplify ApplyResult and MapResult with threa
http://bugs.python.org/issue12091

#12085: subprocess.Popen.__del__ raises AttributeError if __init__ was
http://bugs.python.org/issue12085

#12066: Empty ('') xmlns attribute is not properly handled by xml.dom.
http://bugs.python.org/issue12066

#12063: tokenize module appears to treat unterminated single and doubl
http://bugs.python.org/issue12063

#12055: doctest not working on nested functions
http://bugs.python.org/issue12055

#12053: Add prefetch() for Buffered IO (experiment)
http://bugs.python.org/issue12053

#12037: test_email failures under Windows with the eol extension activ
http://bugs.python.org/issue12037

#12029: ABC registration of Exceptions
http://bugs.python.org/issue12029

#12019: Dead or buggy code in importlib.test.__main__
http://bugs.python.org/issue12019



Most recent 15 issues waiting for review (15)
=============================================

#12124: python -m test test_packaging test_zipimport failure
http://bugs.python.org/issue12124

#12114: packaging.util._find_exe_version(): potential deadlock
http://bugs.python.org/issue12114

#12112: The new packaging module should not use the locale encoding
http://bugs.python.org/issue12112

#12106: reflect syntatic sugar in with ast
http://bugs.python.org/issue12106

#12105: open() does not able to set flags, such as O_CLOEXEC
http://bugs.python.org/issue12105

#12102: mmap requires file to be synced
http://bugs.python.org/issue12102

#12100: Incremental encoders of CJK codecs reset the codec at each cal
http://bugs.python.org/issue12100

#12098: Child process running as debug on Windows
http://bugs.python.org/issue12098

#12091: multiprocessing: simplify ApplyResult and MapResult with threa
http://bugs.python.org/issue12091

#12085: subprocess.Popen.__del__ raises AttributeError if __init__ was
http://bugs.python.org/issue12085

#12084: os.stat() on windows doesn't consider relative symlink
http://bugs.python.org/issue12084

#12074: regrtest: display the current number of failures
http://bugs.python.org/issue12074

#12073: regrtest: use faulthandler to dump the tracebacks on SIGUSR1
http://bugs.python.org/issue12073

#12057: HZ codec has no test
http://bugs.python.org/issue12057

#12049: expose RAND_bytes() function of OpenSSL
http://bugs.python.org/issue12049



Top 10 most discussed issues (10)
=================================

#11610: Improving property to accept abstract methods
http://bugs.python.org/issue11610  12 msgs

#6721: Locks in python standard library should be sanitized on fork
http://bugs.python.org/issue6721   9 msgs

#12105: open() does not able to set flags, such as O_CLOEXEC
http://bugs.python.org/issue12105   9 msgs

#11877: Change os.fsync() to support physical backing store syncs
http://bugs.python.org/issue11877   8 msgs

#12086: Tutorial doesn't discourage name mangling
http://bugs.python.org/issue12086   8 msgs

#12112: The new packaging module should not use the locale encoding
http://bugs.python.org/issue12112   8 msgs

#12127: Inconsistent leading zero treatment
http://bugs.python.org/issue12127   8 msgs

#1615158: POSIX capabilities support
http://bugs.python.org/issue1615158   8 msgs

#6727: ImportError when package is symlinked on Windows
http://bugs.python.org/issue6727   7 msgs

#12097: python.exe crashes if it is unable to find its .dll
http://bugs.python.org/issue12097   7 msgs



Issues closed (49)
==================

#4621: zipfile returns string but expects binary
http://bugs.python.org/issue4621  closed by haypo

#5723: Incomplete json tests
http://bugs.python.org/issue5723  closed by ezio.melotti

#6059: ctypes/uuid-related segmentation fault
http://bugs.python.org/issue6059  closed by charles-francois.natali

#6498: Py_Main() does not return on SystemExit
http://bugs.python.org/issue6498  closed by python-dev

#7656: test_hashlib fails on some installations (specifically Neal's 
http://bugs.python.org/issue7656  closed by gregory.p.smith

#7960: test.support.captured_output has invalid docstring example
http://bugs.python.org/issue7960  closed by ezio.melotti

#8650: zlibmodule.c isn't 64-bit clean
http://bugs.python.org/issue8650  closed by nadeem.vawda

#8809: smtplib should support SSL contexts
http://bugs.python.org/issue8809  closed by pitrou

#9516: sysconfig: $MACOSX_DEPLOYMENT_TARGET mismatch: now "10.3" but 
http://bugs.python.org/issue9516  closed by ronaldoussoren

#9927: Leak around GetFinalPathNameByHandle (Windows)
http://bugs.python.org/issue9927  closed by ocean-city

#10090: python -m locale fails on OSX
http://bugs.python.org/issue10090  closed by ronaldoussoren

#10154: locale.normalize strips "-" from UTF-8,	which fails on Mac
http://bugs.python.org/issue10154  closed by ronaldoussoren

#10239: multiprocessing signal defect
http://bugs.python.org/issue10239  closed by charles-francois.natali

#10756: Error in atexit._run_exitfuncs [...]  Exception expected for v
http://bugs.python.org/issue10756  closed by haypo

#11088: IDLE on OS X with Cocoa Tk 8.5 can hang waiting on input / raw
http://bugs.python.org/issue11088  closed by ronaldoussoren

#11614: import __hello__ is broken in Python 3
http://bugs.python.org/issue11614  closed by haypo

#11731: Simplify email API via 'policy' objects
http://bugs.python.org/issue11731  closed by r.david.murray

#11949: Make float('nan') unorderable
http://bugs.python.org/issue11949  closed by rhettinger

#11979: Minor improvements to the Sockets readme: typos, wording and s
http://bugs.python.org/issue11979  closed by ezio.melotti

#11996: libpython.py: nicer py-bt output
http://bugs.python.org/issue11996  closed by haypo

#12002: ftplib.FTP.abort fails with TypeError on Python 3.x
http://bugs.python.org/issue12002  closed by giampaolo.rodola

#12048: Python 3, ZipFile Bug In Chinese
http://bugs.python.org/issue12048  closed by haypo

#12050: unconsumed_tail of zlib.Decompress is not always cleared on de
http://bugs.python.org/issue12050  closed by nadeem.vawda

#12059: hashlib does not handle missing hash functions correctly
http://bugs.python.org/issue12059  closed by gregory.p.smith

#12060: Python doesn't support real time signals
http://bugs.python.org/issue12060  closed by gregory.p.smith

#12065: test_ssl failure when svn.python.org fails to resolve
http://bugs.python.org/issue12065  closed by pitrou

#12072: Missing parenthesis in c-api/buffer PyBuffer_FillContiguousStr
http://bugs.python.org/issue12072  closed by ezio.melotti

#12076: IDLE v.3.2 crashing randomly on MacOSX 10.6.7
http://bugs.python.org/issue12076  closed by amaury.forgeotdarc

#12083: Compile-time option to avoid writing files, including generate
http://bugs.python.org/issue12083  closed by loewis

#12088: tarfile.extractall fails to overwrite unresolved symlinks and 
http://bugs.python.org/issue12088  closed by orsenthil

#12092: Clarify sentence in tutorial
http://bugs.python.org/issue12092  closed by ezio.melotti

#12093: Typo in struct unpacking example
http://bugs.python.org/issue12093  closed by ezio.melotti

#12094: Cannot Launch IDLE
http://bugs.python.org/issue12094  closed by ezio.melotti

#12095: test failures due to missing module
http://bugs.python.org/issue12095  closed by haypo

#12096: test_threading.test_waitfor() timeout (1 hour) on x86 Gentoo 3
http://bugs.python.org/issue12096  closed by haypo

#12099: re pattern objects have no __class__
http://bugs.python.org/issue12099  closed by python-dev

#12104: os.path.join('/some/path', '') adds extra slash at end of resu
http://bugs.python.org/issue12104  closed by brian.curtin

#12108: test_packaging monkeypatches httplib
http://bugs.python.org/issue12108  closed by pitrou

#12109: test_packaging monkeypatches httplib
http://bugs.python.org/issue12109  closed by pitrou

#12110: test_packaging monkeypatches httplib
http://bugs.python.org/issue12110  closed by pitrou

#12111: email's use of __setitem__ is highly counterintuitive
http://bugs.python.org/issue12111  closed by r.david.murray

#12116: io.Buffer*.seek() doesn't seek if "seeking leaves us inside th
http://bugs.python.org/issue12116  closed by pitrou

#12118: test_imp failure
http://bugs.python.org/issue12118  closed by haypo

#12119: test_distutils failure
http://bugs.python.org/issue12119  closed by haypo

#12120: test_packaging failure
http://bugs.python.org/issue12120  closed by haypo

#12122: test_runpy failure
http://bugs.python.org/issue12122  closed by haypo

#12123: test_import failures
http://bugs.python.org/issue12123  closed by haypo

#1746656: IPv6 Interface naming/indexing functions
http://bugs.python.org/issue1746656  closed by gregory.p.smith

#12078: re.sub() replaces only several matches
http://bugs.python.org/issue12078  closed by ezio.melotti

From guido at python.org  Fri May 20 18:09:48 2011
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 May 2011 09:09:48 -0700
Subject: [Python-Dev] os.access on Windows
In-Reply-To: <4DD687F9.1040403@timgolden.me.uk>
References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>
	<BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>
	<4DD687F9.1040403@timgolden.me.uk>
Message-ID: <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com>

On May 20, 2011 8:30 AM, "Tim Golden" <mail at timgolden.me.uk> wrote:
> On 20/05/2011 16:21, Brian Curtin wrote:
>
>> On Fri, May 20, 2011 at 03:38, Tim Golden <Tim.Golden at cbsoutdoor.co.uk
> (Sorry about that; I had no idea I'd sent that from my work account)
>
>> I think we should tread lightly in the documentation area. We already
>> have two note boxes, and adding a third probably scares everyone away.
>
> I entirely agree. (That's what I meant by "involved" above)

TBH I think the less attractive we can make os.access() look the
better. It uses the real uid instead of the effective uid, it
encourages LBYL behavior, the outcome may be incorrect, it doesn't
work on Windows... The ONLY reason to ever use it is in a setuid()
program. But who writes those any more? (Esp. in Python!)

-- 
--Guido van Rossum (python.org/~guido)

From cf.natali at gmail.com  Fri May 20 19:01:26 2011
From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Fri, 20 May 2011 19:01:26 +0200
Subject: [Python-Dev] Hello!
Message-ID: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com>

Hi,

My name is Charles-Fran?ois Natali, I've been using Python for a
couple years, and I've recently been granted commit priviledge.
I just wanted to say hi to everyone on this list, and let you know
that I'm really happy and proud of joining this great community.

Cheers,

cf

From stefan_ml at behnel.de  Fri May 20 21:04:49 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 20 May 2011 21:04:49 +0200
Subject: [Python-Dev] in latest Py3k site.py: configparser.NoSectionError:
 No section: 'posix_prefix'
Message-ID: <ir6e0i$hn6$1@dough.gmane.org>

Hi,

since May 19, I get the exception below in the latest py3k site.py when 
trying to run a distutils build with it (building Cython). The changelog 
since the previous (working) CPython build is here:

https://sage.math.washington.edu:8091/hudson/job/py3k-hg/374/

The failing build is here:

https://sage.math.washington.edu:8091/hudson/job/cython-devel-build-py3k/1313/console

This is on 64bit Linux. I tried with a clean checkout, no difference. Is 
this problem obvious to someone, is there anything that needs adaptation on 
our side (I hope not), or should I file a bug report?

Thanks,

Stefan


"""
$ python setup.py bdist --formats=gztar --cython-profile
Traceback (most recent call last):
   File "/.../python/lib/python3.3/configparser.py", line 842, in items
     d.update(self._sections[section])
KeyError: 'posix_prefix'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
   File "/.../python/lib/python3.3/site.py", line 537, in <module>
     main()
   File "/.../python/lib/python3.3/site.py", line 522, in main
     known_paths = addusersitepackages(known_paths)
   File "/.../python/lib/python3.3/site.py", line 263, in addusersitepackages
     user_site = getusersitepackages()
   File "/.../python/lib/python3.3/site.py", line 238, in getusersitepackages
     user_base = getuserbase() # this will also set USER_BASE
   File "/.../python/lib/python3.3/site.py", line 228, in getuserbase
     USER_BASE = get_config_var('userbase')
   File "/.../python/lib/python3.3/sysconfig.py", line 576, in get_config_var
     return get_config_vars().get(name)
   File "/.../python/lib/python3.3/sysconfig.py", line 472, in get_config_vars
     _init_posix(_CONFIG_VARS)
   File "/.../python/lib/python3.3/sysconfig.py", line 324, in _init_posix
     makefile = get_makefile_filename()
   File "/.../python/lib/python3.3/sysconfig.py", line 318, in 
get_makefile_filename
     return os.path.join(get_path('stdlib'), config_dir_name, 'Makefile')
   File "/.../python/lib/python3.3/sysconfig.py", line 436, in get_path
     return get_paths(scheme, vars, expand)[name]
   File "/.../python/lib/python3.3/sysconfig.py", line 426, in get_paths
     return _expand_vars(scheme, vars)
   File "/.../python/lib/python3.3/sysconfig.py", line 142, in _expand_vars
     for key, value in _SCHEMES.items(scheme):
   File "/.../python/lib/python3.3/configparser.py", line 845, in items
     raise NoSectionError(section)
configparser.NoSectionError: No section: 'posix_prefix'
"""


From g.brandl at gmx.net  Fri May 20 22:30:19 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 20 May 2011 22:30:19 +0200
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
Message-ID: <ir6j0k$g6q$1@dough.gmane.org>

On 20.05.2011 17:35, Eli Bendersky wrote:
>> There is ongoing work for an RFP by the board to improve the
>> python.org publishing system/site to allow us to self-host these
>> things. Moving PSF properties off of it, and onto another "hosted by
>> someone else" site is probably not a good idea, but our hands may be
>> forced if google/blogger can not resolve the issues.
>>
>> jesse
> 
> The whole idea of a Wordpress-(or similar)-based solution is self
> hosting, and less reliance on outside providers like blogger.
> Wordpress is just a bunch of PHP code you place in a directory on your
> server

That's exactly the problem.

Georg


From martin at v.loewis.de  Fri May 20 23:47:35 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 May 2011 23:47:35 +0200
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
	<BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>
Message-ID: <4DD6E177.5020202@v.loewis.de>

> As Jesse has said, there is an RFP in development to improve
> python.org to the point where we can self-host blogs and the like and
> deal with the associated user account administration appropriately.

To run a blog on www.python.org, a PEP is not needed. If anybody would
volunteer to set this up, it could be done in no time.

Regards,
Martin

From martin at v.loewis.de  Fri May 20 23:56:01 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 20 May 2011 23:56:01 +0200
Subject: [Python-Dev] os.access on Windows
In-Reply-To: <BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com>
References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>	<BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>	<4DD687F9.1040403@timgolden.me.uk>
	<BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com>
Message-ID: <4DD6E371.2020706@v.loewis.de>

> TBH I think the less attractive we can make os.access() look the
> better. It uses the real uid instead of the effective uid, it
> encourages LBYL behavior, the outcome may be incorrect, it doesn't
> work on Windows... The ONLY reason to ever use it is in a setuid()
> program. But who writes those any more? (Esp. in Python!)

+1. The best way to determine "could I access this file" is to try
to access it, and be prepared to get an exception. So we might
deprecate-then-delete it on Windows.

People who *really* need to know in advance should use the Windows
API for that on Windows (i.e. call AccessCheck).

Regards,
Martin

From doug.hellmann at gmail.com  Sat May 21 00:36:49 2011
From: doug.hellmann at gmail.com (Doug Hellmann)
Date: Fri, 20 May 2011 18:36:49 -0400
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <4DD6E177.5020202@v.loewis.de>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
	<BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>
	<4DD6E177.5020202@v.loewis.de>
Message-ID: <5CC8D21C-156F-4D27-B490-9DF29CB1C5F5@gmail.com>


On May 20, 2011, at 5:47 PM, Martin v. L?wis wrote:

>> As Jesse has said, there is an RFP in development to improve
>> python.org to the point where we can self-host blogs and the like and
>> deal with the associated user account administration appropriately.
> 
> To run a blog on www.python.org, a PEP is not needed. If anybody would
> volunteer to set this up, it could be done in no time.

The blog is working again, so we can continue using the tool chain we have.

Thanks,
Doug

--
Doug Hellmann
Communications Director
Python Software Foundation
http://python.org/psf/


From barry at python.org  Sat May 21 02:53:14 2011
From: barry at python.org (Barry Warsaw)
Date: Fri, 20 May 2011 20:53:14 -0400
Subject: [Python-Dev] Python 2.6.7 release candidate 2 now available
Message-ID: <20110520205314.1be39eec@neurotica.wooz.org>

Hello to all you Pythoneers and Pythonistas,

I'm happy to announce the availability of Python 2.6.7 release candidate 2.
Release candidate 1 was not widely announced due to a mismatch between the
Mercurial and Subversion branches.  Barring any unforeseen issues, this will
be the last release candidate before 2.6.7 final, which is currently scheduled
for June 3, 2011.

As previously announced, Python 2.6 is in security-fix only mode.  This means
that general bug fix maintenance has ended, and only critical security fixes
are supported.  We will support Python 2.6 in security-fix only mode until
October 2013.  Also, this is a source-only release; no installers for Windows
or Mac OS X will be provided.

Please download and test this release candidate.

    http://www.python.org/download/releases/2.6.7/

The NEWS file contains a list of changes since 2.6.6.

    http://www.python.org/download/releases/2.6.7/NEWS.txt

Many thanks go out to the entire Python community for their contributions
great and small.

Enjoy,
-Barry
(on behalf of the Python development community)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110520/e51c7872/attachment.pgp>

From nad at acm.org  Sat May 21 05:18:51 2011
From: nad at acm.org (Ned Deily)
Date: Fri, 20 May 2011 20:18:51 -0700
Subject: [Python-Dev] in latest Py3k site.py:
	configparser.NoSectionError: No section: 'posix_prefix'
References: <ir6e0i$hn6$1@dough.gmane.org>
Message-ID: <nad-3CDB70.20185120052011@news.gmane.org>

In article <ir6e0i$hn6$1 at dough.gmane.org>,
 Stefan Behnel <stefan_ml at behnel.de> wrote:
> since May 19, I get the exception below in the latest py3k site.py when 
> trying to run a distutils build with it (building Cython). The changelog 
> since the previous (working) CPython build is here:
> 
> https://sage.math.washington.edu:8091/hudson/job/py3k-hg/374/
> 
> The failing build is here:
> 
> https://sage.math.washington.edu:8091/hudson/job/cython-devel-build-py3k/1313/
> console
> 
> This is on 64bit Linux. I tried with a clean checkout, no difference. Is 
> this problem obvious to someone, is there anything that needs adaptation on 
> our side (I hope not), or should I file a bug report?

It's a bug introduced by the packaging (Distutils2) feature.  Thanks for 
finding it first.

http://bugs.python.org/issue12131

-- 
 Ned Deily,
 nad at acm.org


From rosslagerwall at gmail.com  Sat May 21 06:42:43 2011
From: rosslagerwall at gmail.com (Ross Lagerwall)
Date: Sat, 21 May 2011 06:42:43 +0200
Subject: [Python-Dev] Hello!
In-Reply-To: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com>
References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com>
Message-ID: <1305952963.1475.0.camel@hobo>

On Fri, 2011-05-20 at 19:01 +0200, Charles-Fran?ois Natali wrote:
> Hi,
> 
> My name is Charles-Fran?ois Natali, I've been using Python for a
> couple years, and I've recently been granted commit priviledge.
> I just wanted to say hi to everyone on this list, and let you know
> that I'm really happy and proud of joining this great community.

Congratulations, welcome.

Ross


From solipsis at pitrou.net  Sat May 21 13:09:03 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 21 May 2011 13:09:03 +0200
Subject: [Python-Dev] cpython: Added SSL test for HTTPHandler.
References: <E1QNjTp-0002V3-Gf@dinsdale.python.org>
Message-ID: <20110521130903.2f7cf91f@pitrou.net>

On Sat, 21 May 2011 12:32:21 +0200
vinay.sajip <python-checkins at python.org> wrote:
> +            if secure:
> +                import ssl
> +                fd, fn = tempfile.mkstemp()
> +                os.close(fd)
> +                with open(fn, 'w') as f:
> +                    f.write(self.PEMFILE)
> +                sslctx = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
> +                sslctx.load_cert_chain(fn)

We already bundle a couple of cert files in Lib/test, so you shouldn't
have to use your own (see e.g. Lib/test/keycert.pem).

> +            self.h_hdlr = logging.handlers.HTTPHandler(host, '/frob', secure=secure)

If you want real security, HTTPHandler should configure its SSLContext
in CERT_REQUIRED mode (and be given the proper root certificate(s)).
Otherwise you are vulnerable to man-in-the-middle attacks.

See the "context" and "check_hostname" arguments to HTTPSConnection:
http://docs.python.org/dev/library/http.client.html#http.client.HTTPSConnection

Regards

Antoine.



From solipsis at pitrou.net  Sat May 21 13:59:25 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 21 May 2011 13:59:25 +0200
Subject: [Python-Dev] Hello!
References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com>
Message-ID: <20110521135925.33599a44@pitrou.net>

On Fri, 20 May 2011 19:01:26 +0200
Charles-Fran?ois Natali <cf.natali at gmail.com> wrote:

> Hi,
> 
> My name is Charles-Fran?ois Natali, I've been using Python for a
> couple years, and I've recently been granted commit priviledge.
> I just wanted to say hi to everyone on this list, and let you know
> that I'm really happy and proud of joining this great community.

Welcome, and keep up the good work.

Regards

Antoine.



From solipsis at pitrou.net  Sat May 21 16:37:14 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 21 May 2011 16:37:14 +0200
Subject: [Python-Dev] Stable buildbots update
Message-ID: <20110521163714.68c5384f@pitrou.net>


Hello,

We recently got a couple of new stable buildbots:
- R. David Murray's "x86 Gentoo" machine, which builds in non-debug
  mode and therefore checks that release Pythons work fine
- Stefan Krah's "AMD64 FreeBSD 8.2" machine
- Bill Janssen's "AMD64 Snow Leopard" machine

Many stable buildbots on the default branch (*) are currently red
because of test_packaging issues.
(*) http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable

Regards

Antoine.



From solipsis at pitrou.net  Sat May 21 17:07:25 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 21 May 2011 17:07:25 +0200
Subject: [Python-Dev] The socket HOWTO
Message-ID: <20110521170725.51eab5f9@pitrou.net>


Hello,

I would like to suggest that we remove the socket HOWTO (currently at
http://docs.python.org/dev/howto/sockets.html)

My main issue with this document is that it doesn't seem to have
a well-defined destination:
- people who know sockets won't learn anything from it
- but people who don't know sockets will probably find it clear as mud
(for example, what's an "INET" or "STREAM" socket? what's "select"?)

I have other issues, such as the style/tone it's written in. I'm sure
the author had fun writing it but it doesn't fit well with the rest of
the documentation. Also, the author gives a lot of "advice" without
explaining or justifying it ("if somewhere in those input lists of
sockets is one which has died a nasty death, the select will fail" ->
is that really true? what is a "nasty death" and how is that supposed to
happen? couldn't the author have put a 3-line example to demonstrate
this supposed drawback and how it manifests?).

And, finally, many statements seem arbitrary ("There?s no question that
the fastest sockets code uses non-blocking sockets and select to
multiplex them") or plain wrong ("threading support in Unixes varies
both in API and quality. So the normal Unix solution is to fork a
subprocess to deal with each connection"). I don't think giving
misleading advice to users is really a good idea. And suggesting
beginners they use non-blocking sockets without even *showing* how (or
pointing to asyncore or Twisted) is a very bad idea. select() is not
enough, you still have to be prepared to get EAGAIN or EWOULDBLOCK when
calling recv() or send() (i.e. select() can give false positives).

Oh and I think it's obsolete too, because the "class mysocket"
concatenates the output of recv() with a str rather than a bytes
object. Not to mention that features of the "class mysocket" can be had
using a buffered socket.makefile() instead of writing custom code.

(followed up from http://bugs.python.org/issue12126 at Eli's request)

Regards

Antoine.



From g.brandl at gmx.net  Sat May 21 17:37:05 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 21 May 2011 17:37:05 +0200
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <20110521170725.51eab5f9@pitrou.net>
References: <20110521170725.51eab5f9@pitrou.net>
Message-ID: <ir8m70$e7c$1@dough.gmane.org>

On 05/21/11 17:07, Antoine Pitrou wrote:
> 
> Hello,
> 
> I would like to suggest that we remove the socket HOWTO (currently at
> http://docs.python.org/dev/howto/sockets.html)

+1, or a big rewrite.

Georg


From eliben at gmail.com  Sat May 21 17:48:37 2011
From: eliben at gmail.com (Eli Bendersky)
Date: Sat, 21 May 2011 18:48:37 +0300
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <20110521170725.51eab5f9@pitrou.net>
References: <20110521170725.51eab5f9@pitrou.net>
Message-ID: <BANLkTikbNd8=AAMeUvD345P4sichrQZmZQ@mail.gmail.com>

> I would like to suggest that we remove the socket HOWTO (currently at
> http://docs.python.org/dev/howto/sockets.html)
>
> My main issue with this document is that it doesn't seem to have
> a well-defined destination:
> - people who know sockets won't learn anything from it
> - but people who don't know sockets will probably find it clear as mud
> (for example, what's an "INET" or "STREAM" socket? what's "select"?)
>
<snip>

I definitely recall finding this document useful when I first learned
Python. I knew socket programming from other languages, and the
document helped to see how it maps to Python. That said, I must agree
that there is probably no place for such a tutorial in Python's
official documentation. Python is a widely-general purpose language,
and sockets programming is just one of a plethora of things it
supports, so a special treatment for sockets probably isn't warranted,
especially given that the `socket` module itself is a relatively thin
wrapper over the OS socket interface.

I don't think a rewrite will help either. To describe socket
programming in full, without missing anything and being accurate will
require no less than a small book (and in fact many such books already
exist).

Therefore, I'm +1 on removing it from the official docs. It can be
relegated to the Python wiki, where it can be improved if someone
wishes to contribute to that.

Eli

From rosslagerwall at gmail.com  Sat May 21 17:48:48 2011
From: rosslagerwall at gmail.com (Ross Lagerwall)
Date: Sat, 21 May 2011 17:48:48 +0200
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <20110521170725.51eab5f9@pitrou.net>
References: <20110521170725.51eab5f9@pitrou.net>
Message-ID: <1305992928.1475.10.camel@hobo>

On Sat, 2011-05-21 at 17:07 +0200, Antoine Pitrou wrote:
> Hello,
> 
> I would like to suggest that we remove the socket HOWTO (currently at
> http://docs.python.org/dev/howto/sockets.html)
> 
> My main issue with this document is that it doesn't seem to have
> a well-defined destination:
> - people who know sockets won't learn anything from it
> - but people who don't know sockets will probably find it clear as mud
> (for example, what's an "INET" or "STREAM" socket? what's "select"?)
> 
> I have other issues, such as the style/tone it's written in. I'm sure
> the author had fun writing it but it doesn't fit well with the rest of
> the documentation. Also, the author gives a lot of "advice" without
> explaining or justifying it ("if somewhere in those input lists of
> sockets is one which has died a nasty death, the select will fail" ->
> is that really true? what is a "nasty death" and how is that supposed to
> happen? couldn't the author have put a 3-line example to demonstrate
> this supposed drawback and how it manifests?).
> 
> And, finally, many statements seem arbitrary ("There?s no question that
> the fastest sockets code uses non-blocking sockets and select to
> multiplex them") or plain wrong ("threading support in Unixes varies
> both in API and quality. So the normal Unix solution is to fork a
> subprocess to deal with each connection"). I don't think giving
> misleading advice to users is really a good idea. And suggesting
> beginners they use non-blocking sockets without even *showing* how (or
> pointing to asyncore or Twisted) is a very bad idea. select() is not
> enough, you still have to be prepared to get EAGAIN or EWOULDBLOCK when
> calling recv() or send() (i.e. select() can give false positives).
> 
> Oh and I think it's obsolete too, because the "class mysocket"
> concatenates the output of recv() with a str rather than a bytes
> object. Not to mention that features of the "class mysocket" can be had
> using a buffered socket.makefile() instead of writing custom code.
> 
> (followed up from http://bugs.python.org/issue12126 at Eli's request)

While I agree with most of what you said, I actually did find it very
useful when first learning sockets.
It's in the top page on google for "socket programming" or "socket how
to". Also, it hinted at some concepts that could then be googled for
more information like select, nonblocking sockets, etc.

However, I would agree that this should be moved out of the
documentation and as suggested in the issue, into the wiki.


From orsenthil at gmail.com  Sat May 21 18:01:19 2011
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Sun, 22 May 2011 00:01:19 +0800
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <ir8m70$e7c$1@dough.gmane.org>
References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org>
Message-ID: <20110521160118.GA22904@kevin>

On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote:
> > 
> > I would like to suggest that we remove the socket HOWTO (currently at
> > http://docs.python.org/dev/howto/sockets.html)
> 
> +1, or a big rewrite.
> 

I favor a rewrite over removal. I have read it once/twice and have
never revisited it (the probably the reason that it was not helpful
enough for a revisit), but still gives some important pointers.

One document cannot cover it all, there are many pointers (examples at
effbot.org, Python MoTW docs) all serve as good introduction to
sockets in python.

So a rewrite with good pointers would be more appropriate.

-- 
Senthil


From g.brandl at gmx.net  Sat May 21 19:38:42 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 21 May 2011 19:38:42 +0200
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <20110521160118.GA22904@kevin>
References: <20110521170725.51eab5f9@pitrou.net>
	<ir8m70$e7c$1@dough.gmane.org> <20110521160118.GA22904@kevin>
Message-ID: <ir8tb1$k0o$1@dough.gmane.org>

On 05/21/11 18:01, Senthil Kumaran wrote:
> On Sat, May 21, 2011 at 05:37:05PM +0200, Georg Brandl wrote:
>> > 
>> > I would like to suggest that we remove the socket HOWTO (currently at
>> > http://docs.python.org/dev/howto/sockets.html)
>> 
>> +1, or a big rewrite.
>> 
> 
> I favor a rewrite over removal. I have read it once/twice and have
> never revisited it (the probably the reason that it was not helpful
> enough for a revisit), but still gives some important pointers.
> 
> One document cannot cover it all, there are many pointers (examples at
> effbot.org, Python MoTW docs) all serve as good introduction to
> sockets in python.
> 
> So a rewrite with good pointers would be more appropriate.

Even then, it's better off in the Wiki until the rewrite is complete.

Georg


From ziade.tarek at gmail.com  Sat May 21 20:17:40 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Sat, 21 May 2011 20:17:40 +0200
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <20110521163714.68c5384f@pitrou.net>
References: <20110521163714.68c5384f@pitrou.net>
Message-ID: <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>

On Sat, May 21, 2011 at 4:37 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>
> Hello,
>
> We recently got a couple of new stable buildbots:
> - R. David Murray's "x86 Gentoo" machine, which builds in non-debug
> ?mode and therefore checks that release Pythons work fine
> - Stefan Krah's "AMD64 FreeBSD 8.2" machine
> - Bill Janssen's "AMD64 Snow Leopard" machine
>
> Many stable buildbots on the default branch (*) are currently red
> because of test_packaging issues.
> (*) http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable

Yes, I am aware of this. I have fixed today most remaining issues, and
fixing the final ones right now.


>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
>



-- 
Tarek Ziad? | http://ziade.org

From artur.siekielski at gmail.com  Sun May 22 01:57:55 2011
From: artur.siekielski at gmail.com (Artur Siekielski)
Date: Sun, 22 May 2011 01:57:55 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
	outside of objects
Message-ID: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>

Hi.
The problem with reference counters is that they are very often
incremented/decremented, even for read-only algorithms (like traversal
of a list). It has two drawbacks:
1. CPU cache lines (64 bytes on X86) containing a beginning of a
PyObject are very often invalidated, resulting in loosing many chances
to use the CPU caches
2. The copy-on-write after fork() optimization (Linux) is almost
useless in CPython, because even if you don't modify data directly,
refcounts are modified, and PyObjects with refcounts inside are spread
all over process' memory (and one small refcount modification causes
the whole page - 4kB - to be copied into a child process).

So an idea I would like to try is to move reference counts outside of
PyObjects, to a contiguous block(s) of memory. PyObjects would have a
pointer to a reference count inside this block. Doing this I think
that
1. The beginning of PyObject structs could be CPU-cached for a much
longer time (small objects like ints could be fully cached). I don't
know if having localized writes into the block with refcounts also
help performance?
2. copy-on-write after fork() will work much better, only the block
with refcounts would be copied into a child process (for read-only
algorithms)

However the drawback is that such design introduces a new level of
indirection which is a pointer inside a PyObject instead of a direct
value. Also it seems that the "block" with refcounts would have to be
a non-trivial data structure.

I'm not a compiler/profiling expert so the main question is if such
design can work, and maybe someone was thinking about something
similar? And if CPython was profiled for CPU cache usage?

From solipsis at pitrou.net  Sun May 22 14:48:37 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 22 May 2011 14:48:37 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
Message-ID: <20110522144837.10e9b95d@pitrou.net>


Hello,

On Sun, 22 May 2011 01:57:55 +0200
Artur Siekielski <artur.siekielski at gmail.com> wrote:
> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
> PyObject are very often invalidated, resulting in loosing many chances
> to use the CPU caches

Mutating data doesn't invalidate a cache line. It just makes it
necessary to write it back to memory at some point.

> 2. The copy-on-write after fork() optimization (Linux) is almost
> useless in CPython, because even if you don't modify data directly,
> refcounts are modified, and PyObjects with refcounts inside are spread
> all over process' memory (and one small refcount modification causes
> the whole page - 4kB - to be copied into a child process).

Indeed.

> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar? And if CPython was profiled for CPU cache usage?

This has already been proposed a couple of times. I guess what's needed
is for someone to experiment and post benchmark results.

Regards

Antoine.



From neologix at free.fr  Sun May 22 16:23:55 2011
From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Sun, 22 May 2011 16:23:55 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <20110522144837.10e9b95d@pitrou.net>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<20110522144837.10e9b95d@pitrou.net>
Message-ID: <BANLkTinjggE98g+6Mso5CAF7VH9qyhHRRQ@mail.gmail.com>

>> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
>> PyObject are very often invalidated, resulting in loosing many chances
>> to use the CPU caches
>
> Mutating data doesn't invalidate a cache line. It just makes it
> necessary to write it back to memory at some point.
>

I think he's referring to the multi-core case.
In MESI terminology, the cache line will become modified in the
current cache (current thread),  but invalid in other cores' caches.
But given that objects are accessed serialized by the GIL (which will
issue a memory barrier anyway), I'm not sure that the performance
impact will be noticeable. Furthermore, given that threads are
actually serialized, I suspect that the scheduler tends to bind them
naturally to the same CPU.

>> 2. The copy-on-write after fork() optimization (Linux) is almost
>> useless in CPython, because even if you don't modify data directly,
>> refcounts are modified, and PyObjects with refcounts inside are spread
>> all over process' memory (and one small refcount modification causes
>> the whole page - 4kB - to be copied into a child process).
>
> Indeed.
>

There's been a bug report a couple months ago from someone using large
datasets for some scientific application. He was suggesting to add
support for Linux's MADV_MERGEABLE, but the root cause is really the
reference count being incremented even when objects are treated
read-only.
For the record, it's http://bugs.python.org/issue9942 (and this idea
was brought up here).

cf

From janssen at parc.com  Mon May 23 03:00:51 2011
From: janssen at parc.com (Bill Janssen)
Date: Sun, 22 May 2011 18:00:51 PDT
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
Message-ID: <58834.1306112451@parc.com>

Tarek Ziad? <ziade.tarek at gmail.com> wrote:

> Yes, I am aware of this. I have fixed today most remaining issues, and
> fixing the final ones right now.

Just FYI:  the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots
are now green, but the "PPC Tiger" buildbot is still failing for all
branches because of packaging errors:

======================================================================
FAIL: test_user_site (packaging.tests.test_command_install_dist.InstallTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 95, in test_user_site
    self._test_user_site()
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_dist.py", line 124, in _test_user_site
    self.assertTrue(os.path.exists(self.user_base))
AssertionError: False is not true

======================================================================
FAIL: test_get_outputs (packaging.tests.test_command_install_lib.InstallLibTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/packaging/tests/test_command_install_lib.py", line 71, in test_get_outputs
    self.assertEqual(len(cmd.get_outputs()), 4)
AssertionError: 2 != 4

Bill

From martin at v.loewis.de  Mon May 23 06:59:19 2011
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 23 May 2011 06:59:19 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
Message-ID: <4DD9E9A7.50807@v.loewis.de>

> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar?

My expectation is that your approach would likely make the issues
worse in a multi-CPU setting. If you put multiple reference counters
into a contiguous block of memory, unrelated reference counters will
live in the same cache line. Consequentially, changing one reference
counter on one CPU will invalidate the cached reference counters of
that cache line on other CPU, making your problem a) actually worse.

Regards,
Martin

From cesare.di.mauro at gmail.com  Mon May 23 07:35:31 2011
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Mon, 23 May 2011 07:35:31 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <4DD9E9A7.50807@v.loewis.de>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<4DD9E9A7.50807@v.loewis.de>
Message-ID: <BANLkTi=6Htr6Gt61GAYBEJ2VZLJ-CSYRCA@mail.gmail.com>

2011/5/23 "Martin v. L?wis" <martin at v.loewis.de>

> > I'm not a compiler/profiling expert so the main question is if such
> > design can work, and maybe someone was thinking about something
> > similar?
>
> My expectation is that your approach would likely make the issues
> worse in a multi-CPU setting. If you put multiple reference counters
> into a contiguous block of memory, unrelated reference counters will
> live in the same cache line. Consequentially, changing one reference
> counter on one CPU will invalidate the cached reference counters of
> that cache line on other CPU, making your problem a) actually worse.
>
> Regards,
> Martin
>

I don't think that moving ob_refcnt to a proper memory pool will solve the
problem of cache pollution anyway.

ob_refcnt is obviously the most stressed field in PyObject, but it's not the
only one. We have , that is needed to model each object (instance)
"behavior", which is massively accessed too, so a cache line will be loaded
as well when the object will be used.

Also, only a few of simple objects have just ob_refcnt and ob_type. Most of
them have other fields too, and accessing them means a line cache load.

Regards,
Cesare

P.S. Memory allocation granularity can help sometimes, leaving some data
(ob_refcnt and/or ob_type) on one cache line, and the other on the next one.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/2ec4121b/attachment.html>

From ncoghlan at gmail.com  Mon May 23 08:15:35 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 May 2011 16:15:35 +1000
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <4DD6E177.5020202@v.loewis.de>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
	<BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>
	<4DD6E177.5020202@v.loewis.de>
Message-ID: <BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com>

On Sat, May 21, 2011 at 7:47 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> As Jesse has said, there is an RFP in development to improve
>> python.org to the point where we can self-host blogs and the like and
>> deal with the associated user account administration appropriately.
>
> To run a blog on www.python.org, a PEP is not needed. If anybody would
> volunteer to set this up, it could be done in no time.

If I understand correctly, the RFP is more about improving the entire
python.org toolchain to make it something that non-programmers can
easily provide content for (and even *programmers* don't particularly
like the current toolchain).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Mon May 23 08:17:27 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 May 2011 16:17:27 +1000
Subject: [Python-Dev] Hello!
In-Reply-To: <20110521135925.33599a44@pitrou.net>
References: <BANLkTimp_vvh_aUYbD5Q5p0D8UiSJyRx=Q@mail.gmail.com>
	<20110521135925.33599a44@pitrou.net>
Message-ID: <BANLkTi=UsrLqR6t_tJA96UxCgLGJ7E4kTw@mail.gmail.com>

On Sat, May 21, 2011 at 9:59 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Fri, 20 May 2011 19:01:26 +0200
> Charles-Fran?ois Natali <cf.natali at gmail.com> wrote:
>
>> Hi,
>>
>> My name is Charles-Fran?ois Natali, I've been using Python for a
>> couple years, and I've recently been granted commit priviledge.
>> I just wanted to say hi to everyone on this list, and let you know
>> that I'm really happy and proud of joining this great community.
>
> Welcome, and keep up the good work.

Indeed!

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Mon May 23 08:22:20 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 May 2011 16:22:20 +1000
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <ir8tb1$k0o$1@dough.gmane.org>
References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org>
	<20110521160118.GA22904@kevin> <ir8tb1$k0o$1@dough.gmane.org>
Message-ID: <BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com>

On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 05/21/11 18:01, Senthil Kumaran wrote:
>> So a rewrite with good pointers would be more appropriate.
>
> Even then, it's better off in the Wiki until the rewrite is complete.

Perhaps replacing it with a placeholder page that refers to the Wiki
would be appropriate? A simple summary saying that the HOWTO had not
aged well, and hence had been removed from the official documentation
until it had been updated on the Wiki would allow people looking for
it to better understand the situation, and also how to help improve
it.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ziade.tarek at gmail.com  Mon May 23 08:48:15 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 23 May 2011 08:48:15 +0200
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <58834.1306112451@parc.com>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
Message-ID: <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>

On Mon, May 23, 2011 at 3:00 AM, Bill Janssen <janssen at parc.com> wrote:
> Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>
>> Yes, I am aware of this. I have fixed today most remaining issues, and
>> fixing the final ones right now.
>
> Just FYI: ?the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots
> are now green, but the "PPC Tiger" buildbot is still failing for all
> branches because of packaging errors:

All the linux and windows stable slaves are now green, and I have a
few issues left to be fixed for all solaris flavors and the two you
are showing, that are also failing under Free BSD.

Thanks
Tarek

-- 
Tarek Ziad? | http://ziade.org

From mail at timgolden.me.uk  Mon May 23 10:42:38 2011
From: mail at timgolden.me.uk (Tim Golden)
Date: Mon, 23 May 2011 09:42:38 +0100
Subject: [Python-Dev] os.access on Windows
In-Reply-To: <4DD6E371.2020706@v.loewis.de>
References: <17E3183FF8D8EB47839A7E240AD39FA831772B209C@SVR-EXCH-VMBX.gb.vo.local>	<BANLkTi=o8g-z1Jbfh9qb+KFipTdq_Z=isw@mail.gmail.com>	<4DD687F9.1040403@timgolden.me.uk>
	<BANLkTinkCSZT=b1Bi+iSB57OjMV2VWHbOg@mail.gmail.com>
	<4DD6E371.2020706@v.loewis.de>
Message-ID: <4DDA1DFE.6070800@timgolden.me.uk>

On 20/05/2011 22:56, "Martin v. L?wis" wrote:
>> TBH I think the less attractive we can make os.access() look the
>> better. It uses the real uid instead of the effective uid, it
>> encourages LBYL behavior, the outcome may be incorrect, it doesn't
>> work on Windows... The ONLY reason to ever use it is in a setuid()
>> program. But who writes those any more? (Esp. in Python!)
>
> +1. The best way to determine "could I access this file" is to try
> to access it, and be prepared to get an exception.

FWIW the OP knew this but -- for some reason specific to his
use case -- wanted to avoid updating the mod dates of the containing
directory. Obviously that's his problem, not ours...

 > So we might deprecate-then-delete it on Windows.

I'll rework that patch to be a DeprecationWarning in that case.

> People who *really* need to know in advance should use the Windows
> API for that on Windows (i.e. call AccessCheck).

And indeed this is what I've recommended to the OP. I'll follow this
up in that python-list thread. I see that Benjamin's updated the
os.access docs so I'll let this thread die and talk the OP through
the AccessCheck route (which is, unfortunately, more tricky because
it's not exposed by pywin32. Also not our problem...)

TJG

From ziade.tarek at gmail.com  Mon May 23 11:58:29 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 23 May 2011 11:58:29 +0200
Subject: [Python-Dev] the distutils2 repo and 3to2
Message-ID: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>

Hey,

Now that packaging has landed, the distutils2 repo is going to be
re-seted and will be the python 2.x / 3.1 / 3.2 backport of packaging.

In theory, we want to automate the extraction of packaging from the
stdlib and a few other modules, and run 3to2 at install time. Or
should I say 3.3tosomething.
I want to do this to avoid maintaining yet another code base. In
practice, I don't really know the current state of 3to2 so we'll see..

Any help/hint in this project would be appreciated.

Thanks
Tarek

-- 
Tarek Ziad? | http://ziade.org

From lukasz at langa.pl  Mon May 23 12:51:27 2011
From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=)
Date: Mon, 23 May 2011 12:51:27 +0200
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
Message-ID: <84205284-4F06-408A-95B7-57B504849F59@langa.pl>


Wiadomo?? napisana przez Tarek Ziad? w dniu 2011-05-23, o godz. 11:58:

> Hey,
> 
> Now that packaging has landed, the distutils2 repo is going to be
> re-seted and will be the python 2.x / 3.1 / 3.2 backport of packaging.
> 
> In theory, we want to automate the extraction of packaging from the
> stdlib and a few other modules, and run 3to2 at install time. Or
> should I say 3.3tosomething.
> I want to do this to avoid maintaining yet another code base. In
> practice, I don't really know the current state of 3to2 so we'll see..
> 
> Any help/hint in this project would be appreciated.

I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2. A fully automatic conversion is not really possible, partly because the 3to2 tool is not perfect, and partly because there are parts of the code (esp. in the tests) which no mechanical converter could have figured out on its own.

Anyway, the backport is available here:

http://pypi.python.org/pypi/configparser

There's some documentation there on the conversion process I came up with.

As for distutils2, I was already contacted by ?ric Araujo and will help him improve 3to2. We are yet to contact its authors to see if they believe merging our changes upstream will be possible.

-- 
Best regards,
?ukasz Langa
Senior Systems Architecture Engineer

IT Infrastructure Department
Grupa Allegro Sp. z o.o.

From ziade.tarek at gmail.com  Mon May 23 12:58:22 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 23 May 2011 12:58:22 +0200
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <84205284-4F06-408A-95B7-57B504849F59@langa.pl>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
Message-ID: <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>

2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
..
> I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2.

Do you backport to 3.1 ?

..
>
> There's some documentation there on the conversion process I came up with.

Awesome, will look up, thanks

>
> As for distutils2, I was already contacted by ?ric Araujo and will help him improve 3to2. We are yet to contact its authors to see if they believe merging our changes upstream will be possible.

Great, anything was started already ? If so, we should sync to see how
we can initiate the d2 repo

Cheers
Tarek

-- 
Tarek Ziad? | http://ziade.org

From lukasz at langa.pl  Mon May 23 13:14:58 2011
From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=)
Date: Mon, 23 May 2011 13:14:58 +0200
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
Message-ID: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>


Wiadomo?? napisana przez Tarek Ziad? w dniu 2011-05-23, o godz. 12:58:

> 2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
> ..
>> I'm maintaining a configparser 3.2+ backport for 2.6-2.7 using 3to2.
> 
> Do you backport to 3.1 ?
> 

Not really. I personally think people already using py3k will migrate sooner (even if they have to do it on their own) than the folk on 2.x. The new Ubuntu already ships with Python 3.2.

As for Python 2.x I've learnt that keeping compatibility with a Python version without decorators, `io` library, abstract base classes, etc. would mean either diverging branches or reproducing and maintaining bits of the newer stdlib. This is something 3to2 won't help you with as it's out of scope for that tool. For configparser I only support 2.6+ and none the less the backport has a helpers module with a couple of things copied over from 2.7 or 3.1. There's also an external dependency on ordereddict, etc. You see where this is going.

I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy.

-- 
Best regards,
?ukasz Langa
Senior Systems Architecture Engineer

IT Infrastructure Department
Grupa Allegro Sp. z o.o.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 1898 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/cc3c0e67/attachment.jpg>
-------------- next part --------------

Pomy?l o ?rodowisku naturalnym zanim wydrukujesz t? wiadomo??!
Please consider the environment before printing out this e-mail.


From ziade.tarek at gmail.com  Mon May 23 13:23:32 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 23 May 2011 13:23:32 +0200
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
	<B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
Message-ID: <BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com>

2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
...
>
> I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy.

yeah well, we might raise the bar to 2.5 and use some __future__
statements. I am not sure that keeping 2.4 support is that useful
anymore.

Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From ncoghlan at gmail.com  Mon May 23 14:14:50 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 May 2011 22:14:50 +1000
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
	<B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
	<BANLkTi=RCRrQfnz+V8Ec-HKGRPmtNcGYzw@mail.gmail.com>
Message-ID: <BANLkTinTxBn_H2YS=z_n5FacGY7BcgZZpQ@mail.gmail.com>

2011/5/23 Tarek Ziad? <ziade.tarek at gmail.com>:
> 2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
> ...
>>
>> I've heard you're targetting 2.4 compatibility so be prepared that this is not going to be easy.
>
> yeah well, we might raise the bar to 2.5 and use some __future__
> statements. I am not sure that keeping 2.4 support is that useful
> anymore.

Anyone still stuck with 2.4 at this point in time is probably going to
struggle to switch their packaging support library from distutils to
distutils2 anyway.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From fdrake at acm.org  Mon May 23 14:25:22 2011
From: fdrake at acm.org (Fred Drake)
Date: Mon, 23 May 2011 08:25:22 -0400
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
	<B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
Message-ID: <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com>

2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
> The new Ubuntu already ships with Python 3.2.

Uptake on Ubuntu 11.04 will take longer than 10.10 uptake, given the
reliability issues and the reaction to the new user interface.

That's not to say it won't be significant, but the strength of the
indicator may be less significant than in the past.


  -Fred

-- 
Fred L. Drake, Jr.? ? <fdrake at acm.org>
"Give me the luxuries of life and I will willingly do without the necessities."
?? --Frank Lloyd Wright

From jnoller at gmail.com  Mon May 23 14:44:43 2011
From: jnoller at gmail.com (Jesse Noller)
Date: Mon, 23 May 2011 08:44:43 -0400
Subject: [Python-Dev] looking for a contact at Google on the Blogger team
In-Reply-To: <BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com>
References: <5B09C555-1F4E-413E-9719-2ED1C9B68DF1@gmail.com>
	<BANLkTiksdt85-sV5w5bHgB3OUpnFW9qEVw@mail.gmail.com>
	<BANLkTin7cf6iZZQz+MkKWmTTzSH6yNgL2A@mail.gmail.com>
	<BANLkTi=WGTmCKT3rwjDbcU+j2KEkLzVjUg@mail.gmail.com>
	<BANLkTin8RWdoXFN-aEfLEiYqe6-dmqG7jw@mail.gmail.com>
	<BANLkTinV9XXvVH51LL74HVzTDaB-dzSWJA@mail.gmail.com>
	<BANLkTinaU6ypzk1+n4z-0FeOmM-fGic7KA@mail.gmail.com>
	<4DD6E177.5020202@v.loewis.de>
	<BANLkTi=RyeTT-mnwYES90tCBaVoBeen9ow@mail.gmail.com>
Message-ID: <BANLkTik7XVD13x2He=o8kQ+ZUG3V3hgJcQ@mail.gmail.com>

On Mon, May 23, 2011 at 2:15 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sat, May 21, 2011 at 7:47 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> As Jesse has said, there is an RFP in development to improve
>>> python.org to the point where we can self-host blogs and the like and
>>> deal with the associated user account administration appropriately.
>>
>> To run a blog on www.python.org, a PEP is not needed. If anybody would
>> volunteer to set this up, it could be done in no time.
>
> If I understand correctly, the RFP is more about improving the entire
> python.org toolchain to make it something that non-programmers can
> easily provide content for (and even *programmers* don't particularly
> like the current toolchain).
>
> Cheers,
> Nick.

That is correct.

From barry at python.org  Mon May 23 16:40:30 2011
From: barry at python.org (Barry Warsaw)
Date: Mon, 23 May 2011 10:40:30 -0400
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
	<B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
	<BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com>
Message-ID: <20110523104030.6a08801f@neurotica.wooz.org>

Okay, this reply is getting off-topic, so I won't belabor the point (please
email me directly if you want to discuss further).

On May 23, 2011, at 08:25 AM, Fred Drake wrote:

>2011/5/23 ?ukasz Langa <lukasz at langa.pl>:
>> The new Ubuntu already ships with Python 3.2.
>
>Uptake on Ubuntu 11.04 will take longer than 10.10 uptake, given the
>reliability issues and the reaction to the new user interface.

You're not required to run the default desktop (Unity) of course.  There are
several options out of the box, including the classic desktop and Unity 2D,
and there are a wide range of supported derivatives of Ubuntu offering
additional desktops, such as KDE (Kubuntu) and Xfce (Xubuntu).

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110523/37d92b6b/attachment.pgp>

From fdrake at acm.org  Mon May 23 16:53:07 2011
From: fdrake at acm.org (Fred Drake)
Date: Mon, 23 May 2011 10:53:07 -0400
Subject: [Python-Dev] the distutils2 repo and 3to2
In-Reply-To: <20110523104030.6a08801f@neurotica.wooz.org>
References: <BANLkTikX5__nyOn+RX2FncnLPtXm4=aPyw@mail.gmail.com>
	<84205284-4F06-408A-95B7-57B504849F59@langa.pl>
	<BANLkTinPS_F4hnxKWd1nd3XoOyWae5daXQ@mail.gmail.com>
	<B32E9FD3-94B9-4CCE-9257-EF12B0E57BAA@langa.pl>
	<BANLkTinL26Fd1GhN=OuM90QJt=ui5_fsSQ@mail.gmail.com>
	<20110523104030.6a08801f@neurotica.wooz.org>
Message-ID: <BANLkTim-fkz6qVCpH5JAUePVk2NB9=2uww@mail.gmail.com>

On Mon, May 23, 2011 at 10:40 AM, Barry Warsaw <barry at python.org> wrote:
> You're not required to run the default desktop (Unity) of course. ?There are
> several options out of the box, including the classic desktop and Unity 2D,
> and there are a wide range of supported derivatives of Ubuntu offering
> additional desktops, such as KDE (Kubuntu) and Xfce (Xubuntu).

Of course, but I still think the default affects the rate of uptake.  I'm not
attacking Ubuntu, but I think the uptake rate is relevant to our current
discussion.

That said, the multi-monitor issues prevent my updating to 11.04.


  -Fred

-- 
Fred L. Drake, Jr.? ? <fdrake at acm.org>
"Give me the luxuries of life and I will willingly do without the necessities."
?? --Frank Lloyd Wright

From ethan at stoneleaf.us  Mon May 23 19:20:50 2011
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 23 May 2011 10:20:50 -0700
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>	<BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com>
	<16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com>
Message-ID: <4DDA9772.2060401@stoneleaf.us>

Glyph Lefkowitz wrote:
> In fact, I feel like I would want to push in the opposite direction: 
> don't treat one-byte bytes slices less like integers; I wish I could 
> more easily treat n-byte sequences _more_ like integers! :).  More 
> protocols have 2-byte or 4-byte network-endian packed integers embedded 
> in them than have individual tag bytes that I want to examine.

So are you thinking that bytes([01,56])[:2] == 120 ?  Or more along the 
lines of a .to_int() method?

~Ethan~

From ziade.tarek at gmail.com  Mon May 23 19:16:36 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 23 May 2011 19:16:36 +0200
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
Message-ID: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>

On Mon, May 23, 2011 at 8:48 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> On Mon, May 23, 2011 at 3:00 AM, Bill Janssen <janssen at parc.com> wrote:
>> Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>>
>>> Yes, I am aware of this. I have fixed today most remaining issues, and
>>> fixing the final ones right now.
>>
>> Just FYI: ?the "AMD64 Snow Leopard" buildbot and "PPC Leopard" buildbots
>> are now green, but the "PPC Tiger" buildbot is still failing for all
>> branches because of packaging errors:
>
> All the linux and windows stable slaves are now green, and I have a
> few issues left to be fixed for all solaris flavors and the two you
> are showing, that are also failing under Free BSD.

I have now completed the cleanup and we're back on green-land for the
stable bots.

The red slaves should get green when they catch up with the latest rev
(they are slow). If they're not and they are failing in packaging or
sysconfig let me know.

Sorry again if it has taken so long. Setting up Solaris and BSD VMs
took some time ;)


Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From sturla at molden.no  Mon May 23 18:39:07 2011
From: sturla at molden.no (Sturla Molden)
Date: Mon, 23 May 2011 18:39:07 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <4DD9E9A7.50807@v.loewis.de>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<4DD9E9A7.50807@v.loewis.de>
Message-ID: <4DDA8DAB.2060209@molden.no>

Den 23.05.2011 06:59, skrev "Martin v. L?wis":
>
> My expectation is that your approach would likely make the issues
> worse in a multi-CPU setting. If you put multiple reference counters
> into a contiguous block of memory, unrelated reference counters will
> live in the same cache line. Consequentially, changing one reference
> counter on one CPU will invalidate the cached reference counters of
> that cache line on other CPU, making your problem a) actually worse.

In a multi-threaded setting with concurrent thread accessing reference 
counts, this would certainly worsen the situation.

In a single-threaded setting, this will likely be an improvement.

CPython, however, has a GIL. Thus there is only one concurrently active 
thread with access to reference counts. On a thread switch in the 
interpreter, I think the performance result will depend on the nature of 
the Python code: If threads share a lot of objects, it could help to 
reduce the number of dirty cache lines. If threads mainly work on 
private objects, it would likely have the effect you predict. Which will 
dominate is hard to tell.

Instead, we could use multiple heaps:

Each Python thread could manage it's own heap for malloc and free (cf. 
HeapAlloc and HeapFree in Windows). Objects local to one thread only 
reside in the locally managed heap.

When an object becomes shared by seveeral Python threads, it is moved 
from a local heap to the global heap of the process. Some objects, such 
as modules, would be stored directly onto the global heap.

This way, objects only used by only one thread would never dirty cache 
lines used by other threads.

This would also be a way to reduce the CPython dependency on the GIL. 
Only the global heap would need to be protected by the GIL, whereas the 
local heaps would not need any global synchronization.


(I am setting follow-up to the Python Ideas list, it does not belong on 
Python dev.)

Sturla Molden

From tjreedy at udel.edu  Mon May 23 19:55:41 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 23 May 2011 13:55:41 -0400
Subject: [Python-Dev] Python 3.x and bytes
In-Reply-To: <4DDA9772.2060401@stoneleaf.us>
References: <4DD2C2A5.3080403@stoneleaf.us>	<BANLkTimvzZeN5dmm2xHP5xV8Kpw2Nb9kuQ@mail.gmail.com>	<4DD2D89D.4000303@stoneleaf.us>	<BANLkTintqgBLFtBx8+1b+R10nywuKdKHOw@mail.gmail.com>	<4DD2F661.2050005@stoneleaf.us>	<BANLkTikx8U4jWwLXXazpmtsL6MasDmyVyg@mail.gmail.com>	<BANLkTik9oXD0Tont0MeyFF9im655946r2g@mail.gmail.com>	<16FC9995-2C52-44C2-BDDE-7E7E4B54C9E3@twistedmatrix.com>
	<4DDA9772.2060401@stoneleaf.us>
Message-ID: <ire72p$s38$1@dough.gmane.org>

On 5/23/2011 1:20 PM, Ethan Furman wrote:
> Glyph Lefkowitz wrote:
>> In fact, I feel like I would want to push in the opposite direction:
>> don't treat one-byte bytes slices less like integers; I wish I could
>> more easily treat n-byte sequences _more_ like integers! :). More
>> protocols have 2-byte or 4-byte network-endian packed integers
>> embedded in them than have individual tag bytes that I want to examine.
>
> So are you thinking that bytes([01,56])[:2] == 120 ? Or more along the
> lines of a .to_int() method?

I believe that such things can be handled by the struct module.

-- 
Terry Jan Reedy


From artur.siekielski at gmail.com  Mon May 23 22:55:21 2011
From: artur.siekielski at gmail.com (Artur Siekielski)
Date: Mon, 23 May 2011 22:55:21 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
	outside of objects
In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
Message-ID: <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>

Ok, I managed to make a quick but working patch (sufficient to get
working interpreter, it segfaults for extension modules). It uses the
"ememoa" allocator (http://code.google.com/p/ememoa/) which seems a
reasonable pool allocator. The patch: http://dpaste.org/K8en/. The
main obstacle was that there isn't a single function/macro that can be
used to initialize all PyObjects, so I had to initialize static
PyObjects (mainly PyTypeObjects) by hand.

I used a naive quicksort algorithm as a benchmark:
http://dpaste.org/qquh/ . The result is that after patching it runs
50% SLOWER. I profiled it and allocator methods used 35% time. So
there is still 15% performance loss even if the allocator is poor.

Anyway, I'd like to have working copy-on-write in CPython - in the
presence of GIL I find it important to have multiprocess programs
optimized (and I think it's a common idiom that a parent process
prepares some big data structure, and child "worker" processes do some
read-only quering).

Artur

From guido at python.org  Mon May 23 23:08:48 2011
From: guido at python.org (Guido van Rossum)
Date: Mon, 23 May 2011 14:08:48 -0700
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
Message-ID: <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>

On Mon, May 23, 2011 at 1:55 PM, Artur Siekielski
<artur.siekielski at gmail.com> wrote:
> Ok, I managed to make a quick but working patch (sufficient to get
> working interpreter, it segfaults for extension modules). It uses the
> "ememoa" allocator (http://code.google.com/p/ememoa/) which seems a
> reasonable pool allocator. The patch: http://dpaste.org/K8en/. The
> main obstacle was that there isn't a single function/macro that can be
> used to initialize all PyObjects, so I had to initialize static
> PyObjects (mainly PyTypeObjects) by hand.
>
> I used a naive quicksort algorithm as a benchmark:
> http://dpaste.org/qquh/ . The result is that after patching it runs
> 50% SLOWER. I profiled it and allocator methods used 35% time. So
> there is still 15% performance loss even if the allocator is poor.
>
> Anyway, I'd like to have working copy-on-write in CPython - in the
> presence of GIL I find it important to have multiprocess programs
> optimized (and I think it's a common idiom that a parent process
> prepares some big data structure, and child "worker" processes do some
> read-only quering).

That is the question though -- *is* the idiom commonly used? It
doesn't seem to me that it would scale all that far, since it only
works as long as all forked copies live on the same machine and run on
the same symmetrical multi-core processor.

-- 
--Guido van Rossum (python.org/~guido)

From artur.siekielski at gmail.com  Tue May 24 00:07:27 2011
From: artur.siekielski at gmail.com (Artur Siekielski)
Date: Tue, 24 May 2011 00:07:27 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
Message-ID: <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>

2011/5/23 Guido van Rossum <guido at python.org>:
>> Anyway, I'd like to have working copy-on-write in CPython - in the
>> presence of GIL I find it important to have multiprocess programs
>> optimized (and I think it's a common idiom that a parent process
>> prepares some big data structure, and child "worker" processes do some
>> read-only quering).
>
> That is the question though -- *is* the idiom commonly used?

In fact I came to the whole idea with this optimization because the
idiom didn't work for me. I had a big word index built by a parent
process, and than wanted the children to enable querying this index (I
wanted to use all cores on a server). The index consumed 50% of RAM
and after a few minutes the children consumed all RAM.

I find it common in languages like Java to use thread pools, in
Python+Linux we have multiprocess pools if we want to use all cores,
and in this setting having a working copy-on-write is really valuable.

Oh, and using explicit shared memory or mmap is much harder, because
you have to map the whole object graph into bytes.

> It
> doesn't seem to me that it would scale all that far, since it only
> works as long as all forked copies live on the same machine and run on
> the same symmetrical multi-core processor.

? I don't know about multi-processor systems, but on single-processor
multi-core systems (which are common even on servers) and Linux it
works.


Artur

From sturla at molden.no  Tue May 24 00:33:43 2011
From: sturla at molden.no (Sturla Molden)
Date: Tue, 24 May 2011 00:33:43 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>
Message-ID: <4DDAE0C7.9040501@molden.no>

Den 24.05.2011 00:07, skrev Artur Siekielski:
>
> Oh, and using explicit shared memory or mmap is much harder, because
> you have to map the whole object graph into bytes.

It sounds like you need PYRO, POSH or multiprocessing's proxy objects.

Sturla

From victor.stinner at haypocalc.com  Tue May 24 02:08:49 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 02:08:49 +0200
Subject: [Python-Dev] Deprecate codecs.open() and StreamWriter/StreamReader
Message-ID: <1306195729.605.27.camel@marge>

Hi,

In Python 2, codecs.open() is the best way to read and/or write files
using Unicode. But in Python 3, open() is preferred with its fast io
module. I would like to deprecate codecs.open() because it can be
replaced by open() and io.TextIOWrapper. I would like your opinion and
that's why I'm writing this email.

--

codecs.open() and StreamReader, StreamWriter and StreamReaderWriter
classes of the codecs module don't support universal newlines, still
have some issues with stateful codecs (like UTF-16/32 BOMs), and each
codec has to implement a StreamReader and a StreamWriter class.

StreamReader and StreamWriter are stateless codecs (no reset() or
setstate() method), and so it's not possible to write a generic fix for
all child classes in the codecs module. Each stateful codec has to
handle special cases like seek() problems. For example, UTF-16 codec
duplicates some IncrementalEncoder/IncrementalDecoder code into its
StreamWriter/StreamReader class.

The io module is well tested, supports non-seekable streams, handles
correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of
newlines including an "universal newline" mode. TextIOWrapper reuses
incremental encoders and decoders, so BOM issues were fixed only once,
in TextIOWrapper.

It's trivial to replace a call to codecs.open() by a call to open(),
because the two API are very close. The main different is that
codecs.open() doesn't support universal newline, so you have to use
open(..., newline='') to keep the same behaviour (keep newlines
unchanged). This task can be done by 2to3. But I suppose that most
people will be happy with the universal newline mode.

I don't see which usecase is not covered by TextIOWrapper. But I know
some cases which are not supported by StreamReader/StreamWriter.

--

I opened an issue for this idea. Brett and Marc-Andree Lemburg don't
want to deprecate codecs.open() & friends because they want to be able
to write code working on Python 2 and on Python 3 without any change. I
don't think it's realistic: nontrivial programs require at least the six
module, and most likely the 2to3 program. The six module can have its
"codecs.open" function if codecs.open is removed from Python 3.4.

StreamReader, StreamWriter, StreamReaderEncoder and EncodedFile are not
used in the Python 3 standard library. I tried removed them: except
tests of test_codecs which test them directly, the full test suite pass.

Read the issue for more information: http://bugs.python.org/issue8796

Victor


From stephen at xemacs.org  Tue May 24 04:12:35 2011
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 24 May 2011 11:12:35 +0900
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
Message-ID: <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp>

Tarek Ziad? writes:

 > I have now completed the cleanup and we're back on green-land for the
 > stable bots.

Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the
bots update?  If so, I'm impressed, and "thank you!" to all involved.
Apple and MacPorts have long since washed their hands of that release.


From rdmurray at bitdance.com  Tue May 24 04:50:54 2011
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 23 May 2011 22:50:54 -0400
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
	<87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20110524025055.4B7E1250042@webabinitio.net>

On Tue, 24 May 2011 11:12:35 +0900, "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Tarek Ziad?? writes:
> 
>  > I have now completed the cleanup and we're back on green-land for the
>  > stable bots.
> 
> Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the
> bots update?  If so, I'm impressed, and "thank you!" to all involved.
> Apple and MacPorts have long since washed their hands of that release.

You will note that Tiger is *not* in the stable set :)

--
R. David Murray           http://www.bitdance.com

From nad at acm.org  Tue May 24 07:03:13 2011
From: nad at acm.org (Ned Deily)
Date: Mon, 23 May 2011 22:03:13 -0700
Subject: [Python-Dev] Stable buildbots update
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
	<87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <nad-0412B2.22031323052011@news.gmane.org>

In article <87zkmcalt8.fsf at uwakimon.sk.tsukuba.ac.jp>,
 "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the
> bots update?  If so, I'm impressed, and "thank you!" to all involved.
> Apple and MacPorts have long since washed their hands of that release.

OS X 10.4 does have its quirks that makes it challenging to get all of 
the tests to run without a few cornercase failures but, besides the 
buildbots, I still test regularly with 10.4 and occasionally build 
there, too.  And, FWIW, while top-of-trunk MacPorts may not officially 
support 10.4, many ports work there just fine including python2.6, 2.7, 
and 3.1.  (3.2 has a build issue that may get fixed in 3.2.1).

-- 
 Ned Deily,
 nad at acm.org


From ncoghlan at gmail.com  Tue May 24 07:07:03 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 May 2011 15:07:03 +1000
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <4DDAE0C7.9040501@molden.no>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>
	<4DDAE0C7.9040501@molden.no>
Message-ID: <BANLkTikwewqji-acMFY4HxzmxK9K3__z2g@mail.gmail.com>

On Tue, May 24, 2011 at 8:33 AM, Sturla Molden <sturla at molden.no> wrote:
> Den 24.05.2011 00:07, skrev Artur Siekielski:
>>
>> Oh, and using explicit shared memory or mmap is much harder, because
>> you have to map the whole object graph into bytes.
>
> It sounds like you need PYRO, POSH or multiprocessing's proxy objects.

Indeed. Abstractions over mmap (local machine sharing) and
serialisation (remote sharing) are likely to be far more beneficial in
this area than trying to change the underlying memory model to support
copy-on-write.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Tue May 24 07:24:20 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 May 2011 15:24:20 +1000
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306195729.605.27.camel@marge>
References: <1306195729.605.27.camel@marge>
Message-ID: <BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com>

On Tue, May 24, 2011 at 10:08 AM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> It's trivial to replace a call to codecs.open() by a call to open(),
> because the two API are very close. The main different is that
> codecs.open() doesn't support universal newline, so you have to use
> open(..., newline='') to keep the same behaviour (keep newlines
> unchanged). This task can be done by 2to3. But I suppose that most
> people will be happy with the universal newline mode.

Is there any reason that codecs.open() can't become a thin wrapper
around builtin open in 3.3?

> I don't see which usecase is not covered by TextIOWrapper. But I know
> some cases which are not supported by StreamReader/StreamWriter.

How API compatible is TextIOWrapper with StreamReader/StreamWriter?
How hard would it to be change them to be adapters over the main IO
machinery rather than independent classes?

Rather than deprecating them, that seems like a more profitable
direction to take them.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From g.brandl at gmx.net  Tue May 24 08:38:19 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 24 May 2011 08:38:19 +0200
Subject: [Python-Dev] cpython: Issue #11377: platform.popen() emits a
	DeprecationWarning
In-Reply-To: <E1QOdQq-0002J4-RT@dinsdale.python.org>
References: <E1QOdQq-0002J4-RT@dinsdale.python.org>
Message-ID: <irfjoe$121$1@dough.gmane.org>

On 24.05.2011 00:17, victor.stinner wrote:
> http://hg.python.org/cpython/rev/e44b851d0a2b
> changeset:   70323:e44b851d0a2b
> parent:      70321:202d973e8bf5
> user:        Victor Stinner <victor.stinner at haypocalc.com>
> date:        Tue May 24 00:16:16 2011 +0200
> summary:
>   Issue #11377: platform.popen() emits a DeprecationWarning

Please see http://mail.python.org/pipermail/python-dev/2011-May/111303.html
about the style of your commit messages. 9a16fa0c9548 is another example.

Georg


From victor.stinner at haypocalc.com  Tue May 24 09:23:38 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 09:23:38 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com>
References: <1306195729.605.27.camel@marge>
	<BANLkTik64iVohjzvgLC+LvqA4nhVMNUR=g@mail.gmail.com>
Message-ID: <1306221818.2619.6.camel@marge>

Le mardi 24 mai 2011 ? 15:24 +1000, Nick Coghlan a ?crit :
> On Tue, May 24, 2011 at 10:08 AM, Victor Stinner
> <victor.stinner at haypocalc.com> wrote:
> > It's trivial to replace a call to codecs.open() by a call to open(),
> > because the two API are very close. The main different is that
> > codecs.open() doesn't support universal newline, so you have to use
> > open(..., newline='') to keep the same behaviour (keep newlines
> > unchanged). This task can be done by 2to3. But I suppose that most
> > people will be happy with the universal newline mode.
> 
> Is there any reason that codecs.open() can't become a thin wrapper
> around builtin open in 3.3?

Yes, it's trivial to implement codecs.open using:

def open(filename, mode='rb', encoding=None, errors='strict',
buffering=1):
    return builtins.open(filename, mode, buffering, 
                         encoding, errors, newline='')

But do you we really need two ways to open a file? Extract of import
this:
"There should be one-- and preferably only one --obvious way to do it."

Another example: Python 3.2 has subprocess.Popen, os.popen and
platform.popen to open a subprocess. platform.popen is now deprecated in
Python 3.3. Well, it's already better than Python 2.5 which has
os.popen(), os.popen2(), os.popen3(), os.popen4(), os.spawnl(),
os.spawnle(), os.spawnlp(), os.spawnlpe(), os.spawnv(), os.spawnve(),
os.spawnvp(), os.spawnvpe(), subprocess.Popen, platform.popen and maybe
others :-)

> How API compatible is TextIOWrapper with StreamReader/StreamWriter?

It's fully compatible.

> How hard would it to be change them to be adapters over the main IO
> machinery rather than independent classes?

I don't understand your proposition. We don't need StreamReader and
StreamWriter to open a stream as a file text, only incremental decoders
and encoders. Why do you want to keep them?

Victor


From mal at egenix.com  Tue May 24 10:03:22 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 May 2011 10:03:22 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306195729.605.27.camel@marge>
References: <1306195729.605.27.camel@marge>
Message-ID: <4DDB664A.7050705@egenix.com>

Victor Stinner wrote:
> Hi,
> 
> In Python 2, codecs.open() is the best way to read and/or write files
> using Unicode. But in Python 3, open() is preferred with its fast io
> module. I would like to deprecate codecs.open() because it can be
> replaced by open() and io.TextIOWrapper. I would like your opinion and
> that's why I'm writing this email.

I think you should have moved this part of your email
further up, since it explains the reason why this idea was
rejected for now:

> I opened an issue for this idea. Brett and Marc-Andree Lemburg don't
> want to deprecate codecs.open() & friends because they want to be able
> to write code working on Python 2 and on Python 3 without any change. I
> don't think it's realistic: nontrivial programs require at least the six
> module, and most likely the 2to3 program. The six module can have its
> "codecs.open" function if codecs.open is removed from Python 3.4.

And now for something completely different:

> codecs.open() and StreamReader, StreamWriter and StreamReaderWriter
> classes of the codecs module don't support universal newlines, still
> have some issues with stateful codecs (like UTF-16/32 BOMs), and each
> codec has to implement a StreamReader and a StreamWriter class.
> 
> StreamReader and StreamWriter are stateless codecs (no reset() or
> setstate() method), and so it's not possible to write a generic fix for
> all child classes in the codecs module. Each stateful codec has to
> handle special cases like seek() problems. For example, UTF-16 codec
> duplicates some IncrementalEncoder/IncrementalDecoder code into its
> StreamWriter/StreamReader class.

Please read PEP 100 regarding StreamReader and StreamWriter.
Those codecs parts were explicitly designed to be stateful,
unlike the stateless encoder/decoder methods.

Please read my reply on the ticket:

"""
StreamReader and StreamWriter classes provide the base codec
implementations for stateful interaction with streams. They
define the interface and provide a working implementation for
those codecs that choose not to implement their own variants.

Each codec can, however, implement variants which are optimized
for the specific encoding or intercept certain stream methods
to add functionality or improve the encoding/decoding
performance.

Both are essential parts of the codec interface.

TextIOWrapper and StreamReaderWriter are merely wrappers
around streams that make use of the codecs. They don't
provide any codec logic themselves. That's the conceptual
difference.
"""

> The io module is well tested, supports non-seekable streams, handles
> correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of
> newlines including an "universal newline" mode. TextIOWrapper reuses
> incremental encoders and decoders, so BOM issues were fixed only once,
> in TextIOWrapper.
> 
> It's trivial to replace a call to codecs.open() by a call to open(),
> because the two API are very close. The main different is that
> codecs.open() doesn't support universal newline, so you have to use
> open(..., newline='') to keep the same behaviour (keep newlines
> unchanged). This task can be done by 2to3. But I suppose that most
> people will be happy with the universal newline mode.
> 
> I don't see which usecase is not covered by TextIOWrapper. But I know
> some cases which are not supported by StreamReader/StreamWriter.

This is a misunderstanding of the concepts behind the two.

StreamReader and StreamWriters are implemented by the codecs,
they are part of the API that each codec has to provide in order
to register in the Python codecs system. Their purpose is
to provide a stateful interface and work efficiently and
directly on streams rather than buffers.

Here's my reply from the ticket regarding using incremental
encoders/decoders for the StreamReader/Writer parts of the
codec set of APIs:

"""
The point about having them use incremental codecs for encoding and decoding is a good one and would
need to be investigated. If possible, we could use incremental encoders/decoders for the standard
StreamReader/Writer base classes or add new IncrementalStreamReader/Writer classes which then use
the IncrementalEncode/Decoder per default.

Please open a new ticket for this.
"""

> StreamReader, StreamWriter, StreamReaderEncoder and EncodedFile are not
> used in the Python 3 standard library. I tried removed them: except
> tests of test_codecs which test them directly, the full test suite pass.
>
> Read the issue for more information: http://bugs.python.org/issue8796

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 24 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-06-20: EuroPython 2011, Florence, Italy               27 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From vinay_sajip at yahoo.co.uk  Tue May 24 10:16:03 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Tue, 24 May 2011 08:16:03 +0000 (UTC)
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
References: <1306195729.605.27.camel@marge>
Message-ID: <loom.20110524T095820-780@post.gmane.org>

Victor Stinner <victor.stinner <at> haypocalc.com> writes:

> I opened an issue for this idea. Brett and Marc-Andree Lemburg don't
> want to deprecate codecs.open() & friends because they want to be able
> to write code working on Python 2 and on Python 3 without any change. I
> don't think it's realistic: nontrivial programs require at least the six
> module, and most likely the 2to3 program. The six module can have its
> "codecs.open" function if codecs.open is removed from Python 3.4.

What's "non-trivial"? Both pip and virtualenv (widely used programs) were ported
to Python 3 using a single codebase for 2.x and 3.x, because it seemed to
involve the least ongoing maintenance burden. Though these particular programs
don't use codecs.open, I don't see much value in making it harder to write
programs which can run under both 2.x and 3.x; that's not going to speed
adoption of 3.x.

I find 2to3 very useful indeed for showing where changes may need to be made for
2.x/3.x portability, but do not use it as an automatic conversion tool. The six
module is very useful, too, but some projects won't necessarily want to add it
as an additional dependency, and reimplement just the parts they need from that
bag of tricks.

So I would also want to keep codecs.open() and friends, at least for now -
though it makes seems to make sense to implement them as wrappers (as Nick
suggested).

Regards,

Vinay Sajip



From victor.stinner at haypocalc.com  Tue May 24 10:31:50 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 10:31:50 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <loom.20110524T095820-780@post.gmane.org>
References: <1306195729.605.27.camel@marge>
	<loom.20110524T095820-780@post.gmane.org>
Message-ID: <1306225910.2619.12.camel@marge>

Le mardi 24 mai 2011 ? 08:16 +0000, Vinay Sajip a ?crit :
> So I would also want to keep codecs.open() and friends, at least for now

Well, I would agree to keep codecs.open() (if we patch it to reuse
TextIOWrapper and add a note to say that it is kept for backward
compatibiltiy and open() should be preferred in Python 3), but deprecate
StreamReader, StreamWriter and EncodedFile.

As I wrote, codecs.open() is useful in Python 2. But I don't know any
program or library using directly StreamReader or StreamWriter.

I found some projects (ex: twisted-mail, feeds2imap, pyflag, pygsm, ...)
implementing their own Python codec (cool!) and their codec has their
StreamReader and StreamWriter class, but I don't think that these
classes are used.

Victor


From victor.stinner at haypocalc.com  Tue May 24 10:58:54 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 10:58:54 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <4DDB664A.7050705@egenix.com>
References: <1306195729.605.27.camel@marge>  <4DDB664A.7050705@egenix.com>
Message-ID: <1306227534.2619.34.camel@marge>

Le mardi 24 mai 2011 ? 10:03 +0200, M.-A. Lemburg a ?crit :
> Please read PEP 100 regarding StreamReader and StreamWriter.
> Those codecs parts were explicitly designed to be stateful,
> unlike the stateless encoder/decoder methods.

Yes, it is possible to implement stateful StreamReader and StreamWriter
classes and we have such codecs (I gave the example of UTF-16), but the
state is not exposed (getstate / setstate), and so it's not possible to
write generic code to handle the codec state in the base StreamReader
and StreamWriter classes. io.TextIOWrapper requires encoder.setstate(0)
for example.

> Each codec can, however, implement variants which are optimized
> for the specific encoding or intercept certain stream methods
> to add functionality or improve the encoding/decoding
> performance.

Can you give me some examples?

> TextIOWrapper and StreamReaderWriter are merely wrappers
> around streams that make use of the codecs. They don't
> provide any codec logic themselves. That's the conceptual
> difference.
> ...
> StreamReader and StreamWriters ... work efficiently and
> directly on streams rather than buffers.

StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all
have a file-like API: tell(), seek(), read(),  readline(), write(), etc.
The implementation is maybe different, but the API is just the same, and
so the usecases are just the same.

I don't see in which case I should use StreamReader or StreamWriter
instead TextIOWrapper. I thought that TextIOWrapper is specific to files
on disk, but TextIOWrapper is already used for other usages like
sockets.

> Here's my reply from the ticket regarding using incremental
> encoders/decoders for the StreamReader/Writer parts of the
> codec set of APIs:
> 
> """
> The point about having them use incremental codecs for encoding and
> decoding is a good one and would
> need to be investigated. If possible, we could use incremental
> encoders/decoders for the standard
> StreamReader/Writer base classes or add new
> IncrementalStreamReader/Writer classes which then use
> the IncrementalEncode/Decoder per default.

Why do you want to write a duplicate feature? TextIOWrapper is already
here, it's working and widely used.

I am working on codec issues (like CJK encodings, see #12100, #12057,
#12016) and I would like to remove StreamReader and StreamWriter to have
*less* code to maintain.

If you want to add more code, will be available to maintain it? It looks
like you are busy, some people (not me ;-)) are still
waiting .transform()/.untransform()!

Victor


From solipsis at pitrou.net  Tue May 24 11:56:55 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 May 2011 11:56:55 +0200
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
Message-ID: <20110524115655.65030e15@pitrou.net>

On Mon, 23 May 2011 19:16:36 +0200
Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> 
> I have now completed the cleanup and we're back on green-land for the
> stable bots.
> 
> The red slaves should get green when they catch up with the latest rev
> (they are slow). If they're not and they are failing in packaging or
> sysconfig let me know.
> 
> Sorry again if it has taken so long. Setting up Solaris and BSD VMs
> took some time ;)

Thank you very much! What a beautiful sight this is:
http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable

(until a sporadic failure comes up, that is)

Regards

Antoine.

From artur.siekielski at gmail.com  Tue May 24 11:55:32 2011
From: artur.siekielski at gmail.com (Artur Siekielski)
Date: Tue, 24 May 2011 11:55:32 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <4DDAE0C7.9040501@molden.no>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>
	<4DDAE0C7.9040501@molden.no>
Message-ID: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>

2011/5/24 Sturla Molden <sturla at molden.no>:
>> Oh, and using explicit shared memory or mmap is much harder, because
>> you have to map the whole object graph into bytes.
>
> It sounds like you need PYRO, POSH or multiprocessing's proxy objects.

PYRO/multiprocessing proxies isn't a comparable solution because of
ORDERS OF MAGNITUDE worser performance. You compare here direct memory
access vs serialization/message passing through sockets/pipes.

POSH might be good, but the project is dead for 8 years. And this
copy-on-write is nice because you don't need changes/restrictions to
your code, or a special garbage collector.


Artur

From solipsis at pitrou.net  Tue May 24 12:06:01 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 May 2011 12:06:01 +0200
Subject: [Python-Dev] "streams" vs "buffers"
References: <1306195729.605.27.camel@marge>
	<4DDB664A.7050705@egenix.com>
Message-ID: <20110524120601.32de673a@pitrou.net>

On Tue, 24 May 2011 10:03:22 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> 
> StreamReader and StreamWriters are implemented by the codecs,
> they are part of the API that each codec has to provide in order
> to register in the Python codecs system. Their purpose is
> to provide a stateful interface and work efficiently and
> directly on streams rather than buffers.

I think you are trying to make a conceptual distinction which doesn't
exist in practice. Your OS uses buffers to represent "streams" to you.

Also, how come StreamReader has internal members named "bytebuffer",
"charbuffer" and "linebuffer"?
There certainly seems to be some (non-trivial) amount of buffering
going on there, and probably quite slow and inefficient since it's pure
Python (TextIOWrapper is written in C).

Regards

Antoine.



From mal at egenix.com  Tue May 24 12:14:10 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 May 2011 12:14:10 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306227534.2619.34.camel@marge>
References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com>
	<1306227534.2619.34.camel@marge>
Message-ID: <4DDB84F2.40106@egenix.com>

Victor Stinner wrote:
> Le mardi 24 mai 2011 ? 10:03 +0200, M.-A. Lemburg a ?crit :
>> Please read PEP 100 regarding StreamReader and StreamWriter.
>> Those codecs parts were explicitly designed to be stateful,
>> unlike the stateless encoder/decoder methods.
> 
> Yes, it is possible to implement stateful StreamReader and StreamWriter
> classes and we have such codecs (I gave the example of UTF-16), but the
> state is not exposed (getstate / setstate), and so it's not possible to
> write generic code to handle the codec state in the base StreamReader
> and StreamWriter classes. io.TextIOWrapper requires encoder.setstate(0)
> for example.

So instead of always suggesting to deprecate everything,
how about you come up with a proposal to add meaningful
new methods to those base classes ?

>> Each codec can, however, implement variants which are optimized
>> for the specific encoding or intercept certain stream methods
>> to add functionality or improve the encoding/decoding
>> performance.
> 
> Can you give me some examples?

See the UTF-16 codec in the stdlib for example. This uses
some of the available possibilities to interpret the BOM mark
and then switches the encoder/decoder methods accordingly.

A lot more could be done for other variable length encoding
codecs, e.g. UTF-8, since these often have problems near
the end of a read due to missing bytes.

The base class implementation provides a general purpose
implementation to cover the case, but it's not efficient,
since it doesn't know anything about the encoding
characteristics.

Such an implementation would have to be done per codec
and that's why we have per codec StreamReader/Writer
APIs.

>> TextIOWrapper and StreamReaderWriter are merely wrappers
>> around streams that make use of the codecs. They don't
>> provide any codec logic themselves. That's the conceptual
>> difference.
>> ...
>> StreamReader and StreamWriters ... work efficiently and
>> directly on streams rather than buffers.
> 
> StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all
> have a file-like API: tell(), seek(), read(),  readline(), write(), etc.
> The implementation is maybe different, but the API is just the same, and
> so the usecases are just the same.
> 
> I don't see in which case I should use StreamReader or StreamWriter
> instead TextIOWrapper. I thought that TextIOWrapper is specific to files
> on disk, but TextIOWrapper is already used for other usages like
> sockets.

I have no idea why TextIOWrapper was added to the stdlib
instead of making StreamReaderWriter more capable,
since StreamReaderWriter had already been available in Python
since Python 1.6 (and this is being used by codecs.open()).

Perhaps we should deprecate TextIOWrapper instead and
replace it with codecs.StreamReaderWriter ? ;-)

Seriously, I don't see use of TextIOWrapper as an argument
for removing StreamReader/Writer parts of the codecs API.

>> Here's my reply from the ticket regarding using incremental
>> encoders/decoders for the StreamReader/Writer parts of the
>> codec set of APIs:
>>
>> """
>> The point about having them use incremental codecs for encoding and
>> decoding is a good one and would
>> need to be investigated. If possible, we could use incremental
>> encoders/decoders for the standard
>> StreamReader/Writer base classes or add new
>> IncrementalStreamReader/Writer classes which then use
>> the IncrementalEncode/Decoder per default.
> 
> Why do you want to write a duplicate feature? TextIOWrapper is already
> here, it's working and widely used.

See above and please also try to understand why we have per-codec
implementations for streams. I'm tired of repeating myself.

I would much prefer to see the codec-specific functionality
in TextIOWrapper added back to the codecs where it
belongs.

> I am working on codec issues (like CJK encodings, see #12100, #12057,
> #12016) and I would like to remove StreamReader and StreamWriter to have
> *less* code to maintain.
>
> If you want to add more code, will be available to maintain it? It looks
> like you are busy, some people (not me ;-)) are still
> waiting .transform()/.untransform()!

I dropped the ball on the idea after the strong wave of
comments against those methods. People will simply have
to use codecs.encode() and codecs.decode().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 24 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-06-20: EuroPython 2011, Florence, Italy               27 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Tue May 24 12:25:11 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 May 2011 20:25:11 +1000
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306227534.2619.34.camel@marge>
References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com>
	<1306227534.2619.34.camel@marge>
Message-ID: <BANLkTi=pqLU9kXmr6Kj7o36x7LuUO=Y3Cg@mail.gmail.com>

On Tue, May 24, 2011 at 6:58 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> StreamReader, StreamWriter, TextIOWrapper and StreamReaderWriter all
> have a file-like API: tell(), seek(), read(), ?readline(), write(), etc.
> The implementation is maybe different, but the API is just the same, and
> so the usecases are just the same.
>
> I don't see in which case I should use StreamReader or StreamWriter
> instead TextIOWrapper. I thought that TextIOWrapper is specific to files
> on disk, but TextIOWrapper is already used for other usages like
> sockets.

Back up a step here. It's important to remember that the codecs module
*long* predates the existence of the Python 3 I/O model and the io
module in particular.

Just as PEP 302 defines how module importers should be written, PEP
100 defines how text codecs should be written (i.e. in terms of
StreamReader and StreamWriter).

PEP 3116 then defines how such codecs can be used as part of the
overall I/O stack as redesigned for Python 3.

Now, there may be an opportunity here to rationalise things a bit and
re-use the *new* io module interfaces as the basis for an updated
codec API PEP, but we shouldn't be hasty in deprecating an old API
that is about "how to write codecs" just because it is similar to a
shiny new one that is about "how to process I/O data".

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Tue May 24 12:27:35 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 May 2011 20:27:35 +1000
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <20110524115655.65030e15@pitrou.net>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
	<20110524115655.65030e15@pitrou.net>
Message-ID: <BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com>

On Tue, May 24, 2011 at 7:56 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Thank you very much! What a beautiful sight this is:
> http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable
>
> (until a sporadic failure comes up, that is)

I could turn test_crashers back on if you like ;)

Great work to all involved in tidying things up post-merge!

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From walter at livinglogic.de  Tue May 24 12:16:49 2011
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Tue, 24 May 2011 12:16:49 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306195729.605.27.camel@marge>
References: <1306195729.605.27.camel@marge>
Message-ID: <4DDB8591.2060308@livinglogic.de>

On 24.05.11 02:08, Victor Stinner wrote:

> [...]
> codecs.open() and StreamReader, StreamWriter and StreamReaderWriter
> classes of the codecs module don't support universal newlines, still
> have some issues with stateful codecs (like UTF-16/32 BOMs), and each
> codec has to implement a StreamReader and a StreamWriter class.
> 
> StreamReader and StreamWriter are stateless codecs (no reset() or
> setstate() method),

They *are* stateful, they just don't expose their state to the public.

> and so it's not possible to write a generic fix for
> all child classes in the codecs module. Each stateful codec has to
> handle special cases like seek() problems.

Yes, which in theory makes it possible to implement shortcuts for
certain codecs (e.g. the UTF-32-BE/LE codecs could simply multiply the
character position by 4 to get the byte position). However AFAICR none
of the readers/writers does that.

> For example, UTF-16 codec
> duplicates some IncrementalEncoder/IncrementalDecoder code into its
> StreamWriter/StreamReader class.

Actually it's the other way round: When I implemented the incremental
codecs, I copied code from the StreamReader/StreamWriter classes.

> The io module is well tested, supports non-seekable streams, handles
> correctly corner-cases (like UTF-16/32 BOMs) and supports any kind of
> newlines including an "universal newline" mode. TextIOWrapper reuses
> incremental encoders and decoders, so BOM issues were fixed only once,
> in TextIOWrapper.
> 
> It's trivial to replace a call to codecs.open() by a call to open(),
> because the two API are very close. The main different is that
> codecs.open() doesn't support universal newline, so you have to use
> open(..., newline='') to keep the same behaviour (keep newlines
> unchanged). This task can be done by 2to3. But I suppose that most
> people will be happy with the universal newline mode.
> 
> I don't see which usecase is not covered by TextIOWrapper. But I know
> some cases which are not supported by StreamReader/StreamWriter.

This could be be partially fixed by implementing generic
StreamReader/StreamWriter classes that reuse the incremental codecs, but
I don't think thats worth it.

> [...] 

Servus,
   Walter

From solipsis at pitrou.net  Tue May 24 12:39:29 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 May 2011 12:39:29 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
References: <1306195729.605.27.camel@marge> <4DDB664A.7050705@egenix.com>
	<1306227534.2619.34.camel@marge>
	<BANLkTi=pqLU9kXmr6Kj7o36x7LuUO=Y3Cg@mail.gmail.com>
Message-ID: <20110524123929.42dd91ef@pitrou.net>

On Tue, 24 May 2011 20:25:11 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Just as PEP 302 defines how module importers should be written, PEP
> 100 defines how text codecs should be written (i.e. in terms of
> StreamReader and StreamWriter).
> 
> PEP 3116 then defines how such codecs can be used as part of the
> overall I/O stack as redesigned for Python 3.

The I/O stack doesn't use StreamReader and StreamWriter. That's the
whole point. Stream* have been made useless by the new I/O stack.

> Now, there may be an opportunity here to rationalise things a bit and
> re-use the *new* io module interfaces as the basis for an updated
> codec API PEP, but we shouldn't be hasty in deprecating an old API
> that is about "how to write codecs" just because it is similar to a
> shiny new one that is about "how to process I/O data".

Ok, can you explain us the difference, concretely?

Thanks

Antoine.



From lukasz at langa.pl  Tue May 24 12:42:44 2011
From: lukasz at langa.pl (=?iso-8859-2?Q?=A3ukasz_Langa?=)
Date: Tue, 24 May 2011 12:42:44 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDB8591.2060308@livinglogic.de>
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
Message-ID: <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>


Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16:

>> I don't see which usecase is not covered by TextIOWrapper. But I know
>> some cases which are not supported by StreamReader/StreamWriter.
> 
> This could be be partially fixed by implementing generic
> StreamReader/StreamWriter classes that reuse the incremental codecs, but
> I don't think thats worth it.

Why not?

-- 
Best regards,
?ukasz Langa
Senior Systems Architecture Engineer

IT Infrastructure Department
Grupa Allegro Sp. z o.o.

From victor.stinner at haypocalc.com  Tue May 24 12:50:28 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 12:50:28 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <loom.20110524T095820-780@post.gmane.org>
References: <1306195729.605.27.camel@marge>
	<loom.20110524T095820-780@post.gmane.org>
Message-ID: <1306234228.2619.44.camel@marge>

Le mardi 24 mai 2011 ? 08:16 +0000, Vinay Sajip a ?crit :
> > I opened an issue for this idea. Brett and Marc-Andree Lemburg don't
> > want to deprecate codecs.open() & friends because they want to be able
> > to write code working on Python 2 and on Python 3 without any change. I
> > don't think it's realistic: nontrivial programs require at least the six
> > module, and most likely the 2to3 program. The six module can have its
> > "codecs.open" function if codecs.open is removed from Python 3.4.
> 
> What's "non-trivial"? Both pip and virtualenv (widely used programs) were ported
> to Python 3 using a single codebase for 2.x and 3.x, because it seemed to
> involve the least ongoing maintenance burden. Though these particular programs
> don't use codecs.open, I don't see much value in making it harder to write
> programs which can run under both 2.x and 3.x; that's not going to speed
> adoption of 3.x.

pip has a pip.backwardcompat module which is vey similar to six. If
codecs.open() is deprecated or removed, it will be trivial to add a
wrapper for codecs.open() or open() to six and pip.backwardcompat.
virtualenv.py starts also with a thin compatibility layer.

But yes, each program using a compatibily layer/module will have to be
updated.

Victor


From solipsis at pitrou.net  Tue May 24 12:54:53 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 May 2011 12:54:53 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
Message-ID: <20110524125453.4b20107b@pitrou.net>

On Tue, 24 May 2011 12:16:49 +0200
Walter D?rwald <walter at livinglogic.de> wrote:
> 
> > and so it's not possible to write a generic fix for
> > all child classes in the codecs module. Each stateful codec has to
> > handle special cases like seek() problems.
> 
> Yes, which in theory makes it possible to implement shortcuts for
> certain codecs (e.g. the UTF-32-BE/LE codecs could simply multiply the
> character position by 4 to get the byte position). However AFAICR none
> of the readers/writers does that.

And in practice, TextIOWrapper.tell() does a similar optimization in
a generic way. I'm linking to the Python implementation for readability:
http://hg.python.org/cpython/file/5c716437a83a/Lib/_pyio.py#l1741

TextIOWrapper.seek() is straightforward due to the structure of the
integer "cookie" returned by TextIOWrapper.tell().

In practice, TextIOWrapper gets much more love than
Stream{Reader,Writer} because it's an essential part of the new I/O
stack. As Victor said, problems which Stream* have had for years are
solved neatly in TextIOWrapper.

Therefore, leaving Stream{Reader,Writer} in is not a matter of "choice"
and "freedom given to users". It's giving people the misleading
possibility of using non-optimized, poorly debugged, less featureful
implementations of the same basic idea (an unicode stream abstraction).

Regards

Antoine.



From victor.stinner at haypocalc.com  Tue May 24 12:58:01 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 12:58:01 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
Message-ID: <1306234681.2619.45.camel@marge>

Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit :
> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16:
> 
> >> I don't see which usecase is not covered by TextIOWrapper. But I know
> >> some cases which are not supported by StreamReader/StreamWriter.
> > 
> > This could be be partially fixed by implementing generic
> > StreamReader/StreamWriter classes that reuse the incremental codecs, but
> > I don't think thats worth it.
> 
> Why not?

We have already an implementation of this idea, it is called
io.TextIOWrapper.

Victor


From fijall at gmail.com  Tue May 24 13:31:38 2011
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Tue, 24 May 2011 13:31:38 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
Message-ID: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>

On Sun, May 22, 2011 at 1:57 AM, Artur Siekielski
<artur.siekielski at gmail.com> wrote:
> Hi.
> The problem with reference counters is that they are very often
> incremented/decremented, even for read-only algorithms (like traversal
> of a list). It has two drawbacks:
> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
> PyObject are very often invalidated, resulting in loosing many chances
> to use the CPU caches

Not sure what scenario exactly are you discussing here, but storing
reference counts outside of objects has (at least on a single
processor) worse cache locality than inside objects.

>
> However the drawback is that such design introduces a new level of
> indirection which is a pointer inside a PyObject instead of a direct
> value. Also it seems that the "block" with refcounts would have to be
> a non-trivial data structure.

That would almost certainly be slower for most use cases, except for
the copy-on-write fork. I guess recycler papers might be an
interesting read:
http://www.research.ibm.com/people/d/dfb/recycler.html

This is the best reference-counting GC I'm aware of.

>
> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar? And if CPython was profiled for CPU cache usage?

CPython was not designed for CPU cache usage as far as I'm aware.

>From my (heavily biased) point of view, PyPy is a way better platform
to perform such experiments (and PyPy has been profiled for CPU cache
usage). The main advantage is that you can code your GC without the
need to modify the interpreter. On the other hand you obviously don't
get benefits on CPython, but maybe it's worth experimenting.

Cheers,
fijal

From walter at livinglogic.de  Tue May 24 14:01:43 2011
From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=)
Date: Tue, 24 May 2011 14:01:43 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306234681.2619.45.camel@marge>
References: <1306195729.605.27.camel@marge>
	<4DDB8591.2060308@livinglogic.de>	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
	<1306234681.2619.45.camel@marge>
Message-ID: <4DDB9E27.7040605@livinglogic.de>

On 24.05.11 12:58, Victor Stinner wrote:
> Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit :
>> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16:
>>
>>>> I don't see which usecase is not covered by TextIOWrapper. But I know
>>>> some cases which are not supported by StreamReader/StreamWriter.
>>>
>>> This could be be partially fixed by implementing generic
>>> StreamReader/StreamWriter classes that reuse the incremental codecs, but
>>> I don't think thats worth it.
>>
>> Why not?
> 
> We have already an implementation of this idea, it is called
> io.TextIOWrapper.

Exactly.

>From another post by Victor:

> As I wrote, codecs.open() is useful in Python 2. But I don't know any
> program or library using directly StreamReader or StreamWriter.

So: implementing this is a lot of work, duplicates existing
functionality and is mostly unused.

Servus,
   Walter





From stefan_ml at behnel.de  Tue May 24 14:05:26 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 24 May 2011 14:05:26 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
	outside of objects
In-Reply-To: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
Message-ID: <irg6u7$cu$1@dough.gmane.org>

Maciej Fijalkowski, 24.05.2011 13:31:
> CPython was not designed for CPU cache usage as far as I'm aware.

That's a pretty bold statement to make on this list. Even if it wasn't 
originally "designed" for (efficient?) CPU cache usage, it's certainly been 
around for long enough to have received numerous performance tweaks in that 
regard.

I doubt that efficient CPU cache usage was a major design goal of PyPy 
right from the start. IMHO, the project has changed its objectives way too 
many times to claim something like that, especially at the low level where 
the CPU cache becomes relevant. I remember that not so long ago, PyPy was 
hugely memory hungry compared to CPython. Although, one could certainly 
call *that* "designed for CPU cache usage"... ;)

Stefan


From sturla at molden.no  Tue May 24 14:08:14 2011
From: sturla at molden.no (Sturla Molden)
Date: Tue, 24 May 2011 14:08:14 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>	<4DDAE0C7.9040501@molden.no>
	<BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>
Message-ID: <4DDB9FAE.5060205@molden.no>

Den 24.05.2011 11:55, skrev Artur Siekielski:
>
> PYRO/multiprocessing proxies isn't a comparable solution because of
> ORDERS OF MAGNITUDE worser performance. You compare here direct memory
> access vs serialization/message passing through sockets/pipes.

The bottleneck is likely the serialization, but only if you serialize 
large objects. IPC is always very fast, at least on localhost .

Just out of curiosity, have you considered using a database? Sqlite and 
BSD DB can even be put in shared memory if you want. It sounds like you 
are trying to solve a database problem using os.fork, something which is 
more or less doomed to fail (i.e. you have to replicate all effort put 
into scaling up databases). If a database is too slow, I am rather sure 
you need something else than Python as well.

Sturla

From sturla at molden.no  Tue May 24 14:25:59 2011
From: sturla at molden.no (Sturla Molden)
Date: Tue, 24 May 2011 14:25:59 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
Message-ID: <4DDBA3D7.9060807@molden.no>

Den 24.05.2011 13:31, skrev Maciej Fijalkowski:
>
> Not sure what scenario exactly are you discussing here, but storing
> reference counts outside of objects has (at least on a single
> processor) worse cache locality than inside objects.
>

Artur Siekielski is not talking about cache locality, but copy-on-write 
fork on Linux et al.

When reference counts are updated after forking, memory pages marked 
copy-on-write are copied if they store reference counts. And then he 
quickly runs out of memory. He wants to put reference counts and 
PyObjects in different pages, so only the pages with reference counts 
get copied.

I don't think he cares about cache locality at all, but the rest of us 
do :-)


Sturla






From sturla at molden.no  Tue May 24 14:31:47 2011
From: sturla at molden.no (Sturla Molden)
Date: Tue, 24 May 2011 14:31:47 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>	<4DDAE0C7.9040501@molden.no>
	<BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>
Message-ID: <4DDBA533.3070800@molden.no>

Den 24.05.2011 11:55, skrev Artur Siekielski:
>
> POSH might be good, but the project is dead for 8 years. And this
> copy-on-write is nice because you don't need changes/restrictions to
> your code, or a special garbage collector.

Then I have a solution for you, one that is cheaper than anything else 
you are trying to do (taking work hours into account):

BUY MORE RAM!

RAM is damn cheap. You just need more of it. And 64-bit Python :-)


Sturla


From solipsis at pitrou.net  Tue May 24 14:32:42 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 May 2011 14:32:42 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
	<irg6u7$cu$1@dough.gmane.org>
Message-ID: <20110524143242.0774326c@pitrou.net>

On Tue, 24 May 2011 14:05:26 +0200
Stefan Behnel <stefan_ml at behnel.de> wrote:
> 
> I doubt that efficient CPU cache usage was a major design goal of PyPy 
> right from the start. IMHO, the project has changed its objectives way too 
> many times to claim something like that, especially at the low level where 
> the CPU cache becomes relevant. I remember that not so long ago, PyPy was 
> hugely memory hungry compared to CPython. Although, one could certainly 
> call *that* "designed for CPU cache usage"... ;)

Well, to be honest, "hugely memory hungry" doesn't necessarily mean
cache-averse. It depends on the locality of memory access patterns.

Regards

Antoine.



From stefan_ml at behnel.de  Tue May 24 15:01:49 2011
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 24 May 2011 15:01:49 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
	outside of objects
In-Reply-To: <20110524143242.0774326c@pitrou.net>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>	<irg6u7$cu$1@dough.gmane.org>
	<20110524143242.0774326c@pitrou.net>
Message-ID: <irga7u$n0o$1@dough.gmane.org>

Antoine Pitrou, 24.05.2011 14:32:
> On Tue, 24 May 2011 14:05:26 +0200Stefan Behnel wrote:
>>
>> I doubt that efficient CPU cache usage was a major design goal of PyPy
>> right from the start. IMHO, the project has changed its objectives way too
>> many times to claim something like that, especially at the low level where
>> the CPU cache becomes relevant. I remember that not so long ago, PyPy was
>> hugely memory hungry compared to CPython. Although, one could certainly
>> call *that* "designed for CPU cache usage"... ;)
>
> Well, to be honest, "hugely memory hungry" doesn't necessarily mean
> cache-averse. It depends on the locality of memory access patterns.

Sure. AFAIR (and Maciej is certainly the right person to prove me wrong), 
the problem at the time was that the overall memory footprint of objects 
was too high. That, at least, speaks against efficient cache usage and 
makes it's more likely to result in cache thrashing.

In any case, we're talking about a historical problem they already fixed.

Stefan


From ncoghlan at gmail.com  Tue May 24 16:33:07 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 May 2011 00:33:07 +1000
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <irg6u7$cu$1@dough.gmane.org>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
	<irg6u7$cu$1@dough.gmane.org>
Message-ID: <BANLkTi=VH=7k5NX67ekZsaEfACgV3x96Ew@mail.gmail.com>

On Tue, May 24, 2011 at 10:05 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Maciej Fijalkowski, 24.05.2011 13:31:
>>
>> CPython was not designed for CPU cache usage as far as I'm aware.
>
> That's a pretty bold statement to make on this list. Even if it wasn't
> originally "designed" for (efficient?) CPU cache usage, it's certainly been
> around for long enough to have received numerous performance tweaks in that
> regard.

As a statement of Guido's original intent, I'd side with Maciej (Guido
has made it pretty clear that he subscribes to the "first, make it
work, and only worry about making it faster if that first approach
isn't good enough" school of thought). Various *parts* of CPython, on
the other hand, have indeed been optimised over the years to be quite
aware of potential low level CPU and RAM effects (e.g. dicts, sorting,
the small object allocator).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From artur.siekielski at gmail.com  Tue May 24 17:39:06 2011
From: artur.siekielski at gmail.com (Artur Siekielski)
Date: Tue, 24 May 2011 17:39:06 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <4DDB9FAE.5060205@molden.no>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>
	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>
	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>
	<4DDAE0C7.9040501@molden.no>
	<BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>
	<4DDB9FAE.5060205@molden.no>
Message-ID: <BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com>

2011/5/24 Sturla Molden <sturla at molden.no>:
> Den 24.05.2011 11:55, skrev Artur Siekielski:
>>
>> PYRO/multiprocessing proxies isn't a comparable solution because of
>> ORDERS OF MAGNITUDE worser performance. You compare here direct memory
>> access vs serialization/message passing through sockets/pipes.
> The bottleneck is likely the serialization, but only if you serialize large
> objects. IPC is always very fast, at least on localhost .

It cannot be "fast" compared to direct memory access. Here is a
benchmark: summing numbers in a small list in a child process using
multiprocessing "manager": http://dpaste.org/QzKr/ , and using
implicit copy of the structure after fork(): http://dpaste.org/q3eh/.
The first is 200 TIMES SLOWER. It means if the work finishes in 20
seconds using fork(), the same work will require more than one hour
using multiprocessing manager.

> If a database is too slow, I am rather sure you need
> something else than Python as well.

Disk access is about 1000x slower than memory access in C, and Python
in a worst case is 50x slower than C, so there is still a huge win
(not to mention that in a common case Python is only a few times
slower).


Artur

From tjreedy at udel.edu  Tue May 24 17:44:39 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 May 2011 11:44:39 -0400
Subject: [Python-Dev] CPython optimization: storing reference counters
	outside of objects
In-Reply-To: <4DDBA3D7.9060807@molden.no>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
	<4DDBA3D7.9060807@molden.no>
Message-ID: <irgjp5$tv5$1@dough.gmane.org>

On 5/24/2011 8:25 AM, Sturla Molden wrote:

> Artur Siekielski is not talking about cache locality, but copy-on-write
> fork on Linux et al.
>
> When reference counts are updated after forking, memory pages marked
> copy-on-write are copied if they store reference counts. And then he
> quickly runs out of memory. He wants to put reference counts and
> PyObjects in different pages, so only the pages with reference counts
> get copied.
>
> I don't think he cares about cache locality at all, but the rest of us
> do :-)

It seems clear that separating reference counts from objects satisfies a 
specialized need and should be done in a spedial, patched version of 
CPython rather than the general distribution.

-- 
Terry Jan Reedy


From victor.stinner at haypocalc.com  Tue May 24 18:06:15 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 24 May 2011 18:06:15 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <4DDBCE7C.6090200@udel.edu>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu>
Message-ID: <1306253175.13660.18.camel@marge>

Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit :
> >
> > +.. function:: RAND_bytes(num)
> > +
> > +   Returns *num* cryptographically strong pseudo-random bytes.
> > +
> > +   .. versionadded:: 3.3
> > +
> > +.. function:: RAND_pseudo_bytes(num)
> > +
> > +   Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes,
> > +   is_cryptographic is True if the bytes generated are cryptographically
> > +   strong.
> > +
> > +   .. versionadded:: 3.3
> 
> I am curious what 'cryptographically strong' means, what the real 
> difference is between the above two functions, and how these do not 
> duplicate what is in random.random.

An important feature of a CPRNG (cryptographic pseudo-random number
generator) is that even if you know all of its output, you cannot
rebuild its internal state to guess next (or maybe previous number). The
CPRNG can for example hash its output using SHA-1: you will have to
"break" the SHA-1 hash (maybe using "salt").

Another important feature is that even if you know the internal state,
you will not be able to guess all previous and next numbers, because the
internal state is regulary updated using an external source of entropy.
Use RAND_add() to do that explicitly.

We may add a link to Wikipedia:
http://en.wikipedia.org/wiki/CPRNG

Read the "Requirements" section, it's maybe more correct than my
explanation:
http://en.wikipedia.org/wiki/CPRNG#Requirements

About the random module, it must not be used to generate passwords or
certificates, because it is easy to rebuild the internal state of a
Mersenne Twister generator if you know the previous 624 numbers. Since
you know the state, it's also easy to generate all next numbers. Seed a
Mersenne Twister PRNG doesn't help. See my Hasard project if you would
like to learn more about PRNG ;-)

We may also add a link from random to SSL.RAND_bytes() and
SSL.RAND_pseudo_bytes().

https://bitbucket.org/haypo/hasard/

Victor


From tjreedy at udel.edu  Tue May 24 18:35:16 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 May 2011 12:35:16 -0400
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDB84F2.40106@egenix.com>
References: <1306195729.605.27.camel@marge>
	<4DDB664A.7050705@egenix.com>	<1306227534.2619.34.camel@marge>
	<4DDB84F2.40106@egenix.com>
Message-ID: <irgmo0$hrr$1@dough.gmane.org>

On 5/24/2011 6:14 AM, M.-A. Lemburg wrote:

> I have no idea why TextIOWrapper was added to the stdlib
> instead of making StreamReaderWriter more capable,
> since StreamReaderWriter had already been available in Python
> since Python 1.6 (and this is being used by codecs.open()).

As I understand it, you (and others) wrote codecs long ago and recently 
other people wrote the new i/o stack, which sometimes uses codecs, and 
when they needed to add a few details, they 'naturally' added them to 
the module they were working on and understood (and planned to rewrite 
in C) rather than to the older module that they maybe did not completely 
understand and which is only in Python.

The Victor comes along to do maintenance on some of the Asian codecs and 
discovers that he needs to make changes in two (or more?) places rather 
than one, which he naturally finds unsatifactory.

> Perhaps we should deprecate TextIOWrapper instead and
> replace it with codecs.StreamReaderWriter ? ;-)

I think we should separate two issues: removing internal implementation 
duplication and removing external api duplication. I should think that 
the former should not be too controversial. The latter, I know, is more 
contentious. One problem is that stdlib changes that perhaps 'should' 
have been made in 3.0/1 could not be discovered until the moratorium and 
greater focus on the stdlib.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue May 24 18:39:20 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 May 2011 12:39:20 -0400
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com>
References: <20110521163714.68c5384f@pitrou.net>	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>	<58834.1306112451@parc.com>	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>	<20110524115655.65030e15@pitrou.net>
	<BANLkTi=Vb7VkMAbDr-cfJyU5Vh56J6O+6A@mail.gmail.com>
Message-ID: <irgmvj$hrr$2@dough.gmane.org>

On 5/24/2011 6:27 AM, Nick Coghlan wrote:
> On Tue, May 24, 2011 at 7:56 PM, Antoine Pitrou<solipsis at pitrou.net>  wrote:
>> Thank you very much! What a beautiful sight this is:
>> http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable
>>
>> (until a sporadic failure comes up, that is)
>
> I could turn test_crashers back on if you like ;)

No need. One xp (but not the other) and win7 turned red again.

-- 
Terry Jan Reedy


From debatem1 at gmail.com  Tue May 24 19:09:07 2011
From: debatem1 at gmail.com (geremy condra)
Date: Tue, 24 May 2011 10:09:07 -0700
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <irgjp5$tv5$1@dough.gmane.org>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
	<4DDBA3D7.9060807@molden.no> <irgjp5$tv5$1@dough.gmane.org>
Message-ID: <BANLkTikYC_eCkq7hMrJjjo=+KojTEAOvTg@mail.gmail.com>

On Tue, May 24, 2011 at 8:44 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 5/24/2011 8:25 AM, Sturla Molden wrote:
>
>> Artur Siekielski is not talking about cache locality, but copy-on-write
>> fork on Linux et al.
>>
>> When reference counts are updated after forking, memory pages marked
>> copy-on-write are copied if they store reference counts. And then he
>> quickly runs out of memory. He wants to put reference counts and
>> PyObjects in different pages, so only the pages with reference counts
>> get copied.
>>
>> I don't think he cares about cache locality at all, but the rest of us
>> do :-)
>
> It seems clear that separating reference counts from objects satisfies a
> specialized need and should be done in a spedial, patched version of CPython
> rather than the general distribution.

I'm not sure I agree, especially given that the classical answer to
GIL woes has been to tell people to fork() themselves. There has to be
a lot of code out there that would benefit from this.

Geremy Condra

From g.brandl at gmx.net  Tue May 24 19:28:33 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 24 May 2011 19:28:33 +0200
Subject: [Python-Dev] cpython: move specialized dir implementations into
 __dir__ methods (closes #12166)
In-Reply-To: <E1QOuAE-0002zH-7a@dinsdale.python.org>
References: <E1QOuAE-0002zH-7a@dinsdale.python.org>
Message-ID: <irgprk$8sc$1@dough.gmane.org>

On 24.05.2011 18:08, benjamin.peterson wrote:
> http://hg.python.org/cpython/rev/8f403199f999
> changeset:   70331:8f403199f999
> user:        Benjamin Peterson <benjamin at python.org>
> date:        Tue May 24 11:09:06 2011 -0500
> summary:
>   move specialized dir implementations into __dir__ methods (closes #12166)

> +static PyMethodDef module_methods[] = {
> +    {"__dir__", module_dir, METH_NOARGS,
> +     PyDoc_STR("__dir__() -> specialized dir() implementation")},
> +    {0}
> +};

>  static PyMethodDef type_methods[] = {
>      {"mro", (PyCFunction)mro_external, METH_NOARGS,
>       PyDoc_STR("mro() -> list\nreturn a type's method resolution order")},
> @@ -2585,6 +2661,8 @@
>       PyDoc_STR("__instancecheck__() -> check if an object is an instance")},
>      {"__subclasscheck__", type___subclasscheck__, METH_O,
>       PyDoc_STR("__subclasscheck__() -> check if a class is a subclass")},
> +    {"__dir__", type_dir, METH_NOARGS,
> +     PyDoc_STR("__dir__() -> specialized __dir__ implementation for types")},

>  static PyMethodDef object_methods[] = {
>      {"__reduce_ex__", object_reduce_ex, METH_VARARGS,
>       PyDoc_STR("helper for pickle")},
> @@ -3449,6 +3574,8 @@
>       PyDoc_STR("default object formatter")},
>      {"__sizeof__", object_sizeof, METH_NOARGS,
>       PyDoc_STR("__sizeof__() -> size of object in memory, in bytes")},
> +    {"__dir__", object_dir, METH_NOARGS,
> +     PyDoc_STR("__dir__() -> default dir() implementation")},

This is interesting: I though we use "->" to specify the return value (or
its type).  __instancecheck__ and __subclasscheck__ set a different
precedent, while __sizeof__ follows.

I didn't look at the files to check for other examples.

Georg


From benjamin at python.org  Tue May 24 19:39:57 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 24 May 2011 12:39:57 -0500
Subject: [Python-Dev] cpython: move specialized dir implementations into
 __dir__ methods (closes #12166)
In-Reply-To: <irgprk$8sc$1@dough.gmane.org>
References: <E1QOuAE-0002zH-7a@dinsdale.python.org>
	<irgprk$8sc$1@dough.gmane.org>
Message-ID: <BANLkTikjCq4R9PpLxzjiWx_QBE5ga+1Uuw@mail.gmail.com>

2011/5/24 Georg Brandl <g.brandl at gmx.net>:
> On 24.05.2011 18:08, benjamin.peterson wrote:
>> http://hg.python.org/cpython/rev/8f403199f999
>> changeset: ? 70331:8f403199f999
>> user: ? ? ? ?Benjamin Peterson <benjamin at python.org>
>> date: ? ? ? ?Tue May 24 11:09:06 2011 -0500
>> summary:
>> ? move specialized dir implementations into __dir__ methods (closes #12166)
>
>> +static PyMethodDef module_methods[] = {
>> + ? ?{"__dir__", module_dir, METH_NOARGS,
>> + ? ? PyDoc_STR("__dir__() -> specialized dir() implementation")},
>> + ? ?{0}
>> +};
>
>> ?static PyMethodDef type_methods[] = {
>> ? ? ?{"mro", (PyCFunction)mro_external, METH_NOARGS,
>> ? ? ? PyDoc_STR("mro() -> list\nreturn a type's method resolution order")},
>> @@ -2585,6 +2661,8 @@
>> ? ? ? PyDoc_STR("__instancecheck__() -> check if an object is an instance")},
>> ? ? ?{"__subclasscheck__", type___subclasscheck__, METH_O,
>> ? ? ? PyDoc_STR("__subclasscheck__() -> check if a class is a subclass")},
>> + ? ?{"__dir__", type_dir, METH_NOARGS,
>> + ? ? PyDoc_STR("__dir__() -> specialized __dir__ implementation for types")},
>
>> ?static PyMethodDef object_methods[] = {
>> ? ? ?{"__reduce_ex__", object_reduce_ex, METH_VARARGS,
>> ? ? ? PyDoc_STR("helper for pickle")},
>> @@ -3449,6 +3574,8 @@
>> ? ? ? PyDoc_STR("default object formatter")},
>> ? ? ?{"__sizeof__", object_sizeof, METH_NOARGS,
>> ? ? ? PyDoc_STR("__sizeof__() -> size of object in memory, in bytes")},
>> + ? ?{"__dir__", object_dir, METH_NOARGS,
>> + ? ? PyDoc_STR("__dir__() -> default dir() implementation")},
>
> This is interesting: I though we use "->" to specify the return value (or
> its type). ?__instancecheck__ and __subclasscheck__ set a different
> precedent, while __sizeof__ follows.

Yes, I was wondering about that, so I just picked one. :) "->" seems
to be better for return values, though, given the resemblance to
annotations.


-- 
Regards,
Benjamin

From cesare.di.mauro at gmail.com  Tue May 24 19:40:47 2011
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Tue, 24 May 2011 19:40:47 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <irg6u7$cu$1@dough.gmane.org>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>
	<BANLkTi=85VbhgRBk1XCiwGz25gNONZO=6Q@mail.gmail.com>
	<irg6u7$cu$1@dough.gmane.org>
Message-ID: <BANLkTimwQPCgDaOdNV0An4k59OrtQUFeuQ@mail.gmail.com>

2011/5/24 Stefan Behnel <stefan_ml at behnel.de>

> Maciej Fijalkowski, 24.05.2011 13:31:
>
>  CPython was not designed for CPU cache usage as far as I'm aware.
>>
>
>  That's a pretty bold statement to make on this list. Even if it wasn't
> originally "designed" for (efficient?) CPU cache usage, it's certainly been
> around for long enough to have received numerous performance tweaks in that
> regard.
>
> Stefan


Maybe a change on memory allocation granularity can help here.

Raising it to 16 and 32 bytes for 32 and 64 bits system respectively
guarantees that an access to ob_refcnt and/or ob_type will put on the cache
line some other information for the same object, which is usually required
by itself (except for very simple ones, such as PyNone, PyEllipsis, etc.).

Think about a long, a tuple, a list, a dictionary, ecc.: all of them have
some critical data after these fields, that most likely will be accessed
after INCRef or type checking.

Regards,
Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110524/957fe2ee/attachment.html>

From janssen at parc.com  Tue May 24 19:43:06 2011
From: janssen at parc.com (Bill Janssen)
Date: Tue, 24 May 2011 10:43:06 PDT
Subject: [Python-Dev] Stable buildbots update
In-Reply-To: <nad-0412B2.22031323052011@news.gmane.org>
References: <20110521163714.68c5384f@pitrou.net>
	<BANLkTimpt1YR7PsjYh4H+rY9E9p8VopS_g@mail.gmail.com>
	<58834.1306112451@parc.com>
	<BANLkTi=ofyuCUA49wheByRPVcygPzrCUKg@mail.gmail.com>
	<BANLkTi=Twu6pcoQKM8KebLmxtYfL6HbCOQ@mail.gmail.com>
	<87zkmcalt8.fsf@uwakimon.sk.tsukuba.ac.jp>
	<nad-0412B2.22031323052011@news.gmane.org>
Message-ID: <87174.1306258986@parc.com>

Ned Deily <nad at acm.org> wrote:

> In article <87zkmcalt8.fsf at uwakimon.sk.tsukuba.ac.jp>,
>  "Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> > Are you saying you expect Mac OS X 10.4 "Tiger" to go green once the
> > bots update?  If so, I'm impressed, and "thank you!" to all involved.
> > Apple and MacPorts have long since washed their hands of that release.
> 
> OS X 10.4 does have its quirks that makes it challenging to get all of 
> the tests to run without a few cornercase failures but, besides the 
> buildbots, I still test regularly with 10.4 and occasionally build 
> there, too.  And, FWIW, while top-of-trunk MacPorts may not officially 
> support 10.4, many ports work there just fine including python2.6, 2.7, 
> and 3.1.  (3.2 has a build issue that may get fixed in 3.2.1).

Perhaps more importantly, parc-leopard-1 and parc-tiger-1 are two of the
very few usually-connected buildbots we have running on big-endian
architectures, along with loewis-sun (I *think* Solaris-10 on SPARC is
still big-endian).

Bill

From tjreedy at udel.edu  Tue May 24 19:52:59 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 May 2011 13:52:59 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <1306253175.13660.18.camel@marge>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge>
Message-ID: <irgr9o$k1l$1@dough.gmane.org>

On 5/24/2011 12:06 PM, Victor Stinner wrote:
> Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit :
>>>
>>> +.. function:: RAND_bytes(num)
>>> +
>>> +   Returns *num* cryptographically strong pseudo-random bytes.
>>> +
>>> +   .. versionadded:: 3.3
>>> +
>>> +.. function:: RAND_pseudo_bytes(num)
>>> +
>>> +   Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes,
>>> +   is_cryptographic is True if the bytes generated are cryptographically
>>> +   strong.
>>> +
>>> +   .. versionadded:: 3.3
>>
>> I am curious what 'cryptographically strong' means, what the real
>> difference is between the above two functions, and how these do not
>> duplicate what is in random.random.
>
> An important feature of a CPRNG (cryptographic pseudo-random number
> generator) is that even if you know all of its output, you cannot
> rebuild its internal state to guess next (or maybe previous number). The
> CPRNG can for example hash its output using SHA-1: you will have to
> "break" the SHA-1 hash (maybe using "salt").

So it is presumably slower. I still do not get RAND_pseudo_bytes, which 
somehow decides internally what to do.

>  Another important feature is that even if you know the internal state,
> you will not be able to guess all previous and next numbers, because the
> internal state is regulary updated using an external source of entropy.
> Use RAND_add() to do that explicitly.
>
> We may add a link to Wikipedia:
> http://en.wikipedia.org/wiki/CPRNG

That would be helpful
>
> Read the "Requirements" section, it's maybe more correct than my
> explanation:
> http://en.wikipedia.org/wiki/CPRNG#Requirements
>
> About the random module, it must not be used to generate passwords or
> certificates, because it is easy to rebuild the internal state of a
> Mersenne Twister generator if you know the previous 624 numbers. Since
> you know the state, it's also easy to generate all next numbers. Seed a
> Mersenne Twister PRNG doesn't help. See my Hasard project if you would
> like to learn more about PRNG ;-)
>
> We may also add a link from random to SSL.RAND_bytes() and
> SSL.RAND_pseudo_bytes().

-- 
Terry Jan Reedy



From sturla at molden.no  Tue May 24 20:31:27 2011
From: sturla at molden.no (Sturla Molden)
Date: Tue, 24 May 2011 20:31:27 +0200
Subject: [Python-Dev] CPython optimization: storing reference counters
 outside of objects
In-Reply-To: <BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com>
References: <BANLkTinvniS-_Nr7vYHZ=UdG5zGr3j_H4g@mail.gmail.com>	<BANLkTi=1eajrgLEtAbjmgpm9sENBYw3+aA@mail.gmail.com>	<BANLkTimG_iRMqTP_AqRsAJzaJvod36LSTQ@mail.gmail.com>	<BANLkTimgc=_gd5xzgYaEraoEABspe+Ddsw@mail.gmail.com>	<4DDAE0C7.9040501@molden.no>	<BANLkTink2knbSW+jX5y-quqmPThcW90TxA@mail.gmail.com>	<4DDB9FAE.5060205@molden.no>
	<BANLkTimog4wHqXQ-O+tRKcKoxZDvj=n52A@mail.gmail.com>
Message-ID: <4DDBF97F.8010005@molden.no>

Den 24.05.2011 17:39, skrev Artur Siekielski:
>
> Disk access is about 1000x slower than memory access in C, and Python
> in a worst case is 50x slower than C, so there is still a huge win
> (not to mention that in a common case Python is only a few times
> slower).

You can put databases in shared memory (e.g. Sqlite and BSDDB have options
for this). On linux you can also mount /dev/shm as ramdisk. Also, why
do you distrust the database developers of Oracle et al. not to do the
suffient optimizations?

Sturla



From gzlist at googlemail.com  Tue May 24 21:39:32 2011
From: gzlist at googlemail.com (Martin (gzlist))
Date: Tue, 24 May 2011 20:39:32 +0100
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306195729.605.27.camel@marge>
References: <1306195729.605.27.camel@marge>
Message-ID: <BANLkTine6bRSUemG_PBFy=_UNwmkd3C5bw@mail.gmail.com>

On 24/05/2011, Victor Stinner <victor.stinner at haypocalc.com> wrote:
>
> In Python 2, codecs.open() is the best way to read and/or write files
> using Unicode. But in Python 3, open() is preferred with its fast io
> module. I would like to deprecate codecs.open() because it can be
> replaced by open() and io.TextIOWrapper. I would like your opinion and
> that's why I'm writing this email.

There are some modules that try to stay compatible with Python 2 and 3
without a source translation step. Removing the codecs classes would
mean they'd have to add a few more compatibility hacks, but could be
done.

As an aside, I'm still not sure how the io module should be used.
Example, a simple task I've used StreamWriter classes for is to wrap
stdout. If the stdout.encoding can't represent a character, using
"replace" means you can write any unicode string without throwing a
UnicodeEncodeError.

With the io module, it seems you need to construct a new TextIOWrapper
object, passing the attributes of the old one as parameters, and as
soon as someone passes something that's not a TextIOWrapper (say, a
StringIO object) your code breaks. Is the intention that code dealing
with streams needs to be covered in isinstance checks in Python 3?

Martin

From srini605 at gmail.com  Tue May 24 23:09:47 2011
From: srini605 at gmail.com (srinivasan munisamy)
Date: Wed, 25 May 2011 02:39:47 +0530
Subject: [Python-Dev] [pyodbc] Setting values to SQL_* constants while
 creating a connection object
Message-ID: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com>

Hi,
I would like to know how to set values to values to SQL_*  constants while
creatinga db connection through pyodbc module.
For example, i am getting a connection object like below:

In [27]: dbh1 =
pyodbc.connect("DSN=<dsn>;UID=<uid>;PWD=<pwd>;DATABASE=<database>;APP=<app_name>")

In [28]: dbh1.getinfo(pyodbc.SQL_DESCRIBE_PARAMETER)

Out[28]: True

I want to set this SQL_DESCRIBE_PARAMETER to false for this connection
object. How could i do that?
Please help me in figuring it out.

Thanks,
Srini
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/b6a71e01/attachment.html>

From tjreedy at udel.edu  Wed May 25 00:06:00 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 May 2011 18:06:00 -0400
Subject: [Python-Dev] [pyodbc] Setting values to SQL_* constants while
 creating a connection object
In-Reply-To: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com>
References: <BANLkTik_CifwV_jqACqDaZHmE2Td07f+aA@mail.gmail.com>
Message-ID: <irha44$nd8$2@dough.gmane.org>

On 5/24/2011 5:09 PM, srinivasan munisamy wrote:
> Hi,
> I would like to know how to set values to values to SQL_*  constants

Please direct Python use questions to python-listor other user 
discussion forums. Py-dev is for discussion of development of the next 
versions of Python.
-- 
Terry Jan Reedy


From ncoghlan at gmail.com  Wed May 25 07:09:40 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 May 2011 15:09:40 +1000
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <irgr9o$k1l$1@dough.gmane.org>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu>
	<1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org>
Message-ID: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com>

On Wed, May 25, 2011 at 3:52 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 5/24/2011 12:06 PM, Victor Stinner wrote:
>> An important feature of a CPRNG (cryptographic pseudo-random number
>> generator) is that even if you know all of its output, you cannot
>> rebuild its internal state to guess next (or maybe previous number). The
>> CPRNG can for example hash its output using SHA-1: you will have to
>> "break" the SHA-1 hash (maybe using "salt").
>
> So it is presumably slower. I still do not get RAND_pseudo_bytes, which
> somehow decides internally what to do.

The more important feature here is that it is exposing *OpenSSL's*
random number generation, rather than our own. A CPRNG isn't
*necessarily* slower than a non-crypto one (particularly on systems
with dedicated crypto hardware), but they can definitely fail to
return data if there isn't enough entropy available in the pool (and
the system has to have a usable entropy source in the first place).

The RAND_bytes() documentation should probably make it clearer that
unlike the random module and RAND_pseudo_bytes(), RAND_bytes() can
*fail* (by raising SSLError) if it isn't in a position to provide the
requested random data.

The pseudo_bytes version just encapsulates a fallback technique that
may be suitable in some circumstances: if crypto quality random data
is not available, fall back on PRNG data instead of failing. It is
most suitable for tasks like prototyping an algorithm in Python for
later conversion to C, or similar tasks where it is desirable to use
the OpenSSL PRNG over the one in the random module.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Wed May 25 07:13:44 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 May 2011 15:13:44 +1000
Subject: [Python-Dev] [Python-checkins] Daily reference leaks
	(234021dcad93): sum=61
In-Reply-To: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net>
References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net>
Message-ID: <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com>

On Wed, May 25, 2011 at 1:09 PM,  <solipsis at pitrou.net> wrote:
> results for 234021dcad93 on branch "default"
> --------------------------------------------
>
> test_packaging leaked [128, 128, 128] references, sum=384

Is there a new cache in packaging that regrtest needs to know about
and either ignore or clear when checking reference counts?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From petri at digip.org  Wed May 25 07:59:26 2011
From: petri at digip.org (Petri Lehtinen)
Date: Wed, 25 May 2011 08:59:26 +0300
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <irgr9o$k1l$1@dough.gmane.org>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu>
	<1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org>
Message-ID: <20110525055926.GA21500@colossus>

Terry Reedy wrote:
> On 5/24/2011 12:06 PM, Victor Stinner wrote:
> >Le mardi 24 mai 2011 ? 11:27 -0400, Terry Reedy a ?crit :
> >>>
> >>>+.. function:: RAND_bytes(num)
> >>>+
> >>>+   Returns *num* cryptographically strong pseudo-random bytes.
> >>>+
> >>>+   .. versionadded:: 3.3
> >>>+
> >>>+.. function:: RAND_pseudo_bytes(num)
> >>>+
> >>>+   Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes,
> >>>+   is_cryptographic is True if the bytes generated are cryptographically
> >>>+   strong.
> >>>+
> >>>+   .. versionadded:: 3.3
> >>
> >>I am curious what 'cryptographically strong' means, what the real
> >>difference is between the above two functions, and how these do not
> >>duplicate what is in random.random.
> >
> >An important feature of a CPRNG (cryptographic pseudo-random number
> >generator) is that even if you know all of its output, you cannot
> >rebuild its internal state to guess next (or maybe previous number). The
> >CPRNG can for example hash its output using SHA-1: you will have to
> >"break" the SHA-1 hash (maybe using "salt").
> 
> So it is presumably slower. I still do not get RAND_pseudo_bytes,
> which somehow decides internally what to do.

According to the RAND_bytes manual page from OpenSSL:

    RAND_bytes() puts num cryptographically strong pseudo-random
    bytes into buf. An error occurs if the PRNG has not been seeded
    with enough randomness to ensure an unpredictable byte
    sequence.

    RAND_pseudo_bytes() puts num pseudo-random bytes into buf.
    Pseudo-random byte sequences generated by RAND_pseudo_bytes() will
    be unique if they are of sufficient length, but are not
    necessarily unpredictable. They can be used for non-cryptographic
    purposes and for certain purposes in cryptographic protocols, but
    usually not for key generation etc.

And:

    RAND_bytes() returns 1 on success, 0 otherwise. The error code can
    be obtained by ERR_get_error(3). RAND_pseudo_bytes() returns 1 if
    the bytes generated are cryptographically strong, 0 otherwise.
    Both functions return -1 if they are not supported by the current
    RAND method.

So it seems to me that RAND_bytes() either returns cryptographically
strong data or fails (is it possible to detect the failure with the
Python function? Should this be documented?). RAND_pseudo_bytes()
always succeeds but does not necessarily generate cryptographically
strong data.

> 
> > Another important feature is that even if you know the internal state,
> >you will not be able to guess all previous and next numbers, because the
> >internal state is regulary updated using an external source of entropy.
> >Use RAND_add() to do that explicitly.
> >
> >We may add a link to Wikipedia:
> >http://en.wikipedia.org/wiki/CPRNG
> 
> That would be helpful
> >
> >Read the "Requirements" section, it's maybe more correct than my
> >explanation:
> >http://en.wikipedia.org/wiki/CPRNG#Requirements
> >
> >About the random module, it must not be used to generate passwords or
> >certificates, because it is easy to rebuild the internal state of a
> >Mersenne Twister generator if you know the previous 624 numbers. Since
> >you know the state, it's also easy to generate all next numbers. Seed a
> >Mersenne Twister PRNG doesn't help. See my Hasard project if you would
> >like to learn more about PRNG ;-)
> >
> >We may also add a link from random to SSL.RAND_bytes() and
> >SSL.RAND_pseudo_bytes().

Obviously, the user needs to be familiar with the concept of
"cryptographically strong randomness" to use these functions.

Petri Lehtinen

From sandro.tosi at gmail.com  Wed May 25 10:24:23 2011
From: sandro.tosi at gmail.com (Sandro Tosi)
Date: Wed, 25 May 2011 10:24:23 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
Message-ID: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>

Hi all,
before opening an issue to track the request, I'd like to ask advice
here about this: extend os.chown() to accept even user/group names
instead of just uid and gid.

On a Unix system, you can call chown command passing either id or
names, so it seems (to me at least) natural to expect os.chown() to
behave similarly; but that's not the case.

I can see os module wants to be a thin wrapper around OS syscalls and
chown(2) accepts only uid/gid as input, so what would be best: extend
os.chown() or provide a chown() function in shutil module for this
purpose?

Thanks in advance,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi

From victor.stinner at haypocalc.com  Wed May 25 11:10:54 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 11:10:54 +0200
Subject: [Python-Dev] [Python-checkins] Daily reference leaks
	(234021dcad93): sum=61
In-Reply-To: <BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com>
References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net>
	<BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com>
Message-ID: <1306314654.6407.1.camel@marge>

Le mercredi 25 mai 2011 ? 15:13 +1000, Nick Coghlan a ?crit :
> On Wed, May 25, 2011 at 1:09 PM,  <solipsis at pitrou.net> wrote:
> > results for 234021dcad93 on branch "default"
> > --------------------------------------------
> >
> > test_packaging leaked [128, 128, 128] references, sum=384
> 
> Is there a new cache in packaging that regrtest needs to know about
> and either ignore or clear when checking reference counts?

See the issue http://bugs.python.org/issue12167 : Antoine listed tests
leaking references, and I already fixed some of them.

Victor


From victor.stinner at haypocalc.com  Wed May 25 11:29:17 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 11:29:17 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge>
	<irgr9o$k1l$1@dough.gmane.org>
	<BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com>
Message-ID: <1306315757.6407.5.camel@marge>

Le mercredi 25 mai 2011 ? 15:09 +1000, Nick Coghlan a ?crit :
> The RAND_bytes() documentation should probably make it clearer that
> unlike the random module and RAND_pseudo_bytes(), RAND_bytes() can
> *fail* (by raising SSLError) if it isn't in a position to provide the
> requested random data.

According to the doc, both functions can fail, but it is more likely
than RAND_bytes() fail. I disabled temporary Linux random devices to
test RAND_bytes() error code:

   mv /dev/random /dev/random.xxx
   mv /dev/urandom /dev/urandom.xxx

In this case, RAND_pseudo_bytes() generates non-cryptographic random
numbers: it returns (random_bytes, False). I don't know how to test
RAND_pseudo_bytes() error code.

--

I patched test_ssl to test that RAND_bytes() raises an SSLError if there
is not enough entropy, and I also improved the documentation to detail
the error cases.

Victor


From mal at egenix.com  Wed May 25 11:38:10 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 May 2011 11:38:10 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDB9E27.7040605@livinglogic.de>
References: <1306195729.605.27.camel@marge>	<4DDB8591.2060308@livinglogic.de>	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>	<1306234681.2619.45.camel@marge>
	<4DDB9E27.7040605@livinglogic.de>
Message-ID: <4DDCCE02.7060105@egenix.com>

Walter D?rwald wrote:
> On 24.05.11 12:58, Victor Stinner wrote:
>> Le mardi 24 mai 2011 ? 12:42 +0200, ?ukasz Langa a ?crit :
>>> Wiadomo?? napisana przez Walter D?rwald w dniu 2011-05-24, o godz. 12:16:
>>>
>>>>> I don't see which usecase is not covered by TextIOWrapper. But I know
>>>>> some cases which are not supported by StreamReader/StreamWriter.
>>>>
>>>> This could be be partially fixed by implementing generic
>>>> StreamReader/StreamWriter classes that reuse the incremental codecs, but
>>>> I don't think thats worth it.
>>>
>>> Why not?
>>
>> We have already an implementation of this idea, it is called
>> io.TextIOWrapper.
> 
> Exactly.
> 
> From another post by Victor:
> 
>> As I wrote, codecs.open() is useful in Python 2. But I don't know any
>> program or library using directly StreamReader or StreamWriter.
> 
> So: implementing this is a lot of work, duplicates existing
> functionality and is mostly unused.

You are missing the point: we have StreamReader and StreamWriter APIs
on codecs to allow each codecs to implement more efficient ways of
encoding and decoding streams.

Examples of such optimizations are reading the stream in
chunks that can be decoded in one piece, or writing to the stream
in a way that doesn't generate encoding state problems on the
receiving end by ending transmission half-way through a
shift block.

Of course, you won't find many direct uses of these APIs, since
most of the time, applications will simply use codecs.open() to
automatically benefit from these optimizations.

OTOH, TextIOWrapper doesn't know anything about specific encodings
and thus does not allow for such optimizations to be implemented
by codecs.

We don't have many such specialized implementations in the stdlib,
but this doesn't mean that there's no use for them. It
just means that developers and users are simply unaware of the
possibilities opened by these stateful stream APIs.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 25 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-05-23: Released eGenix mx Base 3.2.0      http://python.egenix.com/
2011-05-25: Released mxODBC 3.1.1              http://python.egenix.com/
2011-06-20: EuroPython 2011, Florence, Italy               26 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From victor.stinner at haypocalc.com  Wed May 25 11:39:52 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 11:39:52 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <20110525055926.GA21500@colossus>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge>
	<irgr9o$k1l$1@dough.gmane.org>  <20110525055926.GA21500@colossus>
Message-ID: <1306316392.6407.14.camel@marge>

Le mercredi 25 mai 2011 ? 08:59 +0300, Petri Lehtinen a ?crit :
> So it seems to me that RAND_bytes() either returns cryptographically
> strong data or fails (is it possible to detect the failure with the
> Python function? Should this be documented?).

RAND_bytes() raises an SSLError on error. You can check if there is
enough entropy before calling RAND_bytes() using RAND_status(). I
documented this two infos.

> RAND_pseudo_bytes() always succeeds...

No, it can fail if the RAND method was changed and the current RAND
method doesn't support this operation.

Example:
----
>>> import ctypes
>>> from ctypes import c_void_p
>>> libssl=ctypes.cdll.LoadLibrary('libssl.so')
>>> RAND_set_rand_method=libssl.RAND_set_rand_method
>>> class rand_meth_st(ctypes.Structure): _fields_ = (('seed',
c_void_p), ('bytes', c_void_p), ('cleanup', c_void_p), ('add',
c_void_p), ('pseudorand', c_void_p), ('status', c_void_p))
... 
>>> not_supported = rand_meth_st()
>>> RAND_set_rand_method(ctypes.byref(not_supported))
>>> import ssl
>>> ssl.RAND_bytes(1)
...
ssl.SSLError: [Errno 0] None
>>> ssl.RAND_pseudo_bytes(1)
...
ssl.SSLError: [Errno 0] None
------

Cool, ssl.RAND_pseudo_bytes() raises also an error, as expected :-)

> ... but does not necessarily generate cryptographically
> strong data.

Yes, if the PRNG was not seed with enough data, the RAND_pseudo_bytes()
Python function returns (random_bytes, False).

> > >We may also add a link from random to SSL.RAND_bytes() and
> > >SSL.RAND_pseudo_bytes().
> 
> Obviously, the user needs to be familiar with the concept of
> "cryptographically strong randomness" to use these functions.

I already patched the doc of the random module to add a security
warning. Well, you don't really need to know how a CSPRNG is
implemented, just that random cannot be used for security and that
ssl.RAND_bytes() raises an error if was seeded with enough data.

Tell me if my warning is not clear:

.. warning::

   The generators of the :mod:`random` module should not be used for
   security purposes, they are not cryptographic. Use ssl.RAND_bytes()
   if you require a cryptographically secure pseudorandom number
   generator.

Victor


From petri at digip.org  Wed May 25 12:20:12 2011
From: petri at digip.org (Petri Lehtinen)
Date: Wed, 25 May 2011 13:20:12 +0300
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <1306316392.6407.14.camel@marge>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu>
	<1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org>
	<20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge>
Message-ID: <20110525102012.GD10448@colossus>

Victor Stinner wrote:
> I already patched the doc of the random module to add a security
> warning. Well, you don't really need to know how a CSPRNG is
> implemented, just that random cannot be used for security and that
> ssl.RAND_bytes() raises an error if was seeded with enough data.
> 
> Tell me if my warning is not clear:
> 
> .. warning::
> 
>    The generators of the :mod:`random` module should not be used for
>    security purposes, they are not cryptographic. Use ssl.RAND_bytes()
>    if you require a cryptographically secure pseudorandom number
>    generator.

Looks good to me. Regarding style, you should probably make a link,
like :func:`ssl.RAND_bytes()`.

Petri

From eric at trueblade.com  Wed May 25 12:54:22 2011
From: eric at trueblade.com (Eric Smith)
Date: Wed, 25 May 2011 06:54:22 -0400 (EDT)
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <20110525102012.GD10448@colossus>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu> <1306253175.13660.18.camel@marge>
	<irgr9o$k1l$1@dough.gmane.org> <20110525055926.GA21500@colossus>
	<1306316392.6407.14.camel@marge> <20110525102012.GD10448@colossus>
Message-ID: <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com>

> Victor Stinner wrote:
>> I already patched the doc of the random module to add a security
>> warning. Well, you don't really need to know how a CSPRNG is
>> implemented, just that random cannot be used for security and that
>> ssl.RAND_bytes() raises an error if was seeded with enough data.
>>
>> Tell me if my warning is not clear:
>>
>> .. warning::
>>
>>    The generators of the :mod:`random` module should not be used for
>>    security purposes, they are not cryptographic. Use ssl.RAND_bytes()
>>    if you require a cryptographically secure pseudorandom number
>>    generator.
>
> Looks good to me. Regarding style, you should probably make a link,
> like :func:`ssl.RAND_bytes()`.

Does "are not cryptographic" have any meaning? (I'm not an expert, just
not sure). Should it not be "cryptographically secure", to match the next
sentence?

Eric.


From petri at digip.org  Wed May 25 12:58:52 2011
From: petri at digip.org (Petri Lehtinen)
Date: Wed, 25 May 2011 13:58:52 +0300
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org> <4DDBCE7C.6090200@udel.edu>
	<1306253175.13660.18.camel@marge> <irgr9o$k1l$1@dough.gmane.org>
	<20110525055926.GA21500@colossus> <1306316392.6407.14.camel@marge>
	<20110525102012.GD10448@colossus>
	<6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com>
Message-ID: <20110525105852.GE10448@colossus>

Eric Smith wrote:
> > Victor Stinner wrote:
> >> I already patched the doc of the random module to add a security
> >> warning. Well, you don't really need to know how a CSPRNG is
> >> implemented, just that random cannot be used for security and that
> >> ssl.RAND_bytes() raises an error if was seeded with enough data.
> >>
> >> Tell me if my warning is not clear:
> >>
> >> .. warning::
> >>
> >>    The generators of the :mod:`random` module should not be used for
> >>    security purposes, they are not cryptographic. Use ssl.RAND_bytes()
> >>    if you require a cryptographically secure pseudorandom number
> >>    generator.
> >
> > Looks good to me. Regarding style, you should probably make a link,
> > like :func:`ssl.RAND_bytes()`.
> 
> Does "are not cryptographic" have any meaning? (I'm not an expert, just
> not sure). Should it not be "cryptographically secure", to match the next
> sentence?

Or just remove ", they are not cryptographic" altogether?

From victor.stinner at haypocalc.com  Wed May 25 13:10:51 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 13:10:51 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <4DDCCE02.7060105@egenix.com>
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
	<1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de>
	<4DDCCE02.7060105@egenix.com>
Message-ID: <1306321851.6407.49.camel@marge>

Le mercredi 25 mai 2011 ? 11:38 +0200, M.-A. Lemburg a ?crit :
> You are missing the point: we have StreamReader and StreamWriter APIs
> on codecs to allow each codecs to implement more efficient ways of
> encoding and decoding streams.
> 
> Examples of such optimizations are reading the stream in
> chunks that can be decoded in one piece, or writing to the stream
> in a way that doesn't generate encoding state problems on the
> receiving end by ending transmission half-way through a
> shift block.
> 
> ...
> 
> We don't have many such specialized implementations in the stdlib,
> but this doesn't mean that there's no use for them. It
> just means that developers and users are simply unaware of the
> possibilities opened by these stateful stream APIs.

Does at least one codec implement such implementation in its
StreamReader or StreamWriter class? And can't we implement such
optimization in incremental encoders and decoders (or in TextIOWrapper)?

I checked all multibyte codecs (UTF and CJK codecs) and I don't see any
of such optimization. UTF codecs handle the BOM, but don't have anything
looking like an optimization. CJK codecs use multibytecodec,
MultibyteStreamReader and MultibyteStreamWriter, which don't look to be
optimized. But I missed maybe something?

TextIOWrapper has an advanced buffer algorithm to prefetch (readahead)
some bytes at each read to speed up small read. It is difficult to
implement such algorithm, but it's done and it works.

--

Ok, let's stop to speak about theorical optimizations, and let's do a
benchmark to compare codecs and the io modules on reading files!

I tested Python 3.3 (70370:178d367c9733) compiled in release mode (gcc
-O3) on a Pentium4 @ 3 GHz with 2 GB of memory. I tunned manually the
number of loops to ensure that the faster test takes at least one
second. I only ran my benchmark once. See the attached bench.py file.


(1) Decode Objects/unicodeobject.c (317336 characters) from utf-8

test_io.readline(): 89.6 ms
test_codecs.readline(): 1272.8 ms
-> codecs 1320% slower than io

test_io.read(1): 1728.9 ms
test_codecs.read(1): 36395.0 ms
-> codecs 2005% slower than io

test_io.read(100): 460.7 ms
test_codecs.read(100): 3897.0 ms
-> codecs 746% slower than io

test_io.read(-1): 1911.7 ms
test_codecs.read(-1): 1740.7 ms
-> codecs 10% FASTER than io


(2) Decode README (6613 characters) from ascii

test_io.readline(): 109.9 ms
test_codecs.readline(): 1023.8 ms
-> codecs 832% slower than io

test_io.read(1): 1560.4 ms
test_codecs.read(1): 29402.6 ms
-> codecs 1784% slower than io

test_io.read(100): 866.9 ms
test_codecs.read(100): 3699.5 ms
-> codecs 327% slower than io

test_io.read(-1): 5140.2 ms
test_codecs.read(-1): 4817.9 ms
-> codecs 7% FASTER than io


(3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from
gb18030

test_io.readline(): 1193.7 ms
test_codecs.readline(): 1474.3 ms
-> codecs 24% slower than io

test_io.read(1): 3847.7 ms
test_codecs.read(1): 27103.9 ms
-> codecs 604% slower than io

test_io.read(100): 12839.5 ms
test_codecs.read(100): 13444.2 ms
-> codecs 5% slower than io

test_io.read(-1): 2183.3 ms
test_codecs.read(-1): 1906.1 ms
-> codecs 15% FASTER than io


The readahead code does really help read(1): io is between 6 and 20
times faster than the codecs. But it does really use a more common
usecase, readline: io is between 1.2 and 13 times faster than the
codecs.

codecs is always faster (between 1.07 and 1.15 times faster than io) to
read the whole content of file using read(-1). Something should maybe be
optimized in TextIOWrapper.read() ;-) But the gain is minor if you
compare it to the gain on read(1) and readline()!

Please check my bench.py script and redo the benchmark on your own
computer!

Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench.py
Type: text/x-python
Size: 1867 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/dadd9dd4/attachment.py>

From ncoghlan at gmail.com  Wed May 25 14:44:15 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 May 2011 22:44:15 +1000
Subject: [Python-Dev] [Python-checkins] Daily reference leaks
 (234021dcad93): sum=61
In-Reply-To: <1306314654.6407.1.camel@marge>
References: <E1QP4Tu-0002qe-D5@ap.vmr.nerim.net>
	<BANLkTikqWv62fc-t1yfJJkcOY1vDXnWksQ@mail.gmail.com>
	<1306314654.6407.1.camel@marge>
Message-ID: <BANLkTintnRBGEyRErQ91-LYC3LOqdfvKBQ@mail.gmail.com>

On Wed, May 25, 2011 at 7:10 PM, Victor Stinner
<victor.stinner at haypocalc.com> wrote:
> Le mercredi 25 mai 2011 ? 15:13 +1000, Nick Coghlan a ?crit :
>> On Wed, May 25, 2011 at 1:09 PM, ?<solipsis at pitrou.net> wrote:
>> > results for 234021dcad93 on branch "default"
>> > --------------------------------------------
>> >
>> > test_packaging leaked [128, 128, 128] references, sum=384
>>
>> Is there a new cache in packaging that regrtest needs to know about
>> and either ignore or clear when checking reference counts?
>
> See the issue http://bugs.python.org/issue12167 : Antoine listed tests
> leaking references, and I already fixed some of them.

Thanks for the issue link. I'd seen a few of these reports go by, so
it's good to know that dealing with it is in progress.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From eric at trueblade.com  Wed May 25 15:08:30 2011
From: eric at trueblade.com (Eric Smith)
Date: Wed, 25 May 2011 09:08:30 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <20110525105852.GE10448@colossus>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu>	<1306253175.13660.18.camel@marge>
	<irgr9o$k1l$1@dough.gmane.org>	<20110525055926.GA21500@colossus>
	<1306316392.6407.14.camel@marge>	<20110525102012.GD10448@colossus>	<6cb8bc01c5c8812a57662243fd39af1b.squirrel@mail.trueblade.com>
	<20110525105852.GE10448@colossus>
Message-ID: <4DDCFF4E.6030809@trueblade.com>

On 05/25/2011 06:58 AM, Petri Lehtinen wrote:
> Eric Smith wrote:
>>> Victor Stinner wrote:
>>>> I already patched the doc of the random module to add a security
>>>> warning. Well, you don't really need to know how a CSPRNG is
>>>> implemented, just that random cannot be used for security and that
>>>> ssl.RAND_bytes() raises an error if was seeded with enough data.
>>>>
>>>> Tell me if my warning is not clear:
>>>>
>>>> .. warning::
>>>>
>>>>    The generators of the :mod:`random` module should not be used for
>>>>    security purposes, they are not cryptographic. Use ssl.RAND_bytes()
>>>>    if you require a cryptographically secure pseudorandom number
>>>>    generator.
>>>
>>> Looks good to me. Regarding style, you should probably make a link,
>>> like :func:`ssl.RAND_bytes()`.
>>
>> Does "are not cryptographic" have any meaning? (I'm not an expert, just
>> not sure). Should it not be "cryptographically secure", to match the next
>> sentence?
> 
> Or just remove ", they are not cryptographic" altogether?

Good call. That's a better change.

Eric.

From barry at python.org  Wed May 25 15:41:46 2011
From: barry at python.org (Barry Warsaw)
Date: Wed, 25 May 2011 09:41:46 -0400
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
Message-ID: <20110525094146.4941b681@neurotica.wooz.org>

On May 25, 2011, at 10:24 AM, Sandro Tosi wrote:

>before opening an issue to track the request, I'd like to ask advice
>here about this: extend os.chown() to accept even user/group names
>instead of just uid and gid.
>
>On a Unix system, you can call chown command passing either id or
>names, so it seems (to me at least) natural to expect os.chown() to
>behave similarly; but that's not the case.
>
>I can see os module wants to be a thin wrapper around OS syscalls and
>chown(2) accepts only uid/gid as input, so what would be best: extend
>os.chown() or provide a chown() function in shutil module for this
>purpose?

I think it would be a nice feature, and I can see the conflict.  OT1H you want
to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a
new, arguably more difficult to discover, function.  Given those two choices,
I still think I'd come down on adding a new function and shutil.chown() seems
an appropriate place for it.

Cheers,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/fe45f151/attachment.pgp>

From mal at egenix.com  Wed May 25 15:43:55 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 May 2011 15:43:55 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306321851.6407.49.camel@marge>
References: <1306195729.605.27.camel@marge>
	<4DDB8591.2060308@livinglogic.de>	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>	<1306234681.2619.45.camel@marge>
	<4DDB9E27.7040605@livinglogic.de>	<4DDCCE02.7060105@egenix.com>
	<1306321851.6407.49.camel@marge>
Message-ID: <4DDD079B.7090906@egenix.com>

Victor Stinner wrote:
> Le mercredi 25 mai 2011 ? 11:38 +0200, M.-A. Lemburg a ?crit :
>> You are missing the point: we have StreamReader and StreamWriter APIs
>> on codecs to allow each codecs to implement more efficient ways of
>> encoding and decoding streams.
>>
>> Examples of such optimizations are reading the stream in
>> chunks that can be decoded in one piece, or writing to the stream
>> in a way that doesn't generate encoding state problems on the
>> receiving end by ending transmission half-way through a
>> shift block.
>>
>> ...
>>
>> We don't have many such specialized implementations in the stdlib,
>> but this doesn't mean that there's no use for them. It
>> just means that developers and users are simply unaware of the
>> possibilities opened by these stateful stream APIs.
> 
> Does at least one codec implement such implementation in its
> StreamReader or StreamWriter class? And can't we implement such
> optimization in incremental encoders and decoders (or in TextIOWrapper)?

I don't see how, since you need control over the file API methods
in order to implement such optimizations. OTOH, adding lots of
special cases to TextIOWrapper isn't a good either, since these
optimizations would then only trigger for a small number of
codecs and completely leave out 3rd party codecs.

> I checked all multibyte codecs (UTF and CJK codecs) and I don't see any
> of such optimization. UTF codecs handle the BOM, but don't have anything
> looking like an optimization. CJK codecs use multibytecodec,
> MultibyteStreamReader and MultibyteStreamWriter, which don't look to be
> optimized. But I missed maybe something?

No, you haven't missed such per-codec optimizations. The base classes
implement general purpose support for reading from streams in
chunks, but the support isn't optimized per codec.

For UTF-16 it would e.g. make sense to always read data in blocks
with even sizes, removing the trial-and-error decoding and extra
buffering currently done by the base classes. For UTF-32, the
blocks should have size % 4 == 0.

For UTF-8 (and other variable length encodings) it would make
sense looking at the end of the (bytes) data read from the
stream to see whether a complete code point was read or not,
rather than simply running the decoder on the complete data
set, only to find that a few bytes at the end are missing.

For single character encodings, it would make sense to prefetch
data in big chunks and skip all the trial and error decoding
implemented by the base classes to address the above problem
with variable length encodings.

Finally, all this could be implemented in C, reducing the
Python call overhead dramatically.

> TextIOWrapper has an advanced buffer algorithm to prefetch (readahead)
> some bytes at each read to speed up small read. It is difficult to
> implement such algorithm, but it's done and it works.
> 
> --
> 
> Ok, let's stop to speak about theorical optimizations, and let's do a
> benchmark to compare codecs and the io modules on reading files!

That's somewhat unfair: TextIOWrapper is implemented in C,
whereas the StreamReader/Writer subclasses used by the
codecs are written in Python.

A fair comparison would use the Python implementation of
TextIOWrapper.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 25 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-05-23: Released eGenix mx Base 3.2.0      http://python.egenix.com/
2011-05-25: Released mxODBC 3.1.1              http://python.egenix.com/
2011-06-20: EuroPython 2011, Florence, Italy               26 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From solipsis at pitrou.net  Wed May 25 15:58:57 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 25 May 2011 15:58:57 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<20110525094146.4941b681@neurotica.wooz.org>
Message-ID: <20110525155857.4c4e87b7@pitrou.net>

On Wed, 25 May 2011 09:41:46 -0400
Barry Warsaw <barry at python.org> wrote:

> On May 25, 2011, at 10:24 AM, Sandro Tosi wrote:
> 
> >before opening an issue to track the request, I'd like to ask advice
> >here about this: extend os.chown() to accept even user/group names
> >instead of just uid and gid.
> >
> >On a Unix system, you can call chown command passing either id or
> >names, so it seems (to me at least) natural to expect os.chown() to
> >behave similarly; but that's not the case.
> >
> >I can see os module wants to be a thin wrapper around OS syscalls and
> >chown(2) accepts only uid/gid as input, so what would be best: extend
> >os.chown() or provide a chown() function in shutil module for this
> >purpose?
> 
> I think it would be a nice feature, and I can see the conflict.  OT1H you want
> to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a
> new, arguably more difficult to discover, function.  Given those two choices,
> I still think I'd come down on adding a new function and shutil.chown() seems
> an appropriate place for it.

+1 for shutil.chown().

Regards

Antoine.



From dirkjan at ochtman.nl  Wed May 25 16:15:32 2011
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Wed, 25 May 2011 16:15:32 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <20110525094146.4941b681@neurotica.wooz.org>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<20110525094146.4941b681@neurotica.wooz.org>
Message-ID: <BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com>

On Wed, May 25, 2011 at 15:41, Barry Warsaw <barry at python.org> wrote:
> I think it would be a nice feature, and I can see the conflict. ?OT1H you want
> to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a
> new, arguably more difficult to discover, function. ?Given those two choices,
> I still think I'd come down on adding a new function and shutil.chown() seems
> an appropriate place for it.

Right. Please add a mention of shutil.chown() to the os.chown() docs, though.

Cheers,

Dirkjan

From barry at python.org  Wed May 25 16:18:39 2011
From: barry at python.org (Barry Warsaw)
Date: Wed, 25 May 2011 10:18:39 -0400
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<20110525094146.4941b681@neurotica.wooz.org>
	<BANLkTinC6k=LH1Zs6b4dmYZZbt13XO--gw@mail.gmail.com>
Message-ID: <20110525101839.6dd65d9c@neurotica.wooz.org>

On May 25, 2011, at 04:15 PM, Dirkjan Ochtman wrote:

>Right. Please add a mention of shutil.chown() to the os.chown() docs, though.

Brilliant!

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/a2d91ff3/attachment.pgp>

From victor.stinner at haypocalc.com  Wed May 25 17:48:11 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 17:48:11 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <4DDD079B.7090906@egenix.com>
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
	<1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de>
	<4DDCCE02.7060105@egenix.com> <1306321851.6407.49.camel@marge>
	<4DDD079B.7090906@egenix.com>
Message-ID: <1306338491.6407.74.camel@marge>

Le mercredi 25 mai 2011 ? 15:43 +0200, M.-A. Lemburg a ?crit :
> For UTF-16 it would e.g. make sense to always read data in blocks
> with even sizes, removing the trial-and-error decoding and extra
> buffering currently done by the base classes. For UTF-32, the
> blocks should have size % 4 == 0.
>
> For UTF-8 (and other variable length encodings) it would make
> sense looking at the end of the (bytes) data read from the
> stream to see whether a complete code point was read or not,
> rather than simply running the decoder on the complete data
> set, only to find that a few bytes at the end are missing.

I think that the readahead algorithm is much more faster than trying to
avoid partial input, and it's not a problem to have partial input if you
use an incremental decoder.

> For single character encodings, it would make sense to prefetch
> data in big chunks and skip all the trial and error decoding
> implemented by the base classes to address the above problem
> with variable length encodings.

TextIOWrapper implements this optimization using its readahead
algorithm.

> That's somewhat unfair: TextIOWrapper is implemented in C,
> whereas the StreamReader/Writer subclasses used by the
> codecs are written in Python.
> 
> A fair comparison would use the Python implementation of
> TextIOWrapper.

Do you mean that you would like to reimplement codecs in C? It is not
revelant to compare codecs and _pyio, because codecs reuses
BufferedReader (of the io module, not of the _pyio module), and io is
the main I/O module of Python 3.

But well, as you want, here is a benchmark comparing:
   _pyio.TextIOWrapper(io.open(filename, 'rb'), encoding)
and 
    codecs.open(filename, encoding)

The only change with my previous bench.py script is the test_io()
function :

def test_io(test_func, chunk_size):
    with open(FILENAME, 'rb') as buffered:
        f = _pyio.TextIOWrapper(buffered, ENCODING)
        test_file(f, test_func, chunk_size)
        f.close()


(1) Decode Objects/unicodeobject.c (317336 characters) from utf-8

test_io.readline(): 1193.4 ms
test_codecs.readline(): 1267.9 ms
-> codecs 6% slower than io

test_io.read(1): 21696.4 ms
test_codecs.read(1): 36027.2 ms
-> codecs 66% slower than io

test_io.read(100): 3080.7 ms
test_codecs.read(100): 3901.7 ms
-> codecs 27% slower than io

test_io.read(): 3991.0 ms
test_codecs.read(): 1736.9 ms
-> codecs 130% FASTER than io


(2) Decode README (6613 characters) from ascii

test_io.readline(): 678.1 ms
test_codecs.readline(): 760.5 ms
-> codecs 12% slower than io

test_io.read(1): 13533.2 ms
test_codecs.read(1): 21900.0 ms
-> codecs 62% slower than io

test_io.read(100): 2663.1 ms
test_codecs.read(100): 3270.1 ms
-> codecs 23% slower than io

test_io.read(): 6769.1 ms
test_codecs.read(): 3919.6 ms
-> codecs 73% FASTER than io


(3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from
gb18030

test_io.readline(): 38.9 ms
test_codecs.readline(): 15.1 ms
-> codecs 157% FASTER than io

test_io.read(1): 369.8 ms
test_codecs.read(1): 302.2 ms
-> codecs 22% FASTER than io

test_io.read(100): 258.2 ms
test_codecs.read(100): 155.1 ms
-> codecs 67% FASTER than io

test_io.read(): 1803.2 ms
test_codecs.read(): 1002.9 ms
-> codecs 80% FASTER than io


_pyio.TextIOWrapper is faster than codecs.StreamReader for readline(),
read(1) and read(100), with ASCII and UTF-8. It is slower for gb18030.

As in the io vs codecs benchmark, codecs.StreamReader is always faster
than _pyio.TextIOWrapper for read().

Victor


From victor.stinner at haypocalc.com  Wed May 25 18:04:17 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 18:04:17 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
 StreamWriter/StreamReader
In-Reply-To: <1306321851.6407.49.camel@marge>
References: <1306195729.605.27.camel@marge> <4DDB8591.2060308@livinglogic.de>
	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>
	<1306234681.2619.45.camel@marge> <4DDB9E27.7040605@livinglogic.de>
	<4DDCCE02.7060105@egenix.com>  <1306321851.6407.49.camel@marge>
Message-ID: <1306339457.20017.1.camel@marge>

Le mercredi 25 mai 2011 ? 13:10 +0200, Victor Stinner a ?crit :
> codecs is always faster (between 1.07 and 1.15 times faster than io) to
> read the whole content of file using read(-1). Something should maybe be
> optimized in TextIOWrapper.read() ;-)

Oh, I understood: it's maybe the universal newline mode of TextIOWrapper
was enabled. If you disable is using open(..., newline='\n'), io and
codecs run at the same speed to read the whole content of the file
(f.read()).

Victor


From tjreedy at udel.edu  Wed May 25 18:42:19 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 25 May 2011 12:42:19 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Issue #12049: Add
 RAND_bytes() and RAND_pseudo_bytes() functions to the ssl
In-Reply-To: <BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com>
References: <E1QOoUJ-0002QA-PE@dinsdale.python.org>
	<4DDBCE7C.6090200@udel.edu>	<1306253175.13660.18.camel@marge>
	<irgr9o$k1l$1@dough.gmane.org>
	<BANLkTim1ta3mRHE2io_Rc_-U1bPpTyrvOQ@mail.gmail.com>
Message-ID: <irjbhe$gov$1@dough.gmane.org>

On 5/25/2011 1:09 AM, Nick Coghlan wrote:

> The more important feature here is that it is exposing *OpenSSL's*
> random number generation, rather than our own.

I agree, thought from a different stance, I think. The issue is whether 
we should 'automatically' expose everything is a wrapped library, even 
if it duplicates existing functions. I think not. But in this case, at 
least one of the two functions is sufficiently different, and the newest 
doc patches clarify the situation.

-- 
Terry Jan Reedy


From neologix at free.fr  Wed May 25 18:46:02 2011
From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Wed, 25 May 2011 18:46:02 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
Message-ID: <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>

While we're at it, adding a "recursive" argument to this shutil.chown
could also be useful.

From solipsis at pitrou.net  Wed May 25 18:55:13 2011
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 25 May 2011 18:55:13 +0200
Subject: [Python-Dev] cpython: Fix closes issue #11109 -
 socketserver.ForkingMixIn leaves zombies, also fails
References: <E1QPGv0-0002dG-9Y@dinsdale.python.org>
Message-ID: <20110525185513.1cf2e252@pitrou.net>

On Wed, 25 May 2011 18:26:46 +0200
senthil.kumaran <python-checkins at python.org> wrote:
> 
> A new method called service_action is made available in BaseServer, called by
> serve_forever loop. This useful in cases where Mixins can use it for cleanup
> action. ForkingMixin class uses service_action to collect the zombie child
> processes. Initial Patch by Justin Wark.

Is it reasonable, performance-wise, to do this at every iteration of
the loop (that is, at every incoming connection)?

Regards

Antoine.



From victor.stinner at haypocalc.com  Wed May 25 19:17:41 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 25 May 2011 19:17:41 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
Message-ID: <1306343861.20117.4.camel@marge>

Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit :
> While we're at it, adding a "recursive" argument to this shutil.chown
> could also be useful.

I don't like the idea of a recursive flag. I would prefer a "map-like"
function to "apply" a function on all files of a directory. Something
like shutil.apply_recursive(shutil.chown)...

... maybe with options to choose between deep-first search and
breadth-first search, filter (filenames, file size, files only,
directories only, other attributes?), directory before files (may be
need for chmod(0o000)), etc.

Victor


From eric at trueblade.com  Wed May 25 19:37:26 2011
From: eric at trueblade.com (Eric Smith)
Date: Wed, 25 May 2011 13:37:26 -0400
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <1306343861.20117.4.camel@marge>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
Message-ID: <4DDD3E56.3050605@trueblade.com>

On 5/25/2011 1:17 PM, Victor Stinner wrote:
> Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit :
>> While we're at it, adding a "recursive" argument to this shutil.chown
>> could also be useful.
> 
> I don't like the idea of a recursive flag. I would prefer a "map-like"
> function to "apply" a function on all files of a directory. Something
> like shutil.apply_recursive(shutil.chown)...
> 
> ... maybe with options to choose between deep-first search and
> breadth-first search, filter (filenames, file size, files only,
> directories only, other attributes?), directory before files (may be
> need for chmod(0o000)), etc.

You can do all of this with an appropriate application of os.walk().

Eric.

From petri at digip.org  Wed May 25 20:03:39 2011
From: petri at digip.org (Petri Lehtinen)
Date: Wed, 25 May 2011 21:03:39 +0300
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <1306343861.20117.4.camel@marge>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
Message-ID: <20110525180338.GA1718@ihaa>

Victor Stinner wrote:
> Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit :
> > While we're at it, adding a "recursive" argument to this shutil.chown
> > could also be useful.
> 
> I don't like the idea of a recursive flag. I would prefer a "map-like"
> function to "apply" a function on all files of a directory. Something
> like shutil.apply_recursive(shutil.chown)...

FWIW, the chown program (in GNU coreutils at least) has a -R flag for
recursive operation, and I've found it *extremely* useful on many
situations.

Petri

From fuzzyman at voidspace.org.uk  Wed May 25 20:41:26 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 25 May 2011 19:41:26 +0100
Subject: [Python-Dev] Python 3.3 release schedule posted
In-Reply-To: <AANLkTinbgpQrPBU64OY_vD7QzmV5HGznr_uXNEEhMkcY@mail.gmail.com>
References: <imdj8n$dq0$1@dough.gmane.org>	<AANLkTi=9bedAp40CPHQG-fkPTHwqrkzJ6q9Dr6X7p_f7@mail.gmail.com>	<AANLkTikt+PUE41h576o5oo+foT5o61RW=p=oEBF5qkzC@mail.gmail.com>
	<AANLkTinbgpQrPBU64OY_vD7QzmV5HGznr_uXNEEhMkcY@mail.gmail.com>
Message-ID: <4DDD4D56.9020301@voidspace.org.uk>

On 26/03/2011 00:33, Laurens Van Houtven wrote:
> On Thu, Mar 24, 2011 at 12:18 AM, Thomas Wouters <thomas at python.org 
> <mailto:thomas at python.org>> wrote:
>
>     It ended up that Jim Fulton is actually writing the PEP (with
>     input from Twisted people and others.)
>
>     -- 
>     Thomas Wouters <thomas at python.org <mailto:thomas at python.org>>
>
>     Hi! I'm a .signature virus! copy me into your .signature file to
>     help me spread!
>
>
> Well, if help is still needed I'll gladly chip in. It's not  that I'm 
> not interested in doing it -- it's just that I don't know who's 
> supposed to or who's working on it :)
>

Hey lvh,

It's worth following this up. If Jim Fulton hasn't had time to move this 
forward and you have the bandwidth to work on it then it would be great 
to see some action.

All the best,

Michael Foord

> -- 
> cheers
> lvh
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk


-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110525/8e5e2c48/attachment.html>

From neologix at free.fr  Wed May 25 20:45:24 2011
From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Wed, 25 May 2011 20:45:24 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <1306343861.20117.4.camel@marge>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
Message-ID: <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com>

>> While we're at it, adding a "recursive" argument to this shutil.chown
>> could also be useful.
>
> I don't like the idea of a recursive flag. I would prefer a "map-like"
> function to "apply" a function on all files of a directory. Something
> like shutil.apply_recursive(shutil.chown)...
>

I was also thinking about this possibility.
The advantage is that we could factor-out the recursive walk logic to
make it available for other functions (chown, chmod...).
It doesn't map well to the Unix command, though.

> You can do all of this with an appropriate application of os.walk().

Then, I wonder why shutil.copytree and shutil.rmtree are provided.
Recursive rm/copy/chown/chmod are extremely useful in system
administration scripts. Furthermore, it's not as simple as it seems
because of symlinks, see for example http://bugs.python.org/issue4489
.

From neologix at free.fr  Wed May 25 23:00:51 2011
From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Wed, 25 May 2011 23:00:51 +0200
Subject: [Python-Dev] cpython: Fix closes issue #11109 -
 socketserver.ForkingMixIn leaves zombies, also fails
In-Reply-To: <20110525185513.1cf2e252@pitrou.net>
References: <E1QPGv0-0002dG-9Y@dinsdale.python.org>
	<20110525185513.1cf2e252@pitrou.net>
Message-ID: <BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com>

>> A new method called service_action is made available in BaseServer, called by
>> serve_forever loop. This useful in cases where Mixins can use it for cleanup
>> action. ForkingMixin class uses service_action to collect the zombie child
>> processes. Initial Patch by Justin Wark.
>
> Is it reasonable, performance-wise, to do this at every iteration of
> the loop (that is, at every incoming connection)?
>

I haven't measured it, but it's O(N) where N is the number of children.
It should be possible to optimize this by putting all the children in
a process group (the other advantage is that we wouldn't wait()
children not spawned by socketserver).

cf

From lac at openend.se  Wed May 25 23:41:53 2011
From: lac at openend.se (Laura Creighton)
Date: Wed, 25 May 2011 23:41:53 +0200
Subject: [Python-Dev] multibytecodex
Message-ID: <201105252141.p4PLfr6T025372@theraft.openend.se>


This just in from pypy-dev.  I am reposting it here because I
am fairly certain that nobody on the pypy-dev mailing list
uses the multibytecodex, but there has got to be at least one
person here who does.

Please reply to the pypy-dev article, not here, or mail to pypy-dev at python.org
if you are not on the pypy-dev mailing list (but have delivery turned off
as many of you do.)

Thank you,
Laura

------- Forwarded Message

From: Armin Rigo <arigo at tunes.org>
Date: Wed, 25 May 2011 21:39:35 +0200
To: pypy-dev at python.org
Subject: [pypy-dev] multibytecodec: missing features


Hi all,

Here are the missing features in multibytecodec:

* support for ``errors !=3D "strict"''.

* classes MultibyteIncrementalEncoder, MultibyteIncrementalDecoder,
MultibyteStreamReader and MultibyteStreamWriter.

One reason I didn't implement the classes yet is that I couldn't
understand two points in how they are supposed to work.  But it seems
that there are really two bugs, as I've been pointed to:
http://bugs.python.org/issue12100 and
http://bugs.python.org/issue12171 .  So the question is if we should
be bug-compatible with Python 2.7 or if we should instead implement
some fixed version.

I suppose I'm rather for the fixed version, but I'd like to hear some
feedback from people that actually use multibytecodecs.  Also, I
wouldn't mind if someone would pick up the work and just do it, either
the classes or ``errors !=3D "strict"'' :-)


A bient=F4t,

Armin.
_______________________________________________
pypy-dev mailing list
pypy-dev at python.org
http://mail.python.org/mailman/listinfo/pypy-dev

------- End of Forwarded Message


From victor.stinner at haypocalc.com  Thu May 26 00:13:42 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 26 May 2011 00:13:42 +0200
Subject: [Python-Dev] multibytecodex
In-Reply-To: <201105252141.p4PLfr6T025372@theraft.openend.se>
References: <201105252141.p4PLfr6T025372@theraft.openend.se>
Message-ID: <1306361622.24449.14.camel@marge>

Le mercredi 25 mai 2011 ? 23:41 +0200, Laura Creighton a ?crit :
> One reason I didn't implement the classes yet is that I couldn't
> understand two points in how they are supposed to work.  But it seems
> that there are really two bugs, as I've been pointed to:
> http://bugs.python.org/issue12100 and
> http://bugs.python.org/issue12171 .  So the question is if we should
> be bug-compatible with Python 2.7 or if we should instead implement
> some fixed version.

I fixed #12100 in Python 2.7, 3.1, 3.2, 3.3 yesterday.

I plan also to fix #12171 in these four versions, it should be done next
days.

> I suppose I'm rather for the fixed version, but I'd like to hear some
> feedback from people that actually use multibytecodecs.

Both bugs are related to encoders. I don't think that anyone is using
Python CJK codecs to encode text (because nobody noticed these bugs
before), but more likely to decode text.

Anyway, you should implement a codec without these *bugs*.

For your information, I added more tests to the CJK codecs (e.g. see
#12057), and I plan to add more tests next weeks. I plan also to fix
issue #12016, yet another CJK codec bug. You may want to wait until all
of these bugs are fixed before working on your own implementation, or
implement directly a version without all of these bugs, and then upgrade
the test suite.

> Also, I wouldn't mind if someone would pick up the work and just do it,
> either the classes or ``errors !=3D "strict"'' :-)

The support of error handlers different than strict is far from being
perfect. Issue #12016 is the main problem, but there are other minor
issues.

In some cases, invalid byte sequences are ignored even with the replace
error handler (whereas I expected U+FFFD characters). CJK codecs are
special because they use escape sequences (especially the ISO 2022
family): what should be done if a byte sequence looks like an escape
sequences, but it is not valid? Replace each byte by U+FFFD, or ignore
these bytes?

I'm trying to write tests "describing" the current behaviour, and then I
will maybe try to improve how invalid byte sequences are handled.

Victor


From timothy.c.delaney at gmail.com  Thu May 26 00:10:29 2011
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Thu, 26 May 2011 08:10:29 +1000
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <1306343861.20117.4.camel@marge>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
Message-ID: <BANLkTin_3C69T5OHP5+Ngti-NAJ3E85WjA@mail.gmail.com>

2011/5/26 Victor Stinner <victor.stinner at haypocalc.com>

> Le mercredi 25 mai 2011 ? 18:46 +0200, Charles-Fran?ois Natali a ?crit :
> > While we're at it, adding a "recursive" argument to this shutil.chown
> > could also be useful.
>
> I don't like the idea of a recursive flag. I would prefer a "map-like"
> function to "apply" a function on all files of a directory. Something
> like shutil.apply_recursive(shutil.chown)...
>
> ... maybe with options to choose between deep-first search and
> breadth-first search, filter (filenames, file size, files only,
> directories only, other attributes?), directory before files (may be
> need for chmod(0o000)), etc.


Pass an iterable to shutil.chown()? Then you could call it like:

shutil.chown(os.walk(path))

Then of course you have the difficulty of wanting to pass either an iterator
or a single path - probably prefer two functions e.g.:

shutil.chown(path)
shutil.chown_many(iter)

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110526/658c4dc7/attachment-0001.html>

From senthil at uthcode.com  Thu May 26 07:10:55 2011
From: senthil at uthcode.com (Senthil Kumaran)
Date: Thu, 26 May 2011 13:10:55 +0800
Subject: [Python-Dev] cpython: Fix closes issue #11109 -
 socketserver.ForkingMixIn leaves zombies, also fails
In-Reply-To: <BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com>
References: <E1QPGv0-0002dG-9Y@dinsdale.python.org>
	<20110525185513.1cf2e252@pitrou.net>
	<BANLkTimOV+gPXkoyxGUSSH=wW_CzrJgXDw@mail.gmail.com>
Message-ID: <20110526051055.GB2736@kevin>


Antoine Pitrou wrote:
> >> A new method called service_action is made available in BaseServer, called by
> >> serve_forever loop. This useful in cases where Mixins can use it for cleanup
> >> action. ForkingMixin class uses service_action to collect the zombie child
> >> processes. Initial Patch by Justin Wark.
> >
> > Is it reasonable, performance-wise, to do this at every iteration of
> > the loop (that is, at every incoming connection)?

If not here, the call was being done at the process_request level when
creating a new child process and the wait would have been there. I am
not sure, how much performance different (lag) this aggressive
collection can bring.

Charles-Fran?ois Natali wrote:

> I haven't measured it, but it's O(N) where N is the number of children.
> It should be possible to optimize this by putting all the children in
> a process group (the other advantage is that we wouldn't wait()
> children not spawned by socketserver).

+1. This is definitely a good idea. The change needs to be done in the
collection_children routine which tries to wait for all children to
finish instead of just the ones forked by the socketserver.
Shall raise ticket for this.

-- 
Senthil

Although the moon is smaller than the earth, it is farther away.

From ncoghlan at gmail.com  Thu May 26 07:58:16 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 May 2011 15:58:16 +1000
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
	<BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com>
Message-ID: <BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com>

2011/5/26 Charles-Fran?ois Natali <neologix at free.fr>:
> Then, I wonder why shutil.copytree and shutil.rmtree are provided.
> Recursive rm/copy/chown/chmod are extremely useful in system
> administration scripts. Furthermore, it's not as simple as it seems
> because of symlinks, see for example http://bugs.python.org/issue4489

Rather than a fixed binary flag, I would suggest following the
precedent of copytree and rmtree, and provide recursive functionality
as a separate shutil function (i.e. shutil.chmodtree,
shutil.chowntree).

As noted, while these *can* be written manually, it is convenient to
have the logic for handling symlinks dealt with for you, as well as
not having to look up the particular incantation for correctly linking
os.walk and the relevant operations.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From petri at digip.org  Thu May 26 08:09:04 2011
From: petri at digip.org (Petri Lehtinen)
Date: Thu, 26 May 2011 09:09:04 +0300
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<BANLkTi=8K1=ZQfxAD0nK0wAaCy33FRmEEw@mail.gmail.com>
	<1306343861.20117.4.camel@marge>
	<BANLkTikHWFa+0K7kbRbk0suXDM=cELke-A@mail.gmail.com>
	<BANLkTimTMVTgW6f1LnSgFq3a1ugqZywJhA@mail.gmail.com>
Message-ID: <20110526060903.GB7580@colossus>

Nick Coghlan wrote:
> 2011/5/26 Charles-Fran?ois Natali <neologix at free.fr>:
> > Then, I wonder why shutil.copytree and shutil.rmtree are provided.
> > Recursive rm/copy/chown/chmod are extremely useful in system
> > administration scripts. Furthermore, it's not as simple as it seems
> > because of symlinks, see for example http://bugs.python.org/issue4489
> 
> Rather than a fixed binary flag, I would suggest following the
> precedent of copytree and rmtree, and provide recursive functionality
> as a separate shutil function (i.e. shutil.chmodtree,
> shutil.chowntree).

+1

> As noted, while these *can* be written manually, it is convenient to
> have the logic for handling symlinks dealt with for you, as well as
> not having to look up the particular incantation for correctly linking
> os.walk and the relevant operations.

This is exactly what I meant when saying that the -R option to chown
and chmod shell commands is useful. I *could* do it without them, but
writing the same logic every time with error handling would be
cumbersome.

Petri

From victor.stinner at haypocalc.com  Thu May 26 14:32:51 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 26 May 2011 14:32:51 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <4DDE43DC.6020103@trueblade.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com>
Message-ID: <1306413171.14987.3.camel@marge>

Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit :
> If you're ever going to add code at the end of these functions, it's
> unlikely you'll remember that you need to add these increments back in.

You don't have to remember. Test the result of the function, it will not
give the expected output. I don't think that you need fuzzing or a
complex tool to detect that the new code doesn't behave correctly.

> It's a bug waiting to happen

What? It's not a bug. Ading new non-tested code is a bug :-)

> I don't see any harm leaving them in.
> Maybe we should add a comment about why they're done.

It makes Python faster (!) and make silent the Clang Static Analyzer :-)

Victor



From eric at trueblade.com  Thu May 26 16:10:26 2011
From: eric at trueblade.com (Eric Smith)
Date: Thu, 26 May 2011 10:10:26 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <1306413171.14987.3.camel@marge>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>	<4DDE43DC.6020103@trueblade.com>
	<1306413171.14987.3.camel@marge>
Message-ID: <4DDE5F52.7030303@trueblade.com>

On 5/26/2011 8:32 AM, Victor Stinner wrote:
> Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit :
>> If you're ever going to add code at the end of these functions, it's
>> unlikely you'll remember that you need to add these increments back in.
> 
> You don't have to remember. Test the result of the function, it will not
> give the expected output. I don't think that you need fuzzing or a
> complex tool to detect that the new code doesn't behave correctly.
> 
>> It's a bug waiting to happen
> 
> What? It's not a bug. Ading new non-tested code is a bug :-)

True. But assuming all code additions will have 100% branch coverage in
the C code is foolish.

>> I don't see any harm leaving them in.
>> Maybe we should add a comment about why they're done.
> 
> It makes Python faster (!) 

I doubt that.

> and make silent the Clang Static Analyzer :-)

I care less about that than maintainability and future-proofing.

From benjamin at python.org  Thu May 26 16:50:03 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 26 May 2011 09:50:03 -0500
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <1306413171.14987.3.camel@marge>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
Message-ID: <BANLkTikOB+oE8+F=EhUrYqd+PpWAyW9yXw@mail.gmail.com>

2011/5/26 Victor Stinner <victor.stinner at haypocalc.com>:
> Le jeudi 26 mai 2011 ? 08:13 -0400, Eric Smith a ?crit :
>> If you're ever going to add code at the end of these functions, it's
>> unlikely you'll remember that you need to add these increments back in.
>
> You don't have to remember. Test the result of the function, it will not
> give the expected output. I don't think that you need fuzzing or a
> complex tool to detect that the new code doesn't behave correctly.
>
>> It's a bug waiting to happen
>
> What? It's not a bug. Ading new non-tested code is a bug :-)
>
>> I don't see any harm leaving them in.
>> Maybe we should add a comment about why they're done.
>
> It makes Python faster (!) and make silent the Clang Static Analyzer :-)

Surely, GCC can optimize that out.


-- 
Regards,
Benjamin

From eric at trueblade.com  Thu May 26 17:26:20 2011
From: eric at trueblade.com (Eric Smith)
Date: Thu, 26 May 2011 11:26:20 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
	<4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
Message-ID: <4DDE711C.1040209@trueblade.com>

On 5/26/2011 10:34 AM, Ronald Oussoren wrote:
> 
> On 26 May, 2011, at 16:10, Eric Smith wrote:
>>
>>
>>> and make silent the Clang Static Analyzer :-)
>>
>> I care less about that than maintainability and future-proofing.
> 
> Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch.

I have looked at it. I think the code was better before the patch. If I
were looking at this, I'd have to wonder why the pointer was incremented
everywhere else, but not here. This is especially true when the changed
code isn't particularly near the end of the function.

From ronaldoussoren at mac.com  Thu May 26 16:34:46 2011
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 26 May 2011 16:34:46 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <4DDE5F52.7030303@trueblade.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
	<4DDE5F52.7030303@trueblade.com>
Message-ID: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>


On 26 May, 2011, at 16:10, Eric Smith wrote:
> 
> 
>> and make silent the Clang Static Analyzer :-)
> 
> I care less about that than maintainability and future-proofing.

Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2224 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110526/bf809ca5/attachment.bin>

From tjreedy at udel.edu  Thu May 26 19:59:51 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 26 May 2011 13:59:51 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
	the end of functions
In-Reply-To: <08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>	<4DDE43DC.6020103@trueblade.com>
	<1306413171.14987.3.camel@marge>	<4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
Message-ID: <irm4eo$7id$1@dough.gmane.org>

On 5/26/2011 10:34 AM, Ronald Oussoren wrote:
>
> On 26 May, 2011, at 16:10, Eric Smith wrote:
>>
>>
>>> and make silent the Clang Static Analyzer :-)
>>
>> I care less about that than maintainability and future-proofing.


> Have to looked at the patch? The patch and resulting code look sane to me, and if anything at most of the updated segments look cleaner after the patch.

Lets assume that the function currently does what it is supposed to do, 
as verified by tests. Then adding an unneeded increment in case the 
function is redefined in the future so that it needs more code strikes 
me as YAGNI. Certainly, reading it today with an unused increment 
suggests to me that something is missing that would use the incremented 
value. This strike me as different from adding a comma at the end of a 
Python sequence display.

-- 
Terry Jan Reedy


From guido at python.org  Thu May 26 20:08:06 2011
From: guido at python.org (Guido van Rossum)
Date: Thu, 26 May 2011 11:08:06 -0700
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <irm4eo$7id$1@dough.gmane.org>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com>
	<1306413171.14987.3.camel@marge> <4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
	<irm4eo$7id$1@dough.gmane.org>
Message-ID: <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com>

On Thu, May 26, 2011 at 10:59 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 5/26/2011 10:34 AM, Ronald Oussoren wrote:
>>
>> On 26 May, 2011, at 16:10, Eric Smith wrote:
>>>
>>>
>>>> and make silent the Clang Static Analyzer :-)
>>>
>>> I care less about that than maintainability and future-proofing.
>
>
>> Have to looked at the patch? The patch and resulting code look sane to me,
>> and if anything at most of the updated segments look cleaner after the
>> patch.
>
> Lets assume that the function currently does what it is supposed to do, as
> verified by tests. Then adding an unneeded increment in case the function is
> redefined in the future so that it needs more code strikes me as YAGNI.
> Certainly, reading it today with an unused increment suggests to me that
> something is missing that would use the incremented value. This strike me as
> different from adding a comma at the end of a Python sequence display.

Sorry to butt in here, but I agree with Eric that it was better
before. There is a common idiom, *pointer++ = <something>, and
whenever you see that you know that you are appending something to an
output buffer. Perhaps the most important idea here is that this
maintains the *invariant* "pointer points just after the last thing in
the buffer". Always maintaining the invariant is better than trying to
micro-optimize things so as to avoid updating dead values. The
compiler is better at that.

-- 
--Guido van Rossum (python.org/~guido)

From alexander.belopolsky at gmail.com  Thu May 26 20:14:42 2011
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 26 May 2011 14:14:42 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
 the end of functions
In-Reply-To: <4DDE711C.1040209@trueblade.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
	<4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
	<4DDE711C.1040209@trueblade.com>
Message-ID: <BANLkTin7yLeSeyPxVOv+hBsXtsyT=JNH_w@mail.gmail.com>

On Thu, May 26, 2011 at 11:26 AM, Eric Smith <eric at trueblade.com> wrote:
..
>> Have to looked at the patch? The patch and resulting code look sane to me, and
>> if anything at most of the updated segments look cleaner after the patch.
>
> I have looked at it. I think the code was better before the patch. If I
> were looking at this, I'd have to wonder why the pointer was incremented
> everywhere else, but not here. This is especially true when the changed
> code isn't particularly near the end of the function.

+1

To me, *p++ = c is an idiomatic way to fill the buffer.  I prefer to
think of p as the state of the stream for which adding a character is
impossible without advancing the state.  Seeing *p = c will definitely
make me pause and think whether or not it is a bug.

From tjreedy at udel.edu  Thu May 26 20:22:11 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 26 May 2011 14:22:11 -0400
Subject: [Python-Dev] cpython: Avoid useless "++" at the end of functions
In-Reply-To: <BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
	<4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
	<irm4eo$7id$1@dough.gmane.org>
	<BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com>
Message-ID: <4DDE9A53.9060203@udel.edu>

On 5/26/2011 2:08 PM, Guido van Rossum wrote:

> Sorry to butt in here, but I agree with Eric that it was better
> before. There is a common idiom, *pointer++ =<something>, and
> whenever you see that you know that you are appending something to an
> output buffer. Perhaps the most important idea here is that this
> maintains the *invariant* "pointer points just after the last thing in
> the buffer". Always maintaining the invariant is better than trying to
> micro-optimize things so as to avoid updating dead values. The
> compiler is better at that.

This explanation makes sense (more than Eric's version of perhaps the 
same thing ;-).

http://bugs.python.org/issue12188
"A condensed version of the above added to PEP 7 would help new 
developers see the usage as local idiom rather than style bug."

Terry J. Reedy

From benjamin at python.org  Thu May 26 20:26:43 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 26 May 2011 13:26:43 -0500
Subject: [Python-Dev] cpython: Avoid useless "++" at the end of functions
In-Reply-To: <4DDE9A53.9060203@udel.edu>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
	<4DDE43DC.6020103@trueblade.com> <1306413171.14987.3.camel@marge>
	<4DDE5F52.7030303@trueblade.com>
	<08A5EAF7-2AAD-409D-B5F7-3AF068F9241B@mac.com>
	<irm4eo$7id$1@dough.gmane.org>
	<BANLkTim8h0kL2zvMfa_AXQ6=ZTc61tmn9A@mail.gmail.com>
	<4DDE9A53.9060203@udel.edu>
Message-ID: <BANLkTimB1u6BLhw-3Bb7kMxEVz+th96jig@mail.gmail.com>

2011/5/26 Terry Reedy <tjreedy at udel.edu>:
> On 5/26/2011 2:08 PM, Guido van Rossum wrote:
>
>> Sorry to butt in here, but I agree with Eric that it was better
>> before. There is a common idiom, *pointer++ =<something>, and
>> whenever you see that you know that you are appending something to an
>> output buffer. Perhaps the most important idea here is that this
>> maintains the *invariant* "pointer points just after the last thing in
>> the buffer". Always maintaining the invariant is better than trying to
>> micro-optimize things so as to avoid updating dead values. The
>> compiler is better at that.
>
> This explanation makes sense (more than Eric's version of perhaps the same
> thing ;-).
>
> http://bugs.python.org/issue12188
> "A condensed version of the above added to PEP 7 would help new developers
> see the usage as local idiom rather than style bug."

I think a more general formulation would be: "Idiomatic code is more
important than making static analyzers happy."



-- 
Regards,
Benjamin

From sandro.tosi at gmail.com  Thu May 26 22:15:57 2011
From: sandro.tosi at gmail.com (Sandro Tosi)
Date: Thu, 26 May 2011 22:15:57 +0200
Subject: [Python-Dev] Extending os.chown() to accept user/group names
In-Reply-To: <20110525155857.4c4e87b7@pitrou.net>
References: <BANLkTikXottB8xfXbe1y48P-GWVxhKn2=Q@mail.gmail.com>
	<20110525094146.4941b681@neurotica.wooz.org>
	<20110525155857.4c4e87b7@pitrou.net>
Message-ID: <BANLkTi=PPYp4H7Zf2dnh_mj4TFxo_XscbQ@mail.gmail.com>

On Wed, May 25, 2011 at 15:58, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Wed, 25 May 2011 09:41:46 -0400
> Barry Warsaw <barry at python.org> wrote:
>> I think it would be a nice feature, and I can see the conflict. ?OT1H you want
>> to keep os.chown() a thin wrapper, but OTOH you'd rather not have to add a
>> new, arguably more difficult to discover, function. ?Given those two choices,
>> I still think I'd come down on adding a new function and shutil.chown() seems
>> an appropriate place for it.
>
> +1 for shutil.chown().

and so shutil.chown() be it: http://bugs.python.org/issue12191

Currently, only the function for a single file is implemented, let's
look later what to do for a recursive one.

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi

From mal at egenix.com  Fri May 27 10:17:29 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 27 May 2011 10:17:29 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <1306338491.6407.74.camel@marge>
References: <1306195729.605.27.camel@marge>
	<4DDB8591.2060308@livinglogic.de>	<A5309F8C-375D-4ED9-A325-8172B9E852B3@langa.pl>	<1306234681.2619.45.camel@marge>
	<4DDB9E27.7040605@livinglogic.de>	<4DDCCE02.7060105@egenix.com>
	<1306321851.6407.49.camel@marge>	<4DDD079B.7090906@egenix.com>
	<1306338491.6407.74.camel@marge>
Message-ID: <4DDF5E19.3080701@egenix.com>

Victor Stinner wrote:
> Le mercredi 25 mai 2011 ? 15:43 +0200, M.-A. Lemburg a ?crit :
>> For UTF-16 it would e.g. make sense to always read data in blocks
>> with even sizes, removing the trial-and-error decoding and extra
>> buffering currently done by the base classes. For UTF-32, the
>> blocks should have size % 4 == 0.
>>
>> For UTF-8 (and other variable length encodings) it would make
>> sense looking at the end of the (bytes) data read from the
>> stream to see whether a complete code point was read or not,
>> rather than simply running the decoder on the complete data
>> set, only to find that a few bytes at the end are missing.
> 
> I think that the readahead algorithm is much more faster than trying to
> avoid partial input, and it's not a problem to have partial input if you
> use an incremental decoder.

Depends on where you're coming from. For non-seekable streams
such as sockets or pipes, readahead is not going to work.

For seekable streams, I agree that readahead is better strategy.

And of course, it also makes sense to use incremental decoders
for these encodings.

>> For single character encodings, it would make sense to prefetch
>> data in big chunks and skip all the trial and error decoding
>> implemented by the base classes to address the above problem
>> with variable length encodings.
> 
> TextIOWrapper implements this optimization using its readahead
> algorithm.

It does yes, but the above was an optimization specific
to single character encodings, not all encodings and
TextIOWrapper doesn't know anything about specific characteristics
of the underlying encodings (except perhaps a few special
cases).

>> That's somewhat unfair: TextIOWrapper is implemented in C,
>> whereas the StreamReader/Writer subclasses used by the
>> codecs are written in Python.
>>
>> A fair comparison would use the Python implementation of
>> TextIOWrapper.
> 
> Do you mean that you would like to reimplement codecs in C? 

As use of Unicode codecs increases in Python applications,
this would certainly be an approach to consider, yes.

Looking at the current situation, it is better to use
TextIOWrapper as it provides better performance, but since
TextIOWrapper cannot (per desing) provide per-codec optimizations,
this is likely to change with a codec rewrite in C of codecs
that benefit a lot from such specific optimizations.

> It is not
> revelant to compare codecs and _pyio, because codecs reuses
> BufferedReader (of the io module, not of the _pyio module), and io is
> the main I/O module of Python 3.

They both use whatever stream you pass in as parameter,
so your TextIOWrapper benchmark will also use the BufferedReader
of the io module.

The point here is to compare Python to Python, not Python
to C.

> But well, as you want, here is a benchmark comparing:
>    _pyio.TextIOWrapper(io.open(filename, 'rb'), encoding)
> and 
>     codecs.open(filename, encoding)
> 
> The only change with my previous bench.py script is the test_io()
> function :
> 
> def test_io(test_func, chunk_size):
>     with open(FILENAME, 'rb') as buffered:
>         f = _pyio.TextIOWrapper(buffered, ENCODING)
>         test_file(f, test_func, chunk_size)
>         f.close()

Thanks for running those tests.

> (1) Decode Objects/unicodeobject.c (317336 characters) from utf-8
> 
> test_io.readline(): 1193.4 ms
> test_codecs.readline(): 1267.9 ms
> -> codecs 6% slower than io
> 
> test_io.read(1): 21696.4 ms
> test_codecs.read(1): 36027.2 ms
> -> codecs 66% slower than io
> 
> test_io.read(100): 3080.7 ms
> test_codecs.read(100): 3901.7 ms
> -> codecs 27% slower than io

This shows that StreamReader/Writer could benefit quite
a bit from using incremental encoders/decoders.

> test_io.read(): 3991.0 ms
> test_codecs.read(): 1736.9 ms
> -> codecs 130% FASTER than io

No surprise here. It's also a very common use case
to read the whole file in one go and the bigger
the file, the more impact this has.

> (2) Decode README (6613 characters) from ascii
> 
> test_io.readline(): 678.1 ms
> test_codecs.readline(): 760.5 ms
> -> codecs 12% slower than io
> 
> test_io.read(1): 13533.2 ms
> test_codecs.read(1): 21900.0 ms
> -> codecs 62% slower than io
> 
> test_io.read(100): 2663.1 ms
> test_codecs.read(100): 3270.1 ms
> -> codecs 23% slower than io
> 
> test_io.read(): 6769.1 ms
> test_codecs.read(): 3919.6 ms
> -> codecs 73% FASTER than io

See above.

> (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from
> gb18030
> 
> test_io.readline(): 38.9 ms
> test_codecs.readline(): 15.1 ms
> -> codecs 157% FASTER than io
> 
> test_io.read(1): 369.8 ms
> test_codecs.read(1): 302.2 ms
> -> codecs 22% FASTER than io
> 
> test_io.read(100): 258.2 ms
> test_codecs.read(100): 155.1 ms
> -> codecs 67% FASTER than io
> 
> test_io.read(): 1803.2 ms
> test_codecs.read(): 1002.9 ms
> -> codecs 80% FASTER than io

These results are interesting since gb18030 is a shift
encoding which keeps state in the encoded data stream, so
the strategy chosen by TextIOWrapper doesn't work out that
well.

It hints to what I mentioned above: per codec optimizations
are going to be relevant once these codecs get a lot of use.

> _pyio.TextIOWrapper is faster than codecs.StreamReader for readline(),
> read(1) and read(100), with ASCII and UTF-8. It is slower for gb18030.
> 
> As in the io vs codecs benchmark, codecs.StreamReader is always faster
> than _pyio.TextIOWrapper for read().

Just to repeat it here what I already mentioned on the ticket:

I am still -1 on deprecating the StreamReader/Writer parts of
the codec APIs. I've given numerous reasons on why these are
useful, what their intention is, why they were added to Python 1.6.

Since such a deprecation would change an important documented API,
please write a PEP outlining your reasoning, including my comments,
use cases and possibilities for optimizations.

Please back out your checkin:

"""
http://hg.python.org/cpython/rev/3555cf6f9c98
changeset:   70430:3555cf6f9c98
user:        Victor Stinner <victor.stinner at haypocalc.com>
date:        Fri May 27 01:51:18 2011 +0200
summary:
  Issue #8796: codecs.open() calls the builtin open() function instead of using
StreamReaderWriter. Deprecate StreamReader, StreamWriter, StreamReaderWriter,
StreamRecoder and EncodedFile() of the codec module. Use the builtin open()
function or io.TextIOWrapper instead.

files:
  Doc/library/codecs.rst  |   25 ++++
  Lib/codecs.py           |   25 ++--
  Lib/test/test_codecs.py |  152 +++++++++++++++++++--------
  Misc/NEWS               |    5 +
  4 files changed, 148 insertions(+), 59 deletions(-)
"""

I wasn't very happy to see that checkin on the checkins list...

We can discuss changing codec.open() to use TextIOWrapper, but
your quest for deprecating APIs in Python has gone too far on
this one.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 27 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-05-23: Released eGenix mx Base 3.2.0      http://python.egenix.com/
2011-05-25: Released mxODBC 3.1.1              http://python.egenix.com/
2011-06-20: EuroPython 2011, Florence, Italy               24 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From eric at trueblade.com  Fri May 27 14:14:55 2011
From: eric at trueblade.com (Eric Smith)
Date: Fri, 27 May 2011 08:14:55 -0400
Subject: [Python-Dev] [Python-checkins] cpython: Avoid useless "++" at
	the end of functions
In-Reply-To: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
References: <E1QPZM1-0001Ll-1A@dinsdale.python.org>
Message-ID: <4DDF95BF.3070800@trueblade.com>

So, given the discussions about this change, can you please revert it,
Victor?

Eric.

On 05/26/2011 08:07 AM, victor.stinner wrote:
> http://hg.python.org/cpython/rev/7ba176c2f558
> changeset:   70397:7ba176c2f558
> user:        Victor Stinner <victor.stinner at haypocalc.com>
> date:        Thu May 26 13:53:47 2011 +0200
> summary:
>   Avoid useless "++" at the end of functions
> 
> Warnings found by the Clang Static Analyzer.
> 
> files:
>   Objects/setobject.c     |  4 ++--
>   Objects/unicodeobject.c |  2 +-
>   Python/compile.c        |  6 +++---
>   3 files changed, 6 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/Objects/setobject.c b/Objects/setobject.c
> --- a/Objects/setobject.c
> +++ b/Objects/setobject.c
> @@ -612,9 +612,9 @@
>          *u++ = '{';
>          /* Omit the brackets from the listrepr */
>          Py_UNICODE_COPY(u, PyUnicode_AS_UNICODE(listrepr)+1,
> -                           PyUnicode_GET_SIZE(listrepr)-2);
> +                           newsize-2);
>          u += newsize-2;
> -        *u++ = '}';
> +        *u = '}';
>      }
>      Py_DECREF(listrepr);
>      if (Py_TYPE(so) != &PySet_Type) {
> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
> --- a/Objects/unicodeobject.c
> +++ b/Objects/unicodeobject.c
> @@ -6474,7 +6474,7 @@
>          }
>      }
>      /* 0-terminate the output string */
> -    *output++ = '\0';
> +    *output = '\0';
>      Py_XDECREF(exc);
>      Py_XDECREF(errorHandler);
>      return 0;
> diff --git a/Python/compile.c b/Python/compile.c
> --- a/Python/compile.c
> +++ b/Python/compile.c
> @@ -3747,11 +3747,11 @@
>      a->a_lnotab_off += 2;
>      if (d_bytecode) {
>          *lnotab++ = d_bytecode;
> -        *lnotab++ = d_lineno;
> +        *lnotab = d_lineno;
>      }
>      else {      /* First line of a block; def stmt, etc. */
>          *lnotab++ = 0;
> -        *lnotab++ = d_lineno;
> +        *lnotab = d_lineno;
>      }
>      a->a_lineno = i->i_lineno;
>      a->a_lineno_off = a->a_offset;
> @@ -3796,7 +3796,7 @@
>      if (i->i_hasarg) {
>          assert(size == 3 || size == 6);
>          *code++ = arg & 0xff;
> -        *code++ = arg >> 8;
> +        *code = arg >> 8;
>      }
>      return 1;
>  }
> 
> 
> 
> 
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins


From victor.stinner at haypocalc.com  Fri May 27 15:29:15 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 27 May 2011 15:29:15 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDF5E19.3080701@egenix.com>
References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge>
	<4DDF5E19.3080701@egenix.com>
Message-ID: <201105271529.15421.victor.stinner@haypocalc.com>

Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a ?crit :
> > I think that the readahead algorithm is much more faster than trying to
> > avoid partial input, and it's not a problem to have partial input if you
> > use an incremental decoder.
> 
> Depends on where you're coming from. For non-seekable streams
> such as sockets or pipes, readahead is not going to work.

I don't see how StreamReader/StreamWriter can do a better job than 
TextIOWrapper for non-seekable streams.

> > TextIOWrapper implements this optimization using its readahead
> > algorithm.
> 
> It does yes, but the above was an optimization specific
> to single character encodings, not all encodings and
> TextIOWrapper doesn't know anything about specific characteristics
> of the underlying encodings (except perhaps a few special
> cases).

Please give me numbers: how fast are your suggested optimizations? Are they 
faster than readahead? All of your argumentation is based on theorical facts.

> > Do you mean that you would like to reimplement codecs in C?
> 
> As use of Unicode codecs increases in Python applications,
> this would certainly be an approach to consider, yes.

I am not sure that StreamReader is/can be faster than TextIOWrapper if it is 
reimplemented in C (see the updated benchmark below, codecs vs _pyio).

> > test_io.read(): 3991.0 ms
> > test_codecs.read(): 1736.9 ms
> > -> codecs 130% FASTER than io
> 
> No surprise here. It's also a very common use case
> to read the whole file in one go and the bigger
> the file, the more impact this has.

Oh, I understood why codecs is always faster than _pyio (or even io): it's 
because of IncrementalNewlineDecoder. To be fair, the read(-1) should be 
tested without IncrementalNewlineDecoder: e.g. with newline='\n'.

newline='' cannot be used for the read(-1) test, because even if newline='' 
indicates that we don't want to translate newlines, read(-1) uses the 
IncrementalNewlineDecoder (which is slower than not calling it at all). We may 
optimize this specific case in TextIOWrapper.

> > (3) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from
> > gb18030
> > 
> > test_io.readline(): 38.9 ms
> > test_codecs.readline(): 15.1 ms
> > -> codecs 157% FASTER than io
> > 
> > test_io.read(1): 369.8 ms
> > test_codecs.read(1): 302.2 ms
> > -> codecs 22% FASTER than io
> > 
> > test_io.read(100): 258.2 ms
> > test_codecs.read(100): 155.1 ms
> > -> codecs 67% FASTER than io
> > 
> > test_io.read(): 1803.2 ms
> > test_codecs.read(): 1002.9 ms
> > -> codecs 80% FASTER than io
> 
> These results are interesting since gb18030 is a shift
> encoding which keeps state in the encoded data stream, so
> the strategy chosen by TextIOWrapper doesn't work out that
> well.

In the 4 tests, TextIOWrapper only calls the decoder *once*, on the whole 
content of the file. The file size if 864 bytes, which is smaller than the 
TextIOWrapper chunk size (2048 bytes).

StreamReader of the gb18030 codec is implemented in C, not in Python (using 
multibytecodec.c). So to be fair, the test on this encoding should be done 
using io, not _pyio for this encoding.

Moreover, the multibytecodec module doesn't support universal newline! It does 
only support '\n' newlines. So to be more fair, the test should use '\n' 
newline.

It's one more reason to TextIOWrapper instead of StreamReader: it has the same 
behaviour (universal newlines) for all encodings. Or is it yet another bug in 
StreamReader?

> I am still -1 on deprecating the StreamReader/Writer parts of
> the codec APIs. I've given numerous reasons on why these are
> useful, what their intention is, why they were added to Python 1.6.

codecs.open() now uses TextIOWrapper, so there is no good reason to keep 
StreamReader or StreamWriter. You did not give me any use case where 
StreamReader or StreamWriter should be used instead of TextIOWrapper. You only 
listed theorical optimizations.

You have until the release of Python 3.3 to prove that StreamReader and/or 
StreamWriter can be faster than TextIOWrapper. If you can prove it using a 
patch and a benchmark, I will be ok to revert my commit.

> Since such a deprecation would change an important documented API,
> please write a PEP outlining your reasoning, including my comments,
> use cases and possibilities for optimizations.

Ok, I will write on a PEP explaining why StreamReader and StreamWriter are 
deprecated.

-----------

I wrote a new benchmarking script which tries to compare more closely codecs 
to io/_pyio (change the newline value and use io for gb18030). It should be a 
little bit more reliable because each test now runs 5 times (taking the 
smallest time), but it's not really reliable... The script is attached to this 
mail.



(1) Decode Objects/unicodeobject.c (317334 characters) from utf-8

_pyio.readline(): 1078.4 ms (8 loops, newline: '')
codecs.readline(): 983.0 ms (8 loops, newline: '')
-> codecs 10% FASTER than _pyio

_pyio.read(1): 3503.5 ms (2 loops, newline: '')
codecs.read(1): 6626.7 ms (2 loops, newline: '')
-> codecs 89% slower than _pyio

_pyio.read(100): 2076.2 ms (80 loops, newline: '')
codecs.read(100): 2870.8 ms (80 loops, newline: '')
-> codecs 38% slower than _pyio

_pyio.read(): 1698.0 ms (800 loops, newline: '\n')
codecs.read(): 1686.4 ms (800 loops, newline: '\n')
-> codecs 1% FASTER than _pyio

(2) Decode Lib/test/cjkencodings/gb18030.txt (501 characters) from gb18030

io.readline(): 5.1 ms (80 loops, newline: '\n')
codecs.readline(): 6.8 ms (80 loops, newline: '\n')
-> codecs 34% slower than io

io.read(1): 5.6 ms (20 loops, newline: '\n')
codecs.read(1): 45.5 ms (20 loops, newline: '\n')
-> codecs 705% slower than io

io.read(100): 54.2 ms (800 loops, newline: '\n')
codecs.read(100): 56.7 ms (800 loops, newline: '\n')
-> codecs 5% slower than io

io.read(): 395.8 ms (8000 loops, newline: '\n')
codecs.read(): 309.2 ms (8000 loops, newline: '\n')
-> codecs 28% FASTER than io

(3) Decode README (6613 characters) from ascii

_pyio.readline(): 385.9 ms (160 loops, newline: '')
codecs.readline(): 384.5 ms (160 loops, newline: '')
-> codecs 0% FASTER than _pyio

_pyio.read(1): 1473.6 ms (40 loops, newline: '')
codecs.read(1): 1913.9 ms (40 loops, newline: '')
-> codecs 30% slower than _pyio

_pyio.read(100): 1081.0 ms (1600 loops, newline: '')
codecs.read(100): 1325.6 ms (1600 loops, newline: '')
-> codecs 23% slower than _pyio

_pyio.read(): 1570.9 ms (16000 loops, newline: '\n')
codecs.read(): 1518.8 ms (16000 loops, newline: '\n')
-> codecs 3% FASTER than _pyio

codecs is still faster in 4 cases:
* ascii, read(): 3% faster than _pyio
* utf-8, readline(): 10% faster than _pyio
* utf-8, read(): 1% faster than _pyio
* gb18030, read(): 28% faster than io (!)

The last one is interesting and should be analyzed.

----

Even if it's not fair, benchmark using io for ASCII and UTF-8 (GB18030 already 
used io for the reasons explained before):

(1) Decode Objects/unicodeobject.c (317334 characters) from utf-8

io.readline(): 52.0 ms (8 loops, newline: '')
codecs.readline(): 1001.0 ms (8 loops, newline: '')
-> codecs 1825% slower than io

io.read(1): 265.7 ms (2 loops, newline: '')
codecs.read(1): 6734.5 ms (2 loops, newline: '')
-> codecs 2434% slower than io

io.read(100): 269.4 ms (80 loops, newline: '')
codecs.read(100): 2881.6 ms (80 loops, newline: '')
-> codecs 970% slower than io

io.read(): 1628.9 ms (800 loops, newline: '\n')
codecs.read(): 1692.8 ms (800 loops, newline: '\n')
-> codecs 4% slower than io

(3) Decode README (6613 characters) from ascii

io.readline(): 25.7 ms (160 loops, newline: '')
codecs.readline(): 415.5 ms (160 loops, newline: '')
-> codecs 1516% slower than io

io.read(1): 153.3 ms (40 loops, newline: '')
codecs.read(1): 2243.6 ms (40 loops, newline: '')
-> codecs 1363% slower than io

io.read(100): 210.2 ms (1600 loops, newline: '')
codecs.read(100): 1521.9 ms (1600 loops, newline: '')
-> codecs 624% slower than io

io.read(): 1100.1 ms (16000 loops, newline: '\n')
codecs.read(): 1501.1 ms (16000 loops, newline: '\n')
-> codecs 36% slower than io

So if you compare codecs to io (and not _pyio), codecs is only faster (26%) in 
one case: read the whole content of the file for multibytecodecs.

Note that the codecs module is 2434% slower than io to read a file in UTF-8 
character by character (which is stupid, don't do that! :-)), and 1825% slower 
to read line by line.

Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench.py
Type: text/x-python
Size: 3326 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110527/8b9511f7/attachment.py>

From benjamin at python.org  Fri May 27 15:33:07 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 27 May 2011 08:33:07 -0500
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <201105271529.15421.victor.stinner@haypocalc.com>
References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge>
	<4DDF5E19.3080701@egenix.com>
	<201105271529.15421.victor.stinner@haypocalc.com>
Message-ID: <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>

2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>:
> You have until the release of Python 3.3 to prove that StreamReader and/or
> StreamWriter can be faster than TextIOWrapper. If you can prove it using a
> patch and a benchmark, I will be ok to revert my commit.

Please don't hold commits over someone's head.



-- 
Regards,
Benjamin

From mal at egenix.com  Fri May 27 15:42:10 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 27 May 2011 15:42:10 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <201105271529.15421.victor.stinner@haypocalc.com>
References: <1306195729.605.27.camel@marge>
	<1306338491.6407.74.camel@marge>	<4DDF5E19.3080701@egenix.com>
	<201105271529.15421.victor.stinner@haypocalc.com>
Message-ID: <4DDFAA32.5030209@egenix.com>

Victor Stinner wrote:
> Le vendredi 27 mai 2011 10:17:29, M.-A. Lemburg a ?crit :
>> I am still -1 on deprecating the StreamReader/Writer parts of
>> the codec APIs. I've given numerous reasons on why these are
>> useful, what their intention is, why they were added to Python 1.6.
> 
> codecs.open() now uses TextIOWrapper, so there is no good reason to keep 
> StreamReader or StreamWriter. You did not give me any use case where 
> StreamReader or StreamWriter should be used instead of TextIOWrapper. You only 
> listed theorical optimizations.
> 
> You have until the release of Python 3.3 to prove that StreamReader and/or 
> StreamWriter can be faster than TextIOWrapper. If you can prove it using a 
> patch and a benchmark, I will be ok to revert my commit.

Victor, please revert the change. It has *not* been approved !

If we'd go by your reasoning for deprecating and eventually
removing parts of the stdlib or Python's subsystems, we'll end
up with a barebone version of Python. That's not what we want
and it's not what our users want.

I have tried to explain the design decisions and reasons for
those codec APIs at great length. You've pretty much used up
my patience. If you are not going to revert the patch, I will.

>> Since such a deprecation would change an important documented API,
>> please write a PEP outlining your reasoning, including my comments,
>> use cases and possibilities for optimizations.
> 
> Ok, I will write on a PEP explaining why StreamReader and StreamWriter are 
> deprecated.

Wrong order: first write a PEP, then discuss, then get approval,
then patch.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 27 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-05-23: Released eGenix mx Base 3.2.0      http://python.egenix.com/
2011-05-25: Released mxODBC 3.1.1              http://python.egenix.com/
2011-06-20: EuroPython 2011, Florence, Italy               24 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Fri May 27 16:01:14 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 May 2011 00:01:14 +1000
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDFAA32.5030209@egenix.com>
References: <1306195729.605.27.camel@marge> <1306338491.6407.74.camel@marge>
	<4DDF5E19.3080701@egenix.com>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<4DDFAA32.5030209@egenix.com>
Message-ID: <BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com>

On Fri, May 27, 2011 at 11:42 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>
> Wrong order: first write a PEP, then discuss, then get approval,
> then patch.

Indeed.

If another committer says "please revert and better justify this
change" then we revert it. We don't get into commit wars.

Something does need to be done to resolve the duplication of
functionality between the io and codecs modules, but it is *far* from
clear that deprecating chunks of the longer standing API is the right
way to go about it. This is especially true given Guido's explicit
direction following the issues with the PyCObject removal in 3.2 that
we be *very* conservative about introducing additional
incompatibilities between Python 2 and Python 3.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From victor.stinner at haypocalc.com  Fri May 27 17:08:06 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 27 May 2011 17:08:06 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>
References: <1306195729.605.27.camel@marge>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>
Message-ID: <201105271708.06211.victor.stinner@haypocalc.com>

Le vendredi 27 mai 2011 15:33:07, Benjamin Peterson a ?crit :
> 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>:
> > You have until the release of Python 3.3 to prove that StreamReader
> > and/or StreamWriter can be faster than TextIOWrapper. If you can prove
> > it using a patch and a benchmark, I will be ok to revert my commit.
> 
> Please don't hold commits over someone's head.

Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader 
and StreamWriter. Walter and Antoine are in favor of using TextIOWrapper 
instead of StreamReader/StreamWriter.

Different people would like to be able to call codecs.open() in Python 2 and 3, 
so I kept the function with its API unchanged, and I documented that open() 
should be preferred (but I did not deprecated codecs.open).

Victor

From benjamin at python.org  Fri May 27 17:34:28 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 27 May 2011 10:34:28 -0500
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <201105271708.06211.victor.stinner@haypocalc.com>
References: <1306195729.605.27.camel@marge>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>
	<201105271708.06211.victor.stinner@haypocalc.com>
Message-ID: <BANLkTik=gUsHdJdBivDNMKA0SOod1JVnOw@mail.gmail.com>

2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>:
> Le vendredi 27 mai 2011 15:33:07, Benjamin Peterson a ?crit :
>> 2011/5/27 Victor Stinner <victor.stinner at haypocalc.com>:
>> > You have until the release of Python 3.3 to prove that StreamReader
>> > and/or StreamWriter can be faster than TextIOWrapper. If you can prove
>> > it using a patch and a benchmark, I will be ok to revert my commit.
>>
>> Please don't hold commits over someone's head.
>
> Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader
> and StreamWriter. Walter and Antoine are in favor of using TextIOWrapper
> instead of StreamReader/StreamWriter.

I'm am too. There does, however, seem to be significant disagreement,
and it shouldn't be a race to see who can commit first.

>
> Different people would like to be able to call codecs.open() in Python 2 and 3,
> so I kept the function with its API unchanged, and I documented that open()
> should be preferred (but I did not deprecated codecs.open).




-- 
Regards,
Benjamin

From victor.stinner at haypocalc.com  Fri May 27 17:35:31 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 27 May 2011 17:35:31 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com>
References: <1306195729.605.27.camel@marge> <4DDFAA32.5030209@egenix.com>
	<BANLkTimM_BmsX_uncd9foc6OaKGnA4-L5w@mail.gmail.com>
Message-ID: <201105271735.31859.victor.stinner@haypocalc.com>

Le vendredi 27 mai 2011 16:01:14, Nick Coghlan a ?crit :
> On Fri, May 27, 2011 at 11:42 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> > Wrong order: first write a PEP, then discuss, then get approval,
> > then patch.
> 
> Indeed.
> 
> If another committer says "please revert and better justify this
> change" then we revert it. We don't get into commit wars.

I reverted my controversal commit.

> Something does need to be done to resolve the duplication of
> functionality between the io and codecs modules, but it is *far* from
> clear that deprecating chunks of the longer standing API is the right
> way to go about it.

Yes, StreamReader & friends are present in Python since Python 2.0.

> This is especially true given Guido's explicit
> direction following the issues with the PyCObject removal in 3.2 that
> we be *very* conservative about introducing additional
> incompatibilities between Python 2 and Python 3.

I did search for usage of these classes on the Internet, and except projects 
implementing their own codecs (and so implement their 
StreamReader/StreamWriter classes, even if they don't use it), I only found 
one project using directly StreamReader: pygment (*). I searched quickly, so 
don't trust these results :-) StreamReader & friends are used indirectly 
through codecs.open(). My patch changes codecs.open() to make it reuse open 
(io.TextIOWrapper), so the deprecation of StreamReader would not be noticed by 
most users.

I think that there are much more users of PyCObject than users using directly 
the StreamReader API (not through codecs.open()).

(*) I also found Sphinx, but I was wrong: it doesn't use StreamReader, it just 
has a full copy of the UTF-8-SIG codec which has a StreamReader class. I don't 
think that the class is used.

Victor

From victor.stinner at haypocalc.com  Fri May 27 17:44:06 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Fri, 27 May 2011 17:44:06 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <4DDFAA32.5030209@egenix.com>
References: <1306195729.605.27.camel@marge>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<4DDFAA32.5030209@egenix.com>
Message-ID: <201105271744.06307.victor.stinner@haypocalc.com>

Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a ?crit :
> If we'd go by your reasoning for deprecating and eventually
> removing parts of the stdlib or Python's subsystems, we'll end
> up with a barebone version of Python. That's not what we want
> and it's not what our users want.

I don't want to deprecate the whole stdlib, just duplicate old API, to follow 
"import this" mantra:

"There should be one-- and preferably only one --obvious way to do it."

It's difficult for an user to choose between between open() and codecs.open().

Victor

From status at bugs.python.org  Fri May 27 18:07:23 2011
From: status at bugs.python.org (Python tracker)
Date: Fri, 27 May 2011 18:07:23 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20110527160723.74C681D1DB@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2011-05-20 - 2011-05-27)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open    2813 (+19)
  closed 21165 (+50)
  total  23978 (+69)

Open issues with patches: 1216 


Issues opened (47)
==================

#12128: Allow an abc.abstractproperty to be overridden by an instance 
http://bugs.python.org/issue12128  opened by cool-RR

#12129: Document Object Model API - validation
http://bugs.python.org/issue12129  opened by Kyle.Keating

#12133: ResourceWarning in urllib.request
http://bugs.python.org/issue12133  opened by ezio.melotti

#12134: json.dump much slower than dumps
http://bugs.python.org/issue12134  opened by poq

#12135: The spawn function should return stderr.
http://bugs.python.org/issue12135  opened by pitrou

#12137: EBADF in test_urllibnet
http://bugs.python.org/issue12137  opened by pitrou

#12139: Add CCC command support to ftplib
http://bugs.python.org/issue12139  opened by giampaolo.rodola

#12141: sysconfig.get_config_vars('srcdir') fails in specific cases
http://bugs.python.org/issue12141  opened by tarek

#12142: Reference cycle when importing ctypes
http://bugs.python.org/issue12142  opened by poq

#12144: cookielib.CookieJar.make_cookies fails for cookies with 'expir
http://bugs.python.org/issue12144  opened by Scott.Wimer

#12145: distutils2 should support README.rst
http://bugs.python.org/issue12145  opened by daniellindsley

#12147: smtplib.send_message does not implement corectly rfc 2822
http://bugs.python.org/issue12147  opened by Nicolas.Estibals

#12148: Clarify "or-ing together" doctest option flags
http://bugs.python.org/issue12148  opened by ekorn

#12149: Segfault in _PyObject_GenericGetAttrWithDict
http://bugs.python.org/issue12149  opened by ezio.melotti

#12151: test_logging fails sometimes
http://bugs.python.org/issue12151  opened by haypo

#12154: PyDoc Partial Functions
http://bugs.python.org/issue12154  opened by JJeffries

#12155: queue example doesn't stop worker threads
http://bugs.python.org/issue12155  opened by haypo

#12156: test_multiprocessing.test_notify_all() timeout (1 hour) on Fre
http://bugs.python.org/issue12156  opened by haypo

#12157: join method of multiprocessing Pool object hangs if iterable a
http://bugs.python.org/issue12157  opened by G??k??en.Eraslan

#12160: codecs doc: what is StreamCodec?
http://bugs.python.org/issue12160  opened by haypo

#12162: Documentation about re \number
http://bugs.python.org/issue12162  opened by Seth.Troisi

#12163: str.count
http://bugs.python.org/issue12163  opened by py.user

#12164: str.translate docstring doesn't mention that 'table' can be No
http://bugs.python.org/issue12164  opened by mark.dickinson

#12165: Nonlocal does not include global; clarify doc
http://bugs.python.org/issue12165  opened by Lukas.Petru

#12167: test_packaging reference leak
http://bugs.python.org/issue12167  opened by pitrou

#12168: SysLogHandler incorrectly appends \000 to messages
http://bugs.python.org/issue12168  opened by Carl.Crowder

#12169: Factor out common code for d2 commands register, upload and up
http://bugs.python.org/issue12169  opened by eric.araujo

#12170: Bytes.index() and bytes.count() should accept byte ints
http://bugs.python.org/issue12170  opened by max-alleged

#12171: Reset method of the incremental encoders of CJK codecs	calls t
http://bugs.python.org/issue12171  opened by haypo

#12172: IDLE crashes when I use F5 to run
http://bugs.python.org/issue12172  opened by Kevin Ness

#12174: Multiprocessing logging levels unclear
http://bugs.python.org/issue12174  opened by JJeffries

#12175: FileIO.readall() read the file position and size at each read
http://bugs.python.org/issue12175  opened by haypo

#12177: re.match raises MemoryError
http://bugs.python.org/issue12177  opened by EungJun.Yi

#12178: csv writer doesn't escape escapechar
http://bugs.python.org/issue12178  opened by ebreck

#12179: Race condition using PyGILState_Ensure on a new thread
http://bugs.python.org/issue12179  opened by syeberman

#12181: SIGBUS error on OpenBSD (sparc64)
http://bugs.python.org/issue12181  opened by rpointel

#12183: Document behaviour of shutil.copy2 and copystat with symlinks
http://bugs.python.org/issue12183  opened by mmarkk

#12184: socketserver.ForkingMixin collect_children routine needs to co
http://bugs.python.org/issue12184  opened by orsenthil

#12186: readline.replace_history_item still leaks memory
http://bugs.python.org/issue12186  opened by stefanholek

#12187: subprocess.wait() with a timeout uses polling on POSIX
http://bugs.python.org/issue12187  opened by haypo

#12188: PEP 7, C style: add ++ policy and explanation
http://bugs.python.org/issue12188  opened by terry.reedy

#12190: intern filenames in bytecode
http://bugs.python.org/issue12190  opened by Mike.Solomon

#12191: Shutil - add chown() in order to allow to use user and group n
http://bugs.python.org/issue12191  opened by sandro.tosi

#12192: Doc that collection mutation methods return item or None
http://bugs.python.org/issue12192  opened by terry.reedy

#12195: Little documentation of annotations
http://bugs.python.org/issue12195  opened by JJeffries

#12196: add pipe2() to the os module
http://bugs.python.org/issue12196  opened by charles-francois.natali

#12185: Decimal documentation lists "first" and "second" arguments, sh
http://bugs.python.org/issue12185  opened by eric.smith



Most recent 15 issues with no replies (15)
==========================================

#12188: PEP 7, C style: add ++ policy and explanation
http://bugs.python.org/issue12188

#12186: readline.replace_history_item still leaks memory
http://bugs.python.org/issue12186

#12185: Decimal documentation lists "first" and "second" arguments, sh
http://bugs.python.org/issue12185

#12179: Race condition using PyGILState_Ensure on a new thread
http://bugs.python.org/issue12179

#12164: str.translate docstring doesn't mention that 'table' can be No
http://bugs.python.org/issue12164

#12157: join method of multiprocessing Pool object hangs if iterable a
http://bugs.python.org/issue12157

#12156: test_multiprocessing.test_notify_all() timeout (1 hour) on Fre
http://bugs.python.org/issue12156

#12142: Reference cycle when importing ctypes
http://bugs.python.org/issue12142

#12137: EBADF in test_urllibnet
http://bugs.python.org/issue12137

#12129: Document Object Model API - validation
http://bugs.python.org/issue12129

#12091: multiprocessing: simplify ApplyResult and MapResult with threa
http://bugs.python.org/issue12091

#12066: Empty ('') xmlns attribute is not properly handled by xml.dom.
http://bugs.python.org/issue12066

#12053: Add prefetch() for Buffered IO (experiment)
http://bugs.python.org/issue12053

#12037: test_email failures under Windows with the eol extension activ
http://bugs.python.org/issue12037

#11992: sys.settrace doesn't disable tracing if a local trace function
http://bugs.python.org/issue11992



Most recent 15 issues waiting for review (15)
=============================================

#12196: add pipe2() to the os module
http://bugs.python.org/issue12196

#12191: Shutil - add chown() in order to allow to use user and group n
http://bugs.python.org/issue12191

#12190: intern filenames in bytecode
http://bugs.python.org/issue12190

#12184: socketserver.ForkingMixin collect_children routine needs to co
http://bugs.python.org/issue12184

#12175: FileIO.readall() read the file position and size at each read
http://bugs.python.org/issue12175

#12174: Multiprocessing logging levels unclear
http://bugs.python.org/issue12174

#12171: Reset method of the incremental encoders of CJK codecs	calls t
http://bugs.python.org/issue12171

#12165: Nonlocal does not include global; clarify doc
http://bugs.python.org/issue12165

#12164: str.translate docstring doesn't mention that 'table' can be No
http://bugs.python.org/issue12164

#12160: codecs doc: what is StreamCodec?
http://bugs.python.org/issue12160

#12154: PyDoc Partial Functions
http://bugs.python.org/issue12154

#12149: Segfault in _PyObject_GenericGetAttrWithDict
http://bugs.python.org/issue12149

#12147: smtplib.send_message does not implement corectly rfc 2822
http://bugs.python.org/issue12147

#12144: cookielib.CookieJar.make_cookies fails for cookies with 'expir
http://bugs.python.org/issue12144

#12139: Add CCC command support to ftplib
http://bugs.python.org/issue12139



Top 10 most discussed issues (10)
=================================

#8898: The email package should defer to the codecs module for	all al
http://bugs.python.org/issue8898  30 msgs

#12006: strptime should implement %V or %u directive from libc
http://bugs.python.org/issue12006  23 msgs

#5715: listen socket close in SocketServer.ForkingMixIn.process_reque
http://bugs.python.org/issue5715  18 msgs

#12175: FileIO.readall() read the file position and size at each read
http://bugs.python.org/issue12175  16 msgs

#12181: SIGBUS error on OpenBSD (sparc64)
http://bugs.python.org/issue12181  14 msgs

#12085: subprocess.Popen.__del__ raises AttributeError if __init__ was
http://bugs.python.org/issue12085  11 msgs

#12168: SysLogHandler incorrectly appends \000 to messages
http://bugs.python.org/issue12168  10 msgs

#12042: What's New multiprocessing example error
http://bugs.python.org/issue12042   9 msgs

#12057: HZ codec has no test
http://bugs.python.org/issue12057   9 msgs

#12167: test_packaging reference leak
http://bugs.python.org/issue12167   9 msgs



Issues closed (44)
==================

#1625: bz2.BZ2File doesn't support multiple streams
http://bugs.python.org/issue1625  closed by nadeem.vawda

#9435: test_distutils fails without zlib
http://bugs.python.org/issue9435  closed by eric.araujo

#9942: Allow memory sections to be OS MERGEABLE
http://bugs.python.org/issue9942  closed by loewis

#10818: pydoc: Remove old server and tk panel
http://bugs.python.org/issue10818  closed by haypo

#10832: Add support of bytes objects in PyBytes_FromFormatV()
http://bugs.python.org/issue10832  closed by haypo

#11998: test_signal cannot test blocked signals if _tkinter is loaded;
http://bugs.python.org/issue11998  closed by haypo

#12003: documentation: alternate version of xrange seems to fail.
http://bugs.python.org/issue12003  closed by eli.bendersky

#12024: 2.6 svn and hg branches are out of sync
http://bugs.python.org/issue12024  closed by barry

#12045: external shell command executed twice in ctypes.util._get_sona
http://bugs.python.org/issue12045  closed by pitrou

#12049: expose RAND_bytes() function of OpenSSL
http://bugs.python.org/issue12049  closed by haypo

#12070: Unlimited loop in sysconfig._parse_makefile()
http://bugs.python.org/issue12070  closed by haypo

#12071: test_concurrent_futures.test_context_manager_shutdown() hangs 
http://bugs.python.org/issue12071  closed by haypo

#12074: regrtest: display the current number of failures
http://bugs.python.org/issue12074  closed by haypo

#12079: decimal.py: TypeError precedence in fma()
http://bugs.python.org/issue12079  closed by mark.dickinson

#12100: Incremental encoders of CJK codecs reset the codec at	each cal
http://bugs.python.org/issue12100  closed by haypo

#12105: open() does not able to set flags, such as O_CLOEXEC
http://bugs.python.org/issue12105  closed by charles-francois.natali

#12113: test_packaging fails when run twice
http://bugs.python.org/issue12113  closed by haypo

#12114: packaging.util._find_exe_version(): potential deadlock
http://bugs.python.org/issue12114  closed by python-dev

#12121: test_packaging failure when ssl is not available
http://bugs.python.org/issue12121  closed by haypo

#12124: python -m test test_packaging test_zipimport failure
http://bugs.python.org/issue12124  closed by haypo

#12126: incorrect select documentation
http://bugs.python.org/issue12126  closed by eli.bendersky

#12130: regex 0.1.20110514 findall overlapped not working with 'start 
http://bugs.python.org/issue12130  closed by brian.curtin

#12131: python built with --prefix fails in site.py with no section 'p
http://bugs.python.org/issue12131  closed by ned.deily

#12132: test_packaging failures when run with -j
http://bugs.python.org/issue12132  closed by tarek

#12136: test_logging fails when no ssl available
http://bugs.python.org/issue12136  closed by vinay.sajip

#12138: buggy use of transient_internet() in test_urllibnet
http://bugs.python.org/issue12138  closed by pitrou

#12140: Crash upon start up
http://bugs.python.org/issue12140  closed by amaury.forgeotdarc

#12143: packaging extension gcc linking fails on Ubuntu Shared
http://bugs.python.org/issue12143  closed by eric.araujo

#12146: Possible bug in 're' documentation example
http://bugs.python.org/issue12146  closed by eli.bendersky

#12150: test_sysconfig fails on solaris
http://bugs.python.org/issue12150  closed by haypo

#12152: Parser/asdl_c.py relies on mercurial repository revision
http://bugs.python.org/issue12152  closed by doko

#12153: Modules/faulthandler.c exports `stack_overflow' symbol
http://bugs.python.org/issue12153  closed by python-dev

#12158: platform: add linux_version()
http://bugs.python.org/issue12158  closed by lemburg

#12159: Integer Overflow in __len__
http://bugs.python.org/issue12159  closed by benjamin.peterson

#12161: StringIO AttributeError instead of ValueError after close..
http://bugs.python.org/issue12161  closed by python-dev

#12166: object.__dir__
http://bugs.python.org/issue12166  closed by python-dev

#12173: PyImport_ImportModuleLevel doesn't have 'const' on its argumen
http://bugs.python.org/issue12173  closed by python-dev

#12176: Compiling Python 2.7.1 on Ubuntu 11.04 (Natty Narwhale)
http://bugs.python.org/issue12176  closed by skrah

#12180: test_packaging: failures --without-threads
http://bugs.python.org/issue12180  closed by tarek

#12182: pydoc.py integer division problem
http://bugs.python.org/issue12182  closed by python-dev

#12189: Python 2.6.6 fails to compile a source whereas pycompile 1.0 a
http://bugs.python.org/issue12189  closed by r.david.murray

#12193: Argparse does not work together with gettext and non-ASCII hel
http://bugs.python.org/issue12193  closed by thorsten

#12194: Fix LDFLAGS on Ubuntu 11.04+
http://bugs.python.org/issue12194  closed by barry

#1441530: socket read() can cause MemoryError in Windows
http://bugs.python.org/issue1441530  closed by charles-francois.natali

From mal at egenix.com  Fri May 27 20:26:45 2011
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 27 May 2011 20:26:45 +0200
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <201105271744.06307.victor.stinner@haypocalc.com>
References: <1306195729.605.27.camel@marge>	<201105271529.15421.victor.stinner@haypocalc.com>	<4DDFAA32.5030209@egenix.com>
	<201105271744.06307.victor.stinner@haypocalc.com>
Message-ID: <4DDFECE5.50100@egenix.com>

Victor Stinner wrote:
> Le vendredi 27 mai 2011 15:42:10, M.-A. Lemburg a ?crit :
>> If we'd go by your reasoning for deprecating and eventually
>> removing parts of the stdlib or Python's subsystems, we'll end
>> up with a barebone version of Python. That's not what we want
>> and it's not what our users want.
> 
> I don't want to deprecate the whole stdlib, just duplicate old API, to follow 
> "import this" mantra:
> 
> "There should be one-- and preferably only one --obvious way to do it."

What people tend to miss in this mantra is the last part: "obvious".
It doesn't say: there should only be one way to do it. There can
be many ways, but there should preferably be only one *obvious* way.

Using codec.open() is not obvious in Python3, since the standard
open() already provides a way to access an encoded stream. Using
a builtin is the obvious way to go.

It is obvious in Python2 where the standard open() doesn't provide a
way to define an encoding, so the user has to explicitly look for this
kind of API and then find it in the "obvious" (to some extent)
codecs module, since that's where encodings happen in Python2.

Having multiple ways to do things, is the most natural thing
on earth and it's good that way.

Python does not and should not force people into doing things
in one dictated "right" way. It should, however, provide
natural choices and obvious hints to find a good solution.
And that's what the Zen mantra is all about.

> It's difficult for an user to choose between between open() and codecs.open().

As I mentioned on the ticket and in my replies: I'm not against
changing codecs.open() to use a variant that is based on TextIOWrapper,
provided there are no user noticeable compatibility issues.

Thanks for reverting the patch.

Have a nice weekend,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 27 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2011-05-23: Released eGenix mx Base 3.2.0      http://python.egenix.com/
2011-05-25: Released mxODBC 3.1.1              http://python.egenix.com/
2011-06-20: EuroPython 2011, Florence, Italy               24 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From martin at v.loewis.de  Fri May 27 20:37:47 2011
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 27 May 2011 20:37:47 +0200
Subject: [Python-Dev] [ANN] Python 2.5.6 released
Message-ID: <4DDFEF7B.5020803@v.loewis.de>

On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.5.6. There were no changes
since the release candidate.

This is a source-only release that only includes security fixes. The
last full bug-fix release of Python 2.5 was Python 2.5.4. Users are
encouraged to upgrade to the latest release of Python 2.7 (which is
2.7.1 at this point). This release is most likely the final release of
Python 2.5; under the current release policy, no security issues in
Python 2.5 will be fixed after October, 2011.

This releases fixes issues with the urllib, urllib2, SimpleHTTPServer,
and audiop modules. See the release notes at the website (also
available as Misc/NEWS in the source distribution) for details of bugs
fixed.

For more information on Python 2.5.6, including download links for
various platforms, release notes, and known issues, please see:

    http://www.python.org/2.5.6

Highlights of the previous major Python releases are available from
the Python 2.5 page, at

    http://www.python.org/2.5/highlights.html

Enjoy this release,
Martin

Martin v. Loewis
martin at v.loewis.de
Python Release Manager
(on behalf of the entire python-dev team)

From tjreedy at udel.edu  Fri May 27 22:30:31 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 27 May 2011 16:30:31 -0400
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <201105271708.06211.victor.stinner@haypocalc.com>
References: <1306195729.605.27.camel@marge>	<201105271529.15421.victor.stinner@haypocalc.com>	<BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>
	<201105271708.06211.victor.stinner@haypocalc.com>
Message-ID: <irp1l8$9mr$1@dough.gmane.org>

On 5/27/2011 11:08 AM, Victor Stinner wrote:

> Tell me if I am wrong, but only Marc-Andre is against deprecating StreamReader

While I am, in general, in favor of removing some duplication, I was and 
am against doing this change precipitously. So I was for the reversion 
(noted), at least temporarily. Given the disagreement, I think there 
should be a PEP with pro and con arguments.

-- 
Terry Jan Reedy


From ncoghlan at gmail.com  Sat May 28 03:21:57 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 28 May 2011 11:21:57 +1000
Subject: [Python-Dev] Deprecate codecs.open() and
	StreamWriter/StreamReader
In-Reply-To: <irp1l8$9mr$1@dough.gmane.org>
References: <1306195729.605.27.camel@marge>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<BANLkTinrGOCVjObD4Gfz3U=kRMPzUFYBZA@mail.gmail.com>
	<201105271708.06211.victor.stinner@haypocalc.com>
	<irp1l8$9mr$1@dough.gmane.org>
Message-ID: <BANLkTimP3c33CuzxF9eazFGwkn_C8BjG4w@mail.gmail.com>

On Sat, May 28, 2011 at 6:30 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 5/27/2011 11:08 AM, Victor Stinner wrote:
>
>> Tell me if I am wrong, but only Marc-Andre is against deprecating
>> StreamReader
>
> While I am, in general, in favor of removing some duplication, I was and am
> against doing this change precipitously. So I was for the reversion (noted),
> at least temporarily. Given the disagreement, I think there should be a PEP
> with pro and con arguments.

Indeed.

I'm also against any deprecation in this area, since that just means
needless work for anyone that *do* use these APIs (even if those
people are few and far between). If we can refactor to remove the
duplication of functionality, that's a *much* better solution.

If we can carry optparse style argument parsing and 2.x style string
formatting, we can carry a couple of legacy codec interface
definitions.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From vinay_sajip at yahoo.co.uk  Sat May 28 16:57:15 2011
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Sat, 28 May 2011 14:57:15 +0000 (UTC)
Subject: [Python-Dev]
	=?utf-8?q?Deprecate_codecs=2Eopen=28=29_and=09Stream?=
	=?utf-8?q?Writer/StreamReader?=
References: <1306195729.605.27.camel@marge>
	<201105271529.15421.victor.stinner@haypocalc.com>
	<4DDFAA32.5030209@egenix.com>
	<201105271744.06307.victor.stinner@haypocalc.com>
Message-ID: <loom.20110528T164222-362@post.gmane.org>

Victor Stinner <victor.stinner <at> haypocalc.com> writes:

> It's difficult for an user to choose between between open() and 
> codecs.open().

Is it? How about the following decision process?

If writing code for Python 3.x only, use open().

If writing code which has to work under both Python 2.x and 3.x, use
codecs.open().

BTW I have written code using StreamReader and StreamWriter in the past,
though it may not have been published on the Internet. Python is used a
lot by companies for internal systems. Such code is seldom published on the
Internet, so it seems that there's no real way of knowing how much
StreamReader/StreamWriter are used.

When looking at porting projects to Python 3.x, I've always adopted a single
code-base approach for 2.x and 3.x, as I feel it's the path of least ongoing
maintenance and hence (in my experience) the path of least resistance to
providing 3.x support. Though of course I've no objection to implementing their
functionality in the most efficient way possible (which may well be
TextIOWrapper), IMO deprecating StreamReader/StreamWriter will make 2.x/3.x
portability harder to achieve, and so seems a step too far.

Regards,

Vinay Sajip


From greg at krypto.org  Sun May 29 11:29:15 2011
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 29 May 2011 02:29:15 -0700
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com>
References: <20110521170725.51eab5f9@pitrou.net> <ir8m70$e7c$1@dough.gmane.org>
	<20110521160118.GA22904@kevin> <ir8tb1$k0o$1@dough.gmane.org>
	<BANLkTin6WNTTkfQoJd=fthmEgf25fheHPA@mail.gmail.com>
Message-ID: <BANLkTimcz+w+Op3bctonbV1dGaAiiV+SBA@mail.gmail.com>

On Sun, May 22, 2011 at 11:22 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Sun, May 22, 2011 at 3:38 AM, Georg Brandl <g.brandl at gmx.net> wrote:
> > On 05/21/11 18:01, Senthil Kumaran wrote:
> >> So a rewrite with good pointers would be more appropriate.
> >
> > Even then, it's better off in the Wiki until the rewrite is complete.
>
> Perhaps replacing it with a placeholder page that refers to the Wiki
> would be appropriate? A simple summary saying that the HOWTO had not
> aged well, and hence had been removed from the official documentation
> until it had been updated on the Wiki would allow people looking for
> it to better understand the situation, and also how to help improve
> it.
>

+1 on removal.

+0.8 on the pointer with a disclaimer (please also add the disclaimer at the
top of the socket howto as well).

there's a lot of editorial misinformation in that page even if some parts of
it are useful for the socket unaware...

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110529/7533d9ab/attachment.html>

From ncoghlan at gmail.com  Sun May 29 15:08:13 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 May 2011 23:08:13 +1000
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Fix
 ProcessTestCasePOSIXPurePython to test the module from import when
In-Reply-To: <E1QQM9G-0005fv-JI@dinsdale.python.org>
References: <E1QQM9G-0005fv-JI@dinsdale.python.org>
Message-ID: <BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com>

On Sun, May 29, 2011 at 2:13 AM, gregory.p.smith
<python-checkins at python.org> wrote:
> Ironically: I don't think any platform should ever actually _use_ the
> pure Python subprocess code on POSIX platforms anymore. ?This at least
> tests it properly in this stable branch. ?The pure python code for
> this is likely to be removed in 3.3.

Don't do that - keeping the pure Python equivalents around can help
reduce the level of effort for other implementations (especially
PyPy).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Sun May 29 15:09:07 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 29 May 2011 23:09:07 +1000
Subject: [Python-Dev] [Python-checkins] cpython (3.2): Fix
 ProcessTestCasePOSIXPurePython to test the module from import when
In-Reply-To: <BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com>
References: <E1QQM9G-0005fv-JI@dinsdale.python.org>
	<BANLkTi=7sbegV1wD50tDpeo2oESpCt=ABg@mail.gmail.com>
Message-ID: <BANLkTindxfmLfPyoiVdMbZm-gcTiMA0kmw@mail.gmail.com>

On Sun, May 29, 2011 at 11:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sun, May 29, 2011 at 2:13 AM, gregory.p.smith
> <python-checkins at python.org> wrote:
>> Ironically: I don't think any platform should ever actually _use_ the
>> pure Python subprocess code on POSIX platforms anymore. ?This at least
>> tests it properly in this stable branch. ?The pure python code for
>> this is likely to be removed in 3.3.
>
> Don't do that - keeping the pure Python equivalents around can help
> reduce the level of effort for other implementations (especially
> PyPy).

Never mind, you addressed that it in a later checkin.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From martin at v.loewis.de  Sun May 29 17:20:29 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sun, 29 May 2011 17:20:29 +0200
Subject: [Python-Dev] The socket HOWTO
In-Reply-To: <20110521170725.51eab5f9@pitrou.net>
References: <20110521170725.51eab5f9@pitrou.net>
Message-ID: <4DE2643D.3070906@v.loewis.de>

> I would like to suggest that we remove the socket HOWTO (currently at
> http://docs.python.org/dev/howto/sockets.html)

-1. I think there should be a Python-oriented introduction to sockets.
You may have complaints about the specific wording of the text, but
please understand that these are probably irrelevant to most
first-time readers of this text. My observation is that people actually
don't read the text that much, but instead try to imitate the examples.
So if the examples are good (and I think they are, mostly), it's of
minor relevance whether the text makes all sense the first time.

> - people who know sockets won't learn anything from it

True. People who know sockets just need to read the module
documentation. It is a beauty of the Python library design that it
exposes the API mostly as-is, so if you know Berkeley sockets,
you will be immediately familiar with Python sockets (unlike,
say, Java or .NET, where they decided to regroup the API into
classes).

> - but people who don't know sockets will probably find it clear as mud

See above - it doesn't really matter.

> (for example, what's an "INET" or "STREAM" socket?

You are probably referring to the sentence "I?m only going to talk about
INET sockets, but they account for at least 99% of the sockets in use.
And I?ll only talk about STREAM sockets" here. It's not important
to first-time readers to actually understand that, and the wording
explicitly tells them that they don't need to understand. It says
"there is more stuff, and you won't need it, and the stuff you need
is called INET and STREAM".

It's easy to fix, though, and I fixed it in f70e26452621 (explaining
that this is all about TCPv4).

> what's "select"?)

It's well explained in the section Non-blocking Sockets, isn't it?

> I have other issues, such as the style/tone it's written in. I'm sure
> the author had fun writing it but it doesn't fit well with the rest of
> the documentation. Also, the author gives a lot of "advice" without
> explaining or justifying it

It's a HOWTO - of course it has advise without justification. It's
not a reference documentation which only tells you what it does, but
not what the best way of putting it together is.

> ("if somewhere in those input lists of
> sockets is one which has died a nasty death, the select will fail" ->
> is that really true?

I think it is:

py> import select
py> select.select([100],[],[],0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
select.error: (9, 'Bad file descriptor')

Of course, rather than "has died a nasty death", it could also say
"has been closed".

> what is a "nasty death" and how is that supposed to
> happen? couldn't the author have put a 3-line example to demonstrate
> this supposed drawback and how it manifests?).

It may well be that the author didn't fully understand the problem
when writing the text, so I wouldn't mind removing this specific
paragraph.

> And, finally, many statements seem arbitrary ("There?s no question that
> the fastest sockets code uses non-blocking sockets and select to
> multiplex them") or plain wrong ("threading support in Unixes varies
> both in API and quality. So the normal Unix solution is to fork a
> subprocess to deal with each connection").

I'd evaluate these two statements exactly vice versa. The first one
(non-blocking sockets are faster) is plain wrong, and the second one
("threading support in Unix varies") is arbitrary, but factually
correct :-)

I'd drop the entire "Performance" section - there is much more
to be said about socket performance than a few paragraphs of text,
and for the target audience, performance is probably no concern.

> Oh and I think it's obsolete too, because the "class mysocket"
> concatenates the output of recv() with a str rather than a bytes
> object.

That's easy to fix, too - c65e1a422bc3

> Not to mention that features of the "class mysocket" can be had
> using a buffered socket.makefile() instead of writing custom code.

I find it actually appropriate in the context. It illustrates a
number of important points about sockets, namely that you cannot
rely on send() and recv() to match in block size. Ultimately, people
that use the socket API *really* need to understand TCP, so it's
good to explain to them that there are issues to consider right
in the first tutorial.

Regards,
Martin

From tiagoboldt at gmail.com  Sun May 29 15:41:52 2011
From: tiagoboldt at gmail.com (Tiago Boldt Sousa)
Date: Sun, 29 May 2011 14:41:52 +0100
Subject: [Python-Dev] PhD ideas
Message-ID: <BANLkTim_yE-kM45jWA8XZjXLL5kG86FGsw@mail.gmail.com>

Hi,

I'm now currently finishing my MsC and am thinking about enrolling
into the PhD program. I was wondering if any of you would like to
suggest me some research topic that could benefit the scientific
community, that might also result as a potential improvement for
Python.

I love everything that's web related (Django here) and software
engineering but I?don't yet have any idea for a research topic that
would be relevant?for a PhD so I'm completely open to suggestions.

Please contact me directly.

Best regards

--
Tiago Boldt Sousa

From benjamin at python.org  Mon May 30 00:44:58 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 29 May 2011 17:44:58 -0500
Subject: [Python-Dev] [RELEASE] 3.1.4 release candidate 1
Message-ID: <BANLkTimsJhBjBH1CiqkSaw9pKmAE=_=+aw@mail.gmail.com>

On behalf of the Python development team, I'm happy as a swallow to announce a
release candidate for the fourth bugfix release for the Python 3.1
series, Python
3.1.4.

3.1.4 will the last bug fix release in the 3.1 series before 3.1. After 3.1.4,
3.1 will be in security-only fix mode.

The Python 3.1 version series focuses on the stabilization and optimization of
the features and changes that Python 3.0 introduced.  For example, the new I/O
system has been rewritten in C for speed.  File system APIs that use unicode
strings now handle paths with undecodable bytes in them. Other features include
an ordered dictionary implementation, a condensed syntax for nested with
statements, and support for ttk Tile in Tkinter.  For a more extensive list of
changes in 3.1, see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in
the Python distribution.

This is a testing release. To download Python 3.1.4rc1 visit:

     http://www.python.org/download/releases/3.1.4/

A list of changes in 3.1.4 can be found here:

     http://hg.python.org/cpython/file/35419f276c60/Misc/NEWS

The 3.1 documentation can be found at:

     http://docs.python.org/3.1

Bugs can always be reported to:

     http://bugs.python.org


Enjoy!

--
Benjamin Peterson
Release Manager
benjamin at python.org
(on behalf of the entire python-dev team and 3.1.4's contributors)

From benjamin at python.org  Mon May 30 00:47:42 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 29 May 2011 17:47:42 -0500
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
Message-ID: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>

On behalf of the Python development team, I'm happy to announce the immediate
availability of Python 2.7.2 release candidate 1.

2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last
major verison of the 2.x line and will be receiving bug fixes while new feature
development focuses on 3.x.

2.7 includes many features that were first released in Python 3.1. The faster io
module, the new nested with statement syntax, improved float repr, set literals,
dictionary views, and the memoryview object have been backported from 3.1. Other
features include an ordered dictionary implementation, unittests improvements, a
new sysconfig module, auto-numbering of fields in the str/unicode format method,
and support for ttk Tile in Tkinter.  For a more extensive list of changes in
2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python
distribution.

To download Python 2.7.2rc1 visit:

     http://www.python.org/download/releases/2.7.1/

The 2.7.2 changelog is at:

     http://hg.python.org/cpython/file/439396b06416/Misc/NEWS

2.7 documentation can be found at:

     http://docs.python.org/2.7/

This is a preview release. Assuming no major problems, 2.7.2 will be released in
two weeks. Please report any bugs you find to

     http://bugs.python.org/


Enjoy!

--
Benjamin Peterson
Release Manager
benjamin at python.org
(on behalf of the entire python-dev team and 2.7.2's contributors)

From jackdied at gmail.com  Mon May 30 01:11:02 2011
From: jackdied at gmail.com (Jack Diederich)
Date: Sun, 29 May 2011 19:11:02 -0400
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
Message-ID: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com>

On Sun, May 29, 2011 at 6:47 PM, Benjamin Peterson <benjamin at python.org> wrote:
> 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last
> major verison of the 2.x line and will be receiving bug fixes while new feature
> development focuses on 3.x.
>
> 2.7 includes many features that were first released in Python 3.1.

It might not be clear to a casual reader that the features were
released in 2.7.0 and not 2.7.2.  We don't, but many projects do
release new features with bugfix version numbers - I'm looking at you,
Django.

-Jack

From benjamin at python.org  Mon May 30 01:13:03 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 29 May 2011 18:13:03 -0500
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com>
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
	<BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com>
Message-ID: <BANLkTinPnsFWqCB2X1dN4cOOMsAPpGZr2w@mail.gmail.com>

2011/5/29 Jack Diederich <jackdied at gmail.com>:
> On Sun, May 29, 2011 at 6:47 PM, Benjamin Peterson <benjamin at python.org> wrote:
>> 2.7.2 is the second in bugfix release for the Python 2.7 series. 2.7 is the last
>> major verison of the 2.x line and will be receiving bug fixes while new feature
>> development focuses on 3.x.
>>
>> 2.7 includes many features that were first released in Python 3.1.
>
> It might not be clear to a casual reader that the features were
> released in 2.7.0 and not 2.7.2. ?We don't, but many projects do
> release new features with bugfix version numbers - I'm looking at you,
> Django.

Okay. I suppose I can say "The 2.7 series" next time.



-- 
Regards,
Benjamin

From carl at oddbird.net  Mon May 30 02:49:55 2011
From: carl at oddbird.net (Carl Meyer)
Date: Sun, 29 May 2011 19:49:55 -0500
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com>
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
	<BANLkTinVkF14-doWJXSrqQVgf5XWMCZGJw@mail.gmail.com>
Message-ID: <4DE2E9B3.4050402@oddbird.net>



On 05/29/2011 06:11 PM, Jack Diederich wrote:
> We don't, but many projects do
> release new features with bugfix version numbers - I'm looking at you,
> Django.

Really? Do you have an example of a new Django feature that was released
in a bugfix version number? Just curious, since that's certainly not the
documented release policy. [1]

Carl


[1] https://docs.djangoproject.com/en/dev/internals/release-process/

From ralf at brainbot.com  Mon May 30 06:47:40 2011
From: ralf at brainbot.com (Ralf Schmitt)
Date: Mon, 30 May 2011 06:47:40 +0200
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com> (Benjamin
	Peterson's message of "Sun, 29 May 2011 17:47:42 -0500")
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
Message-ID: <878vtodcb7.fsf@muni.brainbot.com>

Benjamin Peterson <benjamin at python.org> writes:

> The 2.7.2 changelog is at:
>
>      http://hg.python.org/cpython/file/439396b06416/Misc/NEWS
>

The news file mentions that issue 1195 ("Problems on Linux with Ctrl-D
and Ctrl-C during raw_input") is fixed. That's not true, see:
http://bugs.python.org/msg135671

Does one need special roundup rights to reopen issues?

Cheers,
- Ralf

From victor.stinner at haypocalc.com  Mon May 30 10:26:44 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 30 May 2011 10:26:44 +0200
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <878vtodcb7.fsf@muni.brainbot.com>
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
	<878vtodcb7.fsf@muni.brainbot.com>
Message-ID: <201105301026.44557.victor.stinner@haypocalc.com>

Hi,

Le lundi 30 mai 2011 06:47:40, Ralf Schmitt a ?crit :
> Benjamin Peterson <benjamin at python.org> writes:
> > The 2.7.2 changelog is at:
> >      http://hg.python.org/cpython/file/439396b06416/Misc/NEWS
> 
> The news file mentions that issue 1195 ("Problems on Linux with Ctrl-D
> and Ctrl-C during raw_input") is fixed. That's not true, see:
> http://bugs.python.org/msg135671
> 
> Does one need special roundup rights to reopen issues?

Oh, I forgot that one. Please reopen the issue, I will apply your fix instead 
of mine.

Victor

From ralf at brainbot.com  Mon May 30 10:46:39 2011
From: ralf at brainbot.com (Ralf Schmitt)
Date: Mon, 30 May 2011 10:46:39 +0200
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <201105301026.44557.victor.stinner@haypocalc.com> (Victor
	Stinner's message of "Mon, 30 May 2011 10:26:44 +0200")
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
	<878vtodcb7.fsf@muni.brainbot.com>
	<201105301026.44557.victor.stinner@haypocalc.com>
Message-ID: <877h98mv80.fsf@muni.brainbot.com>

Victor Stinner <victor.stinner at haypocalc.com> writes:

>> Does one need special roundup rights to reopen issues?
>
> Oh, I forgot that one. Please reopen the issue, I will apply your fix instead 
> of mine.

I would love to do that, but as my above question implies I'm either too
stupid to do that or I'm missing the rights to do it :)

Cheers,
- Ralf

From victor.stinner at haypocalc.com  Mon May 30 10:55:38 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Mon, 30 May 2011 10:55:38 +0200
Subject: [Python-Dev] [RELEASE] Python 2.7.2 release candidate 1
In-Reply-To: <877h98mv80.fsf@muni.brainbot.com>
References: <BANLkTi=Hkdj4VuJsCUERDgEPLw4y2_Extw@mail.gmail.com>
	<201105301026.44557.victor.stinner@haypocalc.com>
	<877h98mv80.fsf@muni.brainbot.com>
Message-ID: <201105301055.39036.victor.stinner@haypocalc.com>

Le lundi 30 mai 2011 10:46:39, Ralf Schmitt a ?crit :
> Victor Stinner <victor.stinner at haypocalc.com> writes:
> >> Does one need special roundup rights to reopen issues?
> > 
> > Oh, I forgot that one. Please reopen the issue, I will apply your fix
> > instead of mine.
> 
> I would love to do that, but as my above question implies I'm either too
> stupid to do that or I'm missing the rights to do it :)

Oops, I am stupid. I reopened the issue.

Victor

From ziade.tarek at gmail.com  Mon May 30 18:44:43 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 30 May 2011 18:44:43 +0200
Subject: [Python-Dev] pysetup as a top script
Message-ID: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com>

Hello

If no one objects, I'll promote Tools/scripts/pysetup3 to a top level
script that gets installed in scripts/ like 2to3, pydoc etc..

That way, people will be able to use it directly when installing,
removing projects, or studying what's installed

Cheers
Tarek
-- 
Tarek Ziad? | http://ziade.org

From merwok at netwok.org  Mon May 30 18:52:16 2011
From: merwok at netwok.org (=?UTF-8?Q?=C3=89ric_Araujo?=)
Date: Mon, 30 May 2011 18:52:16 +0200
Subject: [Python-Dev] Docs for the packaging module
Message-ID: <b35dbbc50253fd96327a62e906b8259f@netwok.org>

 Hi,

 The docs were not added alongside the code when packaging was merged
 back into CPython because they were not in a shape conforming with the
 rest of the docs.  I?d like your input on layout so that I can fix this
 ASAP and merge the docs.  (They would still require a lot of additions,
 fixes and improvements after that, but at least they?d be in the repo.)

 The easiest part is the library documentation, i.e. the docs for
 developers of packaging-related tools that want to use for example
 packaging.version or packaging.metadata to do their own stuff.  These
 documents should go into Doc/library/packaging.*, I think this is a
 no-brainer.  (Distutils has only a stub here, its API docs is mixed 
 with
 its usage docs.)

 There is a guide for end-users, which contains an outdated copy of the
 old ?Installing Python Modules? and a few documents about the new
 pysetup script (superseder of setup.py scripts), which are not
 integrated with the first document.  I think those should supersede the
 existing distutils-based Doc/install tree.  We want to advertise 
 pysetup
 and packaging as the way of gettting modules in 3.3.  A question
 remains: is it worthwhile to keep the old document somewhere?

 Last but not least, the doc for authors wanting to package and
 distribute their project (?Distributing Python Modules?, 
 Doc/distutils).
  I think we should not overwrite this directory, because the directory
 name is tied to distutils and because there will be users needing that
 documentation (distutils is not removed).  So, is it okay to create a
 new Doc/packaging directory and change the link on the docs front page
 from distutils to packaging?
 (Technical question: I think I?ll get a complaint from Sphinx that
 distutils is not included in any toctree; I?ll try adding a toctree 
 from
 library/distutils to distutils/index and see if that works.)

 Thanks for reading

From g.brandl at gmx.net  Mon May 30 19:04:51 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 30 May 2011 19:04:51 +0200
Subject: [Python-Dev] cpython: removed spurious output
In-Reply-To: <4DE3BDA6.1040100@udel.edu>
References: <E1QQzfO-0003da-B2@dinsdale.python.org> <4DE3BDA6.1040100@udel.edu>
Message-ID: <is0ind$m68$1@dough.gmane.org>

On 30.05.2011 17:54, Terry Reedy wrote:
> 
> 
> On 5/30/2011 6:25 AM, tarek.ziade wrote:
> 
> Should not old_out be sys.stderr, since that is what you over-write and 
> 'restore'?
> 
>> +        old_out = sys.stdout
>> +        sys.stderr = StringIO()
>> +        try:
>> +            dist = self.run_setup('install_dist', '--prefix=' + self.root_dir)
>> +        finally:
>> +            sys.sterr = old_out

And even more importantly, shouldn't this be "sys.stderr" instead of "sys.sterr"?

Really, what happened to testing before you push?

Georg


From ziade.tarek at gmail.com  Mon May 30 19:13:43 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Mon, 30 May 2011 19:13:43 +0200
Subject: [Python-Dev] cpython: removed spurious output
In-Reply-To: <is0ind$m68$1@dough.gmane.org>
References: <E1QQzfO-0003da-B2@dinsdale.python.org> <4DE3BDA6.1040100@udel.edu>
	<is0ind$m68$1@dough.gmane.org>
Message-ID: <BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com>

On Mon, May 30, 2011 at 7:04 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> On 30.05.2011 17:54, Terry Reedy wrote:
>>
>>
>> On 5/30/2011 6:25 AM, tarek.ziade wrote:
>>
>> Should not old_out be sys.stderr, since that is what you over-write and
>> 'restore'?
>>
>>> + ? ? ? ?old_out = sys.stdout
>>> + ? ? ? ?sys.stderr = StringIO()
>>> + ? ? ? ?try:
>>> + ? ? ? ? ? ?dist = self.run_setup('install_dist', '--prefix=' + self.root_dir)
>>> + ? ? ? ?finally:
>>> + ? ? ? ? ? ?sys.sterr = old_out
>
> And even more importantly, shouldn't this be "sys.stderr" instead of "sys.sterr"?

Yes,

>
> Really, what happened to testing before you push?

I did test it, before and after my push, sir.

This was not to fix a test bug, but to avoid a spurious output in the tests.


> Georg
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
>



-- 
Tarek Ziad? | http://ziade.org

From g.brandl at gmx.net  Mon May 30 19:31:43 2011
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 30 May 2011 19:31:43 +0200
Subject: [Python-Dev] cpython: removed spurious output
In-Reply-To: <BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com>
References: <E1QQzfO-0003da-B2@dinsdale.python.org>
	<4DE3BDA6.1040100@udel.edu> <is0ind$m68$1@dough.gmane.org>
	<BANLkTimVh-fh_tJZtfoT1hX_hg7pL=uNJA@mail.gmail.com>
Message-ID: <is0k9o$2s8$1@dough.gmane.org>

On 30.05.2011 19:13, Tarek Ziad? wrote:
> On Mon, May 30, 2011 at 7:04 PM, Georg Brandl <g.brandl at gmx.net> wrote:
>> On 30.05.2011 17:54, Terry Reedy wrote:
>>>
>>>
>>> On 5/30/2011 6:25 AM, tarek.ziade wrote:
>>>
>>> Should not old_out be sys.stderr, since that is what you over-write and
>>> 'restore'?
>>>
>>>> +        old_out = sys.stdout
>>>> +        sys.stderr = StringIO()
>>>> +        try:
>>>> +            dist = self.run_setup('install_dist', '--prefix=' + self.root_dir)
>>>> +        finally:
>>>> +            sys.sterr = old_out
>>
>> And even more importantly, shouldn't this be "sys.stderr" instead of "sys..sterr"?
> 
> Yes,
> 
>>
>> Really, what happened to testing before you push?
> 
> I did test it, before and after my push, sir.
> 
> This was not to fix a test bug, but to avoid a spurious output in the tests.

Well, I assumed changing sys.stderr would be noticed as changing the execution
environment.

But as I've now found out, the test class itself cleans up sys.stderr, so you
couldn't have noticed the bug.  I apologize.

Georg


From ncoghlan at gmail.com  Tue May 31 07:13:06 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 May 2011 15:13:06 +1000
Subject: [Python-Dev] pysetup as a top script
In-Reply-To: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com>
References: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com>
Message-ID: <BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com>

On Tue, May 31, 2011 at 2:44 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
> Hello
>
> If no one objects, I'll promote Tools/scripts/pysetup3 to a top level
> script that gets installed in scripts/ like 2to3, pydoc etc..
>
> That way, people will be able to use it directly when installing,
> removing projects, or studying what's installed

Cool.

Now I'm trying to remember if it was a list discussion or the language
summit where you got the initial consensus on that approach...

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Tue May 31 07:18:28 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 May 2011 15:18:28 +1000
Subject: [Python-Dev] [Python-checkins] cpython: removed spurious output
In-Reply-To: <E1QQzfO-0003da-B2@dinsdale.python.org>
References: <E1QQzfO-0003da-B2@dinsdale.python.org>
Message-ID: <BANLkTikUyWRTq0xjucXJpXn105XAuoiqww@mail.gmail.com>

On Mon, May 30, 2011 at 8:25 PM, tarek.ziade <python-checkins at python.org> wrote:
> + ? ? ? ?old_out = sys.stdout
> + ? ? ? ?sys.stderr = StringIO()
> + ? ? ? ?try:
> + ? ? ? ? ? ?dist = self.run_setup('install_dist', '--prefix=' + self.root_dir)
> + ? ? ? ?finally:
> + ? ? ? ? ? ?sys.sterr = old_out

There's actually a helper for this in test.support:

  with support.captured_stderr():
    dist = self.run_setup('install_dist', '--prefix=' + self.root_dir)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ncoghlan at gmail.com  Tue May 31 07:44:17 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 May 2011 15:44:17 +1000
Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make
 threading._get_ident() public, rename it to
In-Reply-To: <E1QR9dN-0005rc-Tj@dinsdale.python.org>
References: <E1QR9dN-0005rc-Tj@dinsdale.python.org>
Message-ID: <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com>

On Tue, May 31, 2011 at 7:04 AM, victor.stinner
<python-checkins at python.org> wrote:
> +.. function:: get_ident()
> +
> + ? Return the 'thread identifier' of the current thread. ?This is a nonzero
> + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie
> + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread
> + ? identifiers may be recycled when a thread exits and another thread is
> + ? created.

That's not quite true - the Thread id isn't relinquished until the
Thread object itself is destroyed, rather than when the underlying
thread finishes execution (i.e. the lifecycle of a_thread.ident is the
same as that of id(a_thread)).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From ziade.tarek at gmail.com  Tue May 31 08:45:01 2011
From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Tue, 31 May 2011 08:45:01 +0200
Subject: [Python-Dev] pysetup as a top script
In-Reply-To: <BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com>
References: <BANLkTimiPUBBPEzd04RpfDWAgTLL-HZnTQ@mail.gmail.com>
	<BANLkTinzDKL9Fh5gq5mqu_fR5jVGqzYxsw@mail.gmail.com>
Message-ID: <BANLkTinHSg=YWQvDmY+QGxp5-K6rXJQK6A@mail.gmail.com>

On Tue, May 31, 2011 at 7:13 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Tue, May 31, 2011 at 2:44 AM, Tarek Ziad? <ziade.tarek at gmail.com> wrote:
>> Hello
>>
>> If no one objects, I'll promote Tools/scripts/pysetup3 to a top level
>> script that gets installed in scripts/ like 2to3, pydoc etc..
>>
>> That way, people will be able to use it directly when installing,
>> removing projects, or studying what's installed
>
> Cool.
>
> Now I'm trying to remember if it was a list discussion or the language
> summit where you got the initial consensus on that approach...

The thread starts here:
http://mail.python.org/pipermail/python-dev/2010-October/104535.html

The pysetup top-level script was mentioned here:
http://mail.python.org/pipermail/python-dev/2010-October/104581.html

Cheers
Tarek

>
> Cheers,
> Nick.
>
> --
> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia
>



-- 
Tarek Ziad? | http://ziade.org

From neologix at free.fr  Tue May 31 09:17:23 2011
From: neologix at free.fr (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=)
Date: Tue, 31 May 2011 09:17:23 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make
 threading._get_ident() public, rename it to
In-Reply-To: <BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com>
References: <E1QR9dN-0005rc-Tj@dinsdale.python.org>
	<BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com>
Message-ID: <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com>

>> +.. function:: get_ident()
>> +
>> + ? Return the 'thread identifier' of the current thread. ?This is a nonzero
>> + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie
>> + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread
>> + ? identifiers may be recycled when a thread exits and another thread is
>> + ? created.
>
> That's not quite true - the Thread id isn't relinquished until the
> Thread object itself is destroyed, rather than when the underlying
> thread finishes execution (i.e. the lifecycle of a_thread.ident is the
> same as that of id(a_thread)).
>

I'm not sure I understand, Nick.
Since threads are started detached, their thread ID (e.g. returned by
pthread_self() on pthreads) can be reused as soon as the underlying OS
thread exits (i.e. returns from Modules/_threadmodule.c:t_boostrap) :

On a Linux kernel with NPTL:

$ cat /tmp/test.py
import threading

def print_ident():
    print(threading._get_ident())

t1  = threading.Thread(target=print_ident)
t2  = threading.Thread(target=print_ident)

t1.start()
t1.join()

t2.start()
t2.join()

print(id(t1), id(t2))
$ ./python /tmp/test.py
-1211954272
-1211954272
(3085561228L, 3083093028L)


I'm just curious, maybe I missed something?

Thanks,

cf

From ncoghlan at gmail.com  Tue May 31 10:37:15 2011
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 31 May 2011 18:37:15 +1000
Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make
 threading._get_ident() public, rename it to
In-Reply-To: <BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com>
References: <E1QR9dN-0005rc-Tj@dinsdale.python.org>
	<BANLkTingX9zqUH+TWWbw_e+n=hdPhXxaBA@mail.gmail.com>
	<BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com>
Message-ID: <BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com>

2011/5/31 Charles-Fran?ois Natali <neologix at free.fr>:
>>> +.. function:: get_ident()
>>> +
>>> + ? Return the 'thread identifier' of the current thread. ?This is a nonzero
>>> + ? integer. ?Its value has no direct meaning; it is intended as a magic cookie
>>> + ? to be used e.g. to index a dictionary of thread-specific data. ?Thread
>>> + ? identifiers may be recycled when a thread exits and another thread is
>>> + ? created.
>>
>> That's not quite true - the Thread id isn't relinquished until the
>> Thread object itself is destroyed, rather than when the underlying
>> thread finishes execution (i.e. the lifecycle of a_thread.ident is the
>> same as that of id(a_thread)).
>>
>
> I'm not sure I understand, Nick.

I was just wrong, but the wording is still confusing since it has been
copied from _thread.ident. "Thread" means something other than
"threading.Thread" in that module, while in the threading docs, it
typically refers to the actual objects. With the change of module,
there needs to be something to make it clearer that this is
information related to os level threads.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia

From victor.stinner at haypocalc.com  Tue May 31 10:51:45 2011
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Tue, 31 May 2011 10:51:45 +0200
Subject: [Python-Dev] [Python-checkins] cpython: Close #12028: Make
	threading._get_ident() public, rename it to
In-Reply-To: <BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com>
References: <E1QR9dN-0005rc-Tj@dinsdale.python.org>
	<BANLkTi=M0-CP922qZOEXWfCte-sTOkC55A@mail.gmail.com>
	<BANLkTikX12cjFVdBVOB19SXMivZxhpFxuQ@mail.gmail.com>
Message-ID: <201105311051.46234.victor.stinner@haypocalc.com>

Le mardi 31 mai 2011 10:37:15, Nick Coghlan a ?crit :
> I was just wrong, but the wording is still confusing since it has been
> copied from _thread.ident. "Thread" means something other than
> "threading.Thread" in that module, while in the threading docs, it
> typically refers to the actual objects. With the change of module,
> there needs to be something to make it clearer that this is
> information related to os level threads.

Yes, I copy-pasted the doc from Python 2.7, from thread.get_ident(). Feel free 
to edit directly the doc.

Victor

From fuzzyman at voidspace.org.uk  Tue May 31 15:19:27 2011
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 31 May 2011 14:19:27 +0100
Subject: [Python-Dev] Release pages malformed on python.org
Message-ID: <4DE4EADF.5080709@voidspace.org.uk>

Hello all,

I believe that the release manager is aware of this, but just in case... 
The web pages on python.org for the recent releases are malformatted:

     http://www.python.org/download/releases/3.1.4/
     http://www.python.org/download/releases/2.7.2/

These are the pages linked to from the news on the front page.

All the best,

Michael Foord

-- 
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html


From benjamin at python.org  Tue May 31 16:01:15 2011
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 31 May 2011 09:01:15 -0500
Subject: [Python-Dev] Release pages malformed on python.org
In-Reply-To: <4DE4EADF.5080709@voidspace.org.uk>
References: <4DE4EADF.5080709@voidspace.org.uk>
Message-ID: <BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com>

2011/5/31 Michael Foord <fuzzyman at voidspace.org.uk>:
> Hello all,
>
> I believe that the release manager is aware of this, but just in case... The
> web pages on python.org for the recent releases are malformatted:
>
> ? ?http://www.python.org/download/releases/3.1.4/
> ? ?http://www.python.org/download/releases/2.7.2/

Wohaa. Martin, I think this is from your checkin?



-- 
Regards,
Benjamin

From techtonik at gmail.com  Tue May 31 19:04:10 2011
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 31 May 2011 20:04:10 +0300
Subject: [Python-Dev] Sniffing passwords from PyPI using insecure connection
Message-ID: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com>

Hi,

I'd like to escalate http://bugs.python.org/issue12226 : 'use secured
channel for uploading packages to pypi' to be shipped with next Python
2.6+
This will prevent pydotorg password sniffing when submitting packages
through public networks (such as hotels).

--
anatoly t.

From martin at v.loewis.de  Tue May 31 20:11:39 2011
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 31 May 2011 20:11:39 +0200
Subject: [Python-Dev] Release pages malformed on python.org
In-Reply-To: <BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com>
References: <4DE4EADF.5080709@voidspace.org.uk>
	<BANLkTikqjfZvcPS_fyaiK-Ygdse9-iLATg@mail.gmail.com>
Message-ID: <4DE52F5B.2050002@v.loewis.de>

Am 31.05.2011 16:01, schrieb Benjamin Peterson:
> 2011/5/31 Michael Foord <fuzzyman at voidspace.org.uk>:
>> Hello all,
>>
>> I believe that the release manager is aware of this, but just in case... The
>> web pages on python.org for the recent releases are malformatted:
>>
>>    http://www.python.org/download/releases/3.1.4/
>>    http://www.python.org/download/releases/2.7.2/
> 
> Wohaa. Martin, I think this is from your checkin?

Indeed... I have now fixed it.

Regards,
Martin


From tjreedy at udel.edu  Tue May 31 21:05:29 2011
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 31 May 2011 15:05:29 -0400
Subject: [Python-Dev] Sniffing passwords from PyPI using insecure
	connection
In-Reply-To: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com>
References: <BANLkTikfTXNrvBQpJ-_kXHePon7ynAmOGw@mail.gmail.com>
Message-ID: <is3e5o$qou$1@dough.gmane.org>

On 5/31/2011 1:04 PM, anatoly techtonik wrote:
> Hi,
>
> I'd like to escalate http://bugs.python.org/issue12226 : 'use secured
> channel for uploading packages to pypi' to be shipped with next Python
> 2.6+
> This will prevent pydotorg password sniffing when submitting packages
> through public networks (such as hotels).

The requested one character change is
-    DEFAULT_REPOSITORY = 'http://pypi.python.org/pypi'
+    DEFAULT_REPOSITORY = 'https://pypi.python.org/pypi'

If Tarek (or perhaps Eric) agree that it is appropriate and otherwise 
innocuous, then Martin and Barry can decide whether to include in 2.5/2.6.

Terry Jan Reedy