From alon at  Fri Jun  1 12:07:28 2012
From: alon at (Alon Horev)
Date: Fri, 1 Jun 2012 13:07:28 +0300
Subject: [Python-ideas] setprofile and settrace inconsistency
Message-ID: <>


When setting a trace function with settrace, the trace function when called
with a new scope can return another trace function or None, indicating the
inner scope should not be traced.
I used settrace for some time but calling the trace function for every line
of code is a performance killer.
So I moved on to setprofile, which calls a trace function every function
entry/exit. now here's the problem: the return value from the trace
function is ignored (intentionally), denying the possibility to skip
tracing of 'hot' or 'not interesting' code.

I would like to propose two alternatives:
1. setprofile will not ignore the return value and mimic settrace's
2. setprofile is just a wrapper around settrace that limits
it's functionality, lets make settrace more flexible so setprofile will be
redundant. here's how: settrace will recieve an argument called 'events',
the trace function will fire only on events contained in that list. for
example: setprofile = partial(settrace, events=['call', 'return'])

I personally prefer the second.

Some context to this issue:
I'm building a python tracer - a logger that records each and every
function call. In order for it to run in production systems, the overhead
should be minimal. I would like to allow the user to say which
function/module/classes to trace or skip, for example: the user will skip
all math/cpu intensive operations. another example: the user will want to
trace his django app code but not the django framework.

your thoughts?

                  Thanks, Alon Horev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From techtonik at  Fri Jun  1 17:08:21 2012
From: techtonik at (anatoly techtonik)
Date: Fri, 1 Jun 2012 18:08:21 +0300
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, May 29, 2012 at 9:02 AM, Nick Coghlan <ncoghlan at> wrote:
> Once again, you're completely ignoring all existing knowledge and
> expertise on open collaboration and trying to reinvent the world. It's
> *not going to happen*.

It's too boring to live in a world of existing knowledge and
expertise, and yes, I am not aware of any open collaboration stuff
expertise. Any reading recommendations with concentrated knowledge
that can fit my brain?

> The standard library is just the curated core, and *yes*, it's damn
> hard to get anything added to it (deliberately so). There's a place
> where anyone can post anything they want, and see if others find it
> useful: PyPI.

The major drawbacks of remote packages in general is that it bring
back project compilation from the old days. The biggest Python
advantage at all times was "copy and run" ability.

The drawbacks of PyPI for this proposal are:
1. every function you need will require a separate upload to PyPI
2. you can't upload function with the same stdlib name, but slightly
different implementation as it is used in different projects
3. you can't find functions that people recommend to be included into stdlib
4. it is hard (impossible) to gather feedback on the quality of these proposals

> The standard library provides tools to upload to PyPI, and, as of 3.3,
> will even include tools to download and install from it.

I am glad 3.3 is giving virtualenv and bootstrap stuff. It would
really rock, if the new feature won't be settled in stone right after
release and will gain a few UX iterations with allowed break-ability.

As for PyPI, the major drawback of it is security - DNS attack for a
couple of minutes, and one of your automatically deployed nodes is
trojan ready. I remember PyPI password are stored in clear-text on
developer's machine, but I don't remember if anyone turned off HTTP
basic authorization on PyPI to protect passwords travelling to PyPI
with every upload from intercepting. It would be an interesting
exercise to sniff PyPI passwords over WiFi during next conference
(i.e. and match those to the
developer's accounts on * ;)

> If you don't like our ecosystem (it's hard to tell whether or not you
> do: everything you post is about how utterly awful and unusable
> everything is, yet you're still here years later).

You're absolutely right - I like the Python ecosystem, otherwise I
wouldn't stick there. It is like a vintage car - awesome, nice
looking, and there is even this new twisted pyusion engine inside,
but.. well - it's not for youngsters.

> If you think the PyPI UI is awful or inadequate, follow the example of
> or and *create your own*. There's far more
> to the Python universe than just core development, stop trying to
> shoehorn everything into a place where it doesn't belong.

I have absolutely no idea how aforementioned post touches PyPI UI.
Speaking about PyPI enhancements and ecosystem, instead of reinventing
bicycles I'd rather patch existing one. The only problem is that
patches are not accepted.

From cenkalti at  Fri Jun  1 23:10:03 2012
From: cenkalti at (=?UTF-8?Q?Cenk_Alt=C4=B1?=)
Date: Sat, 2 Jun 2012 00:10:03 +0300
Subject: [Python-ideas] Adding list.pluck()
Message-ID: <>

Hello All,

pluck() is a beautiful function which is in underscore.js library.
Described as "A convenient version of what is perhaps the most common
use-case for map: extracting a list of property values."

What about it implementing for python lists? And maybe for other iterables?

From phd at  Fri Jun  1 23:16:30 2012
From: phd at (Oleg Broytman)
Date: Sat, 2 Jun 2012 01:16:30 +0400
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 02, 2012 at 12:10:03AM +0300, Cenk Alt?? <cenkalti at> wrote:
> pluck() is a beautiful function which is in underscore.js library.
> Described as "A convenient version of what is perhaps the most common
> use-case for map: extracting a list of property values."
> What about it implementing for python lists? And maybe for other iterables?

   Like operator.attrgetter?

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From mikegraham at  Fri Jun  1 23:18:45 2012
From: mikegraham at (Mike Graham)
Date: Fri, 1 Jun 2012 17:18:45 -0400
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 1, 2012 at 5:10 PM, Cenk Alt? <cenkalti at> wrote:
> Hello All,
> pluck() is a beautiful function which is in underscore.js library.
> Described as "A convenient version of what is perhaps the most common
> use-case for map: extracting a list of property values."
> What about it implementing for python lists? And maybe for other iterables?

Using a generator expression or list comprehension to do this is so
easy and readable I don't see why we'd want something new in Python.


From alexandre.zani at  Fri Jun  1 23:25:24 2012
From: alexandre.zani at (Alexandre Zani)
Date: Fri, 1 Jun 2012 14:25:24 -0700
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

What if it's a list of objects instead of a list of dicts? List
comprehension already makes this easy:

[i['name'] for i in l]

I don't think this would add as much in python as it adds in javascript.

On Fri, Jun 1, 2012 at 2:16 PM, Oleg Broytman <phd at> wrote:
> On Sat, Jun 02, 2012 at 12:10:03AM +0300, Cenk Alt?? <cenkalti at> wrote:
>> pluck() is a beautiful function which is in underscore.js library.
>> Described as "A convenient version of what is perhaps the most common
>> use-case for map: extracting a list of property values."
>> What about it implementing for python lists? And maybe for other iterables?
> ? Like operator.attrgetter?
> Oleg.
> --
> ? ? Oleg Broytman ? ? ? ? ? ? ? ? ? ? ? ?phd at
> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From cenkalti at  Fri Jun  1 23:48:04 2012
From: cenkalti at (=?UTF-8?Q?Cenk_Alt=C4=B1?=)
Date: Sat, 2 Jun 2012 00:48:04 +0300
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

l.pluck('name') is more readable IMO.

On Sat, Jun 2, 2012 at 12:25 AM, Alexandre Zani
<alexandre.zani at> wrote:
> What if it's a list of objects instead of a list of dicts? List
> comprehension already makes this easy:
> [i['name'] for i in l]
> I don't think this would add as much in python as it adds in javascript.
> On Fri, Jun 1, 2012 at 2:16 PM, Oleg Broytman <phd at> wrote:
>> On Sat, Jun 02, 2012 at 12:10:03AM +0300, Cenk Alt?? <cenkalti at> wrote:
>>> pluck() is a beautiful function which is in underscore.js library.
>>> Described as "A convenient version of what is perhaps the most common
>>> use-case for map: extracting a list of property values."
>>> What about it implementing for python lists? And maybe for other iterables?
>> ? Like operator.attrgetter?
>> Oleg.
>> --
>> ? ? Oleg Broytman ? ? ? ? ? ? ? ? ? ? ? ?phd at
>> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From alexandre.zani at  Sat Jun  2 00:53:48 2012
From: alexandre.zani at (Alexandre Zani)
Date: Fri, 1 Jun 2012 15:53:48 -0700
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

I must confess that I don't find "pluck" a very intuitive name for
this functionality. For me it was evocative of what pop currently
does. That's an N of 1 so maybe I'm just wrong on that one.

More importantly, this would make the use of a list method dependent
upon the type of the contained items. (works for dicts and nothing
else) That would be unprecedented for list methods and potentially
confusing. What would be the behavior if the list contains non-dicts?

On Fri, Jun 1, 2012 at 2:48 PM, Cenk Alt? <cenkalti at> wrote:
> l.pluck('name') is more readable IMO.
> On Sat, Jun 2, 2012 at 12:25 AM, Alexandre Zani
> <alexandre.zani at> wrote:
>> What if it's a list of objects instead of a list of dicts? List
>> comprehension already makes this easy:
>> [i['name'] for i in l]
>> I don't think this would add as much in python as it adds in javascript.
>> On Fri, Jun 1, 2012 at 2:16 PM, Oleg Broytman <phd at> wrote:
>>> On Sat, Jun 02, 2012 at 12:10:03AM +0300, Cenk Alt?? <cenkalti at> wrote:
>>>> pluck() is a beautiful function which is in underscore.js library.
>>>> Described as "A convenient version of what is perhaps the most common
>>>> use-case for map: extracting a list of property values."
>>>> What about it implementing for python lists? And maybe for other iterables?
>>> ? Like operator.attrgetter?
>>> Oleg.
>>> --
>>> ? ? Oleg Broytman ? ? ? ? ? ? ? ? ? ? ? ?phd at
>>> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at

From mwm at  Sat Jun  2 01:07:04 2012
From: mwm at (Mike Meyer)
Date: Fri, 1 Jun 2012 19:07:04 -0400
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 2 Jun 2012 00:48:04 +0300
Cenk Alt? <cenkalti at> wrote:

> l.pluck('name') is more readable IMO.

Only because you already associate "pluck" with that meaning.

As others said, "pluck" to me implies something like "pop". The list
comprehension spelling doesn't suffer from this problem, and provides
a lot more flexibility. If you don't like list comprehensions, use map
and the operator module.

Even if it is more readable, it's more semantic load. It's another
container operator (and one that's only useful in the special case of
a list of maps) people have to learn. Since it saves 0 lines of code
over either of existing mechanisms, the extra load comes for no


Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From malaclypse2 at  Sat Jun  2 01:16:37 2012
From: malaclypse2 at (Jerry Hill)
Date: Fri, 1 Jun 2012 19:16:37 -0400
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 1, 2012 at 5:48 PM, Cenk Alt? <cenkalti at> wrote:
> l.pluck('name') is more readable IMO.

That's not how the library you linked to works, as far as I can tell.
Based on the sample usage at,  pluck is a
function taking an iterable of dictionaries and the key, so I think
the python equivalent is:

def pluck(iterable, key):
    return [item[key] for item in iterable]

stooges = [{'name' : 'moe', 'age' : 40},
           {'name' : 'larry', 'age' : 50},
           {'name' : 'curly', 'age' : 60}]

print (pluck(stooges, 'name'))

['moe', 'larry', 'curly']


From cenkalti at  Sat Jun  2 07:25:13 2012
From: cenkalti at (=?UTF-8?Q?Cenk_Alt=C4=B1?=)
Date: Sat, 2 Jun 2012 08:25:13 +0300
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

If the item has definded a method "__getitem__" it will be called,
else "__getattribute__" is called.

I know list comprehensions are same thing but .pluck seems easier to
read and write I think (no need to write a temporary variable in list
comprehension, and also square brackets). Just an idea...

On Sat, Jun 2, 2012 at 1:53 AM, Alexandre Zani <alexandre.zani at> wrote:
> I must confess that I don't find "pluck" a very intuitive name for
> this functionality. For me it was evocative of what pop currently
> does. That's an N of 1 so maybe I'm just wrong on that one.
> More importantly, this would make the use of a list method dependent
> upon the type of the contained items. (works for dicts and nothing
> else) That would be unprecedented for list methods and potentially
> confusing. What would be the behavior if the list contains non-dicts?
> On Fri, Jun 1, 2012 at 2:48 PM, Cenk Alt? <cenkalti at> wrote:
>> l.pluck('name') is more readable IMO.
>> On Sat, Jun 2, 2012 at 12:25 AM, Alexandre Zani
>> <alexandre.zani at> wrote:
>>> What if it's a list of objects instead of a list of dicts? List
>>> comprehension already makes this easy:
>>> [i['name'] for i in l]
>>> I don't think this would add as much in python as it adds in javascript.
>>> On Fri, Jun 1, 2012 at 2:16 PM, Oleg Broytman <phd at> wrote:
>>>> On Sat, Jun 02, 2012 at 12:10:03AM +0300, Cenk Alt?? <cenkalti at> wrote:
>>>>> pluck() is a beautiful function which is in underscore.js library.
>>>>> Described as "A convenient version of what is perhaps the most common
>>>>> use-case for map: extracting a list of property values."
>>>>> What about it implementing for python lists? And maybe for other iterables?
>>>> ? Like operator.attrgetter?
>>>> Oleg.
>>>> --
>>>> ? ? Oleg Broytman ? ? ? ? ? ? ? ? ? ? ? ?phd at
>>>> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at

From masklinn at  Sat Jun  2 16:54:44 2012
From: masklinn at (Masklinn)
Date: Sat, 2 Jun 2012 16:54:44 +0200
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On 2 juin 2012, at 07:25, Cenk Alt? <cenkalti at> wrote:

> If the item has definded a method "__getitem__" it will be called,
> else "__getattribute__" is called.

That is compl?te insanity and goes against pretty much all existing python code. Some types make keys available as both items and attributes but I do not know of any operation which does so. 

> I know list comprehensions are same thing but .pluck seems easier to
> read and write I think (no need to write a temporary variable in list
> comprehension, and also square brackets). Just an idea...

Not a useful one, if you dislike the iteration variable of comprehensions you may use 'map' with itemgetter or attrgetter. Not to mention they are more flexible than pluck (they can extract multiple items or attributes, and attrgetter supports "deep" dotted paths)

From grosser.meister.morti at  Sat Jun  2 17:54:50 2012
From: grosser.meister.morti at (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=)
Date: Sat, 02 Jun 2012 17:54:50 +0200
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

There are already at least two easy ways to do this:

 >>> stooges=[{'name': 'moe', 'age': 40}, {'name': 'larry', 'age': 50}, {'name': 'curly', 'age': 60}]
 >>> [guy['name'] for guy in stooges]
['moe', 'larry', 'curly']
 >>> from operator import itemgetter
 >>> map(itemgetter('name'),stooges)
['moe', 'larry', 'curly']

Also I'm used to such functions being called "collect" (Ruby) or "map" (Python, jQuery) and 
accepting a function/block as an argument. In Ruby-on-Rails it can be &:name as a shorthand for 
{|item| item[:name]}, which is equivalent to itemgetter('name') in Python. So if you insist of 
making it shorter (but less readable) you could do:

 >>> from operator import itemgetter as G
 >>> map(G('name'),stooges)
['moe', 'larry', 'curly']

On 06/01/2012 11:10 PM, Cenk Alt? wrote:
> Hello All,
> pluck() is a beautiful function which is in underscore.js library.
> Described as "A convenient version of what is perhaps the most common
> use-case for map: extracting a list of property values."
> What about it implementing for python lists? And maybe for other iterables?

From guido at  Sat Jun  2 18:06:56 2012
From: guido at (Guido van Rossum)
Date: Sat, 2 Jun 2012 09:06:56 -0700
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

Forgive the out of context drive-by comments...

On Sat, Jun 2, 2012 at 8:54 AM, Mathias Panzenb?ck
<grosser.meister.morti at> wrote:
> There are already at least two easy ways to do this:
>>>> stooges=[{'name': 'moe', 'age': 40}, {'name': 'larry', 'age': 50},
>>>> {'name': 'curly', 'age': 60}]
>>>> [guy['name'] for guy in stooges]
> ['moe', 'larry', 'curly']

Bingo. Doesn't need improvements.

>>>> from operator import itemgetter
>>>> map(itemgetter('name'),stooges)
> ['moe', 'larry', 'curly']

If I saw this I would have to think a lot harder before I figured what
it meant. (Especially without the output example.)

Let's remember KISS.

--Guido van Rossum (

From ironfroggy at  Sat Jun  2 18:28:52 2012
From: ironfroggy at (Calvin Spealman)
Date: Sat, 2 Jun 2012 12:28:52 -0400
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 1, 2012 at 5:10 PM, Cenk Alt? <cenkalti at> wrote:
> Hello All,
> pluck() is a beautiful function which is in underscore.js library.
> Described as "A convenient version of what is perhaps the most common
> use-case for map: extracting a list of property values."
> What about it implementing for python lists? And maybe for other iterables?

This is a case where a simple list comprehension or generator
expression would be a lot easier to understand than remembering what a
rarely used method name does. Also, it couples two distinct
interfaces, iterables and mappings, in a way that is generally frowned
upon in Python.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

Read my blog! I depend on your acceptance of my opinion! I am interesting!
Follow me if you're into that sort of thing:

From ironfroggy at  Sat Jun  2 19:17:49 2012
From: ironfroggy at (Calvin Spealman)
Date: Sat, 2 Jun 2012 13:17:49 -0400
Subject: [Python-ideas] setprofile and settrace inconsistency
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 1, 2012 at 6:07 AM, Alon Horev <alon at> wrote:
> Hi,
> When setting a trace function with settrace, the trace function when called
> with a new scope can return another trace function or None, indicating the
> inner scope should not be traced.
> I used settrace for some time but calling the trace function for every line
> of code is a performance killer.
> So I moved on to setprofile, which calls a trace function every function
> entry/exit. now here's the problem: the return value from the trace function
> is ignored (intentionally), denying the?possibility to skip tracing of 'hot'
> or 'not interesting' code.
> I would like to propose two alternatives:
> 1. setprofile will not ignore the return value and mimic settrace's
> behavior.
> 2. setprofile is just a wrapper around settrace that limits
> it's?functionality, lets make settrace more flexible so setprofile will be
> redundant. here's how: settrace will recieve an argument called 'events',
> the trace function will fire only on events contained in that list. for
> example: setprofile = partial(settrace, events=['call', 'return'])

I particularly like the additional parameter for settrace().

> I personally prefer the second.
> Some context to this issue:
> I'm building a python tracer - a logger that records each and every function
> call. In order for it to run in production systems, the overhead should be
> minimal. I would like to allow the user to say which function/module/classes
> to trace or skip, for example: the user will skip all math/cpu intensive
> operations. another example: the user will want to trace his django app code
> but not the django framework.
> your thoughts?
> ? ? ? ? ? ? ? ? ? Thanks, Alon Horev
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

Read my blog! I depend on your acceptance of my opinion! I am interesting!
Follow me if you're into that sort of thing:

From ironfroggy at  Sat Jun  2 19:24:05 2012
From: ironfroggy at (Calvin Spealman)
Date: Sat, 2 Jun 2012 13:24:05 -0400
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 1, 2012 at 11:08 AM, anatoly techtonik <techtonik at> wrote:
> On Tue, May 29, 2012 at 9:02 AM, Nick Coghlan <ncoghlan at> wrote:
>> Once again, you're completely ignoring all existing knowledge and
>> expertise on open collaboration and trying to reinvent the world. It's
>> *not going to happen*.
> It's too boring to live in a world of existing knowledge and
> expertise,

Frankly, this one fragment is enough to stop me reading further. Who
wants to learn
from the vast and broad experience when you could simply randomize the rules of
reality through ignorance and stubbornness?

I sound fickle, because I am.

> and yes, I am not aware of any open collaboration stuff
> expertise. Any reading recommendations with concentrated knowledge
> that can fit my brain?
>> The standard library is just the curated core, and *yes*, it's damn
>> hard to get anything added to it (deliberately so). There's a place
>> where anyone can post anything they want, and see if others find it
>> useful: PyPI.
> The major drawbacks of remote packages in general is that it bring
> back project compilation from the old days. The biggest Python
> advantage at all times was "copy and run" ability.
> The drawbacks of PyPI for this proposal are:
> 1. every function you need will require a separate upload to PyPI
> 2. you can't upload function with the same stdlib name, but slightly
> different implementation as it is used in different projects
> 3. you can't find functions that people recommend to be included into stdlib
> 4. it is hard (impossible) to gather feedback on the quality of these proposals
>> The standard library provides tools to upload to PyPI, and, as of 3.3,
>> will even include tools to download and install from it.
> I am glad 3.3 is giving virtualenv and bootstrap stuff. It would
> really rock, if the new feature won't be settled in stone right after
> release and will gain a few UX iterations with allowed break-ability.
> As for PyPI, the major drawback of it is security - DNS attack for a
> couple of minutes, and one of your automatically deployed nodes is
> trojan ready. I remember PyPI password are stored in clear-text on
> developer's machine, but I don't remember if anyone turned off HTTP
> basic authorization on PyPI to protect passwords travelling to PyPI
> with every upload from intercepting. It would be an interesting
> exercise to sniff PyPI passwords over WiFi during next conference
> (i.e. and match those to the
> developer's accounts on * ;)
>> If you don't like our ecosystem (it's hard to tell whether or not you
>> do: everything you post is about how utterly awful and unusable
>> everything is, yet you're still here years later).
> You're absolutely right - I like the Python ecosystem, otherwise I
> wouldn't stick there. It is like a vintage car - awesome, nice
> looking, and there is even this new twisted pyusion engine inside,
> but.. well - it's not for youngsters.
>> If you think the PyPI UI is awful or inadequate, follow the example of
>> or and *create your own*. There's far more
>> to the Python universe than just core development, stop trying to
>> shoehorn everything into a place where it doesn't belong.
> I have absolutely no idea how aforementioned post touches PyPI UI.
> Speaking about PyPI enhancements and ecosystem, instead of reinventing
> bicycles I'd rather patch existing one. The only problem is that
> patches are not accepted.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

Read my blog! I depend on your acceptance of my opinion! I am interesting!
Follow me if you're into that sort of thing:

From storchaka at  Sat Jun  2 20:01:53 2012
From: storchaka at (Serhiy Storchaka)
Date: Sat, 02 Jun 2012 21:01:53 +0300
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <>
References: <>
Message-ID: <jqdken$e1e$>

On 02.06.12 19:06, Guido van Rossum wrote:
> On Sat, Jun 2, 2012 at 8:54 AM, Mathias Panzenb?ck
> <grosser.meister.morti at>  wrote:
>>>>> from operator import itemgetter
>>>>> map(itemgetter('name'),stooges)
>> ['moe', 'larry', 'curly']
> If I saw this I would have to think a lot harder before I figured what
> it meant. (Especially without the output example.)

And this is not true in Python 3.

<map object at 0xb747970c>

From anikom15 at  Sat Jun  2 20:50:21 2012
From: anikom15 at (Westley =?iso-8859-1?Q?Mart=EDnez?=)
Date: Sat, 2 Jun 2012 11:50:21 -0700
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <jqdken$e1e$>
References: <>
Message-ID: <20120602185021.GA3249@kubrick>

On Sat, Jun 02, 2012 at 09:01:53PM +0300, Serhiy Storchaka wrote:
> On 02.06.12 19:06, Guido van Rossum wrote:
> >On Sat, Jun 2, 2012 at 8:54 AM, Mathias Panzenb?ck
> ><grosser.meister.morti at>  wrote:
> >>>>>from operator import itemgetter
> >>>>>map(itemgetter('name'),stooges)
> >>['moe', 'larry', 'curly']
> >
> >If I saw this I would have to think a lot harder before I figured what
> >it meant. (Especially without the output example.)
> And this is not true in Python 3.
> <map object at 0xb747970c>

map returns a generator in Python 3.

From grosser.meister.morti at  Sat Jun  2 21:54:56 2012
From: grosser.meister.morti at (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 02 Jun 2012 21:54:56 +0200
Subject: [Python-ideas] Adding list.pluck()
In-Reply-To: <20120602185021.GA3249@kubrick>
References: <>
	<jqdken$e1e$> <20120602185021.GA3249@kubrick>
Message-ID: <>

On 06/02/2012 08:50 PM, Westley Mart?nez wrote:
> On Sat, Jun 02, 2012 at 09:01:53PM +0300, Serhiy Storchaka wrote:
>> On 02.06.12 19:06, Guido van Rossum wrote:
>>> On Sat, Jun 2, 2012 at 8:54 AM, Mathias Panzenb?ck
>>> <grosser.meister.morti at>   wrote:
>>>>>> >from operator import itemgetter
>>>>>>> map(itemgetter('name'),stooges)
>>>> ['moe', 'larry', 'curly']
>>> If I saw this I would have to think a lot harder before I figured what
>>> it meant. (Especially without the output example.)
>> And this is not true in Python 3.
>> <map object at 0xb747970c>
> map returns a generator in Python 3.

Yes, yes. I opened a Python 2 shell to write the example code. Python 2 is still the default in most 
(all?) Linux distributions. To get a list from that just wrap list() around it.

I consider this behaviour (that map returns a generator) in fact superior to what's available in 
other languages. You can then pass that to whatever constructor you like (e.g. set() or tuple()) or 
take only some of the values and then stop without calculating (and allocating) it all.


From g.rodola at  Mon Jun  4 01:09:31 2012
From: g.rodola at (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Mon, 4 Jun 2012 01:09:31 +0200
Subject: [Python-ideas] Expose Linux-specific APIs in resource module
Message-ID: <>

>From "man getrlimit" we have 5 linux-specific constants which are
currently not exposed by resource module:


Also, we have prlimit(), which is useful to get/set resources in a
per-process fashion based on process PID.
If desirable I can submit a patch for this.

--- Giampaolo

From techtonik at  Mon Jun  4 11:47:48 2012
From: techtonik at (anatoly techtonik)
Date: Mon, 4 Jun 2012 12:47:48 +0300
Subject: [Python-ideas] (Was: shutil.runret and shutil.runout)
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, May 24, 2012 at 6:24 AM, geremy condra <debatem1 at> wrote:
> On Wed, May 23, 2012 at 7:00 PM, Steven D'Aprano <steve at>
> wrote:
>> anatoly techtonik wrote:
>>> I am all ears how to make more secure. Right now I must
>>> confess that I don't even serious is this problems, so if
>>> anyone can came up with a real-world example with explanation of
>>> security concern that could be copied "as-is" into documentation, it
>>> will surely be appreciated not only by me.
>> Start here:
>> Code injection attacks include two of the top three security
>> vulnerabilities, over even buffer overflows.
>> One sub-category of code injection:
>> OS Command Injection

Great links. Thanks. Do they still too generic to be placed in docs?

> I talked about this in my pycon talk this year. It's easy to avoid and
> disastrous to get wrong. Please don't do it this way.

Sorry, don't have too much time to watch it right now. Any specific
slides, ideas or exceprts?
anatoly t.

From debatem1 at  Tue Jun  5 08:00:34 2012
From: debatem1 at (geremy condra)
Date: Mon, 4 Jun 2012 23:00:34 -0700
Subject: [Python-ideas] (Was: shutil.runret and shutil.runout)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 4, 2012 at 2:47 AM, anatoly techtonik <techtonik at>wrote:

> On Thu, May 24, 2012 at 6:24 AM, geremy condra <debatem1 at> wrote:
> > On Wed, May 23, 2012 at 7:00 PM, Steven D'Aprano <steve at>
> > wrote:
> >>
> >> anatoly techtonik wrote:
> >>
> >>> I am all ears how to make more secure. Right now I must
> >>> confess that I don't even serious is this problems, so if
> >>> anyone can came up with a real-world example with explanation of
> >>> security concern that could be copied "as-is" into documentation, it
> >>> will surely be appreciated not only by me.
> >>
> >>
> >> Start here:
> >>
> >>
> >>
> >> Code injection attacks include two of the top three security
> >> vulnerabilities, over even buffer overflows.
> >>
> >> One sub-category of code injection:
> >>
> >> OS Command Injection
> >>
> Great links. Thanks. Do they still too generic to be placed in docs?
> >
> > I talked about this in my pycon talk this year. It's easy to avoid and
> > disastrous to get wrong. Please don't do it this way.
> Sorry, don't have too much time to watch it right now. Any specific
> slides, ideas or exceprts?

The main idea was just that by combining a bit of awareness of common
security anti-patterns (like this one) with a good test regimen and some
script kiddie tools you can protect yourself from a lot of common
vulnerabilities without being a security guru. I demonstrated how that
process works on something fairly similar to this, but if you're interested
in more details I'm happy to blather on or dredge up my slides.

Geremy Condra

> anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Jun  5 08:14:52 2012
From: steve at (Steven D'Aprano)
Date: Tue, 5 Jun 2012 16:14:52 +1000
Subject: [Python-ideas] (Was: shutil.runret and shutil.runout)
In-Reply-To: <>
References: <>
Message-ID: <20120605061451.GA17873@ando>

On Mon, Jun 04, 2012 at 11:00:34PM -0700, geremy condra wrote:

> The main idea was just that by combining a bit of awareness of common
> security anti-patterns (like this one) with a good test regimen and some
> script kiddie tools you can protect yourself from a lot of common
> vulnerabilities without being a security guru. I demonstrated how that
> process works on something fairly similar to this, but if you're interested
> in more details I'm happy to blather on or dredge up my slides.

I am interested in more details. Would this make a good How (Not) To for 
the documentation?


From debatem1 at  Tue Jun  5 08:45:43 2012
From: debatem1 at (geremy condra)
Date: Mon, 4 Jun 2012 23:45:43 -0700
Subject: [Python-ideas] (Was: shutil.runret and shutil.runout)
In-Reply-To: <20120605061451.GA17873@ando>
References: <>
Message-ID: <>

On Mon, Jun 4, 2012 at 11:14 PM, Steven D'Aprano <steve at>wrote:

> On Mon, Jun 04, 2012 at 11:00:34PM -0700, geremy condra wrote:
> > The main idea was just that by combining a bit of awareness of common
> > security anti-patterns (like this one) with a good test regimen and some
> > script kiddie tools you can protect yourself from a lot of common
> > vulnerabilities without being a security guru. I demonstrated how that
> > process works on something fairly similar to this, but if you're
> interested
> > in more details I'm happy to blather on or dredge up my slides.
> I am interested in more details. Would this make a good How (Not) To for
> the documentation?

Combined with some other material I have on hand it might. Only problem
would be that I don't really know my way around Sphinx- if there are any
doc wizards on hand to help with formatting we could probably make a pretty
quick job of it.

Geremy Condra

> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Tue Jun  5 09:08:29 2012
From: ncoghlan at (Nick Coghlan)
Date: Tue, 5 Jun 2012 17:08:29 +1000
Subject: [Python-ideas] (Was: shutil.runret and shutil.runout)
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 5, 2012 at 4:45 PM, geremy condra <debatem1 at> wrote:
> Combined with some other material I have on hand it might. Only problem
> would be that I don't really know my way around Sphinx- if there are any doc
> wizards on hand to help with formatting we could probably make a pretty
> quick job of it.

Yep, if you can provide a plain text version, we can take it from there.

I suggest attaching it to (which is
about taking a more consistent and holistic approach to documenting
security considerations in the library reference without having
modules like subprocess stuck as a wall of red security warning


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rurpy at  Tue Jun  5 19:20:01 2012
From: rurpy at (Rurpy)
Date: Tue, 5 Jun 2012 10:20:01 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
Message-ID: <>

In my first foray into Python3 I've encountered this problem:
I work in a multi-language environment.  I've written a number 
of tools, mostly command-line, that generate output on stdout.
Because these tools and their output are used by various people
in varying environments, the tools all have an --encoding option
to provide output that meets the needs and preferences of the
output's ultimate consumers. 

In converting them to Python3, I found the best (if not very 
pleasant) way to do this in Python3 was to put something like 
this near the top of each tool[*1]:

  import codecs
  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)

What I want to be able to put there instead is:

  sys.stdout.set_encoding (opts.encoding)

The former I found on the internet -- there is zero probability
I could have figured that out from the Python docs.  It is obscure
to anyone (who has like me generally only needed to deal with 
.encode() and .decode()) who hasn't encountered it before or 
dealt much with the codecs module.  It is excessively complex 
for what is conceptually a simple and straight-forward operation.  
It requires the import of the codecs module in programs that other-
wise don't need it [*2], and the reading of the codecs docs (not
a shining example of clarity themselves) to understand it.  In 
short it is butt ugly relative to what I generally get in Python.

Would it be feasible to provide something like .set_encoding() 
on textio streams?  (Or make .encoding a writeable property?; it
seems to intentionally be non-writeable for some reason but is that
reason really unavoidable?)  If doing this for textio in general is
too hard, then what about encapsulating the codecs stuff above in
a sys.set_encoding() function?  

Needing to change the encoding of a sys.std* stream is not an 
uncommon need and a user should not have to go through the 
codecs dance above to do so IMO.

[*1] There are other ways to change stdout's encoding but they
 all have problems AFAICT.  PYTHONIOENCODING can't easily be 
 changed dynamically within program.  Reopening stdout as binary,
 or using the binary interface to text stdout, requires a explicit 
 encode call at each write site.  Overloading print() is obscure
 because it requires reader to notice print was overloaded.

[*2] I don't mean the actual import of the codecs module which
 occurs anyway; I mean the extra visual and cognitive noise 
 introduced by the presence of the import statement in the source.

From stephen at  Tue Jun  5 21:37:16 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 06 Jun 2012 04:37:16 +0900
Subject: [Python-ideas]  changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > It is excessively complex for what is conceptually a simple and
 > straight-forward operation.

The operation is not conceptually straightforward.  The problem is
that you can't just change the encoding of an open stream, encodings
are generally stateful.  The straightforward way to deal with this
issue is to close the stream and reinitialize it.  Your proposed
.set_encoding() method implies something completely different about
what's going on.

I wouldn't object to a method with the semantics of reinitialization,
but it should have a name implying reinitialization.  It probably
should also error if the stream is open and has been written to.

 > Needing to change the encoding of a sys.std* stream is not an 
 > uncommon need and a user should not have to go through the 
 > codecs dance above to do so IMO.

I suspect needing to *change* the encoding of an open stream is
generally quite rare.  Needing to *initialize* the std* streams with
an appropriate codec is common.  That's why it doesn't so much matter
that PYTHONIOENCODING can't be changed within a program.

I agree that use of PYTHONIOENCODING is pretty awkward.

From amauryfa at  Tue Jun  5 23:22:27 2012
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Tue, 5 Jun 2012 23:22:27 +0200
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

2012/6/5 Stephen J. Turnbull <stephen at>

> I wouldn't object to a method with the semantics of reinitialization,
> but it should have a name implying reinitialization.  It probably
> should also error if the stream is open and has been written to.

What do you think of the following method TextIOWrapper.reset_encoding?
(the assert statements should certainly be replaced by some IOError)

    def reset_encoding(self, encoding, errors='strict'):
        if self._decoder:
            # No decoded chars awaiting read
            assert self._decoded_chars_used == len(self._decoded_chars)
            # Nothing in the input buffer
            buf, flag = self._decoder.getstate()
            assert buf == b''
        if self._encoder:
            # Nothing in the output buffer
            buf = self._encoder.encode('', final=True)
            assert buf == b''
        # Reset the decoders
        self._decoder = None
        self._encoder = None
        # Now change the encoding
        self._encoding = encoding
        self._errors = errors

Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Wed Jun  6 01:34:00 2012
From: victor.stinner at (Victor Stinner)
Date: Wed, 6 Jun 2012 01:34:00 +0200
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

2012/6/5 Rurpy <rurpy at>:
> In my first foray into Python3 I've encountered this problem:
> I work in a multi-language environment. ?I've written a number
> of tools, mostly command-line, that generate output on stdout.
> Because these tools and their output are used by various people
> in varying environments, the tools all have an --encoding option
> to provide output that meets the needs and preferences of the
> output's ultimate consumers.

What happens if the specified encoding is different than the encoding
of the console? Mojibake?

If the output is used as in the input of another program, does the
other program use the same encoding?

In my experience, using an encoding different than the locale encoding
for input/output (stdout, environment variables, command line
arguments, etc.) causes various issues. So I'm curious of your use

> In converting them to Python3, I found the best (if not very
> pleasant) way to do this in Python3 was to put something like
> this near the top of each tool[*1]:
> ?import codecs
> ?sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)

In Python 3, you should use io.TextIOWrapper instead of
codecs.StreamWriter. It's more efficient and has less bugs.

> What I want to be able to put there instead is:
> ?sys.stdout.set_encoding (opts.encoding)

I don't think that your use case merit a new method on
io.TextIOWrapper: replacing sys.stdout does work and should be used
instead. TextIOWrapper is generic and your use case if specific to
sys.std* streams.

It would be surprising to change the encoding of an arbitrary file
after it is opened. At least, I don't see the use case.

For example, opens a Python source code file with the
right encoding. It starts by reading the file in binary mode to detect
the encoding, and then use TextIOWrapper to get a text file without
having to reopen the file. It would be possible to start with a text
file and then change the encoding, but it would be less elegant.

>  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)

You should also flush sys.stdout (and maybe also sys.stdout.buffer)
before replacing it.

> It requires the import of the codecs module in programs that other-
> wise don't need it [*2], and the reading of the codecs docs (not
> a shining example of clarity themselves) to understand it.

It's maybe difficult to change the encoding of sys.stdout at runtime
because it is NOT a good idea :-)

> Needing to change the encoding of a sys.std* stream is not an
> uncommon need and a user should not have to go through the
> codecs dance above to do so IMO.

Replacing sys.std* works but has issues: output written before the
replacement is encoded to a different encoding for example. The best
way is to change your locale encoding (using LC_ALL, LC_CTYPE or LANG
environment variable on UNIX), or simply to set PYTHONIOENCODING
environment variable.

> [*1] There are other ways to change stdout's encoding but they
> ?all have problems AFAICT. ?PYTHONIOENCODING can't easily be
> ?changed dynamically within program.

Ah? Detect if PYTHONIOENCODING is present (or if sys.stdout.encoding
is the requested encoding), if not: restart the program with

> ?Overloading print() is obscure
> ?because it requires reader to notice print was overloaded.

Why not writing the output into a file, instead of stdout?


From python at  Wed Jun  6 01:56:55 2012
From: python at (MRAB)
Date: Wed, 06 Jun 2012 00:56:55 +0100
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On 06/06/2012 00:34, Victor Stinner wrote:
> 2012/6/5 Rurpy<rurpy at>:
>>  In my first foray into Python3 I've encountered this problem:
>>  I work in a multi-language environment.  I've written a number
>>  of tools, mostly command-line, that generate output on stdout.
>>  Because these tools and their output are used by various people
>>  in varying environments, the tools all have an --encoding option
>>  to provide output that meets the needs and preferences of the
>>  output's ultimate consumers.
> What happens if the specified encoding is different than the encoding
> of the console? Mojibake?
> If the output is used as in the input of another program, does the
> other program use the same encoding?
> In my experience, using an encoding different than the locale encoding
> for input/output (stdout, environment variables, command line
> arguments, etc.) causes various issues. So I'm curious of your use
> cases.
>>  In converting them to Python3, I found the best (if not very
>>  pleasant) way to do this in Python3 was to put something like
>>  this near the top of each tool[*1]:
>>    import codecs
>>    sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
> In Python 3, you should use io.TextIOWrapper instead of
> codecs.StreamWriter. It's more efficient and has less bugs.
>>  What I want to be able to put there instead is:
>>    sys.stdout.set_encoding (opts.encoding)
> I don't think that your use case merit a new method on
> io.TextIOWrapper: replacing sys.stdout does work and should be used
> instead. TextIOWrapper is generic and your use case if specific to
> sys.std* streams.
> It would be surprising to change the encoding of an arbitrary file
> after it is opened. At least, I don't see the use case.

And if you _do_ want multiple encodings in a file, it's clearer to open
the file as binary and then explicitly encode to bytes and write _that_
to the file.

From stephen at  Wed Jun  6 05:28:57 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 06 Jun 2012 12:28:57 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Amaury Forgeot d'Arc writes:
 > 2012/6/5 Stephen J. Turnbull <stephen at>
 > > I wouldn't object to a method with the semantics of reinitialization,
 > > but it should have a name implying reinitialization.  It probably
 > > should also error if the stream is open and has been written to.
 > >
 > What do you think of the following method TextIOWrapper.reset_encoding?
 > (the assert statements should certainly be replaced by some
 > IOError)

I think that it's an attractive nuisance because it doesn't close the
stream, and therefore permits changing the encoding without any
warning partway through the stream.  There are two reasonable (for a
very generous definition of "reasonable"<wink/>) ways to handle
multiple scripts in one stream: Unicode and ISO 2022.  Simply changing
encodings in the middle is a recipe for disaster in the absence of a
higher-level protocol for signaling this change (that's the role ISO
2022 fulfils, but it is detested by almost everybody...).  If you want
to do that kind of thing, the "import codecs; sys.stdout = ..." idiom
is available, but I don't see a need to make it convenient.

But the OP's request is pretty clearly not for a generic
.set_encoding(), it's for a more convenient way to initialize the
stream for users.

Aside to Victor: at least on Mac OS X, I find that Python 3.2 (current
MacPorts, I can investigate further if you need it) doesn't respect
the language environment as I would expect it to.  "LC_ALL=ja_JP.UTF8
python32" will give me an out-of-range Unicode error if I try to input
Japanese using "import sys; sys.stdin.readline()" -- I have to use
"PYTHONIOENCODING=UTF8" to get useful behavior.

There may also be cases where multiple users with different language
needs are working at the same workstation.

For both of these cases a command-line option to initialize the
encoding would be convenient.

From ncoghlan at  Wed Jun  6 07:49:16 2012
From: ncoghlan at (Nick Coghlan)
Date: Wed, 6 Jun 2012 15:49:16 +1000
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jun 6, 2012 at 1:28 PM, Stephen J. Turnbull <stephen at> wrote:
> For both of these cases a command-line option to initialize the
> encoding would be convenient.

Before adding yet-another-command-line-option, the cases where the
existing environment variable support can't be used from the command
line, but a new option could be, should be clearly enumerated.

$ python3
Python 3.2.1 (default, Jul 11 2011, 18:54:42)
[GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
$ PYTHONIOENCODING=latin-1 python3
Python 3.2.1 (default, Jul 11 2011, 18:54:42)
[GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rurpy at  Wed Jun  6 08:05:35 2012
From: rurpy at (Rurpy)
Date: Tue, 5 Jun 2012 23:05:35 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/05/2012 01:37 PM, Stephen J. Turnbull wrote:
> Rurpy writes:
>  > It is excessively complex for what is conceptually a simple and
>  > straight-forward operation.
> The operation is not conceptually straightforward.  The problem is
> that you can't just change the encoding of an open stream, encodings
> are generally stateful.  The straightforward way to deal with this
> issue is to close the stream and reinitialize it.  Your proposed
> .set_encoding() method implies something completely different about
> what's going on.

I'm not sure why stateful matters.  When you change encoding
you discard whatever state exists and start with the new encoder
in it's initial state.  If there is a partially en/decoded 
character then wouldn't do the same thing you'd do if the same
condition arose at EOF? 

> I wouldn't object to a method with the semantics of reinitialization,
> but it should have a name implying reinitialization.  It probably
> should also error if the stream is open and has been written to.
>  > Needing to change the encoding of a sys.std* stream is not an 
>  > uncommon need and a user should not have to go through the 
>  > codecs dance above to do so IMO.
> I suspect needing to *change* the encoding of an open stream is
> generally quite rare.  Needing to *initialize* the std* streams with
> an appropriate codec is common.  That's why it doesn't so much matter
> that PYTHONIOENCODING can't be changed within a program.

You are correct that my current concern is reinitializing 
the encoding(s) of the sys.std* streams prior to doing any
operations with them.  I thought that changing the encoding
at any point would be a straight-forward generalization.
However I have in the past encountered mixed encoding outputting 
programs in two contexts; generating test data (i think is was 
for automatic detection and extraction of information), and
bundling multiple differently-encoded data sets in one package 
that were pulled apart again downstream

That both uses probably could have been designed better is irrelevant; 
a hypothetical python programmer's job would have been to produce
a python program that would fit into the the existing processes.

However I don't want to dwell on this because it is not my main
concern now, I thought I would just mention it for the record.

> I agree that use of PYTHONIOENCODING is pretty awkward.

From rurpy at  Wed Jun  6 08:14:26 2012
From: rurpy at (Rurpy)
Date: Tue, 5 Jun 2012 23:14:26 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/05/2012 11:49 PM, Nick Coghlan wrote:
> On Wed, Jun 6, 2012 at 1:28 PM, Stephen J. Turnbull <stephen-Sn97VrDLz2sdnm+yROfE0A at> wrote:
>> For both of these cases a command-line option to initialize the
>> encoding would be convenient.

A Python interpreter command line option?
That would not particularly help my use case much.

> Before adding yet-another-command-line-option, the cases where the
> existing environment variable support can't be used from the command
> line, but a new option could be, should be clearly enumerated.
> $ python3
> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
> [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys
>>>> sys.stdout.encoding
> 'UTF-8'
> $ PYTHONIOENCODING=latin-1 python3
> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
> [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import sys
>>>> sys.stdout.encoding
> 'latin-1'

I don't think that works on Windows.

From rurpy at  Wed Jun  6 08:17:18 2012
From: rurpy at (Rurpy)
Date: Tue, 5 Jun 2012 23:17:18 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
Message-ID: <>

On 06/05/2012 05:34 PM, Victor Stinner wrote:
> 2012/6/5 Rurpy <rurpy at>:
>> In my first foray into Python3 I've encountered this problem:
>> I work in a multi-language environment.  I've written a number
>> of tools, mostly command-line, that generate output on stdout.
>> Because these tools and their output are used by various people
>> in varying environments, the tools all have an --encoding option
>> to provide output that meets the needs and preferences of the
>> output's ultimate consumers.
> What happens if the specified encoding is different than the encoding
> of the console? Mojibake?

When output is directed to te console, yes.  Would one 
expect something else?

> If the output is used as in the input of another program, does the
> other program use the same encoding?

Yes of course (when not misused).  That's why they have 
--encoding options.  (Obviously details vary depending on 
requirements of the various tools.)

> In my experience, using an encoding different than the locale encoding
> for input/output (stdout, environment variables, command line
> arguments, etc.) causes various issues. So I'm curious of your use
> cases.

I gave the use case in my original post:

  + I work in a multi-language environment.  I've written a number 
  + of tools, mostly command-line, that generate output on stdout.
  + Because these tools and their output are used by various people
  + in varying environments, the tools all have an --encoding option
  + to provide output that meets the needs and preferences of the
  + output's ultimate consumers. 

They are often used like:
  ./ --encoding=euc-jp dataset >somefile
  <send somefile to some user who uses euc-jp data> 

And of course some tools require something similar for 
stdin encodings.

>> In converting them to Python3, I found the best (if not very
>> pleasant) way to do this in Python3 was to put something like
>> this near the top of each tool[*1]:
>>  import codecs
>>  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
> In Python 3, you should use io.TextIOWrapper instead of
> codecs.StreamWriter. It's more efficient and has less bugs.

Thanks, I'll do that.

But surely this is a strong argument for encapsulating 
the ability to change (or reinitialize) the std* encodings.

I did fair amount of searching on the internet (many orders
of magnitude more time that it would have taken to look up 
sys.stdout.set_encoding() in the documentation) and *still*
ended up with a suboptimal solution.  

>> What I want to be able to put there instead is:
>>  sys.stdout.set_encoding (opts.encoding)
> I don't think that your use case merit a new method on
> io.TextIOWrapper: replacing sys.stdout does work and should be used
> instead. TextIOWrapper is generic and your use case if specific to
> sys.std* streams.
> It would be surprising to change the encoding of an arbitrary file
> after it is opened. At least, I don't see the use case.

I gave a couple that I encountered in the past, in my response 
to Steven Turnbull.  However, now I am more concerned with
just resetting the encoding at the beginning of the program.

> For example, opens a Python source code file with the
> right encoding. It starts by reading the file in binary mode to detect
> the encoding, and then use TextIOWrapper to get a text file without
> having to reopen the file. It would be possible to start with a text
> file and then change the encoding, but it would be less elegant.

That's a rather different use case than mine, yes?

>>  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
> You should also flush sys.stdout (and maybe also sys.stdout.buffer)
> before replacing it.
>> It requires the import of the codecs module in programs that other-
>> wise don't need it [*2], and the reading of the codecs docs (not
>> a shining example of clarity themselves) to understand it.
> It's maybe difficult to change the encoding of sys.stdout at runtime
> because it is NOT a good idea :-)

Why would that be?  My tools already do that, they meet 
their usability requirements and I have noticed no ill
effects.  The code (except for the piece I am complaining
about) is about as simple and obvious as it is possible to 
get.  Am I missing something?
>> Needing to change the encoding of a sys.std* stream is not an
>> uncommon need and a user should not have to go through the
>> codecs dance above to do so IMO.
> Replacing sys.std* works but has issues: output written before the
> replacement is encoded to a different encoding for example. The best
> way is to change your locale encoding (using LC_ALL, LC_CTYPE or LANG
> environment variable on UNIX), or simply to set PYTHONIOENCODING
> environment variable.

Those solutions are not only NOT the best solution (IMO) -- 
they are completely unacceptable.

If I had to build my programs as shell scripts that manipulate 
environment variables before calling my Python program, I would 
dump Python for some other language. 
>> [*1] There are other ways to change stdout's encoding but they
>>  all have problems AFAICT.  PYTHONIOENCODING can't easily be
>>  changed dynamically within program.
> Ah? Detect if PYTHONIOENCODING is present (or if sys.stdout.encoding
> is the requested encoding), if not: restart the program with

For what I need to do (print() to sys.stdout with a different
encoding than what Python guessed I'd want), your proposal seems
absurdly convoluted to me.

sys.stdout is set to encoding A.  I want it to write using 
encoding B.  The obvious, simplest, most desirable solution 
(barring technical difficulties) is just change the encoding.

>>  Overloading print() is obscure
>>  because it requires reader to notice print was overloaded.
> Why not writing the output into a file, instead of stdout?

Because the interface for these tools already exists and
the users of the tools are happy with them the way they are.

And even if that weren't the case, it is not the role of a 
general purpose programming language to say a standard convention 
such as file redirection should be relegated to second-class 
status simply because the programmer needs a different output 
encoding than the language designers thought he would.

From pyideas at  Wed Jun  6 08:32:24 2012
From: pyideas at (Chris Rebert)
Date: Tue, 5 Jun 2012 23:32:24 -0700
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 5, 2012 at 11:14 PM, Rurpy <rurpy at> wrote:
> On 06/05/2012 11:49 PM, Nick Coghlan wrote:
>> On Wed, Jun 6, 2012 at 1:28 PM, Stephen J. Turnbull <stephen-Sn97VrDLz2sdnm+yROfE0A at> wrote:
>> Before adding yet-another-command-line-option, the cases where the
>> existing environment variable support can't be used from the command
>> line, but a new option could be, should be clearly enumerated.
>> $ python3
>> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
>> [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import sys
>>>>> sys.stdout.encoding
>> 'UTF-8'
>> $ PYTHONIOENCODING=latin-1 python3
>> Python 3.2.1 (default, Jul 11 2011, 18:54:42)
>> [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import sys
>>>>> sys.stdout.encoding
>> 'latin-1'
> I don't think that works on Windows.

You just need to use the "set" command/built-in
( ; or the PowerShell equivalent) to set
the environment variable. It's 1 extra line. Blame Windows for not
being POSIXy enough.


From rurpy at  Wed Jun  6 09:09:34 2012
From: rurpy at (Rurpy)
Date: Wed, 6 Jun 2012 00:09:34 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
Message-ID: <>

On 06/05/2012 05:56 PM, MRAB wrote:
> On 06/06/2012 00:34, Victor Stinner wrote:
>> 2012/6/5 Rurpy<rurpy-/E1597aS9LQAvxtiuMwx3w at>:
>>>  In my first foray into Python3 I've encountered this problem:
>>>  I work in a multi-language environment.  I've written a number
>>>  of tools, mostly command-line, that generate output on stdout.
>>>  Because these tools and their output are used by various people
>>>  in varying environments, the tools all have an --encoding option
>>>  to provide output that meets the needs and preferences of the
>>>  output's ultimate consumers.
>> What happens if the specified encoding is different than the encoding
>> of the console? Mojibake?
>> If the output is used as in the input of another program, does the
>> other program use the same encoding?
>> In my experience, using an encoding different than the locale encoding
>> for input/output (stdout, environment variables, command line
>> arguments, etc.) causes various issues. So I'm curious of your use
>> cases.
>>>  In converting them to Python3, I found the best (if not very
>>>  pleasant) way to do this in Python3 was to put something like
>>>  this near the top of each tool[*1]:
>>>    import codecs
>>>    sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
>> In Python 3, you should use io.TextIOWrapper instead of
>> codecs.StreamWriter. It's more efficient and has less bugs.
>>>  What I want to be able to put there instead is:
>>>    sys.stdout.set_encoding (opts.encoding)
>> I don't think that your use case merit a new method on
>> io.TextIOWrapper: replacing sys.stdout does work and should be used
>> instead. TextIOWrapper is generic and your use case if specific to
>> sys.std* streams.
>> It would be surprising to change the encoding of an arbitrary file
>> after it is opened. At least, I don't see the use case.
> [snip]
> And if you _do_ want multiple encodings in a file, it's clearer to open
> the file as binary and then explicitly encode to bytes and write _that_
> to the file.

But is it really?

The following is very simple and the level of python
expertise required is minimal.  It (would) works fine 
with redirection.  One could substitute any other ordinary
open (for write) text file for sys.stdout.

  [off the top of my head]
  text = 'This is %s text: ??????????'
  sys.stdout.set_encoding ('sjis')
  print (text % 'sjis')
  sys.stdout.set_encoding ('euc-jp')
  print (text % 'euc-jp')
  sys.stdout.set_encoding ('iso2022-jp')
  print (text % 'iso2022-jp')

As for your suggestion, how do I reopen sys.stdout in 
binary mode?  I don't need to do that often and don't 
know off the top of my head.  (And it's too late for 
me to look it up.)  And what happens to redirected output
when I close and reopen the stream?  I can open a regular
filename instead.  But remember to make the last two 
opens with "a" rather than "w".  And don't forget the
"\n" at the end of the text line.

Could you show me an code example of your suggestion 
for comparison?

Disclaimer: As I said before, I am not particularly 
advocating for a for a set_encoding() method -- my 
primary suggestion is a programatic way to change the
sys.std* encodings prior to first use.  Here I am just
questioning the claim that a set_encoding() method 
would not be clearer than existing alternatives.

From rurpy at  Wed Jun  6 09:36:39 2012
From: rurpy at (Rurpy)
Date: Wed, 6 Jun 2012 00:36:39 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/06/2012 12:32 AM, Chris Rebert wrote:
> On Tue, Jun 5, 2012 at 11:14 PM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:
>> On 06/05/2012 11:49 PM, Nick Coghlan wrote:
>>> On Wed, Jun 6, 2012 at 1:28 PM, Stephen J. Turnbull <stephen-Sn97VrDLz2sdnm+yROfE0A-XMD5yJDbdMReXY1tMh2IBg at> wrote:
>>> $ PYTHONIOENCODING=latin-1 python3
>> I don't think that works on Windows.
> You just need to use the "set" command/built-in
> ( ; or the PowerShell equivalent) to set
> the environment variable. It's 1 extra line. Blame Windows for not
> being POSIXy enough.

There's a lot more than that I blame Windows for. :-)

There's another extra line to restore the environment to
its original setting too.  And when you forget to do that
remember to straighten out the output of the next python 
program you run.

Also, does not PYTHONIOENCODING affect all three streams?
That would rule it out of consideration in my use case.

But even if not, I'm sorry, compared with running a single 
command with an encoding option, I think messing with 
environment variables is not really a workable solution.  
About the closest I see to do this in practice would be 
to wrap each python program up in a .bat script.

This is really case of the Python tail wagging the 
application dog.

From tarek at  Wed Jun  6 09:56:17 2012
From: tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 06 Jun 2012 09:56:17 +0200
Subject: [Python-ideas] Supporting already opened sockets in our
	socket-based server classes
Message-ID: <>


What about allowing all our socket servers -- from SocketServer to 
WSGIServer, to run with an existing socket.

The use case is to make it easier to write applications that use the 
pre-fork model to run several processes against the same socket.


- the main process creates a socket, binds it and listen to it
- the main process forks some subprocesses and pass them the socket fd 
- each subprocess recreates a socket object using socket.fromfd()   -- 
so it does not bind it
- each subprocess can accept() connection on the socket

I have a working prototype here :
(don't look at the code I made it quickly just as a proof of concept)

What I am proposing is the following syntax:

if the host passed to the class is of the form:


The class will try to create a socket object against the file descriptor 
12, and will not bind() it neither accept() it.

How does that sounds ? If people like the idea I can try to build a 
patch for 3.x, and I can certainly release a
backport for 2.x


From stephen at  Wed Jun  6 10:39:22 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 06 Jun 2012 17:39:22 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > But even if not, I'm sorry, compared with running a single 
 > command with an encoding option, I think messing with 
 > environment variables is not really a workable solution.  

You have a workable 2-line solution, which you posted.  It's ugly and
hard to find, and it should be, to discourage people from thinking
it's something they might *want* to do.  But they shouldn't; people in
multilingual environments should be using UTF-8 externally unless they
have really really special needs (and even then they should probably
be using UTF-8 embedded in markup that serves those needs).

 > This is really case of the Python tail wagging the application dog.

If you need to do it often, just make a function out of it.  It
doesn't need to be a built-in.

From stephen at  Wed Jun  6 10:26:21 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 06 Jun 2012 17:26:21 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > I'm not sure why stateful matters.  When you change encoding
 > you discard whatever state exists

How do you know what *I* want to do?  Silently discarding buffer
contents would suck.

 > If there is a partially en/decoded character then wouldn't do the
 > same thing you'd do if the same condition arose at EOF?

Again speaking for *myself*, almost certainly not.  On input, if it
happens *before* EOF it's incomplete input, and I should wait for it
to be completed.  If it happens on output, there's a bug somewhere,
and I probably want to do some kind of error recovery.

 > However I have in the past encountered mixed encoding outputting 
 > programs in two contexts; generating test data (i think is was 
 > for automatic detection and extraction of information), and
 > bundling multiple differently-encoded data sets in one package 
 > that were pulled apart again downstream.
 > That both uses probably could have been designed better is irrelevant; 
 > a hypothetical python programmer's job would have been to produce
 > a python program that would fit into the the existing processes.

No, it's not irrelevant that it's bad design.  Python should not go
out of its way to cater to bad design, if bad design can be worked
around with existing facilities.  Here there are at least two ways to
do it: the method of changing sys.std*'s text encoding that you
posted, and switching sys.std* to binary and doing explicit encoding
and decoding of strings to be input or output.

I have also encountered mixed encoding, in my students' filesystems
(it was not uncommon to see /home/j.r.exchangestudent/KOI8-R/SHIFT_JIS
and similar).  That doesn't mean it should be made easier to generate!

From solipsis at  Wed Jun  6 14:28:50 2012
From: solipsis at (Antoine Pitrou)
Date: Wed, 06 Jun 2012 14:28:50 +0200
Subject: [Python-ideas] Supporting already opened sockets in our
 socket-based server classes
In-Reply-To: <>
References: <>
Message-ID: <jqnimk$fch$>

Le 06/06/2012 09:56, Tarek Ziad? a ?crit :
> What I am proposing is the following syntax:
> if the host passed to the class is of the form:
>      fd://12
> The class will try to create a socket object against the file descriptor
> 12, and will not bind() it neither accept() it.

Passing a pseudo-URL where a host name is expected sounds like a bad 
idea. Also, I don't understand the "neither accept() it" part. Surely 
you need to accept() incoming connections, so perhaps you mean "neither 
listen() it"?

(also, I'm not sure calling listen() another time is a problem)



From tarek at  Wed Jun  6 17:23:15 2012
From: tarek at (=?ISO-8859-1?Q?Tarek_Ziad=E9?=)
Date: Wed, 06 Jun 2012 17:23:15 +0200
Subject: [Python-ideas] Supporting already opened sockets in our
 socket-based server classes
In-Reply-To: <jqnimk$fch$>
References: <> <jqnimk$fch$>
Message-ID: <>

On 6/6/12 2:28 PM, Antoine Pitrou wrote:
> Le 06/06/2012 09:56, Tarek Ziad? a ?crit :
>> What I am proposing is the following syntax:
>> if the host passed to the class is of the form:
>>      fd://12
>> The class will try to create a socket object against the file descriptor
>> 12, and will not bind() it neither accept() it.
> Passing a pseudo-URL where a host name is expected sounds like a bad idea.

Well, unix sockets are using this convention to point paths to unix sockets.

e.g.  unix:///some/path

in general, theURI scheme seems widely used out there,

What do you propose ? another option ?

> Also, I don't understand the "neither accept() it" part. Surely you 
> need to accept() incoming connections, so perhaps you mean "neither 
> listen() it"?
Yeah that was a typo -- I do listen() before I fork

> (also, I'm not sure calling listen() another time is a problem)

I don't think so, but the usual pattern I have seen is to call listen() 
before the forking

> Regards
> Antoine.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From solipsis at  Wed Jun  6 19:05:39 2012
From: solipsis at (Antoine Pitrou)
Date: Wed, 06 Jun 2012 19:05:39 +0200
Subject: [Python-ideas] Supporting already opened sockets in our
 socket-based server classes
In-Reply-To: <>
References: <> <jqnimk$fch$>
Message-ID: <jqo2tk$hdg$>

Le 06/06/2012 17:23, Tarek Ziad? a ?crit :
> Well, unix sockets are using this convention to point paths to unix
> sockets.
> e.g.  unix:///some/path

Which unix sockets? In socketserver?

> in general, theURI scheme seems widely used out there,

My point is that if the parameter is currently a hostname, it isn't a 
URI (AFAIK). Starting to mix both concepts could quickly become confusing.

> What do you propose ? another option ?

I think that's better indeed.



From mwm at  Wed Jun  6 19:46:18 2012
From: mwm at (Mike Meyer)
Date: Wed, 6 Jun 2012 13:46:18 -0400
Subject: [Python-ideas] Supporting already opened sockets in our
 socket-based server classes
In-Reply-To: <>
References: <> <jqnimk$fch$>
Message-ID: <>

On Wed, 06 Jun 2012 17:23:15 +0200
Tarek Ziad? <tarek at> wrote:

> On 6/6/12 2:28 PM, Antoine Pitrou wrote:
> > Le 06/06/2012 09:56, Tarek Ziad? a ?crit :
> >>
> >> What I am proposing is the following syntax:
> >>
> >> if the host passed to the class is of the form:
> >>
> >>      fd://12
> >>
> >> The class will try to create a socket object against the file descriptor
> >> 12, and will not bind() it neither accept() it.
> >
> > Passing a pseudo-URL where a host name is expected sounds like a bad idea.
> Well, unix sockets are using this convention to point paths to unix sockets.
> e.g.  unix:///some/path

I think what you're trying to achieve has merit, but you're doing it
in the wrong place. Using a URL-like string instead of a host name?

So how about a new subclass, "PreForkedTCPServer", that takes the file
descriptor instead of the host/port pair when created? You'd probably
want to tweak the class tree somewhat, but that seems like a more
palatable API for what you're trying to do.

Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From tarek at  Wed Jun  6 23:45:18 2012
From: tarek at (=?UTF-8?B?VGFyZWsgWmlhZMOp?=)
Date: Wed, 06 Jun 2012 23:45:18 +0200
Subject: [Python-ideas] Supporting already opened sockets in our
 socket-based server classes
In-Reply-To: <>
References: <> <jqnimk$fch$>
Message-ID: <>

On 6/6/12 7:46 PM, Mike Meyer wrote:
> On Wed, 06 Jun 2012 17:23:15 +0200
> Tarek Ziad?<tarek at>  wrote:
>> On 6/6/12 2:28 PM, Antoine Pitrou wrote:
>>> Le 06/06/2012 09:56, Tarek Ziad? a ?crit :
>>>> What I am proposing is the following syntax:
>>>> if the host passed to the class is of the form:
>>>>       fd://12
>>>> The class will try to create a socket object against the file descriptor
>>>> 12, and will not bind() it neither accept() it.
>>> Passing a pseudo-URL where a host name is expected sounds like a bad idea.
>> Well, unix sockets are using this convention to point paths to unix sockets.
>> e.g.  unix:///some/path
> I think what you're trying to achieve has merit, but you're doing it
> in the wrong place. Using a URL-like string instead of a host name?
> Really?
> So how about a new subclass, "PreForkedTCPServer", that takes the file
> descriptor instead of the host/port pair when created? You'd probably
> want to tweak the class tree somewhat, but that seems like a more
> palatable API for what you're trying to do.

Yeah that makes sense. will try this - thanks for the feedback

>        <mike

From alice at  Thu Jun  7 01:20:07 2012
From: alice at (=?utf-8?Q?Alice_Bevan=E2=80=93McGregor?=)
Date: Wed, 6 Jun 2012 19:20:07 -0400
Subject: [Python-ideas] for/else statements considered harmful
Message-ID: <jqooj7$s7o$>


Was teaching a new user to Python the ropes a short while ago and ran 
into an interesting headspace problem: the for/else syntax fails the 
obviousness and consistency tests.  When used in an if/else block the 
conditional code is executed if the conditional passes, and the else 
block is executed if the conditional fails.  Compared to for loops 
where the for code is repeated and the else code executed if we 
"naturally fall off the loop".  (The new user's reaction was "why the 
hoek would I ever use for/else?")

I forked Python 3.3 to experiment with an alternate implementation that 
follows the logic of pass/fail implied by if/else: (and to refactor the 
stdlib, but that's a different issue ;)

    for x in range(20):
        if x > 10: break
        pass # we had no values to iterate
        pass # we naturally fell off the loop

It abuses finally (to avoid tying up a potentially common word as a 
reserved word like "done") but makes possible an important distinction 
without having to perform potentially expensive length calculations 
(which may not even be possible!) on the value being iterated: that is, 
handling the case where there were no values in the collection or 
returned by the generator.

Templating engines generally implement this type of structure.  Of 
course this type of breaking change in semantics puts this idea firmly 
into Python 4 land.

I'll isolate the for/else/finally code from my fork and post a patch 
this week-end, hopefully.

	? Alice.

From steve at  Thu Jun  7 01:45:36 2012
From: steve at (Steven D'Aprano)
Date: Thu, 07 Jun 2012 09:45:36 +1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <>

Alice Bevan?McGregor wrote:
> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran 
> into an interesting headspace problem: the for/else syntax fails the 
> obviousness and consistency tests.  When used in an if/else block the 
> conditional code is executed if the conditional passes, and the else 
> block is executed if the conditional fails.  Compared to for loops where 
> the for code is repeated and the else code executed if we "naturally 
> fall off the loop".  (The new user's reaction was "why the hoek would I 
> ever use for/else?")

Yes, I love for/else and while/else but regret the name. The else is 
conceptually unlike the else in if/else, and leads to the common confusion 
that the else suite if the iterable is empty.

> I forked Python 3.3 to experiment with an alternate implementation that 
> follows the logic of pass/fail implied by if/else: (and to refactor the 
> stdlib, but that's a different issue ;)
>    for x in range(20):
>        if x > 10: break
>    else:
>        pass # we had no values to iterate
>    finally:
>        pass # we naturally fell off the loop

+10000 :)

> It abuses finally (to avoid tying up a potentially common word as a 
> reserved word like "done") but makes possible an important distinction 
> without having to perform potentially expensive length calculations 
> (which may not even be possible!) on the value being iterated: that is, 
> handling the case where there were no values in the collection or 
> returned by the generator.
> Templating engines generally implement this type of structure.  Of 
> course this type of breaking change in semantics puts this idea firmly 
> into Python 4 land.

Sadly, yes. Where were you when Python 3.0 was still being planned? :)

> I'll isolate the for/else/finally code from my fork and post a patch 
> this week-end, hopefully.
>     ? Alice.

Many thanks.


From bruce at  Thu Jun  7 01:58:50 2012
From: bruce at (Bruce Leban)
Date: Wed, 6 Jun 2012 16:58:50 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <>

If we could go back in time I would completely agree. But since we can't,
flipping meaning of else would be too error inducing and therefore not at
all likely.

So at risk of bike shedding I would suggest

for ...
[ else not: ]
else [ finally ] :

If a context-sensitive keyword would work I'd go for something more like

for ...
[ else empty: ]
else [ no match ] :

This would not introduce any incompatibilities.

--- Bruce
(from my phone)
On Jun 6, 2012 4:31 PM, "Alice Bevan?McGregor" <alice at> wrote:

> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran into
> an interesting headspace problem: the for/else syntax fails the obviousness
> and consistency tests.  When used in an if/else block the conditional code
> is executed if the conditional passes, and the else block is executed if
> the conditional fails.  Compared to for loops where the for code is
> repeated and the else code executed if we "naturally fall off the loop".
>  (The new user's reaction was "why the hoek would I ever use for/else?")
> I forked Python 3.3 to experiment with an alternate implementation that
> follows the logic of pass/fail implied by if/else: (and to refactor the
> stdlib, but that's a different issue ;)
>   for x in range(20):
>       if x > 10: break
>   else:
>       pass # we had no values to iterate
>   finally:
>       pass # we naturally fell off the loop
> It abuses finally (to avoid tying up a potentially common word as a
> reserved word like "done") but makes possible an important distinction
> without having to perform potentially expensive length calculations (which
> may not even be possible!) on the value being iterated: that is, handling
> the case where there were no values in the collection or returned by the
> generator.
> Templating engines generally implement this type of structure.  Of course
> this type of breaking change in semantics puts this idea firmly into Python
> 4 land.
> I'll isolate the for/else/finally code from my fork and post a patch this
> week-end, hopefully.
>        ? Alice.
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Thu Jun  7 02:15:18 2012
From: python at (MRAB)
Date: Thu, 07 Jun 2012 01:15:18 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <>

On 07/06/2012 00:20, Alice Bevan?McGregor wrote:
> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran
> into an interesting headspace problem: the for/else syntax fails the
> obviousness and consistency tests.  When used in an if/else block the
> conditional code is executed if the conditional passes, and the else
> block is executed if the conditional fails.  Compared to for loops
> where the for code is repeated and the else code executed if we
> "naturally fall off the loop".  (The new user's reaction was "why the
> hoek would I ever use for/else?")
I find the easiest way to think of it is imagine you're searching a
list. If you find what you're looking for you break, else you do
something else.

From cmjohnson.mailinglist at  Thu Jun  7 02:44:04 2012
From: cmjohnson.mailinglist at (Carl M. Johnson)
Date: Wed, 6 Jun 2012 14:44:04 -1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Jun 6, 2012, at 1:58 PM, Bruce Leban wrote:

> If a context-sensitive keyword would work I'd go for something more like
> for ...
> [ else empty: ]
> else [ no match ] :
> This would not introduce any incompatibilities.

Since None is now a keyword, you could say "else if None" but that might be confusing, since None is different than empty.

From ncoghlan at  Thu Jun  7 02:53:22 2012
From: ncoghlan at (Nick Coghlan)
Date: Thu, 7 Jun 2012 10:53:22 +1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Thu, Jun 7, 2012 at 9:58 AM, Bruce Leban <bruce at> wrote:
> If we could go back in time I would completely agree. But since we can't,
> flipping meaning of else would be too error inducing and therefore not at
> all likely.

The meaning of the "else:" clause on for and while loops is actually
much closer to the sense in "try/except/else" sense than it is to the
sense in "if/else".

Consider the following:

    for x in range(20):
        if x > 10:
        # Reached the end of the loop

As an approximate short hand for:

    class BreakLoop(Exception): pass

        for x in range(20):
            if x > 10:
                raise BreakLoop
    except BreakLoop:
        # Reached the end of the loop

It's not implemented anything like that (and the analogy doesn't hold
in many other respects), but in terms of the semantics of the
respective else clauses it's an exact match.

Part of the problem is that the "else:" clause on while loops is often
explained as follows (and I've certainly been guilty of this), which I
now think exacerbates the confusion rather than reducing it:

The following code:
    x = 0
    while x < 10:
        x += 1
        if x == y:
        # Made it to 10

Can be seen as equivalent to:

    x = 0
    while 1:
        if x < 10:
            # Made it to 10
        x += 1
        if x == y:

This actually ends up reinforcing the erroneous connection to if
statements, when we really need to be encouraging people to think of
this clause in terms of try statements, with "break" playing the role
of an exception being raised.

So I think what we actually have is a documentation problem where we
need to be actively encouraging the "while/else", "for/else" ->
"try/except/else" link and discouraging any attempts to think of this
construct in terms of if statements (as that is a clear recipe for

If anything were to change at the language level, my preference would
be to further reinforce the try/except/else connection by allowing an
"except break" clause:

    for x in range(20):
        if x > 10:
    except break:
        # Bailed out early
        # Reached the end of the loop

To critique the *specific* proposal presented at the start of the
thread, there are three main problems with it:

1. It doesn't match the expected semantics of a "finally:" clause. In
try/finally the finally clause executes regardless of how the suite
execution is terminated (whether via an exception, reaching the end of
the suite, or leaving the suite early via a return, break or continue
control flow statement). That is explicitly not the case here (as a
loop's else clause only executes in the case of normal loop
termination - which precisely matches the semantics of the else clause
in try/except/else)
2. As Bruce pointed out, the meaning of the else: clause on loops
can't be changed as it would break backwards compatibility with
existing code
3. The post doesn't explain how the proposed change in semantics also
makes sense for while loops


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From andre.roberge at  Thu Jun  7 03:02:32 2012
From: andre.roberge at (Andre Roberge)
Date: Wed, 6 Jun 2012 22:02:32 -0300
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Wed, Jun 6, 2012 at 9:53 PM, Nick Coghlan <ncoghlan at> wrote:

> On Thu, Jun 7, 2012 at 9:58 AM, Bruce Leban <bruce at> wrote:
> > If we could go back in time I would completely agree. But since we can't,
> > flipping meaning of else would be too error inducing and therefore not at
> > all likely.

> If anything were to change at the language level, my preference would
> be


My preference would be for a new keyword: nobreak

This would work well with for/else and while/else which would become
for/nobreak and while/nobreak

I think that anyone reading

while ...
   some statements

would (more) immediately understand that "some statements" are going to be
executed if no break occurred in the above block.

But I doubt that something like this will ever be considered even though it
could be introduced now without breaking any code (other than that which
uses "nobreak" as a variable ... which should be rare) by making it first a
duplicate of the for/else and while/else construction which would be slowly

Just my 0.02$ ...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Thu Jun  7 03:27:14 2012
From: python at (MRAB)
Date: Thu, 07 Jun 2012 02:27:14 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On 07/06/2012 02:02, Andre Roberge wrote:
> On Wed, Jun 6, 2012 at 9:53 PM, Nick Coghlan <ncoghlan at
> <mailto:ncoghlan at>> wrote:
>     On Thu, Jun 7, 2012 at 9:58 AM, Bruce Leban <bruce at
>     <mailto:bruce at>> wrote:
>      > If we could go back in time I would completely agree. But since
>     we can't,
>      > flipping meaning of else would be too error inducing and
>     therefore not at
>      > all likely.
>     If anything were to change at the language level, my preference would
>     be
> My preference would be for a new keyword: nobreak
> This would work well with for/else and while/else which would become
> for/nobreak and while/nobreak
> I think that anyone reading
> while ...
>     ....
> nobreak:
>     some statements
> would (more) immediately understand that "some statements" are going to
> be executed if no break occurred in the above block.
> But I doubt that something like this will ever be considered even though
> it could be introduced now without breaking any code (other than that
> which uses "nobreak" as a variable ... which should be rare) by making
> it first a duplicate of the for/else and while/else construction which
> would be slowly deprecated.
How about "not break"? :-)

From donspauldingii at  Thu Jun  7 03:59:15 2012
From: donspauldingii at (Don Spaulding)
Date: Wed, 6 Jun 2012 20:59:15 -0500
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Wed, Jun 6, 2012 at 7:15 PM, MRAB <python at> wrote:

> On 07/06/2012 00:20, Alice Bevan?McGregor wrote:
>> Howdy!
>> Was teaching a new user to Python the ropes a short while ago and ran
>> into an interesting headspace problem: the for/else syntax fails the
>> obviousness and consistency tests.  When used in an if/else block the
>> conditional code is executed if the conditional passes, and the else
>> block is executed if the conditional fails.  Compared to for loops
>> where the for code is repeated and the else code executed if we
>> "naturally fall off the loop".  (The new user's reaction was "why the
>> hoek would I ever use for/else?")
>>  I find the easiest way to think of it is imagine you're searching a
> list. If you find what you're looking for you break, else you do
> something else.

I think the problem is that "break" doesn't sound like a positive, it
sounds like a negative, and indeed it means we effectively *ignore* the
rest of the list.  So when you get to the "else" it's like an English
double-negative, awkward to understand.  Perhaps even more because you're
effectively else-ing the break, not the for, so the indentation level even
seems off.

Backwards-compatibility issues aside, renaming "else" to "finally" sounds
like a really great idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From andre.roberge at  Thu Jun  7 04:01:36 2012
From: andre.roberge at (Andre Roberge)
Date: Wed, 6 Jun 2012 23:01:36 -0300
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Wed, Jun 6, 2012 at 10:59 PM, Don Spaulding <donspauldingii at>wrote:

> On Wed, Jun 6, 2012 at 7:15 PM, MRAB <python at> wrote:
>> On 07/06/2012 00:20, Alice Bevan?McGregor wrote:
>>> Howdy!
>>> Was teaching a new user to Python the ropes a short while ago and ran
>>> into an interesting headspace problem: the for/else syntax fails the
>>> obviousness and consistency tests.  When used in an if/else block the
>>> conditional code is executed if the conditional passes, and the else
>>> block is executed if the conditional fails.  Compared to for loops
>>> where the for code is repeated and the else code executed if we
>>> "naturally fall off the loop".  (The new user's reaction was "why the
>>> hoek would I ever use for/else?")
>>>  I find the easiest way to think of it is imagine you're searching a
>> list. If you find what you're looking for you break, else you do
>> something else.
> I think the problem is that "break" doesn't sound like a positive, it
> sounds like a negative, and indeed it means we effectively *ignore* the
> rest of the list.  So when you get to the "else" it's like an English
> double-negative, awkward to understand.  Perhaps even more because you're
> effectively else-ing the break, not the for, so the indentation level even
> seems off.
> Backwards-compatibility issues aside, renaming "else" to "finally" sounds
> like a really great idea.

No: "finally" implies that it is going to be done at the end of the block;
the "else" clause is *not* executed if a break occurs - hence it has a
different semantics.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rurpy at  Thu Jun  7 04:34:05 2012
From: rurpy at (Rurpy)
Date: Wed, 6 Jun 2012 19:34:05 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/06/2012 02:39 AM, Stephen J. Turnbull wrote:
> Rurpy writes:
>  > But even if not, I'm sorry, compared with running a single 
>  > command with an encoding option, I think messing with 
>  > environment variables is not really a workable solution.  
> You have a workable 2-line solution, which you posted.'

Please don't misunderstand why I posted...  as you say,
my code now works fine and I understand how to handle
this problem when I encounter it in the future.

I took the time to post here because it took an inordinate
amount of effort to find a solution to a legitimate need 
(your opinion to the contrary not withstanding) and the
resulting code which should have been trivially simple
and obvious, wasn't.

It is a minor issue but the end result of experiences 
like this, although infrequent, is often "WTF, why is 
this simple and reasonable thing so hard to do?".  And 
after a few times some programmers will start to wonder 
if maybe Python is not really an industrial-strength 
language -- one that they can be effective all the time, 
even when the problem falls outside the 95% demographic.
(And I am not talking about things totally out of 
python's scope like high performance computing or 
systems programming.) 

> It's ugly and
> hard to find, and it should be, to discourage people from thinking
> it's something they might *want* to do.  But they shouldn't; people in
> multilingual environments should be using UTF-8 externally unless they
> have really really special needs (and even then they should probably
> be using UTF-8 embedded in markup that serves those needs).

I wanted to do it because it was the correct design choice.  
The suggestion that to redesign an entire existing technical 
and personnel infrastructure to use utf-8, is a better 
choice is, well, never mind.

It is not the place of language designers to intentionally
make it hard to solve legitimate problems.  There *are*
other encodings in the world, there will be for sometime 
to come, and some programmers will sometimes have to deal 
with that.  Non-utf-8 encodings are not so evil (except in 
the minds of some zealots) that working with them conveniently 
should be made difficult.  (I am reminded of the Unix zealots 
of days past who refused to deal with Windows line endings.)
The way I chose to deal with the encoding requirements I 
had was the correct way.  It's unfortunate that Python 
makes it uglier than it should be.

The discussion seems to be going off topic for this list.  
I understand there is no support here for providing a non-
obscure, programmatic way of changing the encoding of the 
standard streams at program startup and that's fine, it
was a suggestion.

Thank you all for the feedback.

From anikom15 at  Thu Jun  7 04:45:14 2012
From: anikom15 at (Westley =?iso-8859-1?Q?Mart=EDnez?=)
Date: Wed, 6 Jun 2012 19:45:14 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <20120607024514.GA13028@kubrick>

On Wed, Jun 06, 2012 at 07:20:07PM -0400, Alice Bevan?McGregor wrote:
>    for x in range(20):
>        if x > 10: break
>    else:
>        pass # we had no values to iterate
>    finally:
>        pass # we naturally fell off the loop
-1 for me.  The idea that finally is executed only when we naturally
fall off the loop is weird.  finally suggests that it will always be
executed, like in a try/finally clause.  I think the naming of else is
weird but can be understood.  If a change is a must I believe else
should keep its semantic and simply be renamed except, but I am +0 on

All in all the use cases would be extremely rare if existant.  I've
never actually seen a for/else or while/else block.

From nathan at  Thu Jun  7 06:29:46 2012
From: nathan at (Nathan Schneider)
Date: Wed, 6 Jun 2012 21:29:46 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
Message-ID: <>

On Wed, Jun 6, 2012 at 5:53 PM, Nick Coghlan <ncoghlan at> wrote:
> On Thu, Jun 7, 2012 at 9:58 AM, Bruce Leban <bruce at> wrote:
>> If we could go back in time I would completely agree. But since we can't,
>> flipping meaning of else would be too error inducing and therefore not at
>> all likely.
> The meaning of the "else:" clause on for and while loops is actually
> much closer to the sense in "try/except/else" sense than it is to the
> sense in "if/else".
> Consider the following:
> ? ?for x in range(20):
> ? ? ? ?if x > 10:
> ? ? ? ? ? ?break
> ? ?else:
> ? ? ? ?# Reached the end of the loop
> As an approximate short hand for:
> ? ?class BreakLoop(Exception): pass
> ? ?try:
> ? ? ? ?for x in range(20):
> ? ? ? ? ? ?if x > 10:
> ? ? ? ? ? ? ? ?raise BreakLoop
> ? ?except BreakLoop:
> ? ? ? ?pass
> ? ?else:
> ? ? ? ?# Reached the end of the loop
> It's not implemented anything like that (and the analogy doesn't hold
> in many other respects), but in terms of the semantics of the
> respective else clauses it's an exact match.
> Part of the problem is that the "else:" clause on while loops is often
> explained as follows (and I've certainly been guilty of this), which I
> now think exacerbates the confusion rather than reducing it:
> The following code:
> ? ?x = 0
> ? ?while x < 10:
> ? ? ? ?x += 1
> ? ? ? ?if x == y:
> ? ? ? ? ? break
> ? ?else:
> ? ? ? ?# Made it to 10
> Can be seen as equivalent to:
> ? ?x = 0
> ? ?while 1:
> ? ? ? ?if x < 10:
> ? ? ? ? ? ?pass
> ? ? ? ?else:
> ? ? ? ? ? ?# Made it to 10
> ? ? ? ?x += 1
> ? ? ? ?if x == y:
> ? ? ? ? ? break
> This actually ends up reinforcing the erroneous connection to if
> statements, when we really need to be encouraging people to think of
> this clause in terms of try statements, with "break" playing the role
> of an exception being raised.
> So I think what we actually have is a documentation problem where we
> need to be actively encouraging the "while/else", "for/else" ->
> "try/except/else" link and discouraging any attempts to think of this
> construct in terms of if statements (as that is a clear recipe for
> confusion).
> If anything were to change at the language level, my preference would
> be to further reinforce the try/except/else connection by allowing an
> "except break" clause:
> ? ?for x in range(20):
> ? ? ? ?if x > 10:
> ? ? ? ? ? ?break
> ? ?except break:
> ? ? ? ?# Bailed out early
> ? ?else:
> ? ? ? ?# Reached the end of the loop

I like this proposal, or perhaps

  while ...:
  with break:
         # Bailed out early
         # Reached the end of the loop

...which avoids any conceptual baggage associated with exception
handling, at some risk of making people think of context managers.

For what it's worth, I don't use the loop version of 'else' to avoid
confusing myself (or the reader of my code). But in my experience the
use case 'else' is intended to solve is probably less common than (a)
checking whether the loop was ever entered, and (b) checking from
within the loop body whether it is the first iteration.


> To critique the *specific* proposal presented at the start of the
> thread, there are three main problems with it:
> 1. It doesn't match the expected semantics of a "finally:" clause. In
> try/finally the finally clause executes regardless of how the suite
> execution is terminated (whether via an exception, reaching the end of
> the suite, or leaving the suite early via a return, break or continue
> control flow statement). That is explicitly not the case here (as a
> loop's else clause only executes in the case of normal loop
> termination - which precisely matches the semantics of the else clause
> in try/except/else)
> 2. As Bruce pointed out, the meaning of the else: clause on loops
> can't be changed as it would break backwards compatibility with
> existing code
> 3. The post doesn't explain how the proposed change in semantics also
> makes sense for while loops
> Cheers,
> Nick.
> --
> Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From tjreedy at  Thu Jun  7 06:31:30 2012
From: tjreedy at (Terry Reedy)
Date: Thu, 07 Jun 2012 00:31:30 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <jqparl$9ka$>

On 6/6/2012 7:20 PM, Alice Bevan?McGregor wrote:
> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran
> into an interesting headspace problem: the for/else syntax fails the
> obviousness and consistency tests.

I disagree. The else clause is executed when the condition (explicit in 
while loops, implicit in for loops) is false. Consider the following 
implementation of while loops in a lower-level pseudo-python:

label startloop
if condition:
   goto startloop

This is *exactly* equivalent to

while condition:

In fact, the absolute goto is how while is implemented in assembler 
languages, include CPython bytecode. If one converts a for-loop to a 
while-loop, you will see the same thing.

CPython bytecode for for-loops is a little more condensed, with a higher 
level FOR_ITER code. It tries to get the next item if there is one and 
catches the exception and jumps if not. (It also handles and hides the 
fact that there are two iterator protocols.) But still, an absolute 
'goto startloop' jump back up to FOR_ITER is added to the end of the 'if 
next' suite, just as with while-loops.

Terry Jan Reedy

From p.f.moore at  Thu Jun  7 08:27:36 2012
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Jun 2012 07:27:36 +0100
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 June 2012 03:34, Rurpy <rurpy at> wrote:
> It is a minor issue but the end result of experiences
> like this, although infrequent, is often "WTF, why is
> this simple and reasonable thing so hard to do?". ?And
> after a few times some programmers will start to wonder
> if maybe Python is not really an industrial-strength
> language -- one that they can be effective all the time,
> even when the problem falls outside the 95% demographic.
> (And I am not talking about things totally out of
> python's scope like high performance computing or
> systems programming.)

One suggestion, which would probably shed some light on whether this
should be viewed as something "simple and reasonable", would be to do
some research on how the same task would be achieved in other
languages. I have no experience to contribute but my intuition says
that this could well be hard on other languages too. Would you be
willing to do some web searches to look for solutions in (say) Java,
or C#, or Ruby? In theory, it shouldn't take long (as otherwise you
can conclude that the solution is obscure to the same extent that it
is with Python).

Even better, if those other languages do have a simple solution, it
may suggest an approach that would be appropriate for Python.


From stephen at  Thu Jun  7 09:12:26 2012
From: stephen at (Stephen J. Turnbull)
Date: Thu, 07 Jun 2012 16:12:26 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > I took the time to post here because it took an inordinate
 > amount of effort to find a solution to a legitimate need 
 > (your opinion to the contrary not withstanding)

I don't think I said the need was illegitimate, if I did I apologize,
and I certainly don't believe it is (I'm an economist by trade -- de
gustibus non est disputandum).

I just don't think it's necessary for Python to try to address the
problem, because the problem is somebody else's bad design at root.
And I don't think it would be wise to try to do it in a very general
way, because it's very hard to do that at the general level of the

 > I understand there is no support here for providing a non-
 > obscure, programmatic way of changing the encoding of the 
 > standard streams at program startup 

You're wrong.  There is *some* support for that.

It just has to be done safely, and that means that a generic
.set_encoding() method that can be called after I/O has been performed
probably isn't going to happen.

And it might not happen at the core level, since a 3-line function can
do the job, it might make just as much sense to put up a package on

From ubershmekel at  Thu Jun  7 09:23:05 2012
From: ubershmekel at (Yuval Greenfield)
Date: Thu, 7 Jun 2012 10:23:05 +0300
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqparl$9ka$>
References: <jqooj7$s7o$>
Message-ID: <>

We had quite a lengthy discussion on for/else in October 2009

Guido mentioned:

> I would not have the feature at all if I had to do it over. I would *not*
> choose another keyword. But I don't see the same level of danger in it that
> some here see.
> I am also against adding a syntax warning for this [[loops with else but
> without break]]. It belongs in pylint etc.

Personally I'd prefer "if not break:" over "else:" but as we're stuck where
we are today I'm just going to encourage people not to use the construct at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Thu Jun  7 09:57:35 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Thu, 7 Jun 2012 03:57:35 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 3:23 AM, Yuval Greenfield <ubershmekel at> wrote:
> Personally I'd prefer "if not break:" over "else:" but as we're stuck where
> we are today I'm just going to encourage people not to use the construct at
> all.

Why shouldn't people use for-else?

-- Devin

From ubershmekel at  Thu Jun  7 10:31:22 2012
From: ubershmekel at (Yuval Greenfield)
Date: Thu, 7 Jun 2012 11:31:22 +0300
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 10:57 AM, Devin Jeanpierre <jeanpierreda at>wrote:

> On Thu, Jun 7, 2012 at 3:23 AM, Yuval Greenfield <ubershmekel at>
> wrote:
> > Personally I'd prefer "if not break:" over "else:" but as we're stuck
> where
> > we are today I'm just going to encourage people not to use the construct
> at
> > all.
> Why shouldn't people use for-else?
> -- Devin

For-else/while-else are confusing. During the previous discussion even the
construct's proponents have fallen to its misleading nature. The word
"else" alone just doesn't fit its role here no matter how intricate and
carefully constructed an example is given to explain its nature or

I believe using for/else will cause you and maintainers of your code to
make more mistakes.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Thu Jun  7 11:28:49 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Thu, 7 Jun 2012 05:28:49 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 4:31 AM, Yuval Greenfield <ubershmekel at> wrote:
> On Thu, Jun 7, 2012 at 10:57 AM, Devin Jeanpierre <jeanpierreda at>
> wrote:
>> On Thu, Jun 7, 2012 at 3:23 AM, Yuval Greenfield <ubershmekel at>
>> wrote:
>> > Personally I'd prefer "if not break:" over "else:" but as we're stuck
>> > where
>> > we are today I'm just going to encourage people not to use the construct
>> > at
>> > all.
>> Why shouldn't people use for-else?
>> -- Devin
> I believe using for/else will cause you and maintainers of your code to make
> more mistakes.

I don't follow. What mistakes would people make? Why would they make them?

Also, are you worried about people that read the documentation and
know what for-else does, or the people that don't or haven't read this
documentation? It's good practice to, when reading source code of an
unfamiliar language, try to read up on things you haven't seen yet --
although sometimes context seems good enough. If you are afraid that
this is someplace that context _seems_ good enough, but actually
_isn't_, that would be something to worry about (although I don't feel
that way).

-- Devin

From stephen at  Thu Jun  7 12:50:14 2012
From: stephen at (Stephen J. Turnbull)
Date: Thu, 07 Jun 2012 19:50:14 +0900
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

Devin Jeanpierre writes:
 > On Thu, Jun 7, 2012 at 4:31 AM, Yuval Greenfield <ubershmekel at> wrote:

 > > I believe using for/else will cause you and maintainers of your
 > > code to make more mistakes.
 > I don't follow. What mistakes would people make? Why would they
 > make them?

There was a long thread about a year ago on this list, where a couple
of less experienced programmers and even a couple of people who have
long since proven themselves reliable, gave code examples that
obviously hadn't been tested.<wink/>  There's a summary at:

The reason they make such mistakes is that there's a strong
association of "else" with "if-then-else", and for many people that
seems to be somewhere between totally useless and actively misleading.

For me, there are a number of reasonable mnemonics, a couple given in
this thread, but IIRC the only idiom I found really plausible was

    def search_in_iterable(key, iter):
        for item in iter:
            if item == key:
                return some_function_of(item)
            return not_found_default

From ubershmekel at  Thu Jun  7 13:01:18 2012
From: ubershmekel at (Yuval Greenfield)
Date: Thu, 7 Jun 2012 14:01:18 +0300
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 1:50 PM, Stephen J. Turnbull <stephen at>wrote:

>    def search_in_iterable(key, iter):
>        for item in iter:
>            if item == key:
>                return some_function_of(item)
>        else:
>            return not_found_default
You don't need the "else" there. An equivalent:

    def search_in_iterable(key, iter):
       for item in iter:
           if item == key:
               return some_function_of(item)
       return not_found_default

I'm not sure I understood what you meant but I'll assume that by
"plausible"/"reasonable" you meant that it's a good example as to how
for/else is misleading.

Devin Jeanpierre Wrote:

> Also, are you worried about people that read the documentation and
> know what for-else does, or the people that don't or haven't read this
> documentation?

On this issue I'm worried about all sentient programmers.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Thu Jun  7 14:01:52 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Thu, 7 Jun 2012 08:01:52 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 6:50 AM, Stephen J. Turnbull <stephen at> wrote:
> There was a long thread about a year ago on this list, where a couple
> of less experienced programmers and even a couple of people who have
> long since proven themselves reliable, gave code examples that
> obviously hadn't been tested.<wink/>

This is disappointing. for-else is simple, even if it has an ambiguous name.

> The reason they make such mistakes is that there's a strong
> association of "else" with "if-then-else", and for many people that
> seems to be somewhere between totally useless and actively misleading.

I know it's really bad form to shift goalposts, but I can't help but
offer an alternative hypothesis: What if it isn't that else is
confusing, but that use of else is rare? People have lots of silly
beliefs about things they never use, or haven't used in a very long

> For me, there are a number of reasonable mnemonics, a couple given in
> this thread, but IIRC the only idiom I found really plausible was

I think of "else" as a collective/delayed else to the if statement in
the body of the loop (which is almost always present). This only works
for for loops though.

Pretty much every single for-else has almost exactly the same form,
though, so... it's pretty easy to use specialized models like that. :)

-- Devin

From ncoghlan at  Thu Jun  7 15:04:29 2012
From: ncoghlan at (Nick Coghlan)
Date: Thu, 7 Jun 2012 23:04:29 +1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 10:01 PM, Devin Jeanpierre
<jeanpierreda at> wrote:
> On Thu, Jun 7, 2012 at 6:50 AM, Stephen J. Turnbull <stephen at> wrote:
>> The reason they make such mistakes is that there's a strong
>> association of "else" with "if-then-else", and for many people that
>> seems to be somewhere between totally useless and actively misleading.
> I know it's really bad form to shift goalposts, but I can't help but
> offer an alternative hypothesis: What if it isn't that else is
> confusing, but that use of else is rare? People have lots of silly
> beliefs about things they never use, or haven't used in a very long
> time.

FWIW, I just added the following paragraph to the relevant section of
the Python tutorial in 2.7, 3.2 and 3.3:

When used with a loop, the ``else`` clause has more in common with the
``else`` clause of a :keyword:`try` statement than it does that of
:keyword:`if` statements: a :keyword:`try` statement's ``else`` clause runs
when no exception occurs, and a loop's ``else`` clause runs when no ``break``
occurs. For more on the :keyword:`try` statement and exceptions, see

The new text should appear in the respective online versions as part
of the next daily docs rebuild.

It may not help much, but it won't hurt, and the "exceptional else" is
a much better parallel than trying to make loop else clauses fit the
"conditional else" mental model.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From alice at  Thu Jun  7 15:06:38 2012
From: alice at (=?utf-8?Q?Alice_Bevan=E2=80=93McGregor?=)
Date: Thu, 7 Jun 2012 09:06:38 -0400
Subject: [Python-ideas] for/else statements considered harmful
References: <jqooj7$s7o$>
Message-ID: <jqq90t$e2d$>

So the subject of the thread seems to hold true.  Average developers 
are confused by the current semantic (a problem that needs more than 
abstract p-code to correct) to the point of actively avoiding use of 
the structure.

I agree, however, that breaking all existing code is probably bad.  ;)

On 2012-06-07 00:53:22 +0000, Nick Coghlan said:
>     for x in range(20):
>         if x > 10:
>             break
>     except break:
>         # Bailed out early
>     else:
>         # Reached the end of the loop

Seems a not insignifigant number of readers got fixated on the 
alternate keyword for the current behaviour of else (finally in my 
example) and ignored or misinterpreted the -really important part- of 
being able to detect if the loop was skipped (no iterations performed; 
else in my example).

Being able to have a block executed if the loop is never entered is 
vitally important so you can avoid expensive or potentially impossible 
length checks on the iterator before the loop.  Take this example:

    sock = lsock.accept()
    for chunk in iter(partial(sock.recv, 4096), ''):
        pass # do something with the chunk
        pass # no data recieved before client hangup!

Using a temporary varable to simulate this is? unfortunate.

    sock = lsock.accept()
    has_data = False
    for chunk in iter(partial(sock.recv, 4096), ''):
        has_data = True
        pass # do something with the chunk

    if not has_data:
        pass # no data recieved before client hangup!

empty woud be a good keyword to preserve the existing meaning of else, 
but I'm pretty sure that's a fairly common variable name.  :/

	? Alice.

From ethan at  Thu Jun  7 15:14:36 2012
From: ethan at (Ethan Furman)
Date: Thu, 07 Jun 2012 06:14:36 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
	<jqparl$9ka$>	<>	<>	<>	<>	<>
Message-ID: <>

Devin Jeanpierre wrote:
> On Thu, Jun 7, 2012 at 6:50 AM, Stephen J. Turnbull wrote:
>> The reason they make such mistakes is that there's a strong
>> association of "else" with "if-then-else", and for many people that
>> seems to be somewhere between totally useless and actively misleading.
> I know it's really bad form to shift goalposts, but I can't help but
> offer an alternative hypothesis: What if it isn't that else is
> confusing, but that use of else is rare? People have lots of silly
> beliefs about things they never use, or haven't used in a very long
> time.

I use the for/else and while/else constructs, and still get them wrong 
-- the association with if/else is very strong for me, and my usage 
pattern is more along the lines of "if this iterable was empty at the 

I appreciate the correlation with except/else, and the failed search 
idea -- those should help me keep these straight even before my tests 
fail.  ;)


From ncoghlan at  Thu Jun  7 15:36:08 2012
From: ncoghlan at (Nick Coghlan)
Date: Thu, 7 Jun 2012 23:36:08 +1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqq90t$e2d$>
References: <jqooj7$s7o$>
Message-ID: <>

On Thu, Jun 7, 2012 at 11:06 PM, Alice Bevan?McGregor
<alice at> wrote:
> On 2012-06-07 00:53:22 +0000, Nick Coghlan said:
>> ? ?for x in range(20):
>> ? ? ? ?if x > 10:
>> ? ? ? ? ? ?break
>> ? ?except break:
>> ? ? ? ?# Bailed out early
>> ? ?else:
>> ? ? ? ?# Reached the end of the loop
> Seems a not insignifigant number of readers got fixated on the alternate
> keyword for the current behaviour of else (finally in my example) and
> ignored or misinterpreted the -really important part- of being able to
> detect if the loop was skipped (no iterations performed; else in my
> example).
> Being able to have a block executed if the loop is never entered is vitally
> important so you can avoid expensive or potentially impossible length checks
> on the iterator before the loop. ?Take this example:
> ? sock = lsock.accept()
> ? for chunk in iter(partial(sock.recv, 4096), ''):
> ? ? ? pass # do something with the chunk
> ? else:
> ? ? ? pass # no data recieved before client hangup!
> Using a temporary varable to simulate this is? unfortunate.
> ? sock = lsock.accept()
> ? has_data = False
> ? for chunk in iter(partial(sock.recv, 4096), ''):
> ? ? ? has_data = True
> ? ? ? pass # do something with the chunk
> ? if not has_data:
> ? ? ? pass # no data recieved before client hangup!
> empty woud be a good keyword to preserve the existing meaning of else, but
> I'm pretty sure that's a fairly common variable name. ?:/

Yeah, it's usually fairly important on here to separate out "this is
the problem I see" from "this is a proposed solution". Getting
agreement on the former is usually easier than the latter, since there
are so many additional constraints that come into play when it comes
to considering solutions. And if we can't even reach agreement that a
problem needs to be solved, then talking about solution details isn't
especially productive (although it can be fun to speculate about the
possibilities anyway).

FWIW, I usually solve this particular problem with for loops by using
the iteration variable itself to hold a sentinel value:

   sock = lsock.accept()
   chunk = None
   for chunk in iter(partial(sock.recv, 4096), ''):
       pass # do something with the chunk

   if chunk is None:
       pass # no data recieved before client hangup!

If "None" is a possible value in the iterable, then I'll use a
dedicated sentinel value instead:

    var = sentinel = object()
    for var in iterable:
    if var is sentinel:

I've never found either of those constructs ugly enough to
particularly want dedicated syntax to replace it, and the availability
of this approach is what makes it especially difficult to push for
dedicated syntactic support (since all that can really be saved is the
assignment that sets up the sentinel value).


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ethan at  Thu Jun  7 15:21:24 2012
From: ethan at (Ethan Furman)
Date: Thu, 07 Jun 2012 06:21:24 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqq90t$e2d$>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

Alice Bevan?McGregor wrote:
> Being able to have a block executed if the loop is never entered is 
> vitally important so you can avoid expensive or potentially impossible 
> length checks on the iterator before the loop.  Take this example:
>    sock = lsock.accept()
>    for chunk in iter(partial(sock.recv, 4096), ''):
>        pass # do something with the chunk
>    else:
>        pass # no data recieved before client hangup!

This is, indeed, the usual way I try to use these contructs...

> Using a temporary varable to simulate this is? unfortunate.
>    sock = lsock.accept()
>    has_data = False
>    for chunk in iter(partial(sock.recv, 4096), ''):
>        has_data = True
>        pass # do something with the chunk
>    if not has_data:
>        pass # no data recieved before client hangup!

and this is how I usually work around it.  :(


From arnodel at  Thu Jun  7 16:00:39 2012
From: arnodel at (Arnaud Delobelle)
Date: Thu, 7 Jun 2012 15:00:39 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <>

On 7 June 2012 00:20, Alice Bevan?McGregor <alice at> wrote:
> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran into
> an interesting headspace problem: the for/else syntax fails the obviousness
> and consistency tests. ?When used in an if/else block the conditional code
> is executed if the conditional passes, and the else block is executed if the
> conditional fails. ?Compared to for loops where the for code is repeated and
> the else code executed if we "naturally fall off the loop". ?(The new user's
> reaction was "why the hoek would I ever use for/else?")

My solution: don't talk about a for/else construct, but talk about a
for/break/else block instead. Then the semantics become obvious again.

> I forked Python 3.3 to experiment with an alternate implementation that
> follows the logic of pass/fail implied by if/else: (and to refactor the
> stdlib, but that's a different issue ;)
> ? for x in range(20):
> ? ? ? if x > 10: break
> ? else:
> ? ? ? pass # we had no values to iterate
> ? finally:
> ? ? ? pass # we naturally fell off the loop
> It abuses finally (to avoid tying up a potentially common word as a reserved
> word like "done") but makes possible an important distinction without having
> to perform potentially expensive length calculations (which may not even be
> possible!) on the value being iterated: that is, handling the case where
> there were no values in the collection or returned by the generator.

I think your use of finally is as unfortunate as the current use of
else:  usually, finally is *always* executed, irrespective of what
happened in the try block.  Your new use goes against that.



From ethan at  Thu Jun  7 15:52:49 2012
From: ethan at (Ethan Furman)
Date: Thu, 07 Jun 2012 06:52:49 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>	<jqq90t$e2d$>
Message-ID: <>

Nick Coghlan wrote:
> On Thu, Jun 7, 2012 at 11:06 PM, Alice Bevan?McGregor
> <alice at> wrote:
>> On 2012-06-07 00:53:22 +0000, Nick Coghlan said:
>>>    for x in range(20):
>>>        if x > 10:
>>>            break
>>>    except break:
>>>        # Bailed out early
>>>    else:
>>>        # Reached the end of the loop
>> Seems a not insignifigant number of readers got fixated on the alternate
>> keyword for the current behaviour of else (finally in my example) and
>> ignored or misinterpreted the -really important part- of being able to
>> detect if the loop was skipped (no iterations performed; else in my
>> example).
>> Being able to have a block executed if the loop is never entered is vitally
>> important so you can avoid expensive or potentially impossible length checks
>> on the iterator before the loop.  Take this example:
>>   sock = lsock.accept()
>>   for chunk in iter(partial(sock.recv, 4096), ''):
>>       pass # do something with the chunk
>>   else:
>>       pass # no data recieved before client hangup!
>> Using a temporary varable to simulate this is? unfortunate.
>>   sock = lsock.accept()
>>   has_data = False
>>   for chunk in iter(partial(sock.recv, 4096), ''):
>>       has_data = True
>>       pass # do something with the chunk
>>   if not has_data:
>>       pass # no data recieved before client hangup!
>> empty woud be a good keyword to preserve the existing meaning of else, but
>> I'm pretty sure that's a fairly common variable name.  :/
> Yeah, it's usually fairly important on here to separate out "this is
> the problem I see" from "this is a proposed solution". Getting
> agreement on the former is usually easier than the latter, since there
> are so many additional constraints that come into play when it comes
> to considering solutions. And if we can't even reach agreement that a
> problem needs to be solved, then talking about solution details isn't
> especially productive (although it can be fun to speculate about the
> possibilities anyway).
> FWIW, I usually solve this particular problem with for loops by using
> the iteration variable itself to hold a sentinel value:
>    sock = lsock.accept()
>    chunk = None
>    for chunk in iter(partial(sock.recv, 4096), ''):
>        pass # do something with the chunk
>    if chunk is None:
>        pass # no data recieved before client hangup!
> If "None" is a possible value in the iterable, then I'll use a
> dedicated sentinel value instead:
>     var = sentinel = object()
>     for var in iterable:
>         ...
>     if var is sentinel:
>         ...
> I've never found either of those constructs ugly enough to
> particularly want dedicated syntax to replace it, and the availability
> of this approach is what makes it especially difficult to push for
> dedicated syntactic support (since all that can really be saved is the
> assignment that sets up the sentinel value).

This seems like a good work-around (meaning: I'll definitely use it, 
thanks!), but it does not address the confusion issues.

I think the main problem with the current while/else, for/else is 
two-fold: 1) we have two failure states (empty from the start, and 
desired result not met), and 2) even though the else is more similar to 
the else in try/except/else, it is formatted *just like* the if/else.

Perhaps the solution is to enhance for and while with except?

    sock = lsock.accept()
    for chunk in iter(partial(sock.recv, 4096), ''):
        pass # do something with the chunk
        pass # no data recieved before client hangup!
        pass # wrap-up processing on chunks


From mwm at  Thu Jun  7 17:30:11 2012
From: mwm at (Mike Meyer)
Date: Thu, 7 Jun 2012 11:30:11 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On Thu, 07 Jun 2012 06:52:49 -0700
Ethan Furman <ethan at> wrote:
> I think the main problem with the current while/else, for/else is 
> two-fold: 1) we have two failure states (empty from the start, and 
> desired result not met), and 2) even though the else is more similar to 
> the else in try/except/else, it is formatted *just like* the if/else.

I'd say we have 1.5 failure states, because the desired result is not
met in both cases. In my experience, the general case (that else
handles) is more common than the special case of the iterator being

> Perhaps the solution is to enhance for and while with except?
>     sock = lsock.accept()
>     for chunk in iter(partial(sock.recv, 4096), ''):
>         pass # do something with the chunk
>     except:
>         pass # no data recieved before client hangup!
>     else:
>         pass # wrap-up processing on chunks

Calling it "wrap-up processing" seems likely to cause people to think
about it as meaning "finally". But if the else clause is not executed
if the except clause is (as done by try/except/else), then there's no
longer an easy way to describe it.

It seems like adding an except would change the conditions under which
the else clause is executed (unlike try/except/else), as otherwise
there's no easy way capture the current behavior, where else is
executed whenever there are no chunks left to process. But that kind
of things seems like a way to introduce bugs.

Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From alice at  Thu Jun  7 17:52:10 2012
From: alice at (=?utf-8?Q?Alice_Bevan=E2=80=93McGregor?=)
Date: Thu, 7 Jun 2012 11:52:10 -0400
Subject: [Python-ideas] for/else statements considered harmful
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <jqqina$6u0$>

On 2012-06-07 15:30:11 +0000, Mike Meyer said:
> Calling it "wrap-up processing" seems likely to cause people to think
> about it as meaning "finally". But if the else clause is not executed
> if the except clause is (as done by try/except/else), then there's no
> longer an easy way to describe it.
> It seems like adding an except would change the conditions under which
> the else clause is executed (unlike try/except/else), as otherwise
> there's no easy way capture the current behavior, where else is
> executed whenever there are no chunks left to process. But that kind
> of things seems like a way to introduce bugs.

Well, how about:

    for <var> in <iterable>:
        pass # process each <var>
    except:  # no arguments!
        pass # nothing to process
        pass # fell through
        pass # regardless of break/fallthrough/empty

Now for loops perfectly match try/except/else/finally!  >:D  (Like 
exception handling, finally would be called even with an inner return 
from any of the prior sections.)

	? Alice.

From guido at  Thu Jun  7 18:26:06 2012
From: guido at (Guido van Rossum)
Date: Thu, 7 Jun 2012 09:26:06 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Thu, Jun 7, 2012 at 6:04 AM, Nick Coghlan <ncoghlan at> wrote:
> FWIW, I just added the following paragraph to the relevant section of
> the Python tutorial in 2.7, 3.2 and 3.3:
> =================
> When used with a loop, the ``else`` clause has more in common with the
> ``else`` clause of a :keyword:`try` statement than it does that of
> :keyword:`if` statements: a :keyword:`try` statement's ``else`` clause runs
> when no exception occurs, and a loop's ``else`` clause runs when no ``break``
> occurs. For more on the :keyword:`try` statement and exceptions, see
> :ref:`tut-handling`.
> =================

I like this. Let's not change the syntax.

--Guido van Rossum (

From mwm at  Thu Jun  7 18:29:01 2012
From: mwm at (Mike Meyer)
Date: Thu, 7 Jun 2012 12:29:01 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqqina$6u0$>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On Thu, 7 Jun 2012 11:52:10 -0400
Alice Bevan?McGregor <alice at> wrote:

> On 2012-06-07 15:30:11 +0000, Mike Meyer said:
> > Calling it "wrap-up processing" seems likely to cause people to think
> > about it as meaning "finally". But if the else clause is not executed
> > if the except clause is (as done by try/except/else), then there's no
> > longer an easy way to describe it.
> > 
> > It seems like adding an except would change the conditions under which
> > the else clause is executed (unlike try/except/else), as otherwise
> > there's no easy way capture the current behavior, where else is
> > executed whenever there are no chunks left to process. But that kind
> > of things seems like a way to introduce bugs.
> Well, how about:
>     for <var> in <iterable>:
>         pass # process each <var>
>     except:  # no arguments!
>         pass # nothing to process
>     else:
>         pass # fell through
>     finally:
>         pass # regardless of break/fallthrough/empty
> Now for loops perfectly match try/except/else/finally!  >:D  (Like 
> exception handling, finally would be called even with an inner return 
> from any of the prior sections.)

For for (and don't forget while) loops, finally is pointless. It's the
same as code after the loop. For try, finally runs even if there's an
exception, which isn't true of that code.

Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From python at  Thu Jun  7 18:32:45 2012
From: python at (MRAB)
Date: Thu, 07 Jun 2012 17:32:45 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqqina$6u0$>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On 07/06/2012 16:52, Alice Bevan?McGregor wrote:
> On 2012-06-07 15:30:11 +0000, Mike Meyer said:
>>  Calling it "wrap-up processing" seems likely to cause people to think
>>  about it as meaning "finally". But if the else clause is not executed
>>  if the except clause is (as done by try/except/else), then there's no
>>  longer an easy way to describe it.
>>  It seems like adding an except would change the conditions under which
>>  the else clause is executed (unlike try/except/else), as otherwise
>>  there's no easy way capture the current behavior, where else is
>>  executed whenever there are no chunks left to process. But that kind
>>  of things seems like a way to introduce bugs.
> Well, how about:
>      for<var>  in<iterable>:
>          pass # process each<var>
>      except:  # no arguments!
>          pass # nothing to process
>      else:
>          pass # fell through
>      finally:
>          pass # regardless of break/fallthrough/empty
> Now for loops perfectly match try/except/else/finally!>:D  (Like
> exception handling, finally would be called even with an inner return
> from any of the prior sections.)
Is the "finally" clause really necessary? Is it just the same as putting it
after the loop?

From python at  Thu Jun  7 18:45:22 2012
From: python at (MRAB)
Date: Thu, 07 Jun 2012 17:45:22 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On 07/06/2012 17:32, MRAB wrote:
> On 07/06/2012 16:52, Alice Bevan?McGregor wrote:
>>  On 2012-06-07 15:30:11 +0000, Mike Meyer said:
>>>   Calling it "wrap-up processing" seems likely to cause people to think
>>>   about it as meaning "finally". But if the else clause is not executed
>>>   if the except clause is (as done by try/except/else), then there's no
>>>   longer an easy way to describe it.
>>>   It seems like adding an except would change the conditions under which
>>>   the else clause is executed (unlike try/except/else), as otherwise
>>>   there's no easy way capture the current behavior, where else is
>>>   executed whenever there are no chunks left to process. But that kind
>>>   of things seems like a way to introduce bugs.
>>  Well, how about:
>>       for<var>   in<iterable>:
>>           pass # process each<var>
>>       except:  # no arguments!
>>           pass # nothing to process
>>       else:
>>           pass # fell through
>>       finally:
>>           pass # regardless of break/fallthrough/empty
>>  Now for loops perfectly match try/except/else/finally!>:D  (Like
>>  exception handling, finally would be called even with an inner return
>>  from any of the prior sections.)
> Is the "finally" clause really necessary? Is it just the same as putting it
> after the loop?
I've just noticed your remark about the finally clause being run even
if there's a return. I can't say I like that; that's the job of

From guido at  Thu Jun  7 18:57:00 2012
From: guido at (Guido van Rossum)
Date: Thu, 7 Jun 2012 09:57:00 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqq90t$e2d$>
	<jqqina$6u0$> <>
Message-ID: <>

On Thu, Jun 7, 2012 at 9:45 AM, MRAB <python at> wrote:
> On 07/06/2012 17:32, MRAB wrote:
>> On 07/06/2012 16:52, Alice Bevan?McGregor wrote:
>>> ?On 2012-06-07 15:30:11 +0000, Mike Meyer said:
>>>> ?Calling it "wrap-up processing" seems likely to cause people to think
>>>> ?about it as meaning "finally". But if the else clause is not executed
>>>> ?if the except clause is (as done by try/except/else), then there's no
>>>> ?longer an easy way to describe it.
>>>> ?It seems like adding an except would change the conditions under which
>>>> ?the else clause is executed (unlike try/except/else), as otherwise
>>>> ?there's no easy way capture the current behavior, where else is
>>>> ?executed whenever there are no chunks left to process. But that kind
>>>> ?of things seems like a way to introduce bugs.
>>> ?Well, how about:
>>> ? ? ?for<var> ? in<iterable>:
>>> ? ? ? ? ?pass # process each<var>
>>> ? ? ?except: ?# no arguments!
>>> ? ? ? ? ?pass # nothing to process
>>> ? ? ?else:
>>> ? ? ? ? ?pass # fell through
>>> ? ? ?finally:
>>> ? ? ? ? ?pass # regardless of break/fallthrough/empty
>>> ?Now for loops perfectly match try/except/else/finally!>:D ?(Like
>>> ?exception handling, finally would be called even with an inner return
>>> ?from any of the prior sections.)
>> Is the "finally" clause really necessary? Is it just the same as putting
>> it
>> after the loop?
> I've just noticed your remark about the finally clause being run even
> if there's a return. I can't say I like that; that's the job of
> try...finally.

You can stop right there. This design is not going anywhere.

--Guido van Rossum (

From alice at  Thu Jun  7 20:04:11 2012
From: alice at (=?utf-8?Q?Alice_Bevan=E2=80=93McGregor?=)
Date: Thu, 7 Jun 2012 14:04:11 -0400
Subject: [Python-ideas] for/else statements considered harmful
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <jqqqer$das$>

On 2012-06-07 16:29:01 +0000, Mike Meyer said:

> On Thu, 7 Jun 2012 11:52:10 -0400
> Alice Bevan?McGregor <alice at> wrote:
>> Now for loops perfectly match try/except/else/finally!  >:D  (Like
>> exception handling, finally would be called even with an inner return
>> from any of the prior sections.)
> For for (and don't forget while) loops, finally is pointless. It's the
> same as code after the loop. For try, finally runs even if there's an
> exception, which isn't true of that code.

I really should use parenthesis less as obviously people don't read the 
content between them.  (Not just you, I'm afraid! ;^)  If it weren't a 
useful feature (for/empty) I'm unsure as to why so many template 
engines implement it even though in most of them you _can_ utilize a 
sentinel value; at least, in the ones that allow embedded Python code.

Alas, the BDFL has spoken, however.  (Getting shot down was not 
unexpected despite the occasional +1000. ;)

	? Alice.

From python at  Thu Jun  7 20:18:12 2012
From: python at (MRAB)
Date: Thu, 07 Jun 2012 19:18:12 +0100
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqqqer$das$>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On 07/06/2012 19:04, Alice Bevan?McGregor wrote:
> On 2012-06-07 16:29:01 +0000, Mike Meyer said:
>>  On Thu, 7 Jun 2012 11:52:10 -0400
>>  Alice Bevan?McGregor<alice at>  wrote:
>>>  Now for loops perfectly match try/except/else/finally!>:D  (Like
>>>  exception handling, finally would be called even with an inner return
>>>  from any of the prior sections.)
>>  For for (and don't forget while) loops, finally is pointless. It's the
>>  same as code after the loop. For try, finally runs even if there's an
>>  exception, which isn't true of that code.
> I really should use parenthesis less as obviously people don't read the
> content between them.  (Not just you, I'm afraid! ;^)  If it weren't a
> useful feature (for/empty) I'm unsure as to why so many template
> engines implement it even though in most of them you _can_ utilize a
> sentinel value; at least, in the ones that allow embedded Python code.
> Alas, the BDFL has spoken, however.  (Getting shot down was not
> unexpected despite the occasional +1000. ;)
It was the comment about the "finally" clause always running which was the
problem, not about running a clause when there was an empty sequence.

From rurpy at  Thu Jun  7 22:48:24 2012
From: rurpy at (Rurpy)
Date: Thu, 7 Jun 2012 13:48:24 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/07/2012 01:12 AM, Stephen J. Turnbull wrote:
> Rurpy writes:
>  > I took the time to post here because it took an inordinate
>  > amount of effort to find a solution to a legitimate need 
>  > (your opinion to the contrary not withstanding)
> I don't think I said the need was illegitimate, if I did I apologize,
> and I certainly don't believe it is (I'm an economist by trade -- de
> gustibus non est disputandum).
> I just don't think it's necessary for Python to try to address the
> problem, because the problem is somebody else's bad design at root.

I don't understand that argument.  The world is full of 
bad design that Python has to address: daylight savings 
time, calendars, floating-point (according to some).  
Good/bad design is not even constant and changes with 
time.  There is still a telnetlib module in stdlib despite
the existence of ssh.  I suspect the vast majority of 
programmers are interested in a language that allows 
them to *effectively* get done what they need to, whether 
they are working of the latest agile TTD REST server, or 
modifying some legacy text files.  What I for one *don't*
need is to have my programming language enforcing its 
idea of CS political correctness on me.

Secondly, the disparity in ease of use of an alternate
encoding on sts.stdout is not really between utf8
and non-utf8, it is between a default encoding (which
may be non-utf8), and the encoding I wish to use.  So
one can't really attribute it to a desire to improve 
the world by making non-utf8 harder to use!

And even were I to accept your argument, Python is 
inconsistent: when I open a file explicitly there is 
only a slight penalty for opening a non-default-encoded 
file (the need the explicitly give an encoding):

  f = open ("myfile", "w")   # my default utf8 encoding
  print ("text string", file=f)
  f = open ("myfile", "w", encoding="sjis")  # non-utf8
  print ("text string", file=f)

But for sys.stdout, the penalty for using an alternate
encoding is to google around for a solution (which may 
not be optimal as Victor Stinner pointed out) and then 
read about codecs and the StreamWriter wrapper, textio 
wrappers and the .buffer() method.  And the reading part 
is then repeated by all those (at the same level of python 
expertise) who read the program.

All I can do is repeat what I said before: non-utf8
codings exist and are widely used.  That's a simple
fact.  Sample some .jp web sites and look at the ratio
of shift-jis web pages to utf-8 web pages for example.

utf-8 is an encoding.  shift-jis is an encoding.  Sure,
I understand that utf-8 is preferable and I will use it
when possible.  The fact that I am writing shift-jis means
that utf-8 *isn't* possible in this case.

Since utf-8 and shift-jis are both encodings and are equivalent 
from a coding viewpoint (a simple choice of which codec to use) 
the discrepancy in ease of use between the two in the case of 
writing to the standard streams is not justifiable and should 
be corrected if possible. 

> And I don't think it would be wise to try to do it in a very general
> way, because it's very hard to do that at the general level of the
> language.

But is it?  Or are you referring to switching encoding
on-the-fly?  (see below).

>  > I understand there is no support here for providing a non-
>  > obscure, programmatic way of changing the encoding of the 
>  > standard streams at program startup 
> You're wrong.  There is *some* support for that.
> It just has to be done safely, and that means that a generic
> .set_encoding() method that can be called after I/O has been performed
> probably isn't going to happen.

There are two sub-threads in this discussion

 1) Providing a more convenient and discoverable way to
 programmatically change the encoding of std* streams
 before first use.

 2) Changing the encoding used on the std* stream or
 any textio stream on the fly as a generalization of (1).

I thought I made clear I was advocating for (1) and 
not (2) when I earlier wrote in reply to you:
  > You are correct that my current concern is reinitializing 
  > the encoding(s) of the sys.std* streams prior to doing any
  > operations with them.
and to MRAB:
  > Disclaimer: As I said before, I am not particularly 
  > advocating for a for a set_encoding() method -- my 
  > primary suggestion is a programatic way to change the
  > sys.std* encodings prior to first use. 

As for (2), you have pointed out some potential issues with
switching encodings midstream.  I don't understand how codecs 
work in Python sufficiently yet to either agree or disagree 
with you.  I have however questioned some of the statements 
made regarding its difficulty (and am holding my opinion 
open until I understand the issues better), but I am not 
(as I've stated) advocating for it now.

Sorry if I failed to make the distinction clearer.  My use
of .set_encoding() as a placeholder for both ideas probably 
contributed to the confusion.

> And it might not happen at the core level, since a 3-line function can
> do the job, it might make just as much sense to put up a package on
> PyPI.

I wasn't suggesting a change to the core level (if by that 
you mean to the interpreter).  I was asking if some way could 
be provided that is easier and more reliable than googling 
around for a magic incantation) to change the encoding of one 
or more of the already-open-when-my-program-starts sys.std* 
streams.  I presume that would be a standard library change
(in either the io or sys modules) and offered a .set_encoding() 
method as a placeholder for discussion.

I hardly think it is worth the effort, for either the producer 
or consumers, of putting a 3-line function on PyPI.  Nor would 
such a solution address the discoverability and ease-of-use 
problems I am complaining about.

An inferior and bare minimum way to address this would be to 
at least add a note about how to change the encoding to the 
sys.std* documentation.  That encourages cargo-cult programming 
and doesn't address the WTF effect but it is at least better 
than the current state of affairs.

From mwm at  Thu Jun  7 23:00:47 2012
From: mwm at (Mike Meyer)
Date: Thu, 7 Jun 2012 17:00:47 -0400
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jun 7, 2012 at 4:48 PM, Rurpy <rurpy at> wrote:
> I suspect the vast majority of
> programmers are interested in a language that allows
> them to *effectively* get done what they need to, whether
> they are working of the latest agile TTD REST server, or
> modifying some legacy text files.

Others have raised the question this begs to have answered: how do
other programming languages deal with wanting to change the encoding
of the standard IO streams? Can you show us how they do things that's
so much easier than what Python does?

> And even were I to accept your argument, Python is
> inconsistent: when I open a file explicitly there is
> only a slight penalty for opening a non-default-encoded
> file (the need the explicitly give an encoding):

The proper encoding for the standard IO streams is generally a
property of the environment, and hence is set in the environment. You
have a use case where that's not the case. The argument is that your
use case isn't common enough to justify changing the standard library.
Can you provide evidence to the contrary? Other languages that make
setting the encoding on the standard streams easy, or applications
outside of those built for your system that have a "--encoding" type

> I wasn't suggesting a change to the core level (if by that
> you mean to the interpreter). ?I was asking if some way could
> be provided that is easier and more reliable than googling
> around for a magic incantation) to change the encoding of one
> or more of the already-open-when-my-program-starts sys.std*
> streams. ?I presume that would be a standard library change
> (in either the io or sys modules) and offered a .set_encoding()
> method as a placeholder for discussion.

Why presume that this needs a change in the library? The method is
straightforward, if somewhat ugly. Is there any reason it can't just
be documented, instead of added to the library? Changing the library
would require a similar documentation change.


From rurpy at  Thu Jun  7 23:02:13 2012
From: rurpy at (Rurpy)
Date: Thu, 7 Jun 2012 14:02:13 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/07/2012 12:27 AM, Paul Moore wrote: 
> One suggestion, which would probably shed some light on whether this
> should be viewed as something "simple and reasonable", would be to do
> some research on how the same task would be achieved in other
> languages.

Yes, that is a good idea.  If I decide to reraise this
suggestion at some point, I will try to do as you suggest.

> I have no experience to contribute but my intuition says
> that this could well be hard on other languages too. 

Again, I have yet to be convinced this is hard.  I am
very sceptical it is hard in the case of streams before
they've been written or read.  Replacing sys.stdout 
with a wrapper that encodes with the alternate encoding
clearly works -- it just needs to be encapsulated so the 
user doesn't need to figure out all the details in order
to use it.

> Would you be
> willing to do some web searches to look for solutions in (say) Java,
> or C#, or Ruby? In theory, it shouldn't take long (as otherwise you
> can conclude that the solution is obscure to the same extent that it
> is with Python).
> Even better, if those other languages do have a simple solution, it
> may suggest an approach that would be appropriate for Python.

From ncoghlan at  Thu Jun  7 23:45:55 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 07:45:55 +1000
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

The interpreter uses the standard streams internally, and they're one of
the first things created during interpreter startup. User provided code
doesn't start running until well after they're initialised.

If user level code doesn't want those streams, it needs to replace them
with something else.


Sent from my phone, thus the relative brevity :)
On Jun 8, 2012 7:03 AM, "Rurpy" <rurpy at> wrote:

> On 06/07/2012 12:27 AM, Paul Moore wrote:
> > One suggestion, which would probably shed some light on whether this
> > should be viewed as something "simple and reasonable", would be to do
> > some research on how the same task would be achieved in other
> > languages.
> Yes, that is a good idea.  If I decide to reraise this
> suggestion at some point, I will try to do as you suggest.
> > I have no experience to contribute but my intuition says
> > that this could well be hard on other languages too.
> Again, I have yet to be convinced this is hard.  I am
> very sceptical it is hard in the case of streams before
> they've been written or read.  Replacing sys.stdout
> with a wrapper that encodes with the alternate encoding
> clearly works -- it just needs to be encapsulated so the
> user doesn't need to figure out all the details in order
> to use it.
> > Would you be
> > willing to do some web searches to look for solutions in (say) Java,
> > or C#, or Ruby? In theory, it shouldn't take long (as otherwise you
> > can conclude that the solution is obscure to the same extent that it
> > is with Python).
> >
> > Even better, if those other languages do have a simple solution, it
> > may suggest an approach that would be appropriate for Python.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rurpy at  Fri Jun  8 02:14:29 2012
From: rurpy at (Rurpy)
Date: Thu, 7 Jun 2012 17:14:29 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
Message-ID: <>

On 06/07/2012 03:45 PM, Nick Coghlan wrote:
> The interpreter uses the standard streams internally, and
> they're one of the first things created during interpreter
> startup. User provided code doesn't start running until well
> after they're initialised.

In other words, the stream objects referenced by sys.std* 
are opened before the user code runs?

But if there are no operations on those streams until my
user code runs, they are still in the same state they were
after they were initialized, yes?  

So if one wanted to provide an "only before first use" 
set_encoding() function, why couldn't that function reexecute
the codecs part of the initialization code a second time?  
Of course there would need to be some sort of flag that it
could use to verify the stream was still in its initial state.

> If user level code doesn't want those streams, it needs to
> replace them with something else.

Yes, this is what the code I googled up does:
  import codecs
  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
But that code is not obvious to someone who has been able to do
all his encoded IO (with the exception of sys.stdout) using just
the encoding parameter of open().  Hence my question if some-
thing like a set_encoding() method/function that would work on
sys.stdout is feasible.  I don't see an answer to that in your
statement above.

From nathan at  Fri Jun  8 02:59:31 2012
From: nathan at (Nathan Schneider)
Date: Thu, 7 Jun 2012 17:59:31 -0700
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jun 7, 2012 at 5:14 PM, Rurpy <rurpy at> wrote:
> On 06/07/2012 03:45 PM, Nick Coghlan wrote:
>> The interpreter uses the standard streams internally, and
>> they're one of the first things created during interpreter
>> startup. User provided code doesn't start running until well
>> after they're initialised.
> In other words, the stream objects referenced by sys.std*
> are opened before the user code runs?
> But if there are no operations on those streams until my
> user code runs, they are still in the same state they were
> after they were initialized, yes?
> So if one wanted to provide an "only before first use"
> set_encoding() function, why couldn't that function reexecute
> the codecs part of the initialization code a second time?
> Of course there would need to be some sort of flag that it
> could use to verify the stream was still in its initial state.
>> If user level code doesn't want those streams, it needs to
>> replace them with something else.
> Yes, this is what the code I googled up does:
> ?import codecs
> ?sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)

What if codecs contained convenience methods for stdin and stdout?
I.e. the above could be written more simply as

  import codecs

This is much more memorable than the current option, and would also
make life easier when working with fileinput (whose openhook argument
can be set to control encoding of input *file* streams, but when it
falls back to stdin this preference is ignored).

> But that code is not obvious to someone who has been able to do
> all his encoded IO (with the exception of sys.stdout) using just
> the encoding parameter of open(). ?Hence my question if some-
> thing like a set_encoding() method/function that would work on
> sys.stdout is feasible. ?I don't see an answer to that in your
> statement above.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From ncoghlan at  Fri Jun  8 03:01:26 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 11:01:26 +1000
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 10:14 AM, Rurpy <rurpy at> wrote:
> On 06/07/2012 03:45 PM, Nick Coghlan wrote:
>> If user level code doesn't want those streams, it needs to
>> replace them with something else.
> Yes, this is what the code I googled up does:
> ?import codecs
> ?sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
> But that code is not obvious to someone who has been able to do
> all his encoded IO (with the exception of sys.stdout) using just
> the encoding parameter of open(). ?Hence my question if some-
> thing like a set_encoding() method/function that would work on
> sys.stdout is feasible. ?I don't see an answer to that in your
> statement above.

Right, I was only trying to explain why the standard streams are a
special case - because they're also used by the interpreter, and it
makes the startup process much simpler if the interpreter retains
complete control over the way they're initialised (it's already
complicated by the fact we need to get something half-usable in place
as sys.stderr so that error reporting is possible while initialising
them properly). It then becomes an application level operation to
replace them if desired.

We can (and do) make the internal standard stream initialisation
configurable, but it then becomes a UI design problem to get something
that balances flexibility against complexity. PYTHONIOENCODING (in
association with OS utilities that make it possible to set an
environment variable for a specific process invocation, as well as
support in the subprocess module for passing a tailored environment to
subprocesses) is our current solution.

The interpreter design aims, first and foremost, to provide a simple
and straightforward experience in POSIX environments that use UTF-8
everywhere (since that's the most sane approach available for
migrating from a previously ASCII-based computing world). Windows is a
bit trickier (due to the internal use of UTF-16 APIs and the lack of
POSIX-style support for temporarily setting an environment variable
when invoking a process from the shell), but correctly supporting that
environment is also a very high priority. The fallback behaviours when
these situations do not apply are designed to work best on systems
that are, at least somewhat *locally* consistent.

The real world is complex. Eventually, our answer has to be "handle it
at the application level, there are too many variations for us to
support it directly at the interpreter level". Currently, any standard
stream encoding related problem that can't be handled with
PYTHONIOENCODING is just such a situation. We know it sucks for
multi-encoding environments, but those are a nightmare for a lot of
reasons and are the main drivers behind the industry-wide effort to
standardise on Unicode text handling, including universal encodings
like UTF-8.

So now we're down to the question of how much complexity we're willing
to tolerate in the interpreter specifically for the sake of
environments where:
1. The automatic standard stream encoding calculation gives the wrong answer
2. The PYTHONIOENCODING override is insufficient
3. The application being executed isn't already handling the problem
4. A -m executable helper module (or directly executable helper
script) can't be used to initialise the standard streams correctly
before continuing on to execute the requested application via the
runpy module

And the answer is "not much". About the only likely way forward I can
see for streamlining this situation would be to treat this as another
use case for, which proposes the
ability to run snippets of Python code prior to execution of __main__.

I do agree that "create a new IO object that is like this old IO
object but with these settings changed" could probably do with a
better official API, but such an API needs to be designed with a
respect for the issues associated with changing encodings "on the fly"
and ask serious questions about whether or not we should be
encouraging that practice by making it easier than it is already. I
thought I had posted a tracker issue to that effect, but I can't find
it now.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From at  Fri Jun  8 04:57:29 2012
From: at (Yury Selivanov)
Date: Thu, 7 Jun 2012 22:57:29 -0400
Subject: [Python-ideas] functools.partial
Message-ID: <>


While I was working on adding support for 'functools.partial' in PEP 362, 
I discovered that it doesn't do any sanity check on passed arguments
upon creation.


    def foo(a):

    p = partial(foo, 1, 2, 3) # this line will execute

    p() # this line will fail

Is it a bug?  Or is it a feature, because we deliberately don't do any checks 
because of performance issues?  If the latter - I think it should be at least 


From cmjohnson.mailinglist at  Fri Jun  8 05:14:42 2012
From: cmjohnson.mailinglist at (Carl M. Johnson)
Date: Thu, 7 Jun 2012 17:14:42 -1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqqina$6u0$>
References: <jqooj7$s7o$> <jqq90t$e2d$>
Message-ID: <>

On Jun 7, 2012, at 5:52 AM, Alice Bevan?McGregor wrote:

> Well, how about:
>   for <var> in <iterable>:
>       pass # process each <var>
>   except:  # no arguments!
>       pass # nothing to process
>   else:
>       pass # fell through
>   finally:
>       pass # regardless of break/fallthrough/empty

Finally is redundant, but what about an `except break:` as the opposite of `else`?

From ncoghlan at  Fri Jun  8 05:40:25 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 13:40:25 +1000
Subject: [Python-ideas] functools.partial
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 12:57 PM, Yury Selivanov < at> wrote:
> Hello,
> While I was working on adding support for 'functools.partial' in PEP 362,
> I discovered that it doesn't do any sanity check on passed arguments
> upon creation.
> Example:
> ? ?def foo(a):
> ? ? ? ?pass
> ? ?p = partial(foo, 1, 2, 3) # this line will execute
> ? ?p() # this line will fail
> Is it a bug? ?Or is it a feature, because we deliberately don't do any checks
> because of performance issues? ?If the latter - I think it should be at least
> documented.

Partly the latter, but also a matter of "this is hard to do, so we
don't even try". There are many other "lazy execution" APIs with the
same problem - they accept an arbitrary underlying callable, but you
don't find out until you try to call it that the arguments don't match
the parameters. This leads to errors being raised far away from the
code that actually introduced the error.

If you dig up some of the older PEP 362 discussions, you'll find that
allowing developers to reduce this problem over time is the main
reason the Signature.bind() method was added to the PEP. While I
wouldn't recommend it for the base partial type, I could easily see
someone using PEP 362 to create a "checked partial" that ensures
arguments are valid as they get passed in rather than leaving the
validation until the call is actually made.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From niki.spahiev at  Fri Jun  8 09:23:36 2012
From: niki.spahiev at (Niki Spahiev)
Date: Fri, 08 Jun 2012 10:23:36 +0300
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <jqs99l$h1f$>

On  8.06.2012 00:00, Mike Meyer wrote:
> The proper encoding for the standard IO streams is generally a
> property of the environment, and hence is set in the environment. You
> have a use case where that's not the case. The argument is that your
> use case isn't common enough to justify changing the standard library.
> Can you provide evidence to the contrary? Other languages that make
> setting the encoding on the standard streams easy, or applications
> outside of those built for your system that have a "--encoding" type
> flag?

     --debug             enable debugging output
     --debugger          start debugger
     --encoding ENCODE   set the charset encoding (default: UTF-8)
     --encodingmode MODE set the charset encoding mode (default: strict)
     --traceback         always print a traceback on exception


From stephen at  Fri Jun  8 10:04:27 2012
From: stephen at (Stephen J. Turnbull)
Date: Fri, 08 Jun 2012 17:04:27 +0900
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

Yuval Greenfield writes:
 > On Thu, Jun 7, 2012 at 1:50 PM, Stephen J. Turnbull <stephen at>wrote:
 > >    def search_in_iterable(key, iter):
 > >        for item in iter:
 > >            if item == key:
 > >                return some_function_of(item)
 > >        else:
 > >            return not_found_default
 > >
 > >
 > You don't need the "else" there. An equivalent:

*You* don't need it.  *I* like it, because it expresses the fact that
returning a default is a necessary complement to the for loop.

While this is something of a TOOWTDI violation, there are cases where
else is needed to express the semantics, as well (eg, if the first
return statement is replaced by "process(item); break").

From techtonik at  Fri Jun  8 10:16:34 2012
From: techtonik at (anatoly techtonik)
Date: Fri, 8 Jun 2012 11:16:34 +0300
Subject: [Python-ideas] Isolated (?transactional) exec (?subinterpreter)
Message-ID: <>


Having a lot of ideas is a curse, because I can barely follow up on
them, but I try -
I really read replies, just don't have enough energy to answer
immediately (as it
usually requires some time for research). Here is another one that
ripes too long
to become rotten:

  Make exec(code[, globals[, locals]]) calls consistent, optionally
isolated from parent environment and transactional.

  - it should not matter if the code is executed with command line
interpreter or from exec(),
    code should not be modified to successfully run in exec if it
successfully runs in intepreter session

Optionally isolated from parent environment:
  - a feature to execute user script in a snapshot of current
environment and have
    a choice whenever to merge its modifications back to environment or not

    real user story - read system configuration settings, where
optional detection
    rules are written in Python (Blender/SCons build scripts) -
autodetection probes
    can affect environment while detection takes place and it can lead to more
    problems later

    (think of virtualenv on Python process level with defined data
exchange protocol
    through globals/locals variables)

  - well, if it is isolated - it is already transactional - an ability
to discard results if an
    error or an exception inside exec() occurs - getting back to the
state right before exec.

anatoly t.

From simon.sapin at  Fri Jun  8 09:42:50 2012
From: simon.sapin at (Simon Sapin)
Date: Fri, 08 Jun 2012 09:42:50 +0200
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <jqs99l$h1f$>
References: <>
Message-ID: <>

Le 08/06/2012 09:23, Niki Spahiev a ?crit :
> Mercurial:
> ...
>       --debug             enable debugging output
>       --debugger          start debugger
>       --encoding ENCODE   set the charset encoding (default: UTF-8)
>       --encodingmode MODE set the charset encoding mode (default: strict)
>       --traceback         always print a traceback on exception
> ...

 From the man page:

>         This overrides the default locale setting detected by Mercurial.
>         This  setting  is  used  to  convert  data  including usernames,
>         changeset descriptions, tag names, and  branches.  This  setting
>         can be overridden with the --encoding command-line option.

I don?t know if this affects standard IO.

Simon Sapin

From ncoghlan at  Fri Jun  8 11:04:55 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 19:04:55 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
	model for loop else clauses
Message-ID: <>

(context for python-ideas: my recently checked in changes to the
tutorial, that added the final paragraph to

On Fri, Jun 8, 2012 at 5:29 PM, Stephen J. Turnbull <stephen at> wrote:
> Note: reply-to set to python-ideas.
> Nick Coghlan writes:
> ?> The inaccuracies in the analogy are why this is in the tutorial, not the
> ?> language reference. All 3 else clauses are really their own thing.
> Nick, for the purpose of the tutorial, actually there are 4 else
> clauses: you need to distinguish *while* from *for*. ?It was much
> easier for me to get confused about *for*.

The only thing I'm trying to do with the tutorial update is to
encourage beginners to be start thinking in terms of try/except/else
when they first encounter for/break/else and while/break/else. That's

Yes, ultimately once people fully understand how it works under the
hood (including the loop-and-a-half construct for infinite while
loops), they'll release it's actually closely related to conditionals
as well, but anyone that places too much weight on the following
obvious parallel is going to be confused for a long time. After all:

  if iterable:

is *very* similar in appearance to:

  for x in iterable:

I believe that parallel is 99% of the reason why people get confused
about the meaning of the latter.

The point of the tutorial update is to give readers a slight nudge
towards thinking of the latter as:

  for x in iterable:
  except break:  # Implicit in the semantics of loops

Would it be worth adding the "except break:" clause to the language
just to make it crystal clear what is actually going on? I don't think
so, but it's still a handy way to explain the semantics while gently
steering people away from linking for/else and if/else too closely. I
actually agree all of the else clauses really *are* quite closely
related (hence the consistent use of the same keyword), but the
relationship is *not* the intuitively obvious one that comes to mind
when you just look at the similarity in the concrete syntax
specifically of for/else and if/else.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From jeanpierreda at  Fri Jun  8 11:14:45 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Fri, 8 Jun 2012 05:14:45 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Fri, Jun 8, 2012 at 4:04 AM, Stephen J. Turnbull <stephen at> wrote:
> Yuval Greenfield writes:
> ?> On Thu, Jun 7, 2012 at 1:50 PM, Stephen J. Turnbull <stephen at>wrote:
> ?>
> ?> > ? ?def search_in_iterable(key, iter):
> ?> > ? ? ? ?for item in iter:
> ?> > ? ? ? ? ? ?if item == key:
> ?> > ? ? ? ? ? ? ? ?return some_function_of(item)
> ?> > ? ? ? ?else:
> ?> > ? ? ? ? ? ?return not_found_default
> ?> >
> ?> >
> ?> You don't need the "else" there. An equivalent:
> *You* don't need it. ?*I* like it, because it expresses the fact that
> returning a default is a necessary complement to the for loop.

I've never been sure of what is good style here. It's comparable to
these two things:

def foo():
    if bar():
        return baz
    return quux

def foo2():
    if bar():
        return baz
        return quux

Is there some well-accepted rule of which to use?

-- Devin

From rob.cliffe at  Fri Jun  8 11:44:55 2012
From: rob.cliffe at (Rob Cliffe)
Date: Fri, 08 Jun 2012 10:44:55 +0100
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On 08/06/2012 10:04, Nick Coghlan wrote:
> (context for python-ideas: my recently checked in changes to the
> tutorial, that added the final paragraph to
> On Fri, Jun 8, 2012 at 5:29 PM, Stephen J. Turnbull<stephen at>  wrote:
>> Note: reply-to set to python-ideas.
>> Nick Coghlan writes:
>>   >  The inaccuracies in the analogy are why this is in the tutorial, not the
>>   >  language reference. All 3 else clauses are really their own thing.
>> Nick, for the purpose of the tutorial, actually there are 4 else
>> clauses: you need to distinguish *while* from *for*.  It was much
>> easier for me to get confused about *for*.
> The only thing I'm trying to do with the tutorial update is to
> encourage beginners to be start thinking in terms of try/except/else
> when they first encounter for/break/else and while/break/else. That's
> it.
> Yes, ultimately once people fully understand how it works under the
> hood (including the loop-and-a-half construct for infinite while
> loops), they'll release it's actually closely related to conditionals
> as well, but anyone that places too much weight on the following
> obvious parallel is going to be confused for a long time. After all:
>    if iterable:
>      ...
>    else:
>      ...
> is *very* similar in appearance to:
>    for x in iterable:
>      ...
>    else:
>      ...
> I believe that parallel is 99% of the reason why people get confused
> about the meaning of the latter.
> The point of the tutorial update is to give readers a slight nudge
> towards thinking of the latter as:
>    for x in iterable:
>      ...
>    except break:  # Implicit in the semantics of loops
>      pass
>    else:
>      ...
> Would it be worth adding the "except break:" clause to the language
> just to make it crystal clear what is actually going on? I don't think
> so, but it's still a handy way to explain the semantics while gently
> steering people away from linking for/else and if/else too closely. I
> actually agree all of the else clauses really *are* quite closely
> related (hence the consistent use of the same keyword), but the
> relationship is *not* the intuitively obvious one that comes to mind
> when you just look at the similarity in the concrete syntax
> specifically of for/else and if/else.
> Cheers,
> Nick.
I think a better scheme would be to have more meaningful keywords or 
keyword-combinations, e.g.

for x in iterable:
     # do stuff
ifempty:  #  or perhaps ifnoiter: (also applicable to while loops)
     # do stuff
     # do stuff
     # do stuff

which would give all the flexibility while making it reasonably clear 
what was happening.
Rob Cliffe

From stephen at  Fri Jun  8 13:11:56 2012
From: stephen at (Stephen J. Turnbull)
Date: Fri, 08 Jun 2012 20:11:56 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > Python is inconsistent:

Yup, and I said there is support for dealing with that inconsistency.
At least I'm +1 and Nick's +0.5.

So let's talk about what to do about it.  Nick has a pretty good
channel on the BFDL, and since he doesn't seem to like an addition to
the stdlib here, it may not go far.  But I don't see a reason to rule
out stdlib changes yet.

As far as I'm concerned, there are three reasonable proposals:

 > > [S]ince a 3-line function can do the job, it might make just as
 > > much sense to put up a package on PyPI.

 > I hardly think it is worth the effort, for either the producer 
 > or consumers, of putting a 3-line function on PyPI.  Nor would 
 > such a solution address the discoverability and ease-of-use 
 > problems I am complaining about.

Agreed that it's pretty weak, but it's not clear that other solutions
will be much better in practice.  Discoverability depends on
documentation, which can be written and improved.

I think "ease of use" is way off-target.

 > I presume that would be a standard library change (in either the io
 > or sys modules) and offered a .set_encoding() method as a
 > placeholder for discussion.

Changing the stdlib is not a panacea.  In particular, it can't be
applied to older Pythons.  I'm also not convinced (cf. Nick's post)
that there's enough value-added and a good name for the restricted
functionality we know we can provide.

 > An inferior and bare minimum way to address this would be to at
 > least add a note about how to change the encoding to the sys.std*
 > documentation.  That encourages cargo-cult programming and doesn't
 > address the WTF effect but it is at least better than the current
 > state of affairs.

IMO, this may be the best, but again I doubt it can be added to older

As for the "cargo cult" and "WTF" issues, I have little sympathy for
either.  The real WTF problem is that multi-encoding environments are
inherently complex and irregular (ie, a WTF waiting to happen), and
Python can't fix that.  It's very unlikely that typical programmers
will bother to understand what happens "under the hood" of a stdlib
function/method, so that is no better than cargo-cult programming (and
cargo-cult at least has the advantage that what is being done is
explicit, allowing programmers who understand textio but not encodings
to figure out what's happening).

From ncoghlan at  Fri Jun  8 13:53:03 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 21:53:03 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 7:44 PM, Rob Cliffe <rob.cliffe at> wrote:
> I think a better scheme would be to have more meaningful keywords or
> keyword-combinations, e.g.
> for x in iterable:
> ? ?# do stuff
> ifempty: ?# ?or perhaps ifnoiter: (also applicable to while loops)
> ? ?# do stuff
> #ifbreak:
> ? ?# do stuff
> #ifnobreak:
> ? ?# do stuff
> which would give all the flexibility while making it reasonably clear what
> was happening.

The way to be clear would actually be to drop the feature altogether
(as Guido has noted in the past). Then TOOWTDI becomes:

    x = _no_data = object()
    result = _not_found = object()
    for x in iterable:
        if acceptable(x):
            result = x
    if x is _no_data:
        # No data!
    if result is _not_found:
        # Nothing interesting!
    # Found a result, process it

That's never going to happen in Python though, due to backwards
compatibility requirements.

FWIW, I wrote an essay summarising some of the thoughts presented in
these threads:


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From rob.cliffe at  Fri Jun  8 14:05:38 2012
From: rob.cliffe at (Rob Cliffe)
Date: Fri, 08 Jun 2012 13:05:38 +0100
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On 08/06/2012 12:53, Nick Coghlan wrote:
> On Fri, Jun 8, 2012 at 7:44 PM, Rob Cliffe<rob.cliffe at>  wrote:
>> I think a better scheme would be to have more meaningful keywords or
>> keyword-combinations, e.g.
>> for x in iterable:
>>     # do stuff
>> ifempty:  #  or perhaps ifnoiter: (also applicable to while loops)
>>     # do stuff
>> #ifbreak:
>>     # do stuff
>> #ifnobreak:
>>     # do stuff
>> which would give all the flexibility while making it reasonably clear what
>> was happening.
> The way to be clear would actually be to drop the feature altogether
> (as Guido has noted in the past). Then TOOWTDI becomes:
>      x = _no_data = object()
>      result = _not_found = object()
>      for x in iterable:
>          if acceptable(x):
>              result = x
>              break
>      if x is _no_data:
>          # No data!
>      if result is _not_found:
>          # Nothing interesting!
>      # Found a result, process it
> That's never going to happen in Python though, due to backwards
> compatibility requirements.
> FWIW, I wrote an essay summarising some of the thoughts presented in
> these threads:
> Cheers,
> Nick.
Fair enough, but I think my more compact versions are more readable.

From ncoghlan at  Fri Jun  8 14:12:47 2012
From: ncoghlan at (Nick Coghlan)
Date: Fri, 8 Jun 2012 22:12:47 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 10:05 PM, Rob Cliffe <rob.cliffe at> wrote:
> Fair enough, but I think my more compact versions are more readable.

At the expense of adding 3 new keywords to the language for something
that can already be handled with ordinary variable assignments. They
would add no real expressive power to the language, so they just
become another special case for newcomers to learn. Not a good

With the benefit of hindsight, we can also see that supporting the
"else" clause on loops wasn't a good trade-off either (given the
confusion it can cause). However, since the mistake has already been
made, a lot of code out in the wild relies on it and those of us that
quite like the construct are used to having it available, it's not
worth the hassle of deprecating and removing it.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From stephen at  Fri Jun  8 14:53:18 2012
From: stephen at (Stephen J. Turnbull)
Date: Fri, 08 Jun 2012 21:53:18 +0900
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

Rob Cliffe writes:
 > On 08/06/2012 10:04, Nick Coghlan wrote:

 > > Would it be worth adding the "except break:" clause to the language
 > > just to make it crystal clear what is actually going on?

-1 I understood what "except break:" was supposed to mean when I read
it the first time, but now I don't any more.

 > > I don't think so, but it's still a handy way to explain the
 > > semantics while gently steering people away from linking for/else
 > > and if/else too closely.

My main point about documentation is that for/else and if/else should
not be linked directly, but rather via while/else.

 > I think a better scheme would be to have more meaningful keywords or 
 > keyword-combinations, e.g.
 > for x in iterable:
 >      # do stuff
 > ifempty:  #  or perhaps ifnoiter: (also applicable to while loops)
 >      # do stuff
 > #ifbreak:
 >      # do stuff
 > #ifnobreak:
 >      # do stuff
 > which would give all the flexibility while making it reasonably clear 
 > what was happening.

Sure, but that's way overboard for something that's only rarely

From amauryfa at  Fri Jun  8 15:00:00 2012
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Fri, 8 Jun 2012 15:00:00 +0200
Subject: [Python-ideas] Isolated (?transactional) exec (?subinterpreter)
In-Reply-To: <>
References: <>
Message-ID: <>

2012/6/8 anatoly techtonik <techtonik at>

> Optionally isolated from parent environment:
>  - a feature to execute user script in a snapshot of current
> environment and have
>    a choice whenever to merge its modifications back to environment or not

It would be a really interesting feature, but seems very difficult to
Do you have the slightest idea how this would work?
What about global state, environment variables, threads, and all kinds of

Or are you thinking about a solution based on the multiprocessing module?

Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ned at  Fri Jun  8 15:13:56 2012
From: ned at (Ned Batchelder)
Date: Fri, 08 Jun 2012 09:13:56 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

Just to add another attempt at explaining the for/else confusion, the 
analogy that keeps it straight in my mind is that the "else" is really 
paired with the "if .. break" inside the loop:


On 6/8/2012 5:04 AM, Nick Coghlan wrote:
> (context for python-ideas: my recently checked in changes to the
> tutorial, that added the final paragraph to
> On Fri, Jun 8, 2012 at 5:29 PM, Stephen J. Turnbull<stephen at>  wrote:
>> Note: reply-to set to python-ideas.
>> Nick Coghlan writes:
>>   >  The inaccuracies in the analogy are why this is in the tutorial, not the
>>   >  language reference. All 3 else clauses are really their own thing.
>> Nick, for the purpose of the tutorial, actually there are 4 else
>> clauses: you need to distinguish *while* from *for*.  It was much
>> easier for me to get confused about *for*.
> The only thing I'm trying to do with the tutorial update is to
> encourage beginners to be start thinking in terms of try/except/else
> when they first encounter for/break/else and while/break/else. That's
> it.
> Yes, ultimately once people fully understand how it works under the
> hood (including the loop-and-a-half construct for infinite while
> loops), they'll release it's actually closely related to conditionals
> as well, but anyone that places too much weight on the following
> obvious parallel is going to be confused for a long time. After all:
>    if iterable:
>      ...
>    else:
>      ...
> is *very* similar in appearance to:
>    for x in iterable:
>      ...
>    else:
>      ...
> I believe that parallel is 99% of the reason why people get confused
> about the meaning of the latter.
> The point of the tutorial update is to give readers a slight nudge
> towards thinking of the latter as:
>    for x in iterable:
>      ...
>    except break:  # Implicit in the semantics of loops
>      pass
>    else:
>      ...
> Would it be worth adding the "except break:" clause to the language
> just to make it crystal clear what is actually going on? I don't think
> so, but it's still a handy way to explain the semantics while gently
> steering people away from linking for/else and if/else too closely. I
> actually agree all of the else clauses really *are* quite closely
> related (hence the consistent use of the same keyword), but the
> relationship is *not* the intuitively obvious one that comes to mind
> when you just look at the similarity in the concrete syntax
> specifically of for/else and if/else.
> Cheers,
> Nick.

From jacek.masiulaniec at  Fri Jun  8 15:21:59 2012
From: jacek.masiulaniec at (Jacek Masiulaniec)
Date: Fri, 8 Jun 2012 14:21:59 +0100
Subject: [Python-ideas] SysLogHandler: gratuitous data loss
Message-ID: <>


In logging.handlers, SysLogHandler defaults to localhost:514.

In practice, there are systems out there that offer local syslog
service via additional endpoints, for example:


Some systems even ship with UDP endpoint disabled by default, in
which case Python's default is to drop data despite the availability
of these other endpoints.

The /dev/log path in particular is so commonplace that many
system-level utils default to it.  Other languages' syslog libraries
provide support for it, too. [1] [2]

I propose a change to SysLogHandler's default behavior:
1) Try connect(2) against the socket files.
2) Use localhost:514 as a fallback.

I believe it's possible to change this interface while remaining




From solipsis at  Fri Jun  8 16:55:17 2012
From: solipsis at (Antoine Pitrou)
Date: Fri, 08 Jun 2012 16:55:17 +0200
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <jqt416$g53$>

Le 08/06/2012 11:04, Nick Coghlan a ?crit :
> The only thing I'm trying to do with the tutorial update is to
> encourage beginners to be start thinking in terms of try/except/else
> when they first encounter for/break/else and while/break/else. That's
> it.

I don't see why you're trying to draw that analogy, since a loop has 
nothing in common with a try block. For the record, when I was a Python 
beginner, I had zero problem understanding the for/else construct, and 
it even struck me as very useful ("oh, they've thought about a clean and 
easy way to write search-and-break loops").

I don't think it's useful to think of beginners as people having 
comprehension problems. Besides, if you don't understand something up 
front, there's always the possibility to come back later.



From steve at  Fri Jun  8 17:06:23 2012
From: steve at (Steven D'Aprano)
Date: Sat, 09 Jun 2012 01:06:23 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan wrote:

>   for x in iterable:
>     ...
>   except break:  # Implicit in the semantics of loops
>     pass
>   else:
>     ...
> Would it be worth adding the "except break:" clause to the language
> just to make it crystal clear what is actually going on? I don't think
> so, but it's still a handy way to explain the semantics while gently

I agree that it is *not* worthwhile. The main reason is that "except break" 
would add a new and different form of confusion (or at least complication): 
what happens when you return or raise from inside the loop rather than break?

If "except break" *only* executes after a break (like it says!) that opens the 
door to "except return" and "except raise". Bleh. I really don't think we need 
this level of complication in loops.

But if "except break" runs on *any* early exit from the loop (break, return or 
raise), then the name is misleading and confusing and we now have a new and 
exciting education problem to replace the old one. (Albeit probably a simpler 

> steering people away from linking for/else and if/else too closely. I
> actually agree all of the else clauses really *are* quite closely
> related (hence the consistent use of the same keyword), but the

I'm not so sure that they are that close, except in the trivial sense of 
having two alternatives, "A happens, otherwise B happens". In the case of 
for/else, the A is implied (a break, return or raise), which makes it rather 
different from if/else where both alternatives are explicit.

Despite the similarity with try/else, I think it is quite a stretch to link 
the semantics of for/else with the word "else". It simply is not a good choice 
of keyword. If it were, we wouldn't be having this discussion.

Although it would have cost an additional keyword, I think that for/else and 
while/else should have been written as for/then and while/then, since that 
accurately describes what they do (unless you're Dutch *wink*).

for x in seq:

There would be no implication that the "then" clause is executed *instead of* 
the loop part, instead the natural implication is that it is executed *after* 
the loop.

(And then we could have introduced an "else" clause to do what people expect 
the else clause to do, namely run if the loop doesn't run.)


From steve at  Fri Jun  8 17:08:33 2012
From: steve at (Steven D'Aprano)
Date: Sat, 09 Jun 2012 01:08:33 +1000
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
	<jqparl$9ka$>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

Devin Jeanpierre wrote:

> I've never been sure of what is good style here. It's comparable to
> these two things:
> def foo():
>     if bar():
>         return baz
>     return quux
> def foo2():
>     if bar():
>         return baz
>     else:
>         return quux
> Is there some well-accepted rule of which to use?

Not in my opinion. Due to laziness (why write an extra line that isn't 
necessary?), I tend to prefer the first version, but have been known to also 
use the second version on some occasions.


From g.rodola at  Fri Jun  8 17:53:04 2012
From: g.rodola at (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Fri, 8 Jun 2012 17:53:04 +0200
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <jqt416$g53$>
References: <>
Message-ID: <>

2012/6/8 Antoine Pitrou <solipsis at>:
> Le 08/06/2012 11:04, Nick Coghlan a ?crit :
>> The only thing I'm trying to do with the tutorial update is to
>> encourage beginners to be start thinking in terms of try/except/else
>> when they first encounter for/break/else and while/break/else. That's
>> it.
> I don't see why you're trying to draw that analogy, since a loop has nothing
> in common with a try block. For the record, when I was a Python beginner, I
> had zero problem understanding the for/else construct, and it even struck me
> as very useful ("oh, they've thought about a clean and easy way to write
> search-and-break loops").
> I don't think it's useful to think of beginners as people having
> comprehension problems. Besides, if you don't understand something up front,
> there's always the possibility to come back later.
> Regards
> Antoine.

I also didn't have problems while I was learning python, and always
found for/else very expressive as a statement.
for/else is not immediately clear, meaning it is mandatory to read the
doc in order to understand what it does and what to expect, but once
you do that then you're done.

--- Giampaolo

From simon.sapin at  Fri Jun  8 17:52:44 2012
From: simon.sapin at (Simon Sapin)
Date: Fri, 08 Jun 2012 17:52:44 +0200
Subject: [Python-ideas] Isolated (?transactional) exec (?subinterpreter)
In-Reply-To: <>
References: <>
Message-ID: <>

Le 08/06/2012 15:00, Amaury Forgeot d'Arc a ?crit :
> It would be a really interesting feature, but seems very difficult to
> implement.
> Do you have the slightest idea how this would work?
> What about global state, environment variables, threads, and all kinds
> of side-effects?
> Or are you thinking about a solution based on the multiprocessing module?

Without any kind an guarantee about side-effects, one could start by 
making shallow or deep copies of the namespaces passed to exec.

Simon Sapin

From tjreedy at  Fri Jun  8 18:31:57 2012
From: tjreedy at (Terry Reedy)
Date: Fri, 08 Jun 2012 12:31:57 -0400
Subject: [Python-ideas] Isolated (?transactional) exec (?subinterpreter)
In-Reply-To: <>
References: <>
Message-ID: <jqt9ei$ce$>

On 6/8/2012 4:16 AM, anatoly techtonik wrote:

>    - it should not matter if the code is executed with command line
> interpreter or from exec(),
>      code should not be modified to successfully run in exec if it
> successfully runs in intepreter session

These 4 duplicate issues are all about misuse of exec. The code in each 
*does* run if exec is passed just one namespace instead of two. When 
people pass two separate namespaces, the code executes as if embedded in 
a class definition. Since 'docs at python' never applied my suggested doc 
patch on 13557, I will take a stab at it.

> Optionally isolated from parent environment:
>    - a feature to execute user script in a snapshot of current
> environment and have
>      a choice whenever to merge its modifications back to environment or not

You can do at least some of that now.

Terry Jan Reedy

From guido at  Fri Jun  8 19:37:17 2012
From: guido at (Guido van Rossum)
Date: Fri, 8 Jun 2012 10:37:17 -0700
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$> <jqparl$9ka$>
Message-ID: <>

On Fri, Jun 8, 2012 at 8:08 AM, Steven D'Aprano <steve at> wrote:
> Devin Jeanpierre wrote:
>> I've never been sure of what is good style here. It's comparable to
>> these two things:
>> def foo():
>> ? ?if bar():
>> ? ? ? ?return baz
>> ? ?return quux
>> def foo2():
>> ? ?if bar():
>> ? ? ? ?return baz
>> ? ?else:
>> ? ? ? ?return quux
>> Is there some well-accepted rule of which to use?

> Not in my opinion. Due to laziness (why write an extra line that isn't
> necessary?), I tend to prefer the first version, but have been known to also
> use the second version on some occasions.

It's indeed a very subtle choice, and for simple examples it usually
doesn't much matter.

I tend to like #1 better if "then" block is small (especially an error
exit or some other "early return" like a cache hit) and the "else"
block is more substantial -- it saves an indentation level. (I also
sometimes reverse the sense of the test just to get the smaller block
first, for this reason.)

When there are a bunch of elif clauses each ending with return (e.g.
emulating a switch) I think it makes more sense to use "else" for the
final clause, for symmetry.

So maybe my gut rule is that if the clauses are roughly symmetrical,
use the else, but if there is significant asymmetry, don't bother.

--Guido van Rossum (

From mwm at  Fri Jun  8 21:53:05 2012
From: mwm at (Mike Meyer)
Date: Fri, 8 Jun 2012 15:53:05 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 08 Jun 2012 21:53:18 +0900
"Stephen J. Turnbull" <stephen at> wrote:
> My main point about documentation is that for/else and if/else should
> not be linked directly, but rather via while/else.

Right. That was the most enlightening comment I saw in this
thread. Writing the if/else and while/else out as:

    if condition:
        # code to run if condition is true
        # code to run if condition is false

    while condition:
        # code to run while condition is true
        # code to run when condition is false

Seems obvious enough to me. For is a little bit harder, but still a
straightforward if you think about it in terms of the while.

    for x in iterable:
        # code to run while there are objects left in iterable
        # code to run when there are no objects left in iterable.

Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From jeanpierreda at  Fri Jun  8 22:08:12 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Fri, 8 Jun 2012 16:08:12 -0400
Subject: [Python-ideas] functools.partial
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jun 7, 2012 at 11:40 PM, Nick Coghlan <ncoghlan at> wrote:
> If you dig up some of the older PEP 362 discussions, you'll find that
> allowing developers to reduce this problem over time is the main
> reason the Signature.bind() method was added to the PEP. While I
> wouldn't recommend it for the base partial type, ... <SNIP>

Why not? It seems like a good idea all around.

-- Devin

From ubershmekel at  Sat Jun  9 00:34:53 2012
From: ubershmekel at (Yuval Greenfield)
Date: Sat, 9 Jun 2012 01:34:53 +0300
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 12:04 PM, Nick Coghlan <ncoghlan at> wrote:

> (context for python-ideas: my recently checked in changes to the
> tutorial, that added the final paragraph to
> )
If we're on that subject then I think this

> Loop statements may have an else clause; it is executed when the loop
terminates through exhaustion of the list (with for) or when the condition
becomes false (with while), but not when the loop is terminated by a break

Doesn't hit the "break" nail on the head fast and hard enough in my
opinion. I'd replace it with something like:

> Loop statements may have an else clause; it is executed immediately after
the loop but is skipped if the loop was terminated by a break statement.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sat Jun  9 01:59:46 2012
From: tjreedy at (Terry Reedy)
Date: Fri, 08 Jun 2012 19:59:46 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <>
References: <jqooj7$s7o$>
	<jqparl$9ka$>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <jqu3m6$4us$>

> Devin Jeanpierre wrote:
>> I've never been sure of what is good style here. It's comparable to
>> these two things:
>> def foo():
>> if bar():
>> return baz
>> return quux
>> def foo2():
>> if bar():
>> return baz
>> else:
>> return quux
>> Is there some well-accepted rule of which to use?

The rule I have adopted is to omit unneeded after-if else to separate 
preamble stuff -- argument checking -- from the core algorithm, but 
leave it when branching is an essential part of the algorithm. My idea 
is that if the top level structure of the algorithm is an alternation, 
then the code should say so without the reader having to examine the 
contents of the branch.

Example: floating-point square root

def fsqrt(x):
   if not isinstance(x, float):
     raise TypeError
   elif x < 0:
     raise ValueError
   # Now we are ready for the real algorithm
   if x > 1.0:
     return fsqrt(1/x)
     # iterate
     return result

Omission of elses can definitely be taken too far. There is in the C 
codebase code roughly with this outline:

if expression:
   # about 15 line with at least 2 nested ifs (with else omitted)
   # and at least 3 codepaths ending in return
calculation for else but with else omitted

It takes far longer for each reader to examine if block to determine the 
the following block is really an else block that it would have taken one 
writer to just put in the "} else {"

Also, some editor allow collapsing of indented blocks, but one cannot do 
that if else is omitted.

Of course, it is routine to omit unneeded else after loops.

Terry Jan Reedy

From tjreedy at  Sat Jun  9 02:15:27 2012
From: tjreedy at (Terry Reedy)
Date: Fri, 08 Jun 2012 20:15:27 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <jqu4jj$ajo$>

On 6/8/2012 6:34 PM, Yuval Greenfield wrote:

>  > Loop statements may have an else clause; it is executed immediately
> after the loop but is skipped if the loop was terminated by a break
> statement.

As I said in my reply on pydev, that is misleading. The else clause 
executes if and when the loop condition is false. Period. Simple rule.

It will not execute if the loop is exited by break OR if the loop is 
exited by return OR if the loop is exited by raise OR if the loop never 
exits. (OR is the loop is aborted by external factors.) As far as else 
is concerned, there is nothing special about break exits compared to 
return or raise exits.

But Nick's doc addition and your alternative imply otherwise. One could 
read Nick's statement and your paraphrase as suggesting that the else 
will by executed if the loop is exited by return (like the finally of 
try) or raise (like the except of try). And that is wrong.

Terry Jan Reedy

From ncoghlan at  Sat Jun  9 02:17:24 2012
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jun 2012 10:17:24 +1000
Subject: [Python-ideas] functools.partial
In-Reply-To: <>
References: <>
Message-ID: <>

On Jun 9, 2012 6:08 AM, "Devin Jeanpierre" <jeanpierreda at> wrote:
> On Thu, Jun 7, 2012 at 11:40 PM, Nick Coghlan <ncoghlan at> wrote:
> > If you dig up some of the older PEP 362 discussions, you'll find that
> > allowing developers to reduce this problem over time is the main
> > reason the Signature.bind() method was added to the PEP. While I
> > wouldn't recommend it for the base partial type, ... <SNIP>
> Why not? It seems like a good idea all around.

Speed, complexity and backwards compatibility. With a layered API, users
can choose whether they want to do early checks or not. If we build it in,
you can't avoid it when you prefer the delayed error to checking the
arguments twice.


Sent from my phone, thus the relative brevity :)
> -- Devin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jun  9 02:24:30 2012
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jun 2012 10:24:30 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <jqu4jj$ajo$>
References: <>
Message-ID: <>

On Jun 9, 2012 10:16 AM, "Terry Reedy" <tjreedy at> wrote:
> On 6/8/2012 6:34 PM, Yuval Greenfield wrote:
>>  > Loop statements may have an else clause; it is executed immediately
>> after the loop but is skipped if the loop was terminated by a break
>> statement.
> As I said in my reply on pydev, that is misleading. The else clause
executes if and when the loop condition is false. Period. Simple rule.
> It will not execute if the loop is exited by break OR if the loop is
exited by return OR if the loop is exited by raise OR if the loop never
exits. (OR is the loop is aborted by external factors.) As far as else is
concerned, there is nothing special about break exits compared to return or
raise exits.
> But Nick's doc addition and your alternative imply otherwise. One could
read Nick's statement and your paraphrase as suggesting that the else will
by executed if the loop is exited by return (like the finally of try) or
raise (like the except of try). And that is wrong.

An else clause on a try statement doesn't execute in any of those cases
either. I'm not assuming beginners are idiots, I'm assuming they're making
a perfectly logical connection that happens to be wrong.


Sent from my phone, thus the relative brevity :)
> --
> Terry Jan Reedy
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Jun  9 04:31:27 2012
From: steve at (Steven D'Aprano)
Date: Sat, 09 Jun 2012 12:31:27 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <jqu4jj$ajo$>
References: <>	<>
Message-ID: <>

Terry Reedy wrote:
> On 6/8/2012 6:34 PM, Yuval Greenfield wrote:
>>  > Loop statements may have an else clause; it is executed immediately
>> after the loop but is skipped if the loop was terminated by a break
>> statement.
> As I said in my reply on pydev, that is misleading.

Why is it misleading? It is *incomplete* insofar as it assumes the reader 
understands that (in the absence of try...finally) a return or raise will 
immediately exit the current function regardless of where in the function that 
return/raise happens to be. I think that's a fair assumption to make.

Other than that, Yuval's description seems both correct and simple to me. It 
precisely matches the semantics of for/else and while/else without introducing 
any additional complexity. The only thing it doesn't do is rationalise why the 
keyword is called "else" instead of a less confusing name.

> The else clause 
> executes if and when the loop condition is false. Period. Simple rule.

What is "the loop condition" in a for-loop? If you mean "when the iterable is 
false (empty)", that's simply incorrect, and is precisely the common error 
that many people make.

If on the other hand you are talking about the reader mentally converting a 
for-loop to an imaginary while-loop in their head, I hardly call that 
"simple". It wouldn't be simple even if for/else loops actually were 
implemented internally as a while loop.

If you mean something else, I have no idea what that could possibly be.

> It will not execute if the loop is exited by break OR if the loop is 
> exited by return OR if the loop is exited by raise OR if the loop never 
> exits. (OR is the loop is aborted by external factors.) As far as else 
> is concerned, there is nothing special about break exits compared to 
> return or raise exits.

Right. Do we really need to explicitly document all of that under for/else? 
Surely we are allowed to assume a certain basic level of understanding of 
Python semantics -- not every page of the docs has to cover the fundamentals.

This kind of reminds me of the scene in "Red Dwarf" where Holly the computer 
is explaining to Lister that he is the last surviving crew member of the skip 
and that everyone else is dead. Paraphrasing:

     Holly: They're all dead. Everybody's dead, Dave.
     Lister: Peterson isn't, is he?
     Holly: Everybody's dead, Dave!
     Lister: Not Chen!
     Holly: Yes, Chen. Everyone. Everybody's dead, Dave!
     Lister: Rimmer?
     Holly: He's dead, Dave. Everybody is dead. Everybody is dead, Dave.
     Lister: Wait. Are you trying to tell me everybody's dead?

Yes Dave, a return will exit a for-loop without executing the code that follows.



From rurpy at  Sat Jun  9 05:39:34 2012
From: rurpy at (Rurpy)
Date: Fri, 8 Jun 2012 20:39:34 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
Message-ID: <>

On 06/07/2012 03:00 PM, Mike Meyer wrote:
> On Thu, Jun 7, 2012 at 4:48 PM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:
>> I suspect the vast majority of
>> programmers are interested in a language that allows
>> them to *effectively* get done what they need to, whether
>> they are working of the latest agile TTD REST server, or
>> modifying some legacy text files.
> Others have raised the question this begs to have answered: how do
> other programming languages deal with wanting to change the encoding
> of the standard IO streams? Can you show us how they do things that's
> so much easier than what Python does?

This is how it seems to be done in Perl: 

 binmode(STDOUT, ":encoding(sjis)");

which seems quite a bit simpler than Python.  I don't
know if it meets your "so much easier" criterion.
A quick trial showed that it works as advertised when
called before any output.  The description of binmode() 
in "man perlfunc" sounds like encoding can be changed 
on-the-fly but my attempt to do so had no effect, so I 
don't know if I'm misinterpreting the text or wrote bad 
Perl code (haven't used it in ages and not interested 
in relearning it right now.)

TCL appears to have on-the-fly encoding changes:

 | encoding system ?encoding?
 |  Set the system encoding to encoding. If encoding is omitted
 |  then the command returns the current system encoding. The system
 |  encoding is used whenever Tcl passes strings to system calls.

I'll see if I can find out about some other languages 
if there continues to be any interest.

>> And even were I to accept your argument, Python is
>> inconsistent: when I open a file explicitly there is
>> only a slight penalty for opening a non-default-encoded
>> file (the need the explicitly give an encoding):
> The proper encoding for the standard IO streams is generally a
> property of the environment, and hence is set in the environment.

"Proper encoding"?  If you said, "Proper default encoding" 
I'd agree with you.  And I'd buy your claim if no one had 
ever invented output redirection and if print output always 
went to a console with a (relatively) fixed encoding.  But 
that is not the case.

> You
> have a use case where that's not the case. The argument is that your
> use case isn't common enough to justify changing the standard library.
> Can you provide evidence to the contrary? 

How exactly do you suggest one accurately quantify 
"commonness"?  And what is the threshold for justification?
It seems to me the strongest argument is the credibility
one that I already made:

1) Programs that accept data input on stdin and write
 data on stdout have a long history and are widely used.
 I hope this is self evident.

2) Encodings other than utf-8 are widely used.  I pointed
 to the commonness of non-utf8 encoding in Japanese web
 pages.  Additionally, Google for "ftp readme ?"
 turns up lots of text files.  Once past the first few 
 pages of Google results (where the web pages are mostly
 utf8) hardly any utf8 files are to be found.

3) An effect of globalization means that many more
 programmers today are dealing with files that have 
 non-native encoding that come from or go to customers,
 vendors, partners and colleagues in other parts of the
 world.  The number of encodings in wide use even within
 a single country (again Japan: utf8, sjis, euc-jp,
 iso202jp) implies pretty strongly that tools for use
 only in that region will often need multi-encoding

I think connecting the dots above leads to a pretty
high-probability conclusion.

> Other languages that make
> setting the encoding on the standard streams easy, or applications
> outside of those built for your system that have a "--encoding" type
> flag?

iconv, recode and their ilk are obvious examples of 

>> I wasn't suggesting a change to the core level (if by that
>> you mean to the interpreter).  I was asking if some way could
>> be provided that is easier and more reliable than googling
>> around for a magic incantation) to change the encoding of one
>> or more of the already-open-when-my-program-starts sys.std*
>> streams.  I presume that would be a standard library change
>> (in either the io or sys modules) and offered a .set_encoding()
>> method as a placeholder for discussion.
> Why presume that this needs a change in the library? The method is
> straightforward, if somewhat ugly. Is there any reason it can't just
> be documented, instead of added to the library? Changing the library
> would require a similar documentation change.

Did you miss the paragraph right below the one you quote?
The one in which I said, 

  >> An inferior and bare minimum way to address this would be to 
  >> at least add a note about how to change the encoding to the 
  >> sys.std* documentation.  That encourages cargo-cult programming 
  >> and doesn't address the WTF effect but it is at least better 
  >> than the current state of affairs.

From rurpy at  Sat Jun  9 05:47:36 2012
From: rurpy at (Rurpy)
Date: Fri, 8 Jun 2012 20:47:36 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/07/2012 06:59 PM, Nathan Schneider wrote:
> On Thu, Jun 7, 2012 at 5:14 PM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:
>> On 06/07/2012 03:45 PM, Nick Coghlan wrote:
>>> level code doesn't want those streams, it needs to
>>> replace them with something else.
>> Yes, this is what the code I googled up does:
>>  import codecs
>>  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
> What if codecs contained convenience methods for stdin and stdout?
> I.e. the above could be written more simply as
>   import codecs
>   codecs.encode_stdout(opts.encoding)
> This is much more memorable than the current option, and would also
> make life easier when working with fileinput (whose openhook argument
> can be set to control encoding of input *file* streams, but when it
> falls back to stdin this preference is ignored).

How ironic.  In Python2 I hated having to import codecs
and use (the only thing I ever used from 
the codecs module) rather than just having an encoding
parameter on open().  

But seems like might be a reasonable thing to do.
I'm sure there will be opinions. :-).

It's not just sys.stdout though, the same issue exists 
with sys.stdin and sys.stderr so one might want either
three functions, or one function that includes the a
stream as parameter.

From rurpy at  Sat Jun  9 05:57:01 2012
From: rurpy at (Rurpy)
Date: Fri, 8 Jun 2012 20:57:01 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/07/2012 07:01 PM, Nick Coghlan wrote:
> On Fri, Jun 8, 2012 at 10:14 AM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:
>> On 06/07/2012 03:45 PM, Nick Coghlan wrote:
>>> If user level code doesn't want those streams, it needs to
>>> replace them with something else.
>> Yes, this is what the code I googled up does:
>>  import codecs
>>  sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
>> But that code is not obvious to someone who has been able to do
>> all his encoded IO (with the exception of sys.stdout) using just
>> the encoding parameter of open().  Hence my question if some-
>> thing like a set_encoding() method/function that would work on
>> sys.stdout is feasible.  I don't see an answer to that in your
>> statement above.

First, thanks for the detailed response.

> Right, I was only trying to explain why the standard streams are a
> special case - because they're also used by the interpreter, and it
> makes the startup process much simpler if the interpreter retains
> complete control over the way they're initialised (it's already
> complicated by the fact we need to get something half-usable in place
> as sys.stderr so that error reporting is possible while initialising
> them properly). It then becomes an application level operation to
> replace them if desired.

OK, I can see that as a use-case design principle.  I still
don't see any hard technical reason why the same streams could
not be kept and simply allow their encoding's to be reset if
they haven't been used yet.  In other words, does that 
principle provide sufficient value to compensate for ruling 
out several possible solutions to based on modify the current
stream rather than rewrapping it?

> We can (and do) make the internal standard stream initialisation
> configurable, but it then becomes a UI design problem to get something
> that balances flexibility against complexity. PYTHONIOENCODING (in
> association with OS utilities that make it possible to set an
> environment variable for a specific process invocation, as well as
> support in the subprocess module for passing a tailored environment to
> subprocesses) is our current solution.
> The interpreter design aims, first and foremost, to provide a simple
> and straightforward experience in POSIX environments that use UTF-8
> everywhere (since that's the most sane approach available for
> migrating from a previously ASCII-based computing world). Windows is a
> bit trickier (due to the internal use of UTF-16 APIs and the lack of
> POSIX-style support for temporarily setting an environment variable
> when invoking a process from the shell), but correctly supporting that
> environment is also a very high priority. The fallback behaviours when
> these situations do not apply are designed to work best on systems
> that are, at least somewhat *locally* consistent.

But networks, shared files systems, email, etc have all
blurred the concept of localness.  Just because I am running
my program on a Unix machine does not mean I may not need
to write files with '\n\r' line endings.

Perhaps another way to view it is that Python is wrongly 
subsuming part of the problem space into the system space.  
The need to read or write disparate encodings is a function 
of the problem being addressed (which includes how problem
data is encoded just as much as whether it is formatted as 
CSV or as labeled name-value pairs); it's not really a 
function of my local system environment.

> The real world is complex. Eventually, our answer has to be "handle it
> at the application level, there are too many variations for us to
> support it directly at the interpreter level". Currently, any standard
> stream encoding related problem that can't be handled with
> PYTHONIOENCODING is just such a situation. We know it sucks for
> multi-encoding environments, but those are a nightmare for a lot of
> reasons and are the main drivers behind the industry-wide effort to
> standardise on Unicode text handling, including universal encodings
> like UTF-8.

I think "nightmare" is a little too strong.  PITA maybe, 
particularly before one's gotten tools and environment 
worked out.  Eventually one can get used to seeing Windows
path separators displayed as yen signs in cmd.exe windows. :-) 
I think of it as just another annoyance imposed by the 
real world -- like making sure backups run exactly once 
a night even across dst changes.

> So now we're down to the question of how much complexity we're willing
> to tolerate in the interpreter specifically for the sake of
> environments where:
> 1. The automatic standard stream encoding calculation gives the wrong answer
> 2. The PYTHONIOENCODING override is insufficient
> 3. The application being executed isn't already handling the problem
> 4. A -m executable helper module (or directly executable helper
> script) can't be used to initialise the standard streams correctly
> before continuing on to execute the requested application via the
> runpy module

In the options you give above, it seems to me that all 
(except 3, and maybe 4; I only use -m only for pdb) there 
seems to be an implicit assumption that there is a single 
encoding that needs to be determined.

But that is wrong.  There are three streams and each
of those streams may need a different encoding.  Python
gets this in the case of explicitly opened files... no
one would dream of having a sys.encoding setting replace
the open(encoding=...) parameter.  What Python is missing
is that the same applies to stdin, stdout and stderr.

PYTHONIOENCODING is fine for what it is; it is just not
meant for my particular issue.

My proposal was simply to allow your option (3) to address 
this.  (Or more accurately, that it address this on a near 
equal footing to explicitly opened streams for reasons of 
both ease of use and python api consistency.)

> And the answer is "not much". About the only likely way forward I can
> see for streamlining this situation would be to treat this as another
> use case for, which proposes the
> ability to run snippets of Python code prior to execution of __main__.

That (IIUC) would not be workable for my problem.

  ./ -e sjis,sjis [other options...]

is acceptable.  Something like:

  python -C 'sys.stdin=...; sys.stdout=...' [other options...]

would not be.  And since you mentioned it above, nor would:

  python -m setstdin_sjis -m setstdout_sjis [other options...]

> I do agree that "create a new IO object that is like this old IO
> object but with these settings changed" could probably do with a
> better official API, but such an API needs to be designed with a
> respect for the issues associated with changing encodings "on the fly"
> and ask serious questions about whether or not we should be
> encouraging that practice by making it easier than it is already. I
> thought I had posted a tracker issue to that effect, but I can't find
> it now.

I think that being unable to easily change stream encoding 
before first use is orders of magnitude more important than
being unable to change them on-the-fly.  I mentioned the
latter only because I thought it might fall out naturally
from fixing the first problem, and might occasionally be 
useful.  (I mentioned a couple cases I've encountered but 
even I, who am very much in favor of generality, have to
admit I think the uses are rare.)

I acknowledge though that even a before-first-use api (which
I think could be implemented before an on-the-fly one) would
have to take the possible later existence of the latter into 

From rurpy at  Sat Jun  9 06:07:07 2012
From: rurpy at (Rurpy)
Date: Fri, 8 Jun 2012 21:07:07 -0700 (PDT)
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
Message-ID: <>

On 06/08/2012 05:11 AM, Stephen J. Turnbull wrote:
> Rurpy writes:
>  > Python is inconsistent:
> Yup, and I said there is support for dealing with that inconsistency.
> At least I'm +1 and Nick's +0.5.
> So let's talk about what to do about it.  Nick has a pretty good
> channel on the BFDL, and since he doesn't seem to like an addition to
> the stdlib here, it may not go far.  But I don't see a reason to rule
> out stdlib changes yet.
> As far as I'm concerned, there are three reasonable proposals:

Which were (summarizing, please correct if wrong)

1) A package on PyPI containing a function like
        import codecs
	def rewrap_stream_with_new_encoding (old_stream, encoding):
            new_stream = codecs.getwriter (encoding)(old_stream.buffer)
            return new_stream
 (or maybe three functions for each of the std* streams, 
 without the 'old_stream' parameter?)

2) Modify standard lib.  Add something like a 
 .reset_encoding() method to io.TextIOWrapper?  
 (Name and functionality to be bikeshedded to death.)

3) Modify the standard lib documentation (I assume 
 for sys.std* as described below)

Also 4?) Nathan Schneider suggested a hybrid (1) and
 (2): put the function in the codecs module.

>  > > [S]ince a 3-line function can do the job, it might make just as
>  > > much sense to put up a package on PyPI.
>  > I hardly think it is worth the effort, for either the producer 
>  > or consumers, of putting a 3-line function on PyPI.  Nor would 
>  > such a solution address the discoverability and ease-of-use 
>  > problems I am complaining about.
> Agreed that it's pretty weak, but it's not clear that other solutions
> will be much better in practice.

If (and when) I had the problem of figuring out how 
to change sys.stdout encoding PyPI would be (and was)
the last place I'd look.  It is just not the kind of 
problem one looks to a package to solve.  Rather like 
looking in PyPI if you want to capitalize a string.

Where I would look is where I did: 
* The Python docs io module.
* Then the sys module docs for std*.  They say how to change
 the buffering and how to change to binary.  They also say 
 how the default encoding is determined.  For this reason,
 this is where I would put any note about changing the encoding.
* Finally the internet.
* Had I not found an answer there I would have posted to 
 c.l.p.  I don't think I'd have looked on PyPI unless something 
 explicitly pointed me there.

> Discoverability depends on
> documentation, which can be written and improved.

Documentation where?

> I think "ease of use" is way off-target.

I would think ease of use would always be a consideration 
in any api change users were exposed to.  Or are you saying
some api's should be discouraged and making them hard to
use is better than a "not recommended" note in the documentation?
If so I suspect we'll just have to agree to disagree on that.

And in this case I don't even see any reason to disrecommend 
it -- writing to sys.stdout is the best answer in the circumstances
I've described.  

>  > I presume that would be a standard library change (in either the io
>  > or sys modules) and offered a .set_encoding() method as a
>  > placeholder for discussion.
> Changing the stdlib is not a panacea.  In particular, it can't be
> applied to older Pythons.  I'm also not convinced (cf. Nick's post)
> that there's enough value-added and a good name for the restricted
> functionality we know we can provide.

Nothing is ever a panacea.  It seems like it could be
the cleanest, nicest (long term) solution but clearly
the most difficult.

>  > An inferior and bare minimum way to address this would be to at
>  > least add a note about how to change the encoding to the sys.std*
>  > documentation.  That encourages cargo-cult programming and doesn't
>  > address the WTF effect but it is at least better than the current
>  > state of affairs.
> IMO, this may be the best, but again I doubt it can be added to older
> versions.

Does it need to be?  I'd have thought this would just
be a doc issue on the tracker (although perhaps getting 
agreement of the wording would be hard?)

> As for the "cargo cult" and "WTF" issues, I have little sympathy for
> either.  The real WTF problem is that multi-encoding environments are
> inherently complex and irregular (ie, a WTF waiting to happen), and
> Python can't fix that. 
But the WTF comes not from multi-encoding (in which 
case it would have occurred when the problem requirements 
were received) but from observing that doing the necessary 
output to a file is easy as pie, but doing the same to 
stdout (another file) isn't.  Python can avoid making a 
less than ideal situation (multi-encoding) worse by not 
making harder to do what needs to be done than necessary.

> It's very unlikely that typical programmers
> will bother to understand what happens "under the hood" of a stdlib
> function/method, so that is no better than cargo-cult programming

The point though is that programmers don't need to look
under the hood -- the fact that something is in stdlib
means (at least ideally) it is documented as a black box.
What goes in, what comees out, the relationship between
the two and any side effects are all concisely, fully
and accurately described (again, in an ideal world). 
But with a code snippet and a comment that says, "use this
to change the encoding of sys.stdout), the programmer has 
to figure out everything himself. (Of course that's not 
totally bad -- I know a lot more about text IO streams 
than I did 3 days ago. :-)

Sure, you could document the code snippet as well as a
packaged function, but that's stretching our ideal world
well past the breaking point -- it doesn't happen. :-)

> (and
> cargo-cult at least has the advantage that what is being done is
> explicit, allowing programmers who understand textio but not encodings
> to figure out what's happening).

True it's a double edged sword but I prefer to use code
packaged in stdlib.  If I didn't I would cut and paste
from there and I don't :-)

Also, there are programmers who understand encoding but
not textio (I'm one) but I'll concede we are probably a 

From ncoghlan at  Sat Jun  9 08:58:21 2012
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jun 2012 16:58:21 +1000
Subject: [Python-ideas] functools.partial
In-Reply-To: <>
References: <>
Message-ID: <>

(Added list back to recipients)

On Sat, Jun 9, 2012 at 10:58 AM, Devin Jeanpierre
<jeanpierreda at> wrote:
> On Fri, Jun 8, 2012 at 8:17 PM, Nick Coghlan <ncoghlan at> wrote:
>> Speed, complexity and backwards compatibility. With a layered API, users can
>> choose whether they want to do early checks or not. If we build it in, you
>> can't avoid it when you prefer the delayed error to checking the arguments
>> twice.
> Then maybe the layered API belongs in the stdlib. What's the use of
> base partial type, except as a micro-optimization?

functools.partial will still be used to change the signature of a
callable, the same as it has been ever since it was added. The layered
API runs afoul of "not every three line function needs to be in the
standard library". It's better to add the base API that is difficult
for third parties to provide (in this case, inspect.signature and
Signature objects) and let specific use cases emerge naturally over
time, rather than trying to guess the *exact* patterns in advance.

> On the other hand, I'm not so sure about your complexity argument.
> It's hard to argue against "this isn't worth our time", if that's what
> you're saying. But if you're saying it's too complicated now,
> shouldn't Signature.bind help with that?

I'm saying it makes *functools.partial* more complex, because we're
asking it to do more. We would also be making it impossible to use
without checking every signature twice. Current uses will be a mix of
cases where the lack of early checking is annoying (but tolerable) and
cases where it is undesirable. Adding the checking directly to the
base API means we're assuming that early checking is desirable for
*every* use case, and that's unlikely to be true.

Most importantly though, if we leave the status quo in place for now,
we can change our minds later if we still think it's a good idea. If
we charge ahead and add early checking everywhere immediately, then
we're quite likely to do more harm than good.

We're in this for the long haul, and 2014 really isn't that far away
in the context of programming language evolution.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From arnodel at  Sat Jun  9 09:46:25 2012
From: arnodel at (Arnaud Delobelle)
Date: Sat, 9 Jun 2012 08:46:25 +0100
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>


(sent from my phone)
On Jun 8, 2012 11:35 PM, "Yuval Greenfield" <ubershmekel at> wrote:
> On Fri, Jun 8, 2012 at 12:04 PM, Nick Coghlan <ncoghlan at> wrote:
>> (context for python-ideas: my recently checked in changes to the
>> tutorial, that added the final paragraph to
> If we're on that subject then I think this
> > Loop statements may have an else clause; it is executed when the loop
terminates through exhaustion of the list (with for) or when the condition
becomes false (with while), but not when the loop is terminated by a break
> Doesn't hit the "break" nail on the head fast and hard enough in my
opinion. I'd replace it with something like:
> > Loop statements may have an else clause; it is executed immediately
after the loop but is skipped if the loop was terminated by a break

Yes. This is why I've been suggesting for a while that we call these
constructs for/break/else and while/break/else.

As Terry says, this is not the whole truth but you'd have to have a warped
mind not to extrapolate the correct behaviour when there is a return or
raise in the loop body.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jun  9 11:55:09 2012
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jun 2012 19:55:09 +1000
Subject: [Python-ideas] Replacing the standard IO streams (was Re: changing
 sys.stdout encoding)
Message-ID: <>

So, after much digging, it appears the *right* way to replace a
standard stream in Python 3 after application start is to do the

    sys.stdin = open(sys.stdin.fileno(), 'r', <new settings>)
    sys.stdout = open(sys.stdout.fileno(), 'w', <new settings>)
    sys.stderr = open(sys.stderr.fileno(), 'w', <new settings>)

Ditto for the other standard streams. It seems it already *is* as
simple as with any other file, we just collectively forgot about:

1. The fact open() accepts file descriptors directly in Python 3
2. The fact that text streams still report the underlying file
descriptor correctly

*That* is something we can happily advertise in the standard library
docs. If you could check to make sure it works properly for your use
case and then file a docs bug at to get it added to
the std streams documentation, that would be very helpful.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From p.f.moore at  Sat Jun  9 13:00:37 2012
From: p.f.moore at (Paul Moore)
Date: Sat, 9 Jun 2012 12:00:37 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 June 2012 10:55, Nick Coghlan <ncoghlan at> wrote:
> So, after much digging, it appears the *right* way to replace a
> standard stream in Python 3 after application start is to do the
> following:
> ? ?sys.stdin = open(sys.stdin.fileno(), 'r', <new settings>)
> ? ?sys.stdout = open(sys.stdout.fileno(), 'w', <new settings>)
> ? ?sys.stderr = open(sys.stderr.fileno(), 'w', <new settings>)
> Ditto for the other standard streams. It seems it already *is* as
> simple as with any other file, we just collectively forgot about:

One minor point - if sys.stdout is redirected, *and* you have already
written to sys.stdout, this resets the file pointer. With as

import sys
sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
print("Hello!") >a gives one line in a, not two (tested on Windows, Unix may
be different). And changing to "a" doesn't resolve this...

Of course, the actual use case is to change the encoding before
anything is written - so maybe a small note saying "don't do this" is
enough. But it's worth mentioning before we get the bug report saying
"Python lost my data" :-)


From p.f.moore at  Sat Jun  9 15:00:03 2012
From: p.f.moore at (Paul Moore)
Date: Sat, 9 Jun 2012 14:00:03 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 June 2012 12:00, Paul Moore <p.f.moore at> wrote:
> On 9 June 2012 10:55, Nick Coghlan <ncoghlan at> wrote:
>> So, after much digging, it appears the *right* way to replace a
>> standard stream in Python 3 after application start is to do the
>> following:
>> ? ?sys.stdin = open(sys.stdin.fileno(), 'r', <new settings>)
>> ? ?sys.stdout = open(sys.stdout.fileno(), 'w', <new settings>)
>> ? ?sys.stderr = open(sys.stderr.fileno(), 'w', <new settings>)
>> Ditto for the other standard streams. It seems it already *is* as
>> simple as with any other file, we just collectively forgot about:
> One minor point - if sys.stdout is redirected, *and* you have already
> written to sys.stdout, this resets the file pointer. With as
> import sys
> print("Hello!")
> sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
> print("Hello!")
> >a gives one line in a, not two (tested on Windows, Unix may
> be different). And changing to "a" doesn't resolve this...

Ignore me - you need to flush stdout before repoening it, is all. Dumb
mistake, sorry for the noise :-(


From jeanpierreda at  Sat Jun  9 16:01:50 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 9 Jun 2012 10:01:50 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Fri, Jun 8, 2012 at 10:31 PM, Steven D'Aprano <steve at> wrote:
> Why is it misleading? It is *incomplete* insofar as it assumes the reader
> understands that (in the absence of try...finally) a return or raise will
> immediately exit the current function regardless of where in the function
> that return/raise happens to be. I think that's a fair assumption to make.

How can the reader understand that, when the reader doesn't know that
return or raise exist yet? The assumption that the reader understands
basic Python is unreasonable. This is the tutorial.

As I understand the objection, it is misleading in that it puts the
focus on the wrong thing. It says "it's skipped by a break", as if
that were special. It's skipped by a lot of things that aren't
mentioned, the really interesting thing is when it _isn't_ skipped,
which is glossed over. It is implied that this happens whenever it is
exited by anything other than break, but of course that isn't true,
and you have to think "well, what about return and raise?"  However,
as mentioned above, no student will ever think about return and raise,
because those constructs have not been introduced yet. I wonder if
they will just internalize "except not when left by break"? That would
be awful!

Anyway, I'm not really an expert on writing technical documentation I
would expect that it's better to not force the reader to remember
information and think about implications, if we can say flat-out
exactly what happens. Even if they can do it successfully, surely it
is annoying?

If you want to mention break up-front, why not reverse the clause
order? Currently the phrasing is this:

    Loop statements may have an else clause; it is executed when the
loop terminates through exhaustion of the list (with for) or when the
condition becomes false (with while), but not when the loop is
terminated by a break statement.

It could also be (something like) this:

    Loop statements may have an else clause; it is not executed when
the loop is terminated by a break statement; it is only executed when
the loop terminates through exhaustion of the list (with for) or when
the condition becomes false (with while).

This reads backwards to me, because the clarifying information is
listed before the main fact. Also it's a terrible sentence (my fault,
the original was long but didn't have three independent clauses). But

Or you could split it up into two sentences:

    Loop statements may have an else clause, which is only executed
when the loop terminates through exhaustion of the list (with for) or
when the condition becomes false (with while). The else clause is
*not* executed when the loop is terminated by a ``break``, or any
other control flow construct you will see.

And so on. Lots of room to play around with how the information gets
across without sacrificing the core fact of how else works.

If that core fact is unworkable and does more harm than good, then I
guess it has to go, though. Lying-to-children is a well-worn and
useful didactic technique.

-- Devin

From zuo at  Sat Jun  9 18:01:13 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 9 Jun 2012 18:01:13 +0200
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan dixit (2012-06-08, 19:04):

>   for x in iterable:
>     ...
>   except break:  # Implicit in the semantics of loops
>     pass
>   else:
>     ...
> Would it be worth adding the "except break:" clause to the language
> just to make it crystal clear what is actually going on? I don't think
> so, but it's still a handy way to explain the semantics while gently
> steering people away from linking for/else and if/else too closely.

IMHO a better option would be a separate keyword, e.g. 'broken':

    for x in iterable:

And not only to make the 'else' more understandable.  I found, in
a few situations, that such a 'broken' clause would be really useful,
making my code easier to read and maintain.  There were some relatively
complex, parsing-related, code structures...

    stopped = False
    for x in iterable:
        if condition1:
            stopped = True
        if contition2:
            stopped = True
        if contition3:
            stopped = True
    if stopped:

It would have been nice to be able to do:

    for x in iterable:
        if condition1:
        if contition2:
        if contition3:


From python at  Sat Jun  9 18:42:53 2012
From: python at (MRAB)
Date: Sat, 09 Jun 2012 17:42:53 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

On 09/06/2012 12:00, Paul Moore wrote:
> On 9 June 2012 10:55, Nick Coghlan<ncoghlan at>  wrote:
>>  So, after much digging, it appears the *right* way to replace a
>>  standard stream in Python 3 after application start is to do the
>>  following:
>>      sys.stdin = open(sys.stdin.fileno(), 'r',<new settings>)
>>      sys.stdout = open(sys.stdout.fileno(), 'w',<new settings>)
>>      sys.stderr = open(sys.stderr.fileno(), 'w',<new settings>)
>>  Ditto for the other standard streams. It seems it already *is* as
>>  simple as with any other file, we just collectively forgot about:
> One minor point - if sys.stdout is redirected, *and* you have already
> written to sys.stdout, this resets the file pointer. With as
> import sys
> print("Hello!")
> sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
> print("Hello!")
>>a gives one line in a, not two (tested on Windows, Unix may
> be different). And changing to "a" doesn't resolve this...
> Of course, the actual use case is to change the encoding before
> anything is written - so maybe a small note saying "don't do this" is
> enough. But it's worth mentioning before we get the bug report saying
> "Python lost my data" :-)
I find that this:

sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')

prints the string "Hello!\r\r\n", but this:

sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')

prints the string "Hello!\r\nHello!\r\r\n".

I had hoped that the flush would be enough, but apparently not.

From bruce at  Sat Jun  9 18:44:15 2012
From: bruce at (Bruce Leban)
Date: Sat, 9 Jun 2012 09:44:15 -0700
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 3:34 PM, Yuval Greenfield <ubershmekel at>

> > Loop statements may have an else clause; it is executed when the loop
> terminates through exhaustion of the list (with for) or when the condition
> becomes false (with while), but not when the loop is terminated by a break
> statement.

I don't think talking about exhaustion of the list is the simplest way to
think about this. Isn't it the distinction whether the loop exits at the
bottom or in the middle?

On Sat, Jun 9, 2012 at 12:46 AM, Arnaud Delobelle <arnodel at> wrote:

> As Terry says, this is not the whole truth but you'd have to have a warped
> mind not to extrapolate the correct behaviour when there is a return or
> raise in the loop body.
If we can express this in a way that is the whole truth that's better. And
leaving out a very common scenario like return in a loop and an less common
one like raise. Asking readers of technical documentation to extrapolate
frequently leads to incorrect assumptions. Go read the docs on msdn if you
don't agree with that.

Here's my take:

Loop statements may have an else clause which is executed when the loop
exits normally (control flows off the bottom of the loop). If the loop
exits from the middle (through break, return, raise or something else),
then the else is not executed. It may help to think of the else as being
paired with an "if ... break" in the middle of the loop. If the break is
not executed then the else will be.

Likewise I would reword the comparison to try. In particular I would remove
the negative reference to if as I think that's misleading.

The else clause of a loop can also be thought of as similar to the else
clause of a try statement. A try statement?s else clause runs when no
exception, break or return occurs and the try exits normally, and a loop?s
else clause runs when no break or return occurs and the loop exits
normally. For more on the try statement and exceptions, see Handling

Note that this corrects the error in the current docs which says "a try
statement?s else clause runs when no exception occurs" which is not true if
you exit the try via break or return.

--- Bruce
Follow me:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Sat Jun  9 18:49:48 2012
From: python at (MRAB)
Date: Sat, 09 Jun 2012 17:49:48 +0100
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
Message-ID: <>

On 09/06/2012 17:01, Jan Kaliszewski wrote:
> Nick Coghlan dixit (2012-06-08, 19:04):
>>    for x in iterable:
>>      ...
>>    except break:  # Implicit in the semantics of loops
>>      pass
>>    else:
>>      ...
>>  Would it be worth adding the "except break:" clause to the language
>>  just to make it crystal clear what is actually going on? I don't think
>>  so, but it's still a handy way to explain the semantics while gently
>>  steering people away from linking for/else and if/else too closely.
> IMHO a better option would be a separate keyword, e.g. 'broken':
>      for x in iterable:
>          ...
>      broken:
>          ...
>      else:
>          ...
> And not only to make the 'else' more understandable.  I found, in
> a few situations, that such a 'broken' clause would be really useful,
> making my code easier to read and maintain.  There were some relatively
> complex, parsing-related, code structures...
>      stopped = False
>      for x in iterable:
>          ...
>          if condition1:
>              stopped = True
>              break
>          ...
>          if contition2:
>              stopped = True
>              break
>          ...
>          if contition3:
>              stopped = True
>              break
>          ...
>      if stopped:
>          do_foo()
>      else:
>          do_bar()
That can be re-written as:

     stopped = True
     for x in iterable:
         if condition1:
         if condition2:
         if condition3:
         stopped = False
     if stopped:

From steve at  Sat Jun  9 19:48:28 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 03:48:28 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>	<>	<>
Message-ID: <>

Bruce Leban wrote:
> On Fri, Jun 8, 2012 at 3:34 PM, Yuval Greenfield <ubershmekel at>
>  wrote:
>>> Loop statements may have an else clause; it is executed when the loop
>> terminates through exhaustion of the list (with for) or when the condition
>> becomes false (with while), but not when the loop is terminated by a break
>> statement.
> I don't think talking about exhaustion of the list is the simplest way to
> think about this. Isn't it the distinction whether the loop exits at the
> bottom or in the middle?

[Aside: I believe that isn't Yuval's description above. As I understand it, he 
is quoting the current docs.]

Loops exit at the top, not the bottom. This is most obvious when you think 
about a while loop:

while condition:

Of course you have to be at the top of the loop for the while to check 
condition, not the bottom. For-loops are not quite so obvious, but execution 
has to return back to the top of the loop in order to check whether or not the 
sequence is exhausted.

Whether or not it is the *simplest* way to think about for/else, talking about 
exhaustion of the list (iterable) is correct.

> On Sat, Jun 9, 2012 at 12:46 AM, Arnaud Delobelle <arnodel at> wrote:
>> As Terry says, this is not the whole truth but you'd have to have a warped
>> mind not to extrapolate the correct behaviour when there is a return or
>> raise in the loop body.
> If we can express this in a way that is the whole truth that's better. And
> leaving out a very common scenario like return in a loop and an less common
> one like raise. Asking readers of technical documentation to extrapolate
> frequently leads to incorrect assumptions. Go read the docs on msdn if you
> don't agree with that.

Should we ask readers to extrapolate what happens when the for loop variable 
is a keyword (e.g. "for None in sequence"), or explicitly mention what happens?

Should we ask readers to extrapolate what happens when the for loop sequence 
doesn't actually exist, or explicitly tell them that they get a NameError and 
the loop doesn't run? Should we do this for every single function?

Frankly, you cannot avoid asking readers to extrapolate, because there is an 
infinite number of things that they could do, and you cannot possibly document 
them all.


From steve at  Sat Jun  9 19:49:43 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 03:49:43 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

Devin Jeanpierre wrote:
> On Fri, Jun 8, 2012 at 10:31 PM, Steven D'Aprano <steve at> wrote:
>> Why is it misleading? It is *incomplete* insofar as it assumes the reader
>> understands that (in the absence of try...finally) a return or raise will
>> immediately exit the current function regardless of where in the function
>> that return/raise happens to be. I think that's a fair assumption to make.
> How can the reader understand that, when the reader doesn't know that
> return or raise exist yet? The assumption that the reader understands
> basic Python is unreasonable. This is the tutorial.

If the reader doesn't know that return or raise exist, they are hardly going 
to draw conclusions about the behaviour of for/else when a return or raise is 

If they do know about return and raise, they should understand that return and 
raise skip everything, not just for/else.

> As I understand the objection, it is misleading in that it puts the
> focus on the wrong thing. It says "it's skipped by a break", as if
> that were special. It's skipped by a lot of things that aren't
> mentioned, the really interesting thing is when it _isn't_ skipped,
> which is glossed over.

I don't think it is glossed over at all. You cut out my quote of Yuval's 
description, here it is again:

     Loop statements may have an else clause; it is executed immediately
     after the loop but is skipped if the loop was terminated by a break

The "really interesting thing" is the first thing about the else clause 
mentioned: it is executed immediately after the loop.

The break statement *really is special* and deserves to be singled out for 
mention. The break statement is the only way to exit the *entire* for-loop 
construct (including the else) without exiting the entire function, or halting 

> It is implied that this happens whenever it is
> exited by anything other than break, but of course that isn't true,

I strongly disagree that it implies anything of the sort. We're should assume 
the readers are beginners, but not idiots.

Ignoring try/finally blocks, which are special, we can assume that the reader 
has (or will have once they actually learn about functions and exceptions) a 
correct understanding of the behaviour of return and raise.

- If the loop is exited by a return, *nothing* following the return is 
executed. That includes the else block.

- If execution is halted by an exception (including raise), *nothing* 
following the exception is executed. That includes the else block.

- If execution is halted by an external event that halts or interrupts the 
Python process, *nothing* following executes. That includes the else block.

- If the loop never completes, *nothing* following the loop executes. That 
includes the else block.

To continue the analogy with the "Red Dwarf" quote I made earlier:

"What about assignments after the loop?"
"No, they aren't executed. Nothing is executed."
"Well what about print statements?"
"No Dave, print statements aren't executed. Nothing is executed."
"How about the len() function?"
"No Dave, nothing is executed."
"What, not even the else clause?"

Unless we think that the average beginner to Python is as dumb as Dave Lister 
from Red Dwarf, I don't think we need worry that they will imagine that 
for/else blocks behave like try/finally.

Somehow we've gone from trying to fix an actual, real-life problem where 
people assume that the else block executes if the loop sequence is empty, to 
arguing how best to solve the entirely hypothetical problem that people might 
imagine that else blocks have the special behaviour of try/finally.


From zachary.ware+pyideas at  Sat Jun  9 20:16:03 2012
From: zachary.ware+pyideas at (Zachary Ware)
Date: Sat, 9 Jun 2012 13:16:03 -0500
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

I've had a thought on this topic; how would it be to completely leave else
out of the if, for, and while sections, then give else its own section
explaining exactly how it works in each situation where it is applicable?
I'd be happy to write up a sample later this evening if this thought isn't
completely shot down :)

As a side note, I didn't even know there was a while...else construct until
I saw this discussion. I'd heard of for...else, but not with while.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Sat Jun  9 22:02:19 2012
From: storchaka at (Serhiy Storchaka)
Date: Sat, 09 Jun 2012 23:02:19 +0300
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

On 09.06.12 12:55, Nick Coghlan wrote:
> So, after much digging, it appears the *right* way to replace a
> standard stream in Python 3 after application start is to do the
> following:
>      sys.stdin = open(sys.stdin.fileno(), 'r',<new settings>)
>      sys.stdout = open(sys.stdout.fileno(), 'w',<new settings>)
>      sys.stderr = open(sys.stderr.fileno(), 'w',<new settings>)

     sys.stdin = io.TextIOWrapper(sys.stdin.detach(), <new settings>)
     sys.stdout = io.TextIOWrapper(sys.stdout.detach(), <new settings>)

None of these methods are not guaranteed to work if the input or output 
have occurred before.

From zuo at  Sat Jun  9 22:07:03 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 9 Jun 2012 22:07:03 +0200
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on the
 margin of PEP 362 discussion)
Message-ID: <>


I think that BindError proposed in PEP 362 could be a built-in TypeError
subclass, raised whenever given arguments do not match a given callable:

1. while using Signature().bind(...) [as proposed in PEP 362],

 and also

2. while using inspect.getcallargs(...)

 and *also*

3. while doing *any* call.


The present behaviour (ad 2. and 3.), i.e. raising TypeError, makes it
hard to differentiate call-argument-related errors from other TypeError

Raising BindError (or ArgumentError? the actual name is disputable of
course), being a TypeError instance, instead -- would made easier
implementing test suites, RPC mechanisms etc.


From zuo at  Sat Jun  9 22:14:33 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 9 Jun 2012 22:14:33 +0200
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

Jan Kaliszewski dixit (2012-06-09, 22:07):

> Raising BindError (or ArgumentError? the actual name is disputable of
> course), being a TypeError instance, instead -- would made easier
> implementing test suites, RPC mechanisms etc.

Erratum: s/TypeError instance/TypeError subclass/, sorry.


From breamoreboy at  Sat Jun  9 23:22:41 2012
From: breamoreboy at (Mark Lawrence)
Date: Sat, 09 Jun 2012 22:22:41 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <jr0es1$ub6$>

On 09/06/2012 21:02, Serhiy Storchaka wrote:
> None of these methods are not guaranteed to work if the input or output
> have occurred before.

That's a double negative so I'm not sure what you meant to say.  Can you 
please rephrase it.  I assume that English is not your native language, 
so I'll let you off :)


Mark Lawrence.

From jeanpierreda at  Sun Jun 10 01:33:21 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 9 Jun 2012 19:33:21 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Sat, Jun 9, 2012 at 1:49 PM, Steven D'Aprano <steve at> wrote:
> - If the loop is exited by a return, *nothing* following the return is
> executed. That includes the else block.
> - If execution is halted by an exception (including raise), *nothing*
> following the exception is executed. That includes the else block.
> - If execution is halted by an external event that halts or interrupts the
> Python process, *nothing* following executes. That includes the else block.
> - If the loop never completes, *nothing* following the loop executes. That
> includes the else block.
> To continue the analogy with the "Red Dwarf" quote I made earlier:

Please stop mocking your own writing. I wrote nothing like the above.

I said that maybe we should be specific and correct with when else is
called. I didn't say that we should be exhaustive for when it is not.
In fact, the example explanations I gave were not exhaustive.

-- Devin

From steve at  Sun Jun 10 02:52:02 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 10:52:02 +1000
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

Jan Kaliszewski wrote:

> The present behaviour (ad 2. and 3.), i.e. raising TypeError, makes it
> hard to differentiate call-argument-related errors from other TypeError
> occurrences.
> Raising BindError (or ArgumentError? the actual name is disputable of
> course), being a TypeError instance, instead -- would made easier
> implementing test suites, RPC mechanisms etc.


Since this will be an error that beginners see (frequently), I suggest 
ArgumentError is more friendly than BindError.


From ncoghlan at  Sun Jun 10 04:26:17 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 10 Jun 2012 12:26:17 +1000
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

Calling detach() on the standard streams is a bad idea - the interpreter
uses the originals internally, and calling detach() breaks them.

Sent from my phone, thus the relative brevity :)
On Jun 10, 2012 6:03 AM, "Serhiy Storchaka" <storchaka at> wrote:

> On 09.06.12 12:55, Nick Coghlan wrote:
>> So, after much digging, it appears the *right* way to replace a
>> standard stream in Python 3 after application start is to do the
>> following:
>>     sys.stdin = open(sys.stdin.fileno(), 'r',<new settings>)
>>     sys.stdout = open(sys.stdout.fileno(), 'w',<new settings>)
>>     sys.stderr = open(sys.stderr.fileno(), 'w',<new settings>)
>    sys.stdin = io.TextIOWrapper(sys.stdin.**detach(), <new settings>)
>    sys.stdout = io.TextIOWrapper(sys.stdout.**detach(), <new settings>)
>    ...
> None of these methods are not guaranteed to work if the input or output
> have occurred before.
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sun Jun 10 05:03:46 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 13:03:46 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

Devin Jeanpierre wrote:
> Please stop mocking your own writing. I wrote nothing like the above.
> I said that maybe we should be specific and correct with when else is
> called. I didn't say that we should be exhaustive for when it is not.
> In fact, the example explanations I gave were not exhaustive.

You explicitly worried that users will conclude that the else block will run 
"except not when left by break", and stated that the description given earlier 
implies that for/else behaves like try/finally (i.e. that the else clause is 
*only* skipped on a break, but not return or raise).

There is no evidence that users somehow get the impression that for/else 
behaves like try/finally, and I find it completely implausible that they will 
do so in the future. If I'm wrong, the docs can be revised, but until then, in 
my opinion worrying about this is a documentation case of YAGNI.

The current documentation for for/else is already specific and correct. The 
real-life problem Nick is trying to solve is that many people think that the 
else clause implies that it behaves like if/else, and Nick is trying to nudge 
users to think of try/else instead. I think that's a worthy goal. Worrying 
about users reading the tutorial and concluding that for/else will run when 
you exit with a return, not so much.


From jeanpierreda at  Sun Jun 10 05:28:04 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 9 Jun 2012 23:28:04 -0400
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Sat, Jun 9, 2012 at 11:03 PM, Steven D'Aprano <steve at> wrote:
> There is no evidence that users somehow get the impression that for/else
> behaves like try/finally, and I find it completely implausible that they
> will do so in the future. If I'm wrong, the docs can be revised, but until
> then, in my opinion worrying about this is a documentation case of YAGNI.
> The current documentation for for/else is already specific and correct. The
> real-life problem Nick is trying to solve is that many people think that the
> else clause implies that it behaves like if/else, and Nick is trying to
> nudge users to think of try/else instead. I think that's a worthy goal.
> Worrying about users reading the tutorial and concluding that for/else will
> run when you exit with a return, not so much.

You are confused.

A) I was arguing in favor of the current documentation, written by
    Nick Coghlan. You were arguing in favor of Yuval's thing. You appear
    to have forgotten this, and are now agreeing with me.
B) Obviously there is no empirical evidence for anything, because
    Yuval's thing is unpublished, and the current documentation was added
    two days ago to the dev branch of the docs.

-- Devin

From rurpy at  Sun Jun 10 06:22:03 2012
From: rurpy at (Rurpy)
Date: Sat, 9 Jun 2012 21:22:03 -0700 (PDT)
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
	changing sys.stdout encoding)
Message-ID: <>

On 06/09/2012 08:26 PM, Nick Coghlan wrote:
> Calling detach() on the standard streams is a bad idea - the
> interpreter uses the originals internally, and calling detach()
> breaks them.

The documentation for sys.std* specifically describes
using detach() on the standard streams:

| To write or read binary data from/to the standard
| streams, use the underlying binary buffer.

and gives example code.

The only caveat mentioned is that detach() "can raise
AttributeError or io.UnsupportedOperation" if the stream
has benn replaced with something that does not support

From at  Sun Jun 10 06:36:36 2012
From: at (Yury Selivanov)
Date: Sun, 10 Jun 2012 00:36:36 -0400
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
	the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-09, at 4:07 PM, Jan Kaliszewski wrote:
> Suggestion
> ==========
> I think that BindError proposed in PEP 362 could be a built-in TypeError
> subclass, raised whenever given arguments do not match a given callable:
> 1. while using Signature().bind(...) [as proposed in PEP 362],
> and also
> 2. while using inspect.getcallargs(...)
> and *also*
> 3. while doing *any* call.
> Rationale
> =========
> The present behaviour (ad 2. and 3.), i.e. raising TypeError, makes it
> hard to differentiate call-argument-related errors from other TypeError
> occurrences.
> Raising BindError (or ArgumentError? the actual name is disputable of
> course), being a TypeError instance, instead -- would made easier
> implementing test suites, RPC mechanisms etc.

That's how it is currently implemented - BindError(TypeError).

I'll mention this in the PEP.


From solipsis at  Sun Jun 10 09:17:02 2012
From: solipsis at (Antoine Pitrou)
Date: Sun, 10 Jun 2012 09:17:02 +0200
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <jr1htu$pi$>

Le 10/06/2012 04:26, Nick Coghlan a ?crit :
> Calling detach() on the standard streams is a bad idea - the interpreter
> uses the originals internally, and calling detach() breaks them.

Where does it do that? The interpreter certainly shouldn't hardwire the 
original objects internally.

Moreover, your snippet is wrong because if someone replaces the streams 
for a second time, garbage collecting the previous streams will close 
the file descriptors. You should use closefd=False.



From pyideas at  Sun Jun 10 09:32:41 2012
From: pyideas at (Chris Rebert)
Date: Sun, 10 Jun 2012 00:32:41 -0700
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 9, 2012 at 5:52 PM, Steven D'Aprano <steve at> wrote:
> Jan Kaliszewski wrote:
>> The present behaviour (ad 2. and 3.), i.e. raising TypeError, makes it
>> hard to differentiate call-argument-related errors from other TypeError
>> occurrences.
>> Raising BindError (or ArgumentError? the actual name is disputable of
>> course), being a TypeError instance, instead -- would made easier
>> implementing test suites, RPC mechanisms etc.
> +1
> Since this will be an error that beginners see (frequently), I suggest
> ArgumentError is more friendly than BindError.

I note that Ruby also has an ArgumentError, which it raises both for
calls with an incorrect number of arguments and in cases when Python
would raise ValueError.


From steve at  Sun Jun 10 14:00:30 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 22:00:30 +1000
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Chris Rebert wrote:

>> Since this will be an error that beginners see (frequently), I suggest
>> ArgumentError is more friendly than BindError.
> I note that Ruby also has an ArgumentError, which it raises both for
> calls with an incorrect number of arguments and in cases when Python
> would raise ValueError.

Even if I wanted to replace ValueError with ArgumentError (and I don't), we 
couldn't due to backward compatibility.

(Although I suppose ArgumentError could inherit from both TypeError and 

My concept is that errors due to the wrong argument count, duplicate or 
missing keyword arguments, etc. which currently raise TypeError could raise 
ArgumentError, a subclass, instead.

That will make distinguishing between "passed the wrong number of arguments" 
from "passed the wrong type of argument" easier.


From steve at  Sun Jun 10 14:04:09 2012
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jun 2012 22:04:09 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

Devin Jeanpierre wrote:
> On Sat, Jun 9, 2012 at 11:03 PM, Steven D'Aprano <steve at> wrote:
>> There is no evidence that users somehow get the impression that for/else
>> behaves like try/finally, and I find it completely implausible that they
>> will do so in the future. If I'm wrong, the docs can be revised, but until
>> then, in my opinion worrying about this is a documentation case of YAGNI.
>> The current documentation for for/else is already specific and correct. The
>> real-life problem Nick is trying to solve is that many people think that the
>> else clause implies that it behaves like if/else, and Nick is trying to
>> nudge users to think of try/else instead. I think that's a worthy goal.
>> Worrying about users reading the tutorial and concluding that for/else will
>> run when you exit with a return, not so much.
> You are confused.

Perhaps I am.

> A) I was arguing in favor of the current documentation, written by
>     Nick Coghlan. You were arguing in favor of Yuval's thing. You appear
>     to have forgotten this, and are now agreeing with me.

The context which has been lost is that Terry Reedy objected to Yuval's 
description of for/else. I replied to Terry's objection, disagreeing, and you 
replied to me, (apparently) disagreeing with my reply. Do you blame me for 
thinking you were agreeing with Terry?

I think that our positions are probably closer than our disagreements might 

> B) Obviously there is no empirical evidence for anything, because
>     Yuval's thing is unpublished, and the current documentation was added
>     two days ago to the dev branch of the docs.

We have anecdotal evidence that many people expect that for/else will execute 
the else clause when the for loop is empty.

We have no anecdotal evidence, or any other evidence, that anyone excepts that 
the else clause runs if you return out of the loop.


From masklinn at  Sun Jun 10 15:05:53 2012
From: masklinn at (Masklinn)
Date: Sun, 10 Jun 2012 15:05:53 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing functions?
Message-ID: <>

The standard library already provides for cryptographic hashes (hashlib)
and MACs (hmac).

One issue which exists, and has been repeatedly outlined after several
breaches of straight-hashed databases (salted and unsalted) last week,
is that many developers do not know:

1. straight hashes are not sufficient to store passwords securely in
   case of database breach
2. salted password, while mitigating rainbow table attacks, aren't
   enough to mitigate brute-force attacks.

(in case of database breach, the goal being to protect password
plaintexts from being found and matched to a user identity in case users
re-use passwords across services, as it would allow attackers to access
all services used by the user).

The best solution to these currently is *mandatory* salting (of
specified minimum strength) and adaptive workload which can be tuned
higher to keep up with Moore's law (especially as most hashing functions
tend to be very fast and embarassingly parallelizable, two undesirable
properties in the face of brute-forcing of the plaintext).

Therefore, I would suggest either adding a new module (name tbd) or
adding new constructors to hashlib.

* All password-hashing functions listed below should recommend a strong
  salt (the PBKDF2 specification recommends 64 bits, we could go further)
  by erroring out (ValueError) if the conditions are not met unless a
  `weak_salt=True` parameter is provided. I think this would be sufficient
  to hint at the importance of salt to users, and to drive them to "the
  right thing".

  The salt should also be mandated non-empty, providing an empty salt
  should generate an error in all cases.

* All password-hashing functions should require a `workload` parameter
  with documentary recommendation. A default value might make sense in
  the short run (ensure the functions are used with an acceptably high
  workload), but those defaults would be set in stone for users *not*
  setting their own load factor.

This module (or addition) should provide, if possible:

* PBKDF2, recommending a load factor of above 10000. The recommended
  load factor in RFC 2898 (PKCS #5) is 1000, but the specification
  is 12 years old. Extrapolating on that original load factor using
  Moore's law (the load factor has a linear relation to the amount 
  of computation in PBKDF2 as it's the number of hashing iterations),
  the stdlib could recommend a load factor of 64000 (6 doublings).

  As with hmac, it should be possible to configure the digest
  constructor (PKCS #5 specifies HMAC-SHA1 as the default PRF)

* bcrypt, the bcrypt C library is BSD-licensed and open-source so it
  could be added pretty directly, there is already a wrapper called
  "py-bcrypt" (under ISC/BSD licence)[0] 

* scrypt is younger and has been looked at less than the previous
  two[0], but from my readings (of articles on it, I am no cryptographer)
  it seems to have no overt issue and combines load-adaptive CPU-hardness
  with load-adaptive memory-hardness (PBKDF2 and bcrypt both work
  in constant space) making it significantly more resistant to
  massively parallel brute-forcing arrays (GPGPU or custom ASIC).

  It is available under a 2-clause BSD license as are the existing Python
  bindings I could find[2], but has a hard dependency on OpenSSL which may
  prevent its usage.

I think these would make Python users safe by lowering the
cost of using these functions and by demonstrating ways to safely
store passwords up-front. They could be augmented with a note in
hashlib indicating that they are to be preferred for password hashing.

[0] especially PBKDF2, still the most conservatively safe choice

From ncoghlan at  Sun Jun 10 15:16:24 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 10 Jun 2012 23:16:24 +1000
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <jr1htu$pi$>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 5:17 PM, Antoine Pitrou <solipsis at> wrote:
> Le 10/06/2012 04:26, Nick Coghlan a ?crit :
>> Calling detach() on the standard streams is a bad idea - the interpreter
>> uses the originals internally, and calling detach() breaks them.
> Where does it do that? The interpreter certainly shouldn't hardwire the
> original objects internally.

At the very least, sys.__std(in/out/err)__.  Doing "sys.stderr =
io.TextIOWrapper(sys.stderr.detach(), line_buffering=True)" also seems
to suppress display of exception tracebacks at the interactive prompt
(perhaps the default except hook is using a cached reference?). I
believe PyFatalError and other APIs that are used deep in the
interpreter won't respect the module level setting.

Basically, it's dangerous to use detach() on a stream where you don't
hold the sole reference, and the safest approach with the standard
streams is to assume that other code is holding references to them.
Detaching the standard streams is just as likely to cause problems as
closing them.

> Moreover, your snippet is wrong because if someone replaces the streams for
> a second time, garbage collecting the previous streams will close the file
> descriptors. You should use closefd=False.

True, although that nicety is all the more reason to encapsulate this
idiom in a new IOBase.reopen() method:

    def reopen(self, mode=None, buffering=-1, encoding=None,
errors=None, newline=None, closefd=False):
        if mode is None:
            mode = getattr(mode, self, 'r')
        return open(self.fileno(), mode, buffering, encoding, errors,
newline, closefd)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Jun 10 15:22:52 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 10 Jun 2012 23:22:52 +1000
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 10:00 PM, Steven D'Aprano <steve at> wrote:
> My concept is that errors due to the wrong argument count, duplicate or
> missing keyword arguments, etc. which currently raise TypeError could raise
> ArgumentError, a subclass, instead.
> That will make distinguishing between "passed the wrong number of arguments"
> from "passed the wrong type of argument" easier.

This is actually why I prefer "BindError" to the name "ArgumentError".

The former is explicit about what has gone wrong: the supplied
arguments could not be bound to the parameters expected by the
supplied callable.

"ArgumentError", on the other hand, could easily refer to any of:
- failing to bind the supplied arguments to the expected parameters
(currently TypeError, will be BindError when using PEP 362)
- one or more of the arguments is of the wrong type (currently TypeError)
- one or more of the arguments has an unacceptable value (currently ValueError)

While I don't think the PEP should be held up over it, the idea of
making BindError a builtin exception and also raising it in the
interpreter's internal parameter binding code is certainly an
interesting idea to explore in the future.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Jun 10 15:36:43 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 10 Jun 2012 23:36:43 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Sun, Jun 10, 2012 at 10:04 PM, Steven D'Aprano <steve at> wrote:
> We have anecdotal evidence that many people expect that for/else will
> execute the else clause when the for loop is empty.
> We have no anecdotal evidence, or any other evidence, that anyone excepts
> that the else clause runs if you return out of the loop.

Right. We also need to remember that this entire discussion started
with a complaint regarding an apparent internal inconsistency in the
language, because the else clauses on if statements and loops don't
mean exactly the same thing. When you read the tutorial, it introduces
the first two forms together, but the third form (try/except/else)
doesn't show up until a later chapter on exception handling. This was
quite possibly one of the factors leading people to make a perfectly
reasonable intuitive leap that happens to be wrong.

All my docs addition is designed to do is discourage readers from
making that incorrect intuitive leap. They will still need to learn
how the else clauses interact with other constructs, like exceptions
and early returns, but those details aren't relevant to building a
fence across the tempting-but-wrong path from "if <iterable>/else" to
"for x in <iterable>/else".

It's a tricky educational problem to be sure, and if it wasn't for
backwards compatibility requirements, there would be a strong
temptation to just drop the else clause from loops entirely. The
versions that use sentinel values instead aren't *that* complicated,
and have the virtue of being explicit. However, that's not going to
happen (it would break too much code without a sufficiently compelling
justification), so making small tweaks to the relevant tutorial docs
(that will hopefully be picked up by Python instructors and other
learning and teching resources) is a reasonable way forward.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From simon.sapin at  Sun Jun 10 16:17:22 2012
From: simon.sapin at (Simon Sapin)
Date: Sun, 10 Jun 2012 16:17:22 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

Le 10/06/2012 15:05, Masklinn a ?crit :
> The standard library already provides for cryptographic hashes (hashlib)
> and MACs (hmac).
> [snip]
> Therefore, I would suggest either adding a new module (name tbd) or
> adding new constructors to hashlib.

PBKDF2 can be implemented in 15 lines of code based on the hmac and 
hashlib modules:

Although the code is short, it is easy to get wrong. So I think it would 
be nice to have in the stdlib, tested once and for all.

Also, PBKDF2 is a well-defined spec that will not change (or it will be 
called PBKDF3 or something) which I think makes it a good fit for the 

I would suggest to have Armin?s implementation (linked above) included 
as-is, but it?s probably too late for 3.3.

Simon Sapin

From storchaka at  Sun Jun 10 16:34:08 2012
From: storchaka at (Serhiy Storchaka)
Date: Sun, 10 Jun 2012 17:34:08 +0300
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <jr0es1$ub6$>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <jr2b88$1p0$>

On 10.06.12 00:22, Mark Lawrence wrote:
> On 09/06/2012 21:02, Serhiy Storchaka wrote:
>> None of these methods are not guaranteed to work if the input or output
>> have occurred before.
> That's a double negative so I'm not sure what you meant to say. Can you
> please rephrase it. I assume that English is not your native language,
> so I'll let you off :)

open(sys.stdin.fileno()) is not guaranteed to work if the input or 
output have occurred before. And io.TextIOWrapper(sys.stdin.detach()) is 
not guaranteed to work if the input or output have occurred before. 
sys.stdin internal buffer can contains read by not used characters. 
sys.stdin.buffer internal buffer can contains read by not used bytes. 
With multibyte encoding sys.stdin.decoder internal buffer can contains 
uncompleted multibyte character.

From storchaka at  Sun Jun 10 16:45:02 2012
From: storchaka at (Serhiy Storchaka)
Date: Sun, 10 Jun 2012 17:45:02 +0300
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <jr2bsn$5rn$>

On 10.06.12 05:26, Nick Coghlan wrote:
> Calling detach() on the standard streams is a bad idea - the interpreter
> uses the originals internally, and calling detach() breaks them.

If interpreter uses standard streams then it uses raw C streams (FILE *) 
stdin/stdout/etc. Calling open(sys.stdin.fileno()) bypasses internal 
buffering in sys.stdin, sys.stdin.buffer, sys.stdin.decoder and raw C 
stdin (if it used in lower level), and lose and break multibyte characters.

From ncoghlan at  Sun Jun 10 17:28:20 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 01:28:20 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 12:17 AM, Simon Sapin <simon.sapin at> wrote:
> Le 10/06/2012 15:05, Masklinn a ?crit :
>> The standard library already provides for cryptographic hashes (hashlib)
>> and MACs (hmac).
>> [snip]
>> Therefore, I would suggest either adding a new module (name tbd) or
>> adding new constructors to hashlib.
> PBKDF2 can be implemented in 15 lines of code based on the hmac and hashlib
> modules:
> Although the code is short, it is easy to get wrong. So I think it would be
> nice to have in the stdlib, tested once and for all.
> Also, PBKDF2 is a well-defined spec that will not change (or it will be
> called PBKDF3 or something) which I think makes it a good fit for the
> stdlib.
> I would suggest to have Armin?s implementation (linked above) included
> as-is, but it?s probably too late for 3.3.

It's cutting it very fine relative to the beta feature freeze (which
is in a couple of weeks), but it could still make it in as a very
reasonable addition to the standard library.

The hmac module has already been enhanced with a "secure_compare"
function for 3.3 to perform string and byte sequence comparisons that
don't leak as much information about the expected result under timing
attacks (it still leaks the expected length, but beyond that the
running time of the comparison should be constant for a given digest

Since the PBKDF2 key derivation requires hmac, and hmac depends on
hashlib (to provide the default hash algorithm for hmac.HMAC), I
believe the best way to expedite this would be to:

1. Create an issue on proposing just the binary
version of pbkdf2 as an enhancement to hmac
2. Attach a patch that updates Lib/, Lib/test/ and
Doc/library/hmac.rst accordingly (this will likely require changes to
work with bytes rather than 2.x strings)
3. Adds a "min_salt_len" parameter to discourage short salt values
(rather than the "weak_salt" boolean flag suggested by Masklinn)
4. Post to python-dev proposing the addition of that function for Python 3

Having needed a key derivation function myself not that long ago, and
with the recent high profile password database breaches Masklinn
noted, this seems like a very reasonable addition to me.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From stephen at  Sun Jun 10 17:36:25 2012
From: stephen at (Stephen J. Turnbull)
Date: Mon, 11 Jun 2012 00:36:25 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Rurpy writes:

 > Or are you saying some api's should be discouraged and making them
 > hard to use is better than a "not recommended" note in the
 > documentation?

No, I'm saying "explicit is better than implicit".  It's not hard to
use the explicit idiom, and it makes it clear that there are two
*different* kinds of problem that could occur, which would be
concealed by an API.

From ncoghlan at  Sun Jun 10 17:44:08 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 01:44:08 +1000
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <jr2b88$1p0$>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On Mon, Jun 11, 2012 at 12:34 AM, Serhiy Storchaka <storchaka at> wrote:
> On 10.06.12 00:22, Mark Lawrence wrote:
>> On 09/06/2012 21:02, Serhiy Storchaka wrote:
>>> None of these methods are not guaranteed to work if the input or output
>>> have occurred before.
>> That's a double negative so I'm not sure what you meant to say. Can you
>> please rephrase it. I assume that English is not your native language,
>> so I'll let you off :)
> open(sys.stdin.fileno()) is not guaranteed to work if the input or output
> have occurred before. And io.TextIOWrapper(sys.stdin.detach()) is not
> guaranteed to work if the input or output have occurred before. sys.stdin
> internal buffer can contains read by not used characters. sys.stdin.buffer
> internal buffer can contains read by not used bytes. With multibyte encoding
> sys.stdin.decoder internal buffer can contains uncompleted multibyte
> character.

Right, but the point of this discussion is to document the cleanest
available way for an application to change these settings at
*application start* (e.g. to support an "--encoding" parameter). Yes,
there are potential issues if you use any of these mechanisms while
there is data in the buffers, but that's a much harder problem and not
one we're trying to solve here.

Regardless, the advantage of the "open + fileno" idiom is that it
works for *any* level of change. If you want to force your streams to
unbuffered binary IO rather than merely changing the encoding:

    sys.stdin = open(sys.stdin.fileno(), 'rb', buffering=0, closefd=False)
    sys.stdout = open(sys.stdout.fileno(), 'wb', buffering=0, closefd=False)
    sys.stderr = open(sys.stderr.fileno(), 'wb', buffering=0, closefd=False)

Keep them as text, but force them to permissive utf-8, no matter how
the interpreter originally created them?:

    sys.stdin = open(sys.stdin.fileno(), 'r', encoding="utf-8",
errors="surrogateescape", closefd=False)
    sys.stdout = open(sys.stdout.fileno(), 'w', encoding="utf-8",
errors="surrogateescape", closefd=False)
    sys.stderr = open(sys.stderr.fileno(), 'w', encoding="utf-8",
errors="surrogateescape", closefd=False)

This approach also has the advantage of leaving
sys.__std(in/out/err)__ in a somewhat usable state.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ubershmekel at  Sun Jun 10 17:50:36 2012
From: ubershmekel at (Yuval Greenfield)
Date: Sun, 10 Jun 2012 18:50:36 +0300
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

I hope this isn't too off-topic, but is the tutorial supposed to
exhaustively explain the python language?

Because if not, then the for-else/while-else clause may be a good thing to
move to an appendix.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From masklinn at  Sun Jun 10 17:52:44 2012
From: masklinn at (Masklinn)
Date: Sun, 10 Jun 2012 17:52:44 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-10, at 17:28 , Nick Coghlan wrote:

> On Mon, Jun 11, 2012 at 12:17 AM, Simon Sapin <simon.sapin at> wrote:
>> Le 10/06/2012 15:05, Masklinn a ?crit :
>>> The standard library already provides for cryptographic hashes (hashlib)
>>> and MACs (hmac).
>>> [snip]
>>> Therefore, I would suggest either adding a new module (name tbd) or
>>> adding new constructors to hashlib.
>> PBKDF2 can be implemented in 15 lines of code based on the hmac and hashlib
>> modules:
>> Although the code is short, it is easy to get wrong. So I think it would be
>> nice to have in the stdlib, tested once and for all.
>> Also, PBKDF2 is a well-defined spec that will not change (or it will be
>> called PBKDF3 or something) which I think makes it a good fit for the
>> stdlib.
>> I would suggest to have Armin?s implementation (linked above) included
>> as-is, but it?s probably too late for 3.3.
> It's cutting it very fine relative to the beta feature freeze (which
> is in a couple of weeks), but it could still make it in as a very
> reasonable addition to the standard library.
> The hmac module has already been enhanced with a "secure_compare"
> function for 3.3 to perform string and byte sequence comparisons that
> don't leak as much information about the expected result under timing
> attacks (it still leaks the expected length, but beyond that the
> running time of the comparison should be constant for a given digest
> length).
> Since the PBKDF2 key derivation requires hmac, and hmac depends on
> hashlib (to provide the default hash algorithm for hmac.HMAC), I
> believe the best way to expedite this would be to:
> 1. Create an issue on proposing just the binary
> version of pbkdf2 as an enhancement to hmac

Although it makes sense from a dependency POV, I'm not sure it's the
best place to put it as people in need of knowing about PBKDF2 would
be more likely to be browsing hashlib, and ? more importantly ? PBKDF2
isn't a MAC, the usage of hmac underlying it being mostly incidental.

If PBKDF2 alone is added, I think putting it in its own module
(parallel to hmac) would be cleaner, *that* can be deprecated if 
more cryptographic hashes of that style (e.g. bcrypt, scrypt) are
added later on in the style of md5 -> hashlib.

> 2. Attach a patch that updates Lib/, Lib/test/ and
> Doc/library/hmac.rst accordingly (this will likely require changes to
> work with bytes rather than 2.x strings)
> 3. Adds a "min_salt_len" parameter to discourage short salt values
> (rather than the "weak_salt" boolean flag suggested by Masklinn)
> 4. Post to python-dev proposing the addition of that function for Python 3
> Having needed a key derivation function myself not that long ago, and
> with the recent high profile password database breaches Masklinn
> noted, this seems like a very reasonable addition to me.

From python at  Sun Jun 10 18:04:17 2012
From: python at (MRAB)
Date: Sun, 10 Jun 2012 17:04:17 +0100
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On 10/06/2012 16:50, Yuval Greenfield wrote:
> I hope this isn't too off-topic, but is the tutorial supposed to
> exhaustively explain the python language?
> Because if not, then the for-else/while-else clause may be a good thing
> to move to an appendix.
The for-else/while-else clause is part of the core language, so it
should be explained.

From ncoghlan at  Sun Jun 10 18:04:50 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 02:04:50 +1000
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Mon, Jun 11, 2012 at 1:50 AM, Yuval Greenfield <ubershmekel at> wrote:
> I hope this isn't too off-topic, but is the tutorial supposed to
> exhaustively explain the python language?
> Because if not, then the for-else/while-else clause may be a good thing to
> move to an appendix.

It's supposed to arm people well enough to cope with at least
*reading* most code they're likely to encounter. Since for/else is the
idiomatic way to write a search loop, even beginners really should
learn how to read it. For more esoteric stuff like metaclasses where
the philosophy of "If you're wondering whether or not you need it, you
don't need it" applies, then the tutorial can safely skip it.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sun Jun 10 18:11:04 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 02:11:04 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 1:52 AM, Masklinn <masklinn at> wrote:
> On 2012-06-10, at 17:28 , Nick Coghlan wrote:
>> 1. Create an issue on proposing just the binary
>> version of pbkdf2 as an enhancement to hmac
> Although it makes sense from a dependency POV, I'm not sure it's the
> best place to put it as people in need of knowing about PBKDF2 would
> be more likely to be browsing hashlib, and ? more importantly ? PBKDF2
> isn't a MAC, the usage of hmac underlying it being mostly incidental.
> If PBKDF2 alone is added, I think putting it in its own module
> (parallel to hmac) would be cleaner, *that* can be deprecated if
> more cryptographic hashes of that style (e.g. bcrypt, scrypt) are
> added later on in the style of md5 -> hashlib.

Yeah, you're probably right. Either a new module, or else in "getpass"
(either way, with a cross-reference from hashlib).

Wherever it ends up, it should also reference hmac.secure_compare for
a comparison function that doesn't allowing timing attacks to
progressively discover the expected hash.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ubershmekel at  Sun Jun 10 18:13:38 2012
From: ubershmekel at (Yuval Greenfield)
Date: Sun, 10 Jun 2012 19:13:38 +0300
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

On Sun, Jun 10, 2012 at 7:04 PM, MRAB <python at> wrote:

> On 10/06/2012 16:50, Yuval Greenfield wrote:
>> I hope this isn't too off-topic, but is the tutorial supposed to
>> exhaustively explain the python language?
>> Because if not, then the for-else/while-else clause may be a good thing
>> to move to an appendix.
>>  The for-else/while-else clause is part of the core language, so it
> should be explained.

If we want a dust of a chance to deprecate for-else/while-else in python 6,
circa 2031, then we should at least move it to the back of the tutorial.
I'm not suggesting to completely delete the text, just to nudge it to the
end. This clause is most definitely not a common pattern in python.
Personally I've never seen it in the wild and most pythonistas I've spoken
with have never heard of the construct.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bruce at  Sun Jun 10 18:23:13 2012
From: bruce at (Bruce Leban)
Date: Sun, 10 Jun 2012 09:23:13 -0700
Subject: [Python-ideas] Nudging beginners towards a more accurate mental
 model for loop else clauses
In-Reply-To: <>
References: <>
	<jqu4jj$ajo$> <>
Message-ID: <>

 On Sat, Jun 9, 2012 at 10:48 AM, Steven D'Aprano <steve at>

> Loops exit at the top, not the bottom. This is most obvious when you think
> about a while loop:
> while condition:
>    ...
> Of course you have to be at the top of the loop for the while to check
> condition, not the bottom. For-loops are not quite so obvious, but
> execution has to return back to the top of the loop in order to check
> whether or not the sequence is exhausted.
> Whether or not it is the *simplest* way to think about for/else, talking
> about exhaustion of the list (iterable) is correct.

If you want to talk about exhaustion of the list then you need to talk
differently about the while loop. Documentation is usually written for
non-experts. When I taught intro to programming, the mental model that most
students had was nowhere near as strong as most people on this list. The
concept 'loop exits normally' would be much easier for them to understand.

On Sat, Jun 9, 2012 at 10:49 AM, Steven D'Aprano <steve at>

> Ignoring try/finally blocks, which are special, we can assume that the
> reader has (or will have once they actually learn about functions and
> exceptions) a correct understanding of the behaviour of return and raise.
> - If the loop is exited by a return, *nothing* following the return is
> executed. That includes the else block.
> - If execution is halted by an exception (including raise), *nothing*
> following the exception is executed. That includes the else block.
> - If execution is halted by an external event that halts or interrupts the
> Python process, *nothing* following executes. That includes the else block.
> - If the loop never completes, *nothing* following the loop executes. That
> includes the else block.

You've written four different ways of saying 'loop does not exit normally'
vs. saying once 'loop exits normally'. When you emphasize *nothing* above,
it strongly suggests they all mean the same thing. If you *don't* ignore
try/finally, then they don't. I don't think documentation needs to cover
every case, but if you're going to write stuff in bold letters (or italic
or whatever), then readers expect you're covering all the bases and not
ignoring special cases. That may not be your intent but that's the way
people read things. Again, docs are written for non-experts.

   Holly: He's dead, Dave. Everybody is dead. Everybody is dead, Dave.
   Lister: Wait. Are you trying to tell me everybody's dead?
   Holly: Yup. Well, except for Dracula who was executing a try/finally.
He's undead and probably going to kill you too. But I didn't want to bother
you with that minor detail.


--- Bruce
Follow me:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Sun Jun 10 18:41:14 2012
From: stephen at (Stephen J. Turnbull)
Date: Mon, 11 Jun 2012 01:41:14 +0900
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

^^^^^^^[[[[[[[[[@[@Nick Coghlan writes:
 > On Mon, Jun 11, 2012 at 12:34 AM, Serhiy Storchaka <storchaka at> wrote:

 > > open(sys.stdin.fileno()) is not guaranteed to work if the input or output
 > > have occurred before.

 > Right, but the point of this discussion is to document the cleanest
 > available way for an application to change these settings at
 > *application start* (e.g. to support an "--encoding" parameter). Yes,
 > there are potential issues if you use any of these mechanisms while
 > there is data in the buffers,


The OP's problem is a real one.  His use case (the "--encoding"
parameter) seems to be the most likely one in production use, so the
loss of buffered data issue should rarely come up.  Changing encodings
on the fly offers plenty of ways to lose data besides incomplete
buffers, anyway.

I am a little concerned with MRAB's report that

    import sys
    sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')

doesn't work as expected, though.  (It does work for me on Mac OS X,
both as above -- of course there are no '\r's in the output -- and
with 'print("hello", end="\r\n")'.)

From storchaka at  Sun Jun 10 18:43:51 2012
From: storchaka at (Serhiy Storchaka)
Date: Sun, 10 Jun 2012 19:43:51 +0300
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <jr2irh$msa$>

On 10.06.12 18:44, Nick Coghlan wrote:
> This approach also has the advantage of leaving
> sys.__std(in/out/err)__ in a somewhat usable state.

And then sys.std* and sys.__std*__ have their own inconsistent buffers.

From greg at  Sun Jun 10 19:56:46 2012
From: greg at (Gregory P. Smith)
Date: Sun, 10 Jun 2012 10:56:46 -0700
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 9:11 AM, Nick Coghlan <ncoghlan at> wrote:

> On Mon, Jun 11, 2012 at 1:52 AM, Masklinn <masklinn at> wrote:
> > On 2012-06-10, at 17:28 , Nick Coghlan wrote:
> >> 1. Create an issue on proposing just the binary
> >> version of pbkdf2 as an enhancement to hmac
> >
> > Although it makes sense from a dependency POV, I'm not sure it's the
> > best place to put it as people in need of knowing about PBKDF2 would
> > be more likely to be browsing hashlib, and ? more importantly ? PBKDF2
> > isn't a MAC, the usage of hmac underlying it being mostly incidental.
> >
> > If PBKDF2 alone is added, I think putting it in its own module
> > (parallel to hmac) would be cleaner, *that* can be deprecated if
> > more cryptographic hashes of that style (e.g. bcrypt, scrypt) are
> > added later on in the style of md5 -> hashlib.
> Yeah, you're probably right. Either a new module, or else in "getpass"
> (either way, with a cross-reference from hashlib).
> Wherever it ends up, it should also reference hmac.secure_compare for
> a comparison function that doesn't allowing timing attacks to
> progressively discover the expected hash.
I'd just stick it in hmac myself but getpass was also a good suggestion.
 Cross reference to it from the docs of all three as the real goal of
adding pbkdf2 is to advertise it to users so that they might use it rather
than something more naive.

hashlib itself should be kept pure as is for standard low level hash
algorithms.  It can't have a dependency on anything else.

Even if this doesn't make it into the stdlib in time for 3.3, feel free to
update the getpass, hmac and/or hashlib docs to point to the pbkdf2 module
externally as a suggestion for passphrase/secret hashing.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From simon.sapin at  Sun Jun 10 20:04:17 2012
From: simon.sapin at (Simon Sapin)
Date: Sun, 10 Jun 2012 20:04:17 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

Le 10/06/2012 19:56, Gregory P. Smith a ?crit :
>>  Yeah, you're probably right. Either a new module, or else in "getpass"
>>  (either way, with a cross-reference from hashlib).
> I'd just stick it in hmac myself but getpass was also a good suggestion.

I disagree. The getpass module is about terminal control, it has nothing 
to do with hashing. PBKDF2 or other adaptive hashes do not belong there.

Simon Sapin

From masklinn at  Sun Jun 10 20:11:15 2012
From: masklinn at (Masklinn)
Date: Sun, 10 Jun 2012 20:11:15 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-10, at 20:04 , Simon Sapin wrote:

> Le 10/06/2012 19:56, Gregory P. Smith a ?crit :
>>> Yeah, you're probably right. Either a new module, or else in "getpass"
>>> (either way, with a cross-reference from hashlib).
>> I'd just stick it in hmac myself but getpass was also a good suggestion.
> I disagree. The getpass module is about terminal control, it has nothing to do with hashing. PBKDF2 or other adaptive hashes do not belong there.

It seems there's as many opinions on the subject as there are people
(which was to be expected) when there's no code yet, I'll try to get
something done first (unless somebody else wants to) and discussion of
its exact location in the stdlib can be bikeshed in -dev if and when
that point/paint is reached.

From python at  Sun Jun 10 20:12:55 2012
From: python at (MRAB)
Date: Sun, 10 Jun 2012 19:12:55 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10/06/2012 17:41, Stephen J. Turnbull wrote:
> ^^^^^^^[[[[[[[[[@[@Nick Coghlan writes:
>   >  On Mon, Jun 11, 2012 at 12:34 AM, Serhiy Storchaka<storchaka at>  wrote:
>   >  >  open(sys.stdin.fileno()) is not guaranteed to work if the input or output
>   >  >  have occurred before.
> [...]
>   >  Right, but the point of this discussion is to document the cleanest
>   >  available way for an application to change these settings at
>   >  *application start* (e.g. to support an "--encoding" parameter). Yes,
>   >  there are potential issues if you use any of these mechanisms while
>   >  there is data in the buffers,
> +1
> The OP's problem is a real one.  His use case (the "--encoding"
> parameter) seems to be the most likely one in production use, so the
> loss of buffered data issue should rarely come up.  Changing encodings
> on the fly offers plenty of ways to lose data besides incomplete
> buffers, anyway.
> I am a little concerned with MRAB's report that
>      import sys
>      print("hello")
>      sys.stdout.flush()
>      sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>      print("hello")
> doesn't work as expected, though.  (It does work for me on Mac OS X,
> both as above -- of course there are no '\r's in the output -- and
> with 'print("hello", end="\r\n")'.)
That's actually Python 3.1. From Python 3.2 it's slightly different,
but still not quite right:

Python 3.1:     "hello\r\nhello\r\r\n"
Python 3.2:     "hello\nhello\r\n"
Python 3.3.0a4: "hello\nhello\r\n"

All on Windows.

From simon.sapin at  Sun Jun 10 20:24:43 2012
From: simon.sapin at (Simon Sapin)
Date: Sun, 10 Jun 2012 20:24:43 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

Le 10/06/2012 20:11, Masklinn a ?crit :
> [...] when there's no code yet
> I'll try to get something done first

There is code, with tests. Here is the link I posted earlier in this thread:

Simon Sapin

From p.f.moore at  Sun Jun 10 20:34:04 2012
From: p.f.moore at (Paul Moore)
Date: Sun, 10 Jun 2012 19:34:04 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10 June 2012 19:12, MRAB <python at> wrote:
> On 10/06/2012 17:41, Stephen J. Turnbull wrote:
>> I am a little concerned with MRAB's report that
>> ? ? import sys
>> ? ? print("hello")
>> ? ? sys.stdout.flush()
>> ? ? sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>> ? ? print("hello")
>> doesn't work as expected, though. ?(It does work for me on Mac OS X,
>> both as above -- of course there are no '\r's in the output -- and
>> with 'print("hello", end="\r\n")'.)
> That's actually Python 3.1. From Python 3.2 it's slightly different,
> but still not quite right:
> Python 3.1: ? ? "hello\r\nhello\r\r\n"
> Python 3.2: ? ? "hello\nhello\r\n"
> Python 3.3.0a4: "hello\nhello\r\n"
> All on Windows.

Not here (Win 7 32-bit):

PS D:\Data> type
import sys

sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
PS D:\Data> py -3.2 | od -c
0000000   H   e   l   l   o   !  \r  \n   H   e   l   l   o   !  \r  \n


From masklinn at  Sun Jun 10 20:35:35 2012
From: masklinn at (Masklinn)
Date: Sun, 10 Jun 2012 20:35:35 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-10, at 20:24 , Simon Sapin wrote:

> Le 10/06/2012 20:11, Masklinn a ?crit :
>> [...] when there's no code yet
>> I'll try to get something done first
> There is code, with tests. Here is the link I posted earlier in this thread:

Yes, I've seen it, but

1. I'll need to talk to Armin about using that code (which is why I CC'd
   him to the list when I responded to Nick's response to your comment),
   or have him do it, I don't think anybody is going to take his code
   without even asking for consent and try to push it into the stdlib

2. The interface is simple, but painful. Just look at the comment at the top:

        3.  Store ``algorithm$salt:costfactor$hash`` in the database so that
        you can upgrade later easily to a different algorithm if you need
        one.  For instance ``PBKDF2-256$thesalt:10000$deadbeef...``.

   if we know what's supposed to be done, how about just doing it and
   returning *that*? If it goes into the stdlib, I'd like to have
   something non-cryptographers can use easily, correctly and without
   making mistakes. Then there's the issue of implementing the equality
   test, extracting stuff from that storage string on subsequent auths to
   test for matches. It should be possible to do all that in a single
   user-facing operations, no munging about in user's code.

3. The test suite needs to be converted to the stdlib's format

4. The documentation needs to be written

From p.f.moore at  Sun Jun 10 20:36:03 2012
From: p.f.moore at (Paul Moore)
Date: Sun, 10 Jun 2012 19:36:03 +0100
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 June 2012 19:24, Simon Sapin <simon.sapin at> wrote:
> Le 10/06/2012 20:11, Masklinn a ?crit :
>> [...] when there's no code yet
>> I'll try to get something done first
> There is code, with tests. Here is the link I posted earlier in this thread:

To use that would need Armin's approval and support. So far he's not
commented here.


From python at  Sun Jun 10 21:01:21 2012
From: python at (MRAB)
Date: Sun, 10 Jun 2012 20:01:21 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10/06/2012 19:34, Paul Moore wrote:
> On 10 June 2012 19:12, MRAB<python at>  wrote:
>>  On 10/06/2012 17:41, Stephen J. Turnbull wrote:
>>>  I am a little concerned with MRAB's report that
>>>       import sys
>>>       print("hello")
>>>       sys.stdout.flush()
>>>       sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>       print("hello")
>>>  doesn't work as expected, though.  (It does work for me on Mac OS X,
>>>  both as above -- of course there are no '\r's in the output -- and
>>>  with 'print("hello", end="\r\n")'.)
>>  That's actually Python 3.1. From Python 3.2 it's slightly different,
>>  but still not quite right:
>>  Python 3.1:     "hello\r\nhello\r\r\n"
>>  Python 3.2:     "hello\nhello\r\n"
>>  Python 3.3.0a4: "hello\nhello\r\n"
>>  All on Windows.
> Not here (Win 7 32-bit):
> PS D:\Data>  type
> import sys
> print("Hello!")
> sys.stdout.flush()
> sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
> print("Hello!")
> PS D:\Data>  py -3.2 | od -c
> 0000000   H   e   l   l   o   !  \r  \n   H   e   l   l   o   !  \r  \n
> 0000020
I'm using Windows XP Pro (32-bit), initially sys.stdout.encoding ==

From p.f.moore at  Sun Jun 10 22:07:00 2012
From: p.f.moore at (Paul Moore)
Date: Sun, 10 Jun 2012 21:07:00 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10 June 2012 20:01, MRAB <python at> wrote:
> On 10/06/2012 19:34, Paul Moore wrote:
>> On 10 June 2012 19:12, MRAB<python at> ?wrote:
>>> ?On 10/06/2012 17:41, Stephen J. Turnbull wrote:
>>>> ?I am a little concerned with MRAB's report that
>>>> ? ? ?import sys
>>>> ? ? ?print("hello")
>>>> ? ? ?sys.stdout.flush()
>>>> ? ? ?sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>> ? ? ?print("hello")
>>>> ?doesn't work as expected, though. ?(It does work for me on Mac OS X,
>>>> ?both as above -- of course there are no '\r's in the output -- and
>>>> ?with 'print("hello", end="\r\n")'.)
>>> ?That's actually Python 3.1. From Python 3.2 it's slightly different,
>>> ?but still not quite right:
>>> ?Python 3.1: ? ? "hello\r\nhello\r\r\n"
>>> ?Python 3.2: ? ? "hello\nhello\r\n"
>>> ?Python 3.3.0a4: "hello\nhello\r\n"
>>> ?All on Windows.
>> Not here (Win 7 32-bit):
>> PS D:\Data> ?type
>> import sys
>> print("Hello!")
>> sys.stdout.flush()
>> sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>> print("Hello!")
>> PS D:\Data> ?py -3.2 | od -c
>> 0000000 ? H ? e ? l ? l ? o ? ! ?\r ?\n ? H ? e ? l ? l ? o ? ! ?\r ?\n
>> 0000020
> I'm using Windows XP Pro (32-bit), initially sys.stdout.encoding ==
> "cp1252".

PS D:\Data> py -3 -c "import sys; print(sys.stdout.encoding)"

This is at the console (Powershell) - are you running from within
something like idle, or a GUI environment?


From jeanpierreda at  Sun Jun 10 22:16:41 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sun, 10 Jun 2012 16:16:41 -0400
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 2:36 PM, Paul Moore <p.f.moore at> wrote:
> To use that would need Armin's approval and support. So far he's not
> commented here.

Only if you want a different license than 3-clause BSD.

P.S. I love this thread. Great suggestion. :)

-- Devin

From python at  Sun Jun 10 22:28:14 2012
From: python at (MRAB)
Date: Sun, 10 Jun 2012 21:28:14 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10/06/2012 21:07, Paul Moore wrote:
> On 10 June 2012 20:01, MRAB<python at>  wrote:
>>  On 10/06/2012 19:34, Paul Moore wrote:
>>>  On 10 June 2012 19:12, MRAB<python at>    wrote:
>>>>    On 10/06/2012 17:41, Stephen J. Turnbull wrote:
>>>>>    I am a little concerned with MRAB's report that
>>>>>        import sys
>>>>>        print("hello")
>>>>>        sys.stdout.flush()
>>>>>        sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>>>        print("hello")
>>>>>    doesn't work as expected, though.  (It does work for me on Mac OS X,
>>>>>    both as above -- of course there are no '\r's in the output -- and
>>>>>    with 'print("hello", end="\r\n")'.)
>>>>    That's actually Python 3.1. From Python 3.2 it's slightly different,
>>>>    but still not quite right:
>>>>    Python 3.1:     "hello\r\nhello\r\r\n"
>>>>    Python 3.2:     "hello\nhello\r\n"
>>>>    Python 3.3.0a4: "hello\nhello\r\n"
>>>>    All on Windows.
>>>  Not here (Win 7 32-bit):
>>>  PS D:\Data>    type
>>>  import sys
>>>  print("Hello!")
>>>  sys.stdout.flush()
>>>  sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>  print("Hello!")
>>>  PS D:\Data>    py -3.2 | od -c
>>>  0000000   H   e   l   l   o   !  \r  \n   H   e   l   l   o   !  \r  \n
>>>  0000020
>>  I'm using Windows XP Pro (32-bit), initially sys.stdout.encoding ==
>>  "cp1252".
> PS D:\Data>  py -3 -c "import sys; print(sys.stdout.encoding)"
> cp850
> This is at the console (Powershell) - are you running from within
> something like idle, or a GUI environment?
It's at the system command prompt. When I redirect the script's stdout 
to a file
(on the command line using ">output.txt") I get those 15 bytes from 
Python 3.2.

Your output appears to be 32 bytes (the second line starts with

From p.f.moore at  Sun Jun 10 22:38:14 2012
From: p.f.moore at (Paul Moore)
Date: Sun, 10 Jun 2012 21:38:14 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 10 June 2012 21:28, MRAB <python at> wrote:
> On 10/06/2012 21:07, Paul Moore wrote:
>> On 10 June 2012 20:01, MRAB<python at> ?wrote:
>>> ?On 10/06/2012 19:34, Paul Moore wrote:
>>>> ?On 10 June 2012 19:12, MRAB<python at> ? ?wrote:
>>>>> ? On 10/06/2012 17:41, Stephen J. Turnbull wrote:
>>>>>> ? I am a little concerned with MRAB's report that
>>>>>> ? ? ? import sys
>>>>>> ? ? ? print("hello")
>>>>>> ? ? ? sys.stdout.flush()
>>>>>> ? ? ? sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>>>> ? ? ? print("hello")
>>>>>> ? doesn't work as expected, though. ?(It does work for me on Mac OS X,
>>>>>> ? both as above -- of course there are no '\r's in the output -- and
>>>>>> ? with 'print("hello", end="\r\n")'.)
>>>>> ? That's actually Python 3.1. From Python 3.2 it's slightly different,
>>>>> ? but still not quite right:
>>>>> ? Python 3.1: ? ? "hello\r\nhello\r\r\n"
>>>>> ? Python 3.2: ? ? "hello\nhello\r\n"
>>>>> ? Python 3.3.0a4: "hello\nhello\r\n"
>>>>> ? All on Windows.
>>>> ?Not here (Win 7 32-bit):
>>>> ?PS D:\Data> ? ?type
>>>> ?import sys
>>>> ?print("Hello!")
>>>> ?sys.stdout.flush()
>>>> ?sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8')
>>>> ?print("Hello!")
>>>> ?PS D:\Data> ? ?py -3.2 | od -c
>>>> ?0000000 ? H ? e ? l ? l ? o ? ! ?\r ?\n ? H ? e ? l ? l ? o ? ! ?\r ?\n
>>>> ?0000020
>>> ?I'm using Windows XP Pro (32-bit), initially sys.stdout.encoding ==
>>> ?"cp1252".
>> PS D:\Data> ?py -3 -c "import sys; print(sys.stdout.encoding)"
>> cp850
>> This is at the console (Powershell) - are you running from within
>> something like idle, or a GUI environment?
> It's at the system command prompt. When I redirect the script's stdout to a
> file
> (on the command line using ">output.txt") I get those 15 bytes from Python
> 3.2.
> Your output appears to be 32 bytes (the second line starts with
> "0000020").

Well spotted - PowerShell does funny things with Unicode in pipes, I'd
forgotten. Indeed, I get the same output as you from cmd.


From ben+python at  Mon Jun 11 00:12:29 2012
From: ben+python at (Ben Finney)
Date: Mon, 11 Jun 2012 08:12:29 +1000
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
	the margin of PEP 362 discussion)
References: <>
Message-ID: <>

Nick Coghlan <ncoghlan at> writes:

> On Sun, Jun 10, 2012 at 10:00 PM, Steven D'Aprano <steve at> wrote:
> > My concept is that errors due to the wrong argument count, duplicate
> > or missing keyword arguments, etc. which currently raise TypeError
> > could raise ArgumentError, a subclass, instead.
> >
> > That will make distinguishing between "passed the wrong number of
> > arguments" from "passed the wrong type of argument" easier.
> This is actually why I prefer "BindError" to the name "ArgumentError".
> The former is explicit about what has gone wrong: the supplied
> arguments could not be bound to the parameters expected by the
> supplied callable.

?ArgumentBindError?, then?

 \     ?Airports are ugly. Some are very ugly. Some attain a degree of |
  `\        ugliness that can only be the result of a special effort.? |
_o__)             ?Douglas Adams, _The Long Dark Tea-Time Of The Soul_ |
Ben Finney

From acarter at  Mon Jun 11 00:42:53 2012
From: acarter at (Andrew Carter)
Date: Sun, 10 Jun 2012 15:42:53 -0700
Subject: [Python-ideas] Saving state in list/generator comprehension
Message-ID: <>

Forgive me for any problems in this e-mail as I'm new to this mailing list.

I thought it might be nice to be able to somehow save a state in
list/generator comprehensions,
a side effect of this (although not the intended goal) is it would make
reduce feasible in a clean manner as the final result would just be the

One mechanism I can think of is to overload the with/as keyword for use
inside of list/generator comprehensions, and using the previous result as
I believe the change to the grammar in python3k would be

comp_iter : comp_for | comp_if | comp_with
comp_with: 'with' testlist 'as' testlist

So something in the form of
  [expr for i in iterable with initializer as accumulator]
would resolve to something like
  result = []
  accumulator = initializer
  for i in iterable:
    accumulator = expr
  return result

For instance reduce could be defined as (assuming all 3 arguments are
  reduce = lambda function, iterable, initializer : ([initializer] +
[function(accumulator, i) for i in iterable with initializer as
Breaking this down, the "with initializer as accumulator" statement means
that when the list comprehension begins accumulator=initializer,
then after each iteration, accumulator = function(accumulator, i), so with
the function f, list [i1,i2,i3,...], and initial value i0, the resulting
list of
"[function(accumulator, i) for i in iterable with initializer as
accumulator]" would be [f(i0,i1), f(f(i0,i1),i2),
f(f(f(i0,i1),i2),i3),...], or in left associative infix form with
the f = "+" operator, [i0+i1,i0+i1+i2,i0+i1+i2+i3,...].
Consing (effectively) initializer to the beginning of the list ensures
clean behavior for empty lists, and indexing [-1] gets the last element
which is really the only
element that matters.

Consider a slightly more complex example of a Fibonacci generator, one
might define it as follows,
  def fibs():
    a, b = 1, 0
    while True:
      a, b = b, a + b
      yield b

Using the with statement, it would require two generator comprehensions
  fibs = lambda : (b for a,b in (b, a+b for i in itertools.cycle((None,))
with a,b = 0,1))
The inner generator comprehension
  (b, a+b for i in itertools.repeat(None) with a,b = 0,1)
creates an infinite generator of tuples which are
consecutive Fibonacci numbers, the outer list comprehension strips off the
unneeded "state".

Some of the pros of doing it this way is that because with/as are already
keywords in python backwards compatibility shouldn't be an issue,
but if one is just mapping with state then an extra list/generator
comprehension block is needed to strip the state from the intermediate list.

I apologize if similar ideas have already been discussed.
-Andrew Carter

p.s. Is there a built-in way to get the last element from a generator
(perhaps even with a default) a quick google search did not reveal one?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From zuo at  Mon Jun 11 01:16:29 2012
From: zuo at (Jan Kaliszewski)
Date: Mon, 11 Jun 2012 01:16:29 +0200
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound) methods
Message-ID: <>


Today, I encountered a surprising bug in my code which creates
some weakref.proxies to instance methods... The actual Python
behaviour related to the issue can be ilustrated with the
following example:

    >>> import weakref
    >>> class A:
    ...     def method(self): print(self)
    >>> A.method
    <function method at 0xb732926c>
    >>> a = A()
    >>> a.method
    <bound method A.method of <__main__.A object at 0xb7326bec>>
    >>> r = weakref.ref(a.method)  # creating a weak reference
    >>> r                          # ...but it appears to be dead
    <weakref at 0xb7327d9c; dead>
    >>> w = weakref.proxy(a.method)  # the same with a weak proxy
    >>> w
    <weakproxy at 0xb7327d74 to NoneType at 0x829f7d0>
    >>> w()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ReferenceError: weakly-referenced object no longer exists

This behaviour is perfectly correct -- but still surprising,
especially for people who know little about method creation
machinery, descriptors etc.

I think it would be nice to make this 'trap' less painful --
for example, by doing one or both of the following:

1. Describe and explain this behaviour in the weakref
module documentation.

2. Provide (in functools?) a type-and-decorator that do the
same what func_descr_get() does (transforms a function into
a method) *plus* caches the created method (e.g. at the
instance object).

A prototype implementation:

    class InstanceCachedMethod(object):

        def __init__(self, func):
            self.func = func
            ) = '__{0}_method_ref'.format(func.__name__)

        def __get__(self, instance, owner):
            if instance is None:
                return self.func
                return getattr(instance, self.instance_attr_name)
            except AttributeError:
                method = types.MethodType(self.func, instance)
                setattr(instance, self.instance_attr_name, method)
                return method
A simplified version that reuses the func.__name__ (works well
as long as func.__name__ is the actual instance attribute name...):

    class InstanceCachedMethod(object):

        def __init__(self, func):
            self.func = func

        def __get__(self, instance, owner):
            if instance is None:
                return self.func
            method = types.MethodType(self.func, instance)
            setattr(instance, self.func.__name__, method)
            return method 

Both versions work well with weakref.proxy()/ref() objects:

    >>> class B:
    ...     @InstanceCachedMethod
    ...     def method(self): print(self)
    >>> B.method
    <function method at 0xb7329d6c>
    >>> b = B()
    >>> b.method
    <bound method B.method of <__main__.B object at 0xb7206ccc>>
    >>> r = weakref.ref(b.method)
    >>> r
    <weakref at 0xb72c611c; to 'method' at 0xb736c40c (method)>
    >>> w = weakref.proxy(b.method)
    >>> w
    <weakproxy at 0xb7327e14 to method at 0xb736c40c>
    >>> w()
    <__main__.B object at 0xb7206ccc>

What do you think about it?


From pyideas at  Mon Jun 11 03:12:36 2012
From: pyideas at (Chris Rebert)
Date: Sun, 10 Jun 2012 18:12:36 -0700
Subject: [Python-ideas] BindError as a built-in TypeError subclass (on
 the margin of PEP 362 discussion)
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 5:00 AM, Steven D'Aprano <steve at> wrote:
> Chris Rebert wrote:
>>> Since this will be an error that beginners see (frequently), I suggest
>>> ArgumentError is more friendly than BindError.
>> I note that Ruby also has an ArgumentError, which it raises both for
>> calls with an incorrect number of arguments and in cases when Python
>> would raise ValueError.
> Even if I wanted to replace ValueError with ArgumentError (and I don't), we
> couldn't due to backward compatibility.
> (Although I suppose ArgumentError could inherit from both TypeError and
> ValueError.)
> My concept is that errors due to the wrong argument count, duplicate or
> missing keyword arguments, etc. which currently raise TypeError could raise
> ArgumentError, a subclass, instead.
> That will make distinguishing between "passed the wrong number of arguments"
> from "passed the wrong type of argument" easier.

You seem to have misinterpreted the intent behind my post. I'm in no
way arguing that ValueError and "ArgumentBindingError" should be
conflated. I'm pointing out that another very similar language (Ruby)
has an error of the same name with a very similar purpose (which it
also distinguishes from its TypeError), thus providing further
validation of the use case for the proposed ArgumentBindingError.


From steve at  Mon Jun 11 04:01:38 2012
From: steve at (Steven D'Aprano)
Date: Mon, 11 Jun 2012 12:01:38 +1000
Subject: [Python-ideas] Saving state in list/generator comprehension
In-Reply-To: <>
References: <>
Message-ID: <>

Andrew Carter wrote:
> Forgive me for any problems in this e-mail as I'm new to this mailing list.
> I thought it might be nice to be able to somehow save a state in
> list/generator comprehensions,
> a side effect of this (although not the intended goal) is it would make
> reduce feasible in a clean manner as the final result would just be the
> state.

reduce already exists; in Python 2, it is a built-in available at all times, 
in Python 3 it has been banished to the functools module.

What is your use-case for this? "Saving state" is a means to an end. The 
beauty of list comprehensions and generator expressions is that they are 
intentionally quite simple and limited. If you need something more complex, 
write a function or generator. Not everything has to be a (very-long and 
unreadable) one-linear.

reduce already exists, but if it didn't, you could write it quite easily. 
Here's a version with optional starting value which yields the intermediate 

import itertools
_MISSING = object()  # sentinel value

def foldl(func, iterable, start=_MISSING):
     # foldr is left as an exercise :-)
     if start is _MISSING:
         it = iter(iterable)
         it = itertools.chain([start], iterable)
     a = next(it)  # raises if iterable is empty and start not given
         b = next(it)
     except StopIteration:
         yield a
     a = func(a, b)
     yield a
     for b in it:
         a = func(a, b)
         yield a

Modifying this to return just the last value is easy, and in fact is simpler 
than the above:

def foldl(func, iterable, start=_MISSING):
     if start is _MISSING:
         it = iter(iterable)
         it = itertools.chain([start], iterable)
     a = next(it)
     for b in it:
         a = func(a, b)
     return a

> Some of the pros of doing it this way is that because with/as are already
> keywords in python backwards compatibility shouldn't be an issue,

That's not an argument in favour of your request. That's merely the lack of 
one specific argument against it. There are an infinite number of things which 
could be done that won't break backwards compatibility, but that doesn't mean 
we should do them all. What positive arguments in favour of your proposal do 
you have? What does your proposal allow us to do that we can't already do, or 
at least do better?

> p.s. Is there a built-in way to get the last element from a generator
> (perhaps even with a default) a quick google search did not reveal one?

The same as you would get the last element from any iterator, not just 
generators: iterate over it as quickly as possible, keeping only the last 
value seen. Because generator values are generated lazily as needed, there's 
no direct way to skip to the last value, or get random access to them.

In pure Python:

for x in iterator:

This may be faster:

collections.deque(iterator, maxlen=1)[0]

Of course, both examples assume that the iterator or generator yields at least 
one value, and is not infinite.


From acarter at  Mon Jun 11 05:09:12 2012
From: acarter at (Andrew Carter)
Date: Sun, 10 Jun 2012 20:09:12 -0700
Subject: [Python-ideas] Saving state in list/generator comprehension
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jun 10, 2012 at 7:01 PM, Steven D'Aprano <steve at>wrote:

> Andrew Carter wrote:
>> Forgive me for any problems in this e-mail as I'm new to this mailing
>> list.
>> I thought it might be nice to be able to somehow save a state in
>> list/generator comprehensions,
>> a side effect of this (although not the intended goal) is it would make
>> reduce feasible in a clean manner as the final result would just be the
>> state.
> reduce already exists; in Python 2, it is a built-in available at all
> times, in Python 3 it has been banished to the functools module.
> I think that it is was banished for good reason. I have found myself
sometimes writing code that needs a reduce, but I feel that using the
reduce function provided isn't very clear, especially if you see it and
aren't familiar with the function. Admittedly writing a function that is a
few short lines is possible, and what I end up doing, it just seems like
there should be a more elegant way to do it than having a bunch of
specialized functions.

What is your use-case for this? "Saving state" is a means to an end. The
> beauty of list comprehensions and generator expressions is that they are
> intentionally quite simple and limited. If you need something more complex,
> write a function or generator. Not everything has to be a (very-long and
> unreadable) one-linear.
> reduce already exists, but if it didn't, you could write it quite easily.
> Here's a version with optional starting value which yields the intermediate
> results:
As I have mentioned above occasionally I want to turn a list into a single
value by some repeated operation, but actually I think its more common that
I want to map some operation over a list with dependencies of previous
operation passed through. I think my most common use case, is I have a
function that operates on a single value, and also has some state.
Unfortunately leaving my example purposely vague, I was iterating over a
list, and had what was effectively an environment variable (more of state)
initially as a dynamic environment, so it was updated each time the
function was called across the list comprehension. I then for other reasons
wanted to use the environment type as a key for dictionary (which is a
problem if its mutable), but that meant that the original list
comprehension (which I felt was rather simple). Admittedly it didn't take
me any time at all to write a simple function that did the list
comprehension, but it still felt like a simple enough problem that it could
be elegantly solved without resorting to the helper function.

> import itertools
> _MISSING = object()  # sentinel value
> def foldl(func, iterable, start=_MISSING):
>    # foldr is left as an exercise :-)
>    if start is _MISSING:
>        it = iter(iterable)
>    else:
>        it = itertools.chain([start], iterable)
>    a = next(it)  # raises if iterable is empty and start not given
>    try:
>        b = next(it)
>    except StopIteration:
>        yield a
>        return
>    a = func(a, b)
>    yield a
>    for b in it:
>        a = func(a, b)
>        yield a
> Modifying this to return just the last value is easy, and in fact is
> simpler than the above:
> def foldl(func, iterable, start=_MISSING):
>    if start is _MISSING:
>        it = iter(iterable)
>    else:
>        it = itertools.chain([start], iterable)
>    a = next(it)
>    for b in it:
>        a = func(a, b)
>    return a
> [...]
>  Some of the pros of doing it this way is that because with/as are already
>> keywords in python backwards compatibility shouldn't be an issue,
> That's not an argument in favour of your request. That's merely the lack
> of one specific argument against it. There are an infinite number of things
> which could be done that won't break backwards compatibility, but that
> doesn't mean we should do them all. What positive arguments in favour of
> your proposal do you have? What does your proposal allow us to do that we
> can't already do, or at least do better?
 I feel like there is a need from personal experience of mapping with state
in a short concise way. However it is quite possible that it is just me,
and I need to think about the problem differently, or perhaps live with 4
line functions that are only used once. As for the backwards compatibility
I think was getting ahead of myself, I feel the with/as solution is quite
clunky, but I couldn't come up with a more elegant solution that operated
in a similar vein to how python feels as a language.

>  p.s. Is there a built-in way to get the last element from a generator
>> (perhaps even with a default) a quick google search did not reveal one?
> The same as you would get the last element from any iterator, not just
> generators: iterate over it as quickly as possible, keeping only the last
> value seen. Because generator values are generated lazily as needed,
> there's no direct way to skip to the last value, or get random access to
> them.
> In pure Python:
> for x in iterator:
>    pass
> This may be faster:
> collections.deque(iterator, maxlen=1)[0]
> That's a neat solution, a little bit confusing at first glance, but still
very neat, thanks!

> Of course, both examples assume that the iterator or generator yields at
> least one value, and is not infinite.
> --
> Steven
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Jun 11 08:09:11 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 16:09:11 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 4:35 AM, Masklinn <masklinn at> wrote:
> On 2012-06-10, at 20:24 , Simon Sapin wrote:
>> Le 10/06/2012 20:11, Masklinn a ?crit :
>>> [...] when there's no code yet
>>> I'll try to get something done first
>> There is code, with tests. Here is the link I posted earlier in this thread:
> Yes, I've seen it, but
> 1. I'll need to talk to Armin about using that code (which is why I CC'd
> ? him to the list when I responded to Nick's response to your comment),
> ? or have him do it, I don't think anybody is going to take his code
> ? without even asking for consent and try to push it into the stdlib
> 2. The interface is simple, but painful. Just look at the comment at the top:
> ? ? ? ?3. ?Store ``algorithm$salt:costfactor$hash`` in the database so that
> ? ? ? ?you can upgrade later easily to a different algorithm if you need
> ? ? ? ?one. ?For instance ``PBKDF2-256$thesalt:10000$deadbeef...``.
> ? if we know what's supposed to be done, how about just doing it and
> ? returning *that*? If it goes into the stdlib, I'd like to have
> ? something non-cryptographers can use easily, correctly and without
> ? making mistakes. Then there's the issue of implementing the equality
> ? test, extracting stuff from that storage string on subsequent auths to
> ? test for matches. It should be possible to do all that in a single
> ? user-facing operations, no munging about in user's code.
> 3. The test suite needs to be converted to the stdlib's format
> 4. The documentation needs to be written

Right. Given the time frames involved, it's probably best to target
this at 3.4 as a simple way to do
rainbow-table-and-brute-force-resistant password hashing and
comparisons, defaulting to PBKDF2, but accepting alternative key
derivation functions so people can plug in bcrypt, scrypt, etc
(similar to the way hmac defaults to md5, but lets you specify any
hash function with the appropriate API).

I think Armin's already created a good foundation for that, but
there'll be quite a bit of work in getting a PEP written, etc.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Jun 11 08:12:45 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 16:12:45 +1000
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <jr2irh$msa$>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On Mon, Jun 11, 2012 at 2:43 AM, Serhiy Storchaka <storchaka at> wrote:
> On 10.06.12 18:44, Nick Coghlan wrote:
>> This approach also has the advantage of leaving
>> sys.__std(in/out/err)__ in a somewhat usable state.
> And then sys.std* and sys.__std*__ have their own inconsistent buffers.

Correct, but using detach() leaves sys.__std*__ completely broken
(either throwing exceptions or silently failing to emit output).
Creating two independent streams that share the underlying file handle
is much closer to the 2.x behaviour when replacing sys.std*.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From stephen at  Mon Jun 11 08:16:07 2012
From: stephen at (Stephen J. Turnbull)
Date: Mon, 11 Jun 2012 15:16:07 +0900
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

MRAB writes:

 > That's actually Python 3.1. From Python 3.2 it's slightly different,
 > but still not quite right:
 > Python 3.1:     "hello\r\nhello\r\r\n"
 > Python 3.2:     "hello\nhello\r\n"
 > Python 3.3.0a4: "hello\nhello\r\n"
 > All on Windows.

<stifle o="self"/>

Hm.  Maybe it's that port's implementation of universal newlines or
something like that?  What happens if you use an explicit "end="
argument?  (I don't have a Python 3 to check on Windows easily

From ncoghlan at  Mon Jun 11 08:45:46 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 16:45:46 +1000
Subject: [Python-ideas] Saving state in list/generator comprehension
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 1:09 PM, Andrew Carter <acarter at> wrote:
> ?I feel like there is a need from personal experience of mapping with state
> in a short concise way. However it is quite possible that it is just me, and
> I need to think about the problem differently, or perhaps live with 4 line
> functions that are only used once.?As for the backwards compatibility I
> think was getting ahead of myself,?I feel the with/as solution is quite
> clunky, but I couldn't come up with a more elegant solution that operated in
> a similar vein to how python feels as a language.

Part of how Python feels as a language is due to the fact that
stateful operations cannot, in general, be expressed cleanly as
expressions - you have to step up to a multi-statement procedural
algorithm if your state can't be expressed cleanly through simple

I and others have put forward various proposals to change this over
the years, but it's a complex problem that touches on the heart of the
statement/expression dichotomy that Guido deliberately introduced when
creating the language.

The mechanism I personally consider most promising is one that makes
it easier to be explicit that a particular function is only used in
the current statement (see PEP 403). It still feels like Python (i.e.
no embedded assignments), but also clearly expresses when a function
exists solely for code structure purposes, and has nothing to do with
splitting out a component that will be used from multiple locations.

The current design proposal in PEP 403 is still quite flawed, though,
and needs a substantial amount of work to be brought up to a standard
where it makes a compelling case for a change to Python.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From p.f.moore at  Mon Jun 11 10:06:42 2012
From: p.f.moore at (Paul Moore)
Date: Mon, 11 Jun 2012 09:06:42 +0100
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

On 11 June 2012 07:16, Stephen J. Turnbull <stephen at> wrote:
> MRAB writes:
> ?> That's actually Python 3.1. From Python 3.2 it's slightly different,
> ?> but still not quite right:
> ?>
> ?> Python 3.1: ? ? "hello\r\nhello\r\r\n"
> ?> Python 3.2: ? ? "hello\nhello\r\n"
> ?> Python 3.3.0a4: "hello\nhello\r\n"
> ?>
> ?> All on Windows.
> <stifle o="self"/>
> Hm. ?Maybe it's that port's implementation of universal newlines or
> something like that? ?What happens if you use an explicit "end="
> argument? ?(I don't have a Python 3 to check on Windows easily
> available.)

Explicit end= makes no difference to the behaviour. In fact, a minimal
test suggests that universal newline mode is not enabled on Windows in
Python 3. That's a regression from 2.x. See below.

D:\Data>py -3 -c "print('x')" | od -c
0000000   x  \n

D:\Data>py -2 -c "print('x')" | od -c
0000000   x  \r  \n

D:\Data>py -3 -V
Python 3.2.2

D:\Data>py -2 -V
Python 2.7.2


From amauryfa at  Mon Jun 11 10:11:34 2012
From: amauryfa at (Amaury Forgeot d'Arc)
Date: Mon, 11 Jun 2012 10:11:34 +0200
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
	<> <jr0es1$ub6$>
Message-ID: <>

2012/6/11 Paul Moore <p.f.moore at>

> Explicit end= makes no difference to the behaviour. In fact, a minimal
> test suggests that universal newline mode is not enabled on Windows in
> Python 3. That's a regression from 2.x. See below.
> D:\Data>py -3 -c "print('x')" | od -c
> 0000000   x  \n
> 0000002
> D:\Data>py -2 -c "print('x')" | od -c
> 0000000   x  \r  \n
> 0000003

This is certainly related to

Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From lists at  Mon Jun 11 10:42:59 2012
From: lists at (Christian Heimes)
Date: Mon, 11 Jun 2012 10:42:59 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <jr4b2j$ojd$>

Am 11.06.2012 08:09, schrieb Nick Coghlan:
> Right. Given the time frames involved, it's probably best to target
> this at 3.4 as a simple way to do
> rainbow-table-and-brute-force-resistant password hashing and
> comparisons, defaulting to PBKDF2, but accepting alternative key
> derivation functions so people can plug in bcrypt, scrypt, etc
> (similar to the way hmac defaults to md5, but lets you specify any
> hash function with the appropriate API).
> I think Armin's already created a good foundation for that, but
> there'll be quite a bit of work in getting a PEP written, etc.

Python already has an excellent library for password hashing: passlib
[1]. It's well written and documented, contains more than 30 password
hashing algorithms and schemas used by major platforms and applications
like Unix, LDAP and databases. The library even contains a policy
framework for handling, recognizing and migrating passwords as well as
counteractive measures against side channel attacks.

IMHO it's not enough to just provide the basic algorithm for PBKDF2 and
friends. There is still too much space for error. Passlib hides the
complex parts and has a user friendly API, for example



From ncoghlan at  Mon Jun 11 12:03:35 2012
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jun 2012 20:03:35 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <jr4b2j$ojd$>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 6:42 PM, Christian Heimes <lists at> wrote:
> Am 11.06.2012 08:09, schrieb Nick Coghlan:
>> Right. Given the time frames involved, it's probably best to target
>> this at 3.4 as a simple way to do
>> rainbow-table-and-brute-force-resistant password hashing and
>> comparisons, defaulting to PBKDF2, but accepting alternative key
>> derivation functions so people can plug in bcrypt, scrypt, etc
>> (similar to the way hmac defaults to md5, but lets you specify any
>> hash function with the appropriate API).
>> I think Armin's already created a good foundation for that, but
>> there'll be quite a bit of work in getting a PEP written, etc.
> Python already has an excellent library for password hashing: passlib
> [1]. It's well written and documented, contains more than 30 password
> hashing algorithms and schemas used by major platforms and applications
> like Unix, LDAP and databases. The library even contains a policy
> framework for handling, recognizing and migrating passwords as well as
> counteractive measures against side channel attacks.
> IMHO it's not enough to just provide the basic algorithm for PBKDF2 and
> friends. There is still too much space for error. Passlib hides the
> complex parts and has a user friendly API, for example

Thanks for the link Christian, it does appear this particular wheel
has already been thoroughly invented. I'll be recommending passlib for
use by others in the future and look into adopting it for my own

However, password hashing is an important and common enough problem
that it would be good to have some basic level of support in the
standard library, with a clear migration path to a more feature
complete approach like passlib.

It would be good if someone was willing to do the work of raising this
discussion with the passlib authors, and looking to see if a suitably
stable core could be extracted that is API compatible with passlib,
and could be proposed as a standard library addition for 3.4.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From techtonik at  Mon Jun 11 12:31:50 2012
From: techtonik at (anatoly techtonik)
Date: Mon, 11 Jun 2012 13:31:50 +0300
Subject: [Python-ideas] Isolated (?transactional) exec (?subinterpreter)
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 4:00 PM, Amaury Forgeot d'Arc <amauryfa at> wrote:
> 2012/6/8 anatoly techtonik <techtonik at>
>> Optionally isolated from parent environment:
>> ?- a feature to execute user script in a snapshot of current
>> environment and have
>> ? ?a choice whenever to merge its modifications back to environment or not
> It would be a really interesting feature, but seems very difficult to
> implement.
> Do you have the?slightest?idea how this would work?
> What about global state, environment variables,?threads, and all kinds of
> side-effects?
> Or are you thinking about a solution based on the multiprocessing module?

For my original user story both approaches will suffice. I've never
used multiprocessing (mostly because 2.6 only compatibility) and it
looks like it is capable to do what I want with some tweaks. But first
approach with fine-grained environment control (object space, state,
memory) will be more beneficial for Python as it can bring a nice
research methodology for interpreter improvements (and hopefully some
pictures). Two things are required:

 1. Execution rollback
 2. Scope control

Execution rollback (or transaction) can be either "save the state and
restore" or "keep track of changes and discard". On a lowest possible
level it is something like using memory copy-on-write while Python
bytecode modifies it and discarding the copied stuff in the end. Like
in OSI model for networking, this low level memory and code
abstraction is the 1st layer.

But you're absolutely right about global state, environment variables,
threads and other stuff - when we jump to a higher layer - rolling
back execution pointer to a saved checkpoint and discarding memory
will not be enough. We need to ensure that reverted operation did not
affect state of the system outside the execution scope. "Scope
control" means that every pathway when execution can alter global
state outside needs to be carefully recorded and classified. It will
then be possible to detect "escaped" transactions automatically and
detect if the operation is safe to revert or not.

From lists at  Mon Jun 11 15:41:18 2012
From: lists at (Christian Heimes)
Date: Mon, 11 Jun 2012 15:41:18 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

Am 11.06.2012 12:03, schrieb Nick Coghlan:
> Thanks for the link Christian, it does appear this particular wheel
> has already been thoroughly invented. I'll be recommending passlib for
> use by others in the future and look into adopting it for my own
> projects.

You are welcome! I'm using passlib for about two years and really like
its API. PyPI surprises now and then with its hidden gems. I wished we
had a way to draw more attention to good solutions, something like
"official endorsed projects" or so.

> However, password hashing is an important and common enough problem
> that it would be good to have some basic level of support in the
> standard library, with a clear migration path to a more feature
> complete approach like passlib.
> It would be good if someone was willing to do the work of raising this
> discussion with the passlib authors, and looking to see if a suitably
> stable core could be extracted that is API compatible with passlib,
> and could be proposed as a standard library addition for 3.4.

That's a nice idea, Nick! I've added one of the two core developers of
passlib to the CC list. The other one doesn't have his/her email address
exposed on Google Code.

A stripped down and API compatible version of passlib would make a good
addition for Python's standard library. IMHO the complete passlib
package is too big for the core. The context API and handlers for
bcrypt, pbkdf2 and sha*_crypt are sufficient. Developers can still
install passlib if they need all features.

We need to come up with a different name (passhash ?) for the stdlib


From jimjjewett at  Mon Jun 11 16:21:29 2012
From: jimjjewett at (Jim Jewett)
Date: Mon, 11 Jun 2012 10:21:29 -0400
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jun 7, 2012 at 5:00 PM, Mike Meyer <mwm at> wrote:
> On Thu, Jun 7, 2012 at 4:48 PM, Rurpy <rurpy at> wrote:
>> I suspect the vast majority of
>> programmers are interested in a language that allows
>> them to *effectively* get done what they need to,


The problem is that your use case gets hit by several special cases at once.

Usually, you don't need to worry about encodings at all; the default
is sufficient.  Obviously not the case for you.

Usually, the answer is just to open a file (or stream) the way you
want to.  sys.stdout is special because you don't open it.

If you do want to change sys.stdout, usually the answer is to replace
it with a different object.  Apparently (though I missed the reason
why) that doesn't work for you, and you need to keep using the same
underlying stream.

So at that point, replacing it with a wrapped version of itself
probably *is* the simplest solution.

The remaining problem is how to find the least bad way of doing that.
Your solution does work.  Adding it as an example to the docs would
probably be reasonable, but someone seems to have worked pretty hard
at keeping the sys module documentation short.  I could personally
support a wrap function on the sys.std* streams that took care of
flushing before wrapping, but ... there is a cost, in that the API
gets longer, and therefore harder to learn.

> or applications
> outside of those built for your system that have a "--encoding" type
> flag?

There are plenty of applications with an encoding flag; I'm not sure
how often it applies to sys.std*, as opposed to named files.


From rurpy at  Mon Jun 11 16:42:46 2012
From: rurpy at (Rurpy)
Date: Mon, 11 Jun 2012 07:42:46 -0700 (PDT)
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
Message-ID: <>

Here is another issue that came up in my ongoing
adventure porting to Python3...

Executive summary:

There is no good way to read a text file when the 
encoding has to be determined by reading the start
of the file.  A long-winded version of that follows.
Scroll down the the "Proposal" section to skip it.


When one opens a text file for reading, one must specify 
(explicitly or by default) an encoding which Python will
use to convert the raw bytes read into Python strings.  
This means one must know the encoding of a file before 
opening it, which is usually the case, but not always.

Plain text files have no meta-data giving their encoding 
so sometimes it may not be known and some of the file must 
be read and a guess made.  Other data like html pages, xml 
files or python source code have encoding information inside 
them, but that too requires reading the start of the file 
without knowing the encoding in advance.

I see three ways in general in Python 3 currently to attack 
this problem, but each has some severe drawbacks:

1.  The most straight-forward way to handle this is to open
the file twice, first in binary mode or with latin1 encoding
and again in text mode after the encoding has been determined
This of course has a performance cost since the data is read
twice.  Further, it can't be used if the data source is a 
from a pipe, socket or other non-rewindable source.  This
includes sys.stdin when it comes from a pipe.

2.  Alternatively, with a little more expertise, one can rewrap 
the open binary stream in a TextIOWrapper to avoid a second
OS file open.  The standard library's 
function does this:

    def open(filename):
        buffer =, 'rb')
        encoding, lines = detect_encoding(buffer.readline)
        text = TextIOWrapper(buffer, encoding, line_buffering=True)
        text.mode = 'r'
        return text

This too seems to read the data twice and of course the 
seek(0) prevents this method also from being usable with
pipes, sockets and other non-seekable sources.

3.  Another method is to simply leave the file open in 
binary mode, read bytes data, and manually decode it to 
text.  This seems to be the only option when reading from 
non-rewindable sources like pipes and sockets, etc.
But then ones looses the all the advantages of having 
a text stream even though one wants to be reading text!
And if one tries to hide this, one ends up reimplementing
a good part of TextIOWrapper!

I believe these problems could be addressed with a fairly 
simple and clean modification of the io.TextIOWrapper

The following is a logical description; I don't mean to 
imply that the code must follow this outline exactly.
It is based on looking at _pyio;  I hope the C code is

1. Allow io.TextIOWrapper's encoding parameter to be a
 callable object in addition to a string or None.

2. In __init__(), if the encoding parameter was callable, 
 record it as an encoding hook and leave encoding set to

3. The places in Io.TextIOWrapper that currently read
 undecoded data from the internal buffer object and decode
 (only methods read() and read_chunk() I think) it would
 be modified to do so in this way:

4. Read data from the buffer object as is done now.

5. If the encoding has been set, get a decoder if necessary
 and continue on as usual.

6. If the encoding is None, call the encoding callable
 with the data just read and the buffer object.

7. The callable will examine the data, possibly using the
 buffer object's peek method to look further ahead in the
 file.  It returns the name of an encoding.

8. io.TextIOWrapper will get the encoding and record it,
 and setup the decoder the same way as if the encoding name
 had been received as a parameter, decode the read data and
 continue on as usual.

9. In other non-read paths where encoding needs to be known,
 raise an error if it is still None.

Were io.TextWrapper modified this way, it would offer:

* Better performance since there is no need to reread data

* Read data is decoded after being examined so the stream
 is usable with serial datasources like pipes, sockets, etc.

* User code is simplified and clearer; there is better
 separation of concerns.  For example, the code in the 
 "Problem" section could be written:

    stream = open(filename, encoding=detect_encoding):
    def detect_encoding (data, buffer):
	# This is still basically the same function as
	# in the code in the "Problem" section.
        ... look for Python coding declaration in
            first two lines of the 'data' bytes object.
        if not found_encoding:
           raise Error ("unable to determine encoding")
        return found_encoding

I have modified a copy the _pyio module as described and 
the changes required seemed unsurprising and relatively
few, though I am sure there are subtleties and other
considerations I am missing.  Hence this post seeking

From rurpy at  Mon Jun 11 17:06:18 2012
From: rurpy at (Rurpy)
Date: Mon, 11 Jun 2012 08:06:18 -0700 (PDT)
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
Message-ID: <>

As a followup, here are some timing data that seem to confirm
a modest increase in speed as a result of implementing the
callable encoding parameter I proposed (although that would 
not be the main reason for wanting to do it.)  These are just
for illustration.  (Among many other reasons, _pyio benchmarks
are not very useful.)

I read four short test files using four methods for determining 
the test file's encoding.  The test files are a simplified model 
of a python coding declaration (always on first line in our case 
with no BOM present [*1]) followed by mixed english and japanese 

Method 0 (reopen0): 
Use the encoding callable I am proposing.

   def reopen0 (fname):
        def hook (data,buf):
            return get_encoding (data)
        t = (fname, encoding=hook)

Method 1 (reopen1):
Open in binary to determine encoding, then rewrap in a 
TextIOWrapper with the correct encoding.

    def reopen1 (fname):
        b = (fname, 'rb')
        line = b.readline()
        enc = get_encoding (line) (0)
        t = io.TextIOWrapper (b, enc, line_buffering=True)
        t.mode = 'r'

Method 2 (reopen2):
Open in binary to determine encoding, then reopen in text mode
with correct encoding.

    def reopen2 (fname):
        b = (fname, 'rb')
        line = b.readline()
        enc = get_encoding (line)
        t = (fname, encoding=enc)

Method 3 (reopen3):
Open in text mode (latin1) to determine encoding, then reopen
in text mode with correct encoding.

    def reopen3 (fname):
        f = (fname, encoding='latin1')
        line = f.readline()
        enc = get_encoding (line)
        t = (fname, encoding=enc)

The same get_encoding() function is used in all methods [*1].

The input test data are all small files (because we want
to measure encoding detection, not how fast read() runs.)
Each has a python/emacs coding declaration in the first line.

test.utf8 -- Tiny python program with coding declaration 
  and single print statement in main() function that prints
  a short word (literal) in Japanese.  Encoding is utf-8
  (122 bytes).
test.sjis -- Identical to test.utf8 but sjis encoding
  (111 bytes).
test2.utf8 -- A python coding declaration followed by 
  approximately 50 long lines with mixed English and
 Japanese (4274 bytes).
test2.sjis -- Identical to test2.utf8 but sjis encoding
 (3401 bytes).

$ python3 test.utf8
test.utf8 / reopen0: total time (10000 reps) was 1.188323
test.utf8 / reopen1: total time (10000 reps) was 1.490757
test.utf8 / reopen2: total time (10000 reps) was 1.766081
test.utf8 / reopen3: total time (10000 reps) was 2.141996
$ python3 test.sjis
test.sjis / reopen0: total time (10000 reps) was 1.175914
test.sjis / reopen1: total time (10000 reps) was 1.471780
test.sjis / reopen2: total time (10000 reps) was 1.764444
test.sjis / reopen3: total time (10000 reps) was 2.122550
$ python3 test2.utf8
test2.utf8 / reopen0: total time (10000 reps) was 1.690255
test2.utf8 / reopen1: total time (10000 reps) was 1.996235
test2.utf8 / reopen2: total time (10000 reps) was 2.278798
test2.utf8 / reopen3: total time (10000 reps) was 2.727867
$ python3 test2.sjis
test2.sjis / reopen0: total time (10000 reps) was 1.841388
test2.sjis / reopen1: total time (10000 reps) was 2.147142
test2.sjis / reopen2: total time (10000 reps) was 2.426701
test2.sjis / reopen3: total time (10000 reps) was 2.873278

Here is what happen when a test data file is piped 
into a program using the four methods above:

  $ cat test.utf8 | python3 reopen0
  read 102 characters

  $ cat test.utf8 | python3 reopen1
  got exception: [Errno 29] Illegal seek

  $ cat test.utf8 | python3 reopen2
  read 0 characters

  $ cat test.utf8 | python3 reopen3
  read 0 characters

[*1] Here is the get_encoding function used above.  It is 
a toy simplified python source encoding line reader.  Toy,
in that is looks at only one line, doesn't consider a BOM,
etc.  It purpose was to allow me to sanity check the benefits
of having a callable encoding parameter.

    def get_encoding (line):
        if isinstance (line, bytes):
            nlpos = line.index(b'\n')
            mo = (line, 0, nlpos)
            if not mo: return None
            enc = ('latin1')
            nlpos = line.index('\n')
            mo = (line, 0, nlpos)
            if not mo: return None
            enc =
        return enc

From ncoghlan at  Mon Jun 11 17:10:47 2012
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Jun 2012 01:10:47 +1000
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
In-Reply-To: <>
References: <>
Message-ID: <>

Immediate thought: it seems like it would be easier to offer a way to
inject data back into a buffered IO object's internal buffer.

Sent from my phone, thus the relative brevity :)
On Jun 12, 2012 12:43 AM, "Rurpy" <rurpy at> wrote:

> Here is another issue that came up in my ongoing
> adventure porting to Python3...
> Executive summary:
> ==================
> There is no good way to read a text file when the
> encoding has to be determined by reading the start
> of the file.  A long-winded version of that follows.
> Scroll down the the "Proposal" section to skip it.
> Problem:
> ========
> When one opens a text file for reading, one must specify
> (explicitly or by default) an encoding which Python will
> use to convert the raw bytes read into Python strings.
> This means one must know the encoding of a file before
> opening it, which is usually the case, but not always.
> Plain text files have no meta-data giving their encoding
> so sometimes it may not be known and some of the file must
> be read and a guess made.  Other data like html pages, xml
> files or python source code have encoding information inside
> them, but that too requires reading the start of the file
> without knowing the encoding in advance.
> I see three ways in general in Python 3 currently to attack
> this problem, but each has some severe drawbacks:
> 1.  The most straight-forward way to handle this is to open
> the file twice, first in binary mode or with latin1 encoding
> and again in text mode after the encoding has been determined
> This of course has a performance cost since the data is read
> twice.  Further, it can't be used if the data source is a
> from a pipe, socket or other non-rewindable source.  This
> includes sys.stdin when it comes from a pipe.
> 2.  Alternatively, with a little more expertise, one can rewrap
> the open binary stream in a TextIOWrapper to avoid a second
> OS file open.  The standard library's
> function does this:
>    def open(filename):
>        buffer =, 'rb')
>        encoding, lines = detect_encoding(buffer.readline)
>        text = TextIOWrapper(buffer, encoding, line_buffering=True)
>        text.mode = 'r'
>        return text
> This too seems to read the data twice and of course the
> seek(0) prevents this method also from being usable with
> pipes, sockets and other non-seekable sources.
> 3.  Another method is to simply leave the file open in
> binary mode, read bytes data, and manually decode it to
> text.  This seems to be the only option when reading from
> non-rewindable sources like pipes and sockets, etc.
> But then ones looses the all the advantages of having
> a text stream even though one wants to be reading text!
> And if one tries to hide this, one ends up reimplementing
> a good part of TextIOWrapper!
> I believe these problems could be addressed with a fairly
> simple and clean modification of the io.TextIOWrapper
> class...
> Proposal
> ========
> The following is a logical description; I don't mean to
> imply that the code must follow this outline exactly.
> It is based on looking at _pyio;  I hope the C code is
> equivalent.
> 1. Allow io.TextIOWrapper's encoding parameter to be a
>  callable object in addition to a string or None.
> 2. In __init__(), if the encoding parameter was callable,
>  record it as an encoding hook and leave encoding set to
>  None.
> 3. The places in Io.TextIOWrapper that currently read
>  undecoded data from the internal buffer object and decode
>  (only methods read() and read_chunk() I think) it would
>  be modified to do so in this way:
> 4. Read data from the buffer object as is done now.
> 5. If the encoding has been set, get a decoder if necessary
>  and continue on as usual.
> 6. If the encoding is None, call the encoding callable
>  with the data just read and the buffer object.
> 7. The callable will examine the data, possibly using the
>  buffer object's peek method to look further ahead in the
>  file.  It returns the name of an encoding.
> 8. io.TextIOWrapper will get the encoding and record it,
>  and setup the decoder the same way as if the encoding name
>  had been received as a parameter, decode the read data and
>  continue on as usual.
> 9. In other non-read paths where encoding needs to be known,
>  raise an error if it is still None.
> Were io.TextWrapper modified this way, it would offer:
> * Better performance since there is no need to reread data
> * Read data is decoded after being examined so the stream
>  is usable with serial datasources like pipes, sockets, etc.
> * User code is simplified and clearer; there is better
>  separation of concerns.  For example, the code in the
>  "Problem" section could be written:
>    stream = open(filename, encoding=detect_encoding):
>    ...
>    def detect_encoding (data, buffer):
>        # This is still basically the same function as
>        # in the code in the "Problem" section.
>        ... look for Python coding declaration in
>            first two lines of the 'data' bytes object.
>        if not found_encoding:
>           raise Error ("unable to determine encoding")
>        return found_encoding
> I have modified a copy the _pyio module as described and
> the changes required seemed unsurprising and relatively
> few, though I am sure there are subtleties and other
> considerations I am missing.  Hence this post seeking
> feedback...
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ericsnowcurrently at  Mon Jun 11 17:11:50 2012
From: ericsnowcurrently at (Eric Snow)
Date: Mon, 11 Jun 2012 09:11:50 -0600
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 8:42 AM, Rurpy <rurpy at> wrote:
> Here is another issue that came up in my ongoing
> adventure porting to Python3...
> Executive summary:
> ==================
> There is no good way to read a text file when the
> encoding has to be determined by reading the start
> of the file. ?A long-winded version of that follows.
> Scroll down the the "Proposal" section to skip it.

FWIW, the import system does an encoding check on Python source files
that is somewhat related.  See


From stephen at  Mon Jun 11 18:24:20 2012
From: stephen at (Stephen J. Turnbull)
Date: Tue, 12 Jun 2012 01:24:20 +0900
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan writes:

 > Immediate thought: it seems like it would be easier to offer a way to
 > inject data back into a buffered IO object's internal buffer.


If you're only interested in the top of the file (see below), I would
suggest allowing only one bufferfull, and then simply rewinding the
buffer pointer once you're done.  This is one strategy used by Emacsen
for encoding detection (for the reason pointed out by Rurpy: not all
streams are rewindable).

But is that really "easier"?  It might be more general, but you still
need to reinitialize the encoding (ie, from the trivial "binary" to
whatever is detected), with all the hair that comes with that.

 > > Executive summary:
 > > ==================
 > >
 > > There is no good way to read a text file when the
 > > encoding has to be determined by reading the start
 > > of the file.  A long-winded version of that follows.
 > > Scroll down the the "Proposal" section to skip it.

This may be insufficiently general.  Specifically, both Emacsen and vi
allow specification of editor configuration variables at the bottom of
the file as well as the top.  I don't know whether vi allows encoding
specs at the bottom, but Emacsen do (but only for files).

I wouldn't recommend paying much attention to what Emacsen actually
*do* when initializing a stream (it's, uh, "baroque").

From guido at  Mon Jun 11 18:49:45 2012
From: guido at (Guido van Rossum)
Date: Mon, 11 Jun 2012 09:49:45 -0700
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 3:03 AM, Nick Coghlan <ncoghlan at> wrote:
> However, password hashing is an important and common enough problem
> that it would be good to have some basic level of support in the
> standard library, with a clear migration path to a more feature
> complete approach like passlib.

I usually like this approach, but here I am hesitant, because of the
cost if the basic approach is found inadequate. The stdlib support
should either be state-of-the art or so poor that people are naturally
driven to a state-of-the art alternative on PyPI that is maintained
regularly. In this case I think our only option is the latter. I do
think it is another example of a situation where the stdlib docs ought
to contain some hints about where to go instead for this

--Guido van Rossum (

From masklinn at  Mon Jun 11 22:08:11 2012
From: masklinn at (Masklinn)
Date: Mon, 11 Jun 2012 22:08:11 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-11, at 18:49 , Guido van Rossum wrote:

> On Mon, Jun 11, 2012 at 3:03 AM, Nick Coghlan <ncoghlan at> wrote:
>> However, password hashing is an important and common enough problem
>> that it would be good to have some basic level of support in the
>> standard library, with a clear migration path to a more feature
>> complete approach like passlib.
> I usually like this approach, but here I am hesitant, because of the
> cost if the basic approach is found inadequate. The stdlib support
> should either be state-of-the art

Well depends what you mean by "state of the art", PBKDF2 is still the
"tried and true" trusted password-hashing algorithm (it's the one used
in TrueCrypt, 1Password, WPA2, DPAPI and many others). bcrypt is the
"old newness", working on the same principle as PBKDF2 (do lots of work)
but a different underlying algorithm, and scrypt is the "new newness" as
it includes being memory-hard on top of being processing-hard, but is
significantly less trusted as it's only a few years old.

So as far as I know, PBKDF2 is indeed "state of the art", scrypt is
"bleeding edge" and bcrypt is somewhere in-between[0] (but if PBKDF2 is
found to be insufficient, bcrypt will fall for similar reasons: it's
only binding on CPU power and is easy to parallelize). Ulrich Drepper
also built an MD5crypt-inspired crypt based on SHA2 (and fixed a few
weak ideas of MD5crypt[1]) a few years ago.

As a matter of facts, passlib notes PBKDF2/SHA512 as one of its three
recommendation (alongside bcrypt and sha512_crypt) and notes it is the
most portable of three roughly equivalent choices[2] (and that
sha512_crypt is somewhat baroque and harder to analyze for flaws than
the alternatives).

> or so poor that people are naturally
> driven to a state-of-the art alternative on PyPI that is maintained
> regularly. In this case I think our only option is the latter. I do
> think it is another example of a situation where the stdlib docs ought
> to contain some hints about where to go instead for this
> functionality.

The issue with this idea is that people are *not* driven to
state-of-the-art alternatives because they don't understand or know the
issue. And as a result, as we've seen last week, they'll use
cryptographic hashes (with or without salts) even though those are
insufficient, because that's available and they read on the internet
that it was what they needed.

And how are you going to make people understand there's a difference
between a cryptographic hash and a password hash by doing nothing,
giving them cryptographic hashes and leaving them to their own devices?

[0] and beyond the bleeding edge lies ubiquitous 2-factor auth,
[1] MD5crypt can not use adaptive load factors and injects constant
    data at some points, it also allows longer salts.

From guido at  Mon Jun 11 22:21:07 2012
From: guido at (Guido van Rossum)
Date: Mon, 11 Jun 2012 13:21:07 -0700
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 1:08 PM, Masklinn <masklinn at> wrote:
> On 2012-06-11, at 18:49 , Guido van Rossum wrote:
>> On Mon, Jun 11, 2012 at 3:03 AM, Nick Coghlan <ncoghlan at> wrote:
>>> However, password hashing is an important and common enough problem
>>> that it would be good to have some basic level of support in the
>>> standard library, with a clear migration path to a more feature
>>> complete approach like passlib.
>> I usually like this approach, but here I am hesitant, because of the
>> cost if the basic approach is found inadequate. The stdlib support
>> should either be state-of-the art
> Well depends what you mean by "state of the art", PBKDF2 is still the
> "tried and true" trusted password-hashing algorithm (it's the one used
> in TrueCrypt, 1Password, WPA2, DPAPI and many others). bcrypt is the
> "old newness", working on the same principle as PBKDF2 (do lots of work)
> but a different underlying algorithm, and scrypt is the "new newness" as
> it includes being memory-hard on top of being processing-hard, but is
> significantly less trusted as it's only a few years old.
> So as far as I know, PBKDF2 is indeed "state of the art", scrypt is
> "bleeding edge" and bcrypt is somewhere in-between[0] (but if PBKDF2 is
> found to be insufficient, bcrypt will fall for similar reasons: it's
> only binding on CPU power and is easy to parallelize). Ulrich Drepper
> also built an MD5crypt-inspired crypt based on SHA2 (and fixed a few
> weak ideas of MD5crypt[1]) a few years ago.
> As a matter of facts, passlib notes PBKDF2/SHA512 as one of its three
> recommendation (alongside bcrypt and sha512_crypt) and notes it is the
> most portable of three roughly equivalent choices[2] (and that
> sha512_crypt is somewhat baroque and harder to analyze for flaws than
> the alternatives).
>> or so poor that people are naturally
>> driven to a state-of-the art alternative on PyPI that is maintained
>> regularly. In this case I think our only option is the latter. I do
>> think it is another example of a situation where the stdlib docs ought
>> to contain some hints about where to go instead for this
>> functionality.
> The issue with this idea is that people are *not* driven to
> state-of-the-art alternatives because they don't understand or know the
> issue. And as a result, as we've seen last week, they'll use
> cryptographic hashes (with or without salts) even though those are
> insufficient, because that's available and they read on the internet
> that it was what they needed.

Is there any indication that Python was involved in last week's
incidents? (I'm only aware of the Linkedin one -- were there others?)

> And how are you going to make people understand there's a difference
> between a cryptographic hash and a password hash by doing nothing,
> giving them cryptographic hashes and leaving them to their own devices?

Do you really think that including some API in the stdlib is going to
make a difference in education? And what would we do if in 2 years
time the stdlib's "basic functionality" were somehow compromised (not
due to a bug in Python's implementation but simply through some
advance in the crypto world) -- how would we get everyone who relied
on the stdlib to switch to a different algorithm? I really think that
the right approach here is to get *everyone* who needs this to use a
3rd party library. Diversity is very good here!

> [0] and beyond the bleeding edge lies ubiquitous 2-factor auth,
> ? ?probably.
> [1] MD5crypt can not use adaptive load factors and injects constant
> ? ?data at some points, it also allows longer salts.
> [2]

TBH it's possible that I'm not sufficiently familiar with the issue to
have a valid opinion here -- I would never dream of taking on the
responsibility of password security for anything, since I don't have
the right crypto hacker mindset. But I do worry about having
attractive suboptimal solutions to common security problems in the

--Guido van Rossum (

From lists at  Mon Jun 11 22:39:32 2012
From: lists at (Christian Heimes)
Date: Mon, 11 Jun 2012 22:39:32 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <jr5l24$i2l$>

Am 11.06.2012 22:21, schrieb Guido van Rossum:
> Is there any indication that Python was involved in last week's
> incidents? (I'm only aware of the Linkedin one -- were there others?)

No, zero Pythons were harmed. The other victims were and
eHarmony. Surprisingly, Sony wasn't hacked last week! *scnr*

> Do you really think that including some API in the stdlib is going to
> make a difference in education? And what would we do if in 2 years
> time the stdlib's "basic functionality" were somehow compromised (not
> due to a bug in Python's implementation but simply through some
> advance in the crypto world) -- how would we get everyone who relied
> on the stdlib to switch to a different algorithm? I really think that
> the right approach here is to get *everyone* who needs this to use a
> 3rd party library. Diversity is very good here!


I'm against adding just the password hashing algorithms. Developers can
easily screw up right algorithm with a erroneous approach. It's the
beauty of passlib: The framework hides all the complex and
easy-to-get-wrong stuff behind a minimal API.


From ncoghlan at  Mon Jun 11 22:54:43 2012
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Jun 2012 06:54:43 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 12, 2012 at 6:21 AM, Guido van Rossum <guido at> wrote:
> On Mon, Jun 11, 2012 at 1:08 PM, Masklinn <masklinn at> wrote:
>> The issue with this idea is that people are *not* driven to
>> state-of-the-art alternatives because they don't understand or know the
>> issue. And as a result, as we've seen last week, they'll use
>> cryptographic hashes (with or without salts) even though those are
>> insufficient, because that's available and they read on the internet
>> that it was what they needed.
> Is there any indication that Python was involved in last week's
> incidents? (I'm only aware of the Linkedin one -- were there others?)

eHarmony and were the other two prominent sites I saw mentioned.

We're not aware of any specific Python connection, it just prompted
the current discussion of whether or not there was anything CPython
could do to nudge developers in the right direction.

Even a native PBKDF2 would be an awful lot better than nothing.

>> And how are you going to make people understand there's a difference
>> between a cryptographic hash and a password hash by doing nothing,
>> giving them cryptographic hashes and leaving them to their own devices?
> Do you really think that including some API in the stdlib is going to
> make a difference in education? And what would we do if in 2 years
> time the stdlib's "basic functionality" were somehow compromised (not
> due to a bug in Python's implementation but simply through some
> advance in the crypto world) -- how would we get everyone who relied
> on the stdlib to switch to a different algorithm? I really think that
> the right approach here is to get *everyone* who needs this to use a
> 3rd party library. Diversity is very good here!

I think it's similar to the situation with hmac: for backwards
compatibility reasons, the default hash in hmac is still MD5. That
doesn't mean hmac is useless, and using MD5 is still better than doing
nothing. It's all about raising the bar for attackers, and the fact
that attackers are continually inventing better ladders and grappling
hooks doesn't mean the older walls become completely useless.

However, I also think, with the right API design, we could allow for
the key derivation algorithms to be retuned in security releases,
*because* the state of the art of evolves (and because computers get
faster). The passlib core APIs and hash formats are designed with
precisely that problem in mind.

>> [0] and beyond the bleeding edge lies ubiquitous 2-factor auth,
>> ? ?probably.
>> [1] MD5crypt can not use adaptive load factors and injects constant
>> ? ?data at some points, it also allows longer salts.
>> [2]
> TBH it's possible that I'm not sufficiently familiar with the issue to
> have a valid opinion here -- I would never dream of taking on the
> responsibility of password security for anything, since I don't have
> the right crypto hacker mindset. But I do worry about having
> attractive suboptimal solutions to common security problems in the
> stdlib.

The trick is that even a suboptimal solution is a whole lot better
than the next-to-nothing that many people do currently. At the moment,
the available approaches are:

1. store plaintext passwords (eek)
2. store hashed unsalted passwords (vulnerable to rainbow tables)
3. store hashed salted passwords (vulnerable to massively parallel
brute force attacks)
4. store tunable cost hashed salted passwords (reduces vulnerability
to brute force, currently requires a third party library)

Option 4 *is* the state of the art, it's just a matter of tinkering
with the key derivation algorithm in response to advances in crypto
improvements, as well as ramping up the tuning parameters over time to
account for Moore's law.

By making it as easy as possible for people to use Option 4 instead of
one of the first 3, we increase the odds of people doing the right
thing. A third party library like passlib can then focus on more
dynamic things like:

1. Providing API compatible interfaces to 3rd party key derivation
algorithms (e.g. bcrypt, or the accelerated PBKDF2 implementation in
M2crypto), as well as to newer ones like scrypt
2. Providing convenient interfaces for reading and writing 3rd party
hash storage formats (e.g. LDAP)


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Mon Jun 11 23:00:27 2012
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Jun 2012 07:00:27 +1000
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <jr5l24$i2l$>
References: <>
Message-ID: <>

On Tue, Jun 12, 2012 at 6:39 AM, Christian Heimes <lists at> wrote:
> Am 11.06.2012 22:21, schrieb Guido van Rossum:
>> Do you really think that including some API in the stdlib is going to
>> make a difference in education? And what would we do if in 2 years
>> time the stdlib's "basic functionality" were somehow compromised (not
>> due to a bug in Python's implementation but simply through some
>> advance in the crypto world) -- how would we get everyone who relied
>> on the stdlib to switch to a different algorithm? I really think that
>> the right approach here is to get *everyone* who needs this to use a
>> 3rd party library. Diversity is very good here!
> +1
> I'm against adding just the password hashing algorithms. Developers can
> easily screw up right algorithm with a erroneous approach. It's the
> beauty of passlib: The framework hides all the complex and
> easy-to-get-wrong stuff behind a minimal API.

Right, when I suggested looking for an "API compatible stable core"
that could be added for 3.4, I was specifically thinking of:

1. The core CryptContext API
2. The PBKDF2 and sha512_crypt derivation functions

Based on a brief look a the module documentation, those parts seem
like they're sufficiently mature to be suitable for the stdlib,
whereas the rest of passlib is more suited to development as a 3rd
party library with its own release schedule.

However, I could be completely wrong, thus the suggestion that it be
looked into, rather than "we should definitely do this". At the very
least, we should be directing people towards passlib for password
storage and comparison purposes.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From techtonik at  Tue Jun 12 11:04:58 2012
From: techtonik at (anatoly techtonik)
Date: Tue, 12 Jun 2012 12:04:58 +0300
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 2, 2012 at 8:24 PM, Calvin Spealman <ironfroggy at> wrote:
> On Fri, Jun 1, 2012 at 11:08 AM, anatoly techtonik <techtonik at> wrote:
>> On Tue, May 29, 2012 at 9:02 AM, Nick Coghlan <ncoghlan at> wrote:
>>> Once again, you're completely ignoring all existing knowledge and
>>> expertise on open collaboration and trying to reinvent the world. It's
>>> *not going to happen*.
>> It's too boring to live in a world of existing knowledge and
>> expertise,
> Frankly, this one fragment is enough to stop me reading further. Who
> wants to learn
> from the vast and broad experience when you could simply randomize the rules of
> reality through ignorance and stubbornness?

If everybody would think like this, the world will never learn about
anti-patterns, and the software craftmanship collapsed in astonishing
agony some years ago. If it doesn't make it clear - it is not
randomizing - it is putting beliefs to the test asking for the current

> I sound fickle, because I am.

It doesn't matter how do you sound, what matters is that you spoiled
the fun to discuss the technical part no matter how long ago it was
invented. If you have a lot of people who ask the same question -
create a FAQ.  That's not a vast and broad experience - that's just a
time proven practice from usenet times.

Common guys, what's wrong with you? It is just an idea, not a proposal
or scientific paper. And I am not a scientist - I just want to discuss
the idea, and I am not sending mails to python-dev anymore, because
you asked to. I've spent some time trying to make the idea
interesting. It is fine If you know a scientific paper about the
matter, can explain it in a few words and send a link for more
details. But the replies like "you're stubborn and ignorant, and
nobody should help you" doesn't make you a better person. I am
criticizing, because I lack time, motivation and fantasy to write
stuff about good and bright sides in my life that I just don't see. I
write because I see bad things that can be better, and I am still open
to discuss if it is real or not.
anatoly t.

From fetchinson at  Tue Jun 12 12:16:06 2012
From: fetchinson at (Daniel Fetchinson)
Date: Tue, 12 Jun 2012 12:16:06 +0200
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

> I just want to discuss the idea,

Great! You even got a perfectly good answer already: *not going to happen*!
Hey, this seems to be working: you raise a point, the community
discusses it and after careful deliberation comes up with an answer!
This is how things are supposed to be working, aren't they?

> But the replies like "you're stubborn and ignorant, and
> nobody should help you" doesn't make you a better person.

Hey, hey, hey, you are overlooking the other answers you got! You
think they came from thin air? People read your post, thought about
it, considered the pros and cons and then put in the time to answer
it, write an email and hit the Send button.

Now, move on, nothing to be seen here, chop, chop, carry on!


Psss, psss, put it down! -

From oscar.j.benjamin at  Tue Jun 12 16:25:32 2012
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Tue, 12 Jun 2012 15:25:32 +0100
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On 11 June 2012 15:21, Jim Jewett <jimjjewett at> wrote:

> On Thu, Jun 7, 2012 at 5:00 PM, Mike Meyer <mwm at> wrote:
> > On Thu, Jun 7, 2012 at 4:48 PM, Rurpy <rurpy at> wrote:
> >> I suspect the vast majority of
> >> programmers are interested in a language that allows
> >> them to *effectively* get done what they need to,
> Agreed.
> The problem is that your use case gets hit by several special cases at
> once.
> Usually, you don't need to worry about encodings at all; the default
> is sufficient.  Obviously not the case for you.
> Usually, the answer is just to open a file (or stream) the way you
> want to.  sys.stdout is special because you don't open it.
> If you do want to change sys.stdout, usually the answer is to replace
> it with a different object.  Apparently (though I missed the reason
> why) that doesn't work for you, and you need to keep using the same
> underlying stream.

I also think I missed something in this thread. At the beginning of the
original thread it seemed that everyone was agreed that

  writer = codecs.getwriter(desired_encoding)
  sys.stdout = writer(sys.stdout.buffer)

was a reasonable solution (with the caveat that it should happen before any
output is written). Is there some reason why this is not a good approach?

The only problem I know of is that under Python 2.x it becomes an error to
print _already_ encoded strings (they get decoded as ascii before being
encoded) but that's probably not a problem for an application that takes a
disciplined approach to unicode.

> So at that point, replacing it with a wrapped version of itself
> probably *is* the simplest solution.
> The remaining problem is how to find the least bad way of doing that.
> Your solution does work.  Adding it as an example to the docs would
> probably be reasonable, but someone seems to have worked pretty hard
> at keeping the sys module documentation short.  I could personally
> support a wrap function on the sys.std* streams that took care of
> flushing before wrapping, but ... there is a cost, in that the API
> gets longer, and therefore harder to learn.
> > or applications
> > outside of those built for your system that have a "--encoding" type
> > flag?
> There are plenty of applications with an encoding flag; I'm not sure
> how often it applies to sys.std*, as opposed to named files.
> -jJ
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jimjjewett at  Tue Jun 12 17:15:11 2012
From: jimjjewett at (Jim Jewett)
Date: Tue, 12 Jun 2012 11:15:11 -0400
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 11:39 PM, Rurpy <rurpy at> wrote:
> On 06/07/2012 03:00 PM, Mike Meyer wrote:
>> On Thu, Jun 7, 2012 at 4:48 PM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:

>> how do other programming languages deal with wanting to
>> change the encoding of the standard IO streams?

> This is how it seems to be done in Perl:

> ?binmode(STDOUT, ":encoding(sjis)");

> which seems quite a bit simpler than Python.

Agreed, in isolation.  But in my limited experience, and from reading ... I think you
probably need to hold at least as many concepts in your head
simultaneously to get it to work.

> ... The description of binmode()
> in "man perlfunc" sounds like encoding can be changed
> on-the-fly but my attempt to do so had no effect

which sort of belies simple

> TCL appears to have on-the-fly encoding changes:

> ?| encoding system ?encoding?
> ?| The system ?encoding is used whenever Tcl passes strings
>  | to system calls.
> ?

So if you call rename, the system encoding is used for the filename,
but does that mean it is used for sysout?


From jimjjewett at  Tue Jun 12 17:33:27 2012
From: jimjjewett at (Jim Jewett)
Date: Tue, 12 Jun 2012 11:33:27 -0400
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 8, 2012 at 11:57 PM, Rurpy <rurpy at> wrote:
> On 06/07/2012 07:01 PM, Nick Coghlan wrote:
>> On Fri, Jun 8, 2012 at 10:14 AM, Rurpy <rurpy-/E1597aS9LQAvxtiuMwx3w at> wrote:

>>> ?sys.stdout = codecs.getwriter(opts.encoding)(sys.stdout.buffer)
>>> But that code is not obvious to someone who has been able to do
>>> all his encoded IO (with the exception of sys.stdout) using just
>>> the encoding parameter of open().

Well, you could do it with sys.stdout too, if you did as part of open.
 Unfortunately, by the time your code comes along, it is already open
-- and may well have already been written to.

> OK, I can see that as a use-case design principle. ?I still
> don't see any hard technical reason why the same streams could
> not be kept and simply allow their encoding's to be reset if
> they haven't been used yet.

Unfortunately, that leads to very fragile code, that will break
unexpectedly because something totally unrelated decided to write a
license message to stdout.

> But networks, shared files systems, email, etc have all
> blurred the concept of localness. ?Just because I am running
> my program on a Unix machine does not mean I may not need
> to write files with '\n\r' line endings.

So write a file, instead of stdout...

stdin/stdout is more convenient for pipes, but most such programs do
have -i and -o flags for cases like yours.

> seems to be an implicit assumption that there is a single
> encoding that needs to be determined.

Which is reasonable; they aren't the only input/output, they are the
*standard* input and output.  If they have different encodings, they
aren't really standard.  (I have some sympathy for a more lenient
encoding on stderr.)

> That (IIUC) would not be workable for my problem.
> ?./ -e sjis,sjis [other options...]
> is acceptable. ?Something like:
> ?python -C 'sys.stdin=...; sys.stdout=...' [other options...]
> would not be.

Tastes differ; I actually prefer the second, as more explicit.

> I think that being unable to easily change stream encoding
> before first use is orders of magnitude more important than
> being unable to change them on-the-fly.

Yes, but since we're talking specifically about streams you don't
start, that just makes for fragile code that breaks in the field.


From ubershmekel at  Tue Jun 12 18:11:11 2012
From: ubershmekel at (Yuval Greenfield)
Date: Tue, 12 Jun 2012 19:11:11 +0300
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, May 29, 2012 at 8:05 AM, anatoly techtonik <techtonik at>wrote:

> The problem with stdlib - it is all damn subjective. There is no
> process to add functions and modules if you're not well-behaved and
> skilled in public debates and don't have really a lot of time to be a
> champion of your module/function. In other words - it is hard (if not
> impossible for 80% of Python Earth population). So, many people and
> projects decide to opt-out. Take a look at Twisted - a lot of useful
> stuff, but not in Python stdlib. So..
> Provide a way for people to opt-out from core stuff, but still allow
> to share the changes and update code if necessary.
> This will require:
> - a local stdlib Python path convention
> - snippet normalization function and AST hash dumper
> - web site with stats
> - source code crawler
> How it works:
> 1. Every project maintains its own stdlib directory with functions
> that they feel are good to have in standard library
> 2. Functions are placed so that they are imported as if from standard
> library, but this time with stdlib prefix
> 3. The license for this directory is public domain to remove all legal
> barriers (credits are welcome, but optional)
> 4. Crawler (probably PyPI) scans this stdlib dir, finds functions,
> normalizes them, calculates hash and submits to web site
>  4.1 Normalization is required to find the shared function
> copy/pasted across different projects with different
>        indentation level, docstrings, parameters/variable names etc.
>  4.2 Hash is calculated upon AST. There are at least three hashes for
> each entry:
>       4.2.1 Full hash - all docstrings and variable names are
> preserved, whitespace normalized
>       4.2.2 Stripped hash - docstrings are stripped, variable names
> are normalized
>       4.2.3 Signature hash - a mark placed in a comment above
> function name, either calculated from function
>                signature or generated randomly, used for manual
> tracking of copy/paste e.g. pd:ac546df6b8340a92
> 5. Web site maintains usage and popularity staff, accepts votes on
> inclusion of snippets
> User stories:
> 1. "I want to find if there is a better/updated version of my function
> available"
>   1.1  I enter hash into web site search form
>   1.2  Site gives me a link to my snippet
>   1.3  I can see what people proposed to replace this function with
>   1.4  I can choose the function with most votes
>   1.5  I can flag the functions I may find irrelevant or
>   1.5  I can tag the functions that divert in different direction
> than I need to filter them
> 2. "I want to reuse code snippets without additional dependencies on
> 3rd party projects"
>   1.1  Just place them into my own stdlib directory
> 3. "I want to update code snippets when there is an update for them"
>   1.1  I run scanner, it extracts signature hashes, stripped hashes
> and looks if web-site version of signature matches normalized hash
> 4. "I want to see what people want to include in the next Python version"
>   1.1  A call for proposals is made
>   1.2  People place wannabe's into their stdlib dirs
>   1.3  Crawl generates new functions on a web site
>   1.4  Functions are categorized
>   1.5  Optionally included / declined with a short one-liner reason - why
>   1.6  Optionally provided with more detailed info why
> --- feature creep cut ---
> 5. "I want to see what functions are popular in other languages"
>   1.1  A separate crawler for Ruby, PHP etc. stdlib converts their
> AST into compatible format where possible
>   1.2  Submit to site stats
> 6. "I want to download the function in Ruby format"
>   1.1  AST converter tries to do the job automatically where possible
>   1.2  If it fails - you are encouraged to fix the converter rules or
> write the replacement for this signature manually
> Just an idea.
> --
> anatoly t.

I think having a separate site "anatloy's std-lib" which somehow
implemented an easy install of the top 10-100 most useful/popular/selected
packages on pypi could be nice. I considered making such a bundle myself a
while ago.

I don't think it really needs to be sanctioned.


PS I like how candid the replies you got were, and indeed getting a reply
is better than the sound of crickets. Though some of these replies carried
the scent of excrement poredom - the author's need to import niceness.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mwm at  Tue Jun 12 18:51:47 2012
From: mwm at (Mike Meyer)
Date: Tue, 12 Jun 2012 12:51:47 -0400
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, 12 Jun 2012 12:04:58 +0300
anatoly techtonik <techtonik at> wrote:
> On Sat, Jun 2, 2012 at 8:24 PM, Calvin Spealman <ironfroggy at> wrote:
> > On Fri, Jun 1, 2012 at 11:08 AM, anatoly techtonik <techtonik at> wrote:
> >> On Tue, May 29, 2012 at 9:02 AM, Nick Coghlan <ncoghlan at> wrote:
> >>> Once again, you're completely ignoring all existing knowledge and
> >>> expertise on open collaboration and trying to reinvent the world. It's
> >>> *not going to happen*.
> >> It's too boring to live in a world of existing knowledge and
> >> expertise,
> > Frankly, this one fragment is enough to stop me reading further. Who
> > wants to learn
> > from the vast and broad experience when you could simply randomize the rules of
> > reality through ignorance and stubbornness?
> If everybody would think like this, the world will never learn about
> anti-patterns, and the software craftmanship collapsed in astonishing
> agony some years ago. If it doesn't make it clear - it is not
> randomizing - it is putting beliefs to the test asking for the current
> status.

Ah, I think I see Anatoly's problem here. It's an impedance
mismatch. He wants to discuss language/platform/environment
ideas. This is valuable work, and he does have some interesting
ideas. It definitely has a place in the world.

It's just that this isn't that place. Python has a set of objectives
for the language that have been around long enough to qualify as
"traditions". As such, it's not a good place to experiment with
arbitrary changes to things, because you keep running afoul of the

> Common guys, what's wrong with you? It is just an idea, not a proposal
> or scientific paper.

Yes, but it's an idea that ignores the traditions of the environment
you're proposing it for. If you're serious about discussing ideas
about changing Python, you need to do the groundwork of understanding
those traditions, and try and make sure your ideas don't collide with
them. It doesn't matter whether or not they're good ideas, if they
clash with the traditions, they aren't going to happen. You need to
figure that out yourself, and not ask us to do it for you.

If, on the other hand, you want to talk about
language/platform/environment design ideas without that restriction,
then you need a different forum. Just because you happen to be working
in Python doesn't mean that a Python forum is appropriate for them,
any more than discussing (say) drone control programs would be
appropriate in a Python forum just because I happen to be writing it
in Python.

If you're somewhere in between the two, maybe a PyPy forum would be
more appropriate? I dunno. I'm sorry I can't really recommend a good
forum for you. The last time I was seriously interested in such
things, Python hadn't been released yet.

Mike Meyer <mwm at>
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail -

From solipsis at  Tue Jun 12 23:34:40 2012
From: solipsis at (Antoine Pitrou)
Date: Tue, 12 Jun 2012 23:34:40 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
References: <>
Message-ID: <>

On Sun, 10 Jun 2012 10:56:46 -0700
"Gregory P. Smith" <greg at> wrote:
> I'd just stick it in hmac myself but getpass was also a good suggestion.
>  Cross reference to it from the docs of all three as the real goal of
> adding pbkdf2 is to advertise it to users so that they might use it rather
> than something more naive.
> hashlib itself should be kept pure as is for standard low level hash
> algorithms.  It can't have a dependency on anything else.

I don't really understand this requirement. Can you elaborate?



From victor.stinner at  Tue Jun 12 23:44:00 2012
From: victor.stinner at (Victor Stinner)
Date: Tue, 12 Jun 2012 23:44:00 +0200
Subject: [Python-ideas] Replacing the standard IO streams (was Re:
 changing sys.stdout encoding)
In-Reply-To: <>
References: <>
Message-ID: <>

>> ? ? sys.stdin = open(sys.stdin.fileno(), 'r',<new settings>)
>> ? ? sys.stdout = open(sys.stdout.fileno(), 'w',<new settings>)
>> ? ? sys.stderr = open(sys.stderr.fileno(), 'w',<new settings>)
> ? ?sys.stdin = io.TextIOWrapper(sys.stdin.detach(), <new settings>)
> ? ?sys.stdout = io.TextIOWrapper(sys.stdout.detach(), <new settings>)
> ? ?...
> None of these methods are not guaranteed to work if the input or output have
> occurred before.

You should set the newline option for sys.std* files. Python 3 does
something like this:

if == "win32:
   # translate "\r\n" to "\n" for sys.stdin on Windows
   newline = None
   newline = "\n"
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), newline=newline, <new
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), newline="\n", <new settings>)
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), newline="\n", <new settings>)


Lib/test/ uses the following code which is not exactly
correct (it creates a new buffered writer instead of reusing
sys.stdout buffered writer):

def replace_stdout():
    """Set stdout encoder error handler to backslashreplace (as stderr error
    handler) to avoid UnicodeEncodeError when printing a traceback"""
    import atexit

    stdout = sys.stdout
    sys.stdout = open(stdout.fileno(), 'w',

    def restore_stdout():
        sys.stdout = stdout


From victor.stinner at  Tue Jun 12 23:48:08 2012
From: victor.stinner at (Victor Stinner)
Date: Tue, 12 Jun 2012 23:48:08 +0200
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
In-Reply-To: <>
References: <>
Message-ID: <>

> 1. ?The most straight-forward way to handle this is to open
> the file twice, first in binary mode or with latin1 encoding
> and again in text mode after the encoding has been determined
> This of course has a performance cost since the data is read
> twice. ?Further, it can't be used if the data source is a
> from a pipe, socket or other non-rewindable source. ?This
> includes sys.stdin when it comes from a pipe.

Some months ago, I proposed to automatically detect if a file contains
a BOM and uses it to set the encoding. Various methods were proposed
but there was no real consensus. One proposition was to use a codec
(e.g. "bom") which uses the BOM if it is present, and so don't need to
reread the file twice.

For the pipe issue: it depends where the encoding specification is. If
the encoding is written at the end of your "file" (stream), you have
to store the whole stream content (few MB or maybe much more?) into
memory. If it is in the first lines, you have to store these lines in
a buffer. It's not easy to decide for the threshold.

I don't like the codec approach because the codec is disconnected from
the stream. For example, the codec doesn't know the current position
in stream nor can read a few more bytes forward or backward. If you
open the file in "append" mode, you are not writing at the beginning
but at the end of the file. You may also seek at an arbitrary position
before the first read...

There are also some special cases. For example, when a text file is
opened in write mode, the file is seekable and the file position is
not zero, TextIOWrapper calls encoder.setstate(0) to not write the BOM
in the middle of the file. (See also Lib/test/ for related

> 2. ?Alternatively, with a little more expertise, one can rewrap
> the open binary stream in a TextIOWrapper to avoid a second
> OS file open.

That's my favorite method because you have the full control on the
stream. (I wrote But yes, it does not work on
non-seekable streams (e.g. pipes).

> This too seems to read the data twice and of course the
> seek(0) prevents this method also from being usable with
> pipes, sockets and other non-seekable sources.

Does it really matter? You usually need to read few bytes to get the encoding.

> 9. In other non-read paths where encoding needs to be known,
> ?raise an error if it is still None.

Why not reading data until you the encoding is known instead?

> I have modified a copy the _pyio module as described and
> the changes required seemed unsurprising and relatively
> few, though I am sure there are subtleties and other
> considerations I am missing. ?Hence this post seeking
> feedback...

Can you post the modified somewhere so I can play with it?


From greg at  Tue Jun 12 23:49:35 2012
From: greg at (Gregory P. Smith)
Date: Tue, 12 Jun 2012 14:49:35 -0700
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 12, 2012 at 2:34 PM, Antoine Pitrou <solipsis at> wrote:

> On Sun, 10 Jun 2012 10:56:46 -0700
> "Gregory P. Smith" <greg at> wrote:
> > I'd just stick it in hmac myself but getpass was also a good suggestion.
> >  Cross reference to it from the docs of all three as the real goal of
> > adding pbkdf2 is to advertise it to users so that they might use it
> rather
> > than something more naive.
> >
> > hashlib itself should be kept pure as is for standard low level hash
> > algorithms.  It can't have a dependency on anything else.
> I don't really understand this requirement. Can you elaborate?

I wrote that quickly.  I don't want a circular dependency or things that
aren't well established standards in hashlib.  I see hashlib as being for
low level algorithms only (FIPS standards, etc) where fast implementations
are available in most VM runtimes.  hmac depends on hashlib therefore
nothing in hashlib should ever depend on hmac.  That doesn't prevent
someone from deciding hmac shouldn't be a module of its own and moving it
to live within hashlib some day but that would seem like needless API churn
outside of a major language version change.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Wed Jun 13 00:13:50 2012
From: victor.stinner at (Victor Stinner)
Date: Wed, 13 Jun 2012 00:13:50 +0200
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
In-Reply-To: <>
References: <>
Message-ID: <>

2012/6/11 Nick Coghlan <ncoghlan at>:
> Immediate thought: it seems like it would be easier to offer a way to inject
> data back into a buffered IO object's internal buffer.

BufferedReader has already an useful peek() method to read data
without changing the position.

It's not perfect ("The number of bytes returned may be less or more
than requested.") but better than nothing.


From stephen at  Wed Jun 13 06:58:28 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 13 Jun 2012 13:58:28 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Oscar Benjamin writes:

 > I also think I missed something in this thread. At the beginning of the
 > original thread it seemed that everyone was agreed that
 >   writer = codecs.getwriter(desired_encoding)
 >   sys.stdout = writer(sys.stdout.buffer)
 > was a reasonable solution (with the caveat that it should happen before any
 > output is written). Is there some reason why this is not a good
 > approach?

It's undocumented and unobvious, but it's needed for standard stream
filtering in some environments -- where a lot of coding is done by
people who otherwise never need to understand streams at anything but
a superficial level -- and the analogous case of a newly opened file,
pipe, or socket is documented and obvious, and usable by novices.

It's damn shame that we can't say the same about the stdin, stdout,
and stderr streams (even if I too have been at pains to explain why
that's hard to fix).

From stephen at  Wed Jun 13 07:09:04 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 13 Jun 2012 14:09:04 +0900
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: <>
References: <>
Message-ID: <>

 > I write because I see bad things that can be better, and I am still
 > open to discuss if it is real or not.

There's nothing wrong with that in its place.  But python-ideas is a
place for ideas where the poster is pretty sure it's real *and* has a
concrete proposal (the "idea" in python-ideas) to make it better *and*
has the will to follow up themselves if nobody else grabs the ball.

There's some room for blue-sky ideas (lacking concrete proposals or
personal commitment), but if all you ever offer is blue-sky ideas that
get no uptake, you're just wasting time, yours as well as everybody

From guido at  Wed Jun 13 07:21:45 2012
From: guido at (Guido van Rossum)
Date: Tue, 12 Jun 2012 22:21:45 -0700
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 12, 2012 at 9:58 PM, Stephen J. Turnbull <stephen at> wrote:
> Oscar Benjamin writes:
> ?> I also think I missed something in this thread. At the beginning of the
> ?> original thread it seemed that everyone was agreed that
> ?>
> ?> ? writer = codecs.getwriter(desired_encoding)
> ?> ? sys.stdout = writer(sys.stdout.buffer)
> ?>
> ?> was a reasonable solution (with the caveat that it should happen before any
> ?> output is written). Is there some reason why this is not a good
> ?> approach?
> It's undocumented and unobvious, but it's needed for standard stream
> filtering in some environments -- where a lot of coding is done by
> people who otherwise never need to understand streams at anything but
> a superficial level -- and the analogous case of a newly opened file,
> pipe, or socket is documented and obvious, and usable by novices.
> It's damn shame that we can't say the same about the stdin, stdout,
> and stderr streams (even if I too have been at pains to explain why
> that's hard to fix).

I'm probably missing something, but in all my naivete I have what
feels like a simple solution, and I can't seem to see what's wrong
with it.

In C there used to be a function to set the buffer size on an open
stream that could only be called when the stream hadn't been used yet.
ISTM the OP's use case would be covered by a similar function on an
open TextIOWrapper to set the encoding that can only be used when it
hasn't been used to write (or read) anything yet? When called under
any other circumstances it should raise an error. The TextIOWrapper
should maintain a "used" flag so that it can raise this exception

This ought to work for stdin and stdout when used at the start of the
program, assuming nothing is written by code run before main starts.
(This should normally be fine, otherwise you couldn't use a Python
program as a filter at all.) It won't work for stderr if connected to
a tty-ish device (since the version stuff is written there) but that
should be okay, and it should still be okay with stderr if it's not a
tty, since then it starts silent. (But I don't think the use case is
very strong for stderr anyway.)

I'm not sure about a name, but it might well be called set_encoding().
The error message when misused should clarify to people who
misunderstand the name that it can only be called when the stream
hasn't been used yet; I don't think it's necessary to encode that
information in the name. (C's setbuf() wasn't called
set_buffer_on_virgin_stream() either. :-)

I don't care about the integrity of the underlying binary stream. It's
a binary stream, you can write whatever bytes you want to it. But if a
TextIOWrapper is used properly, it won't write a mixture of encodings
to the underlying binary stream, since you can only set the encoding
before reading/writing a single byte. (And the TextIOWrapper is
careful not to use the binary stream before the first actual read() or
write() call -- it just tries to calls tell(), if it's seekable, which
should be safe.)

--Guido van Rossum (

From ncoghlan at  Wed Jun 13 07:42:24 2012
From: ncoghlan at (Nick Coghlan)
Date: Wed, 13 Jun 2012 15:42:24 +1000
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jun 13, 2012 at 3:21 PM, Guido van Rossum <guido at> wrote:
> On Tue, Jun 12, 2012 at 9:58 PM, Stephen J. Turnbull <stephen at> wrote:
>> Oscar Benjamin writes:
>> ?> I also think I missed something in this thread. At the beginning of the
>> ?> original thread it seemed that everyone was agreed that
>> ?>
>> ?> ? writer = codecs.getwriter(desired_encoding)
>> ?> ? sys.stdout = writer(sys.stdout.buffer)
>> ?>
>> ?> was a reasonable solution (with the caveat that it should happen before any
>> ?> output is written). Is there some reason why this is not a good
>> ?> approach?
>> It's undocumented and unobvious, but it's needed for standard stream
>> filtering in some environments -- where a lot of coding is done by
>> people who otherwise never need to understand streams at anything but
>> a superficial level -- and the analogous case of a newly opened file,
>> pipe, or socket is documented and obvious, and usable by novices.
>> It's damn shame that we can't say the same about the stdin, stdout,
>> and stderr streams (even if I too have been at pains to explain why
>> that's hard to fix).
> I'm probably missing something, but in all my naivete I have what
> feels like a simple solution, and I can't seem to see what's wrong
> with it.

I think you're right, and such a method in combination with
stream.buffer.peek() should actually handle a lot of encoding
detection cases, too.

The alternative approaches (calling TextIOWrapper on stream.detach(),
or open on stream.fileno()) either break any references to the old
stream or else create two independent IO stacks on top of a single
underlying file descriptor, which may create some odd behaviour.

Being able to set the encoding on a previously unused stream would
also interact better with the existing subprocess PIPE API.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From solipsis at  Wed Jun 13 10:25:21 2012
From: solipsis at (Antoine Pitrou)
Date: Wed, 13 Jun 2012 10:25:21 +0200
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
References: <>
Message-ID: <>

On Tue, 12 Jun 2012 01:10:47 +1000
Nick Coghlan <ncoghlan at> wrote:
> Immediate thought: it seems like it would be easier to offer a way to
> inject data back into a buffered IO object's internal buffer.

Except that it would be limited by buffer size, which is not
necessarily something you have control over.



From stephen at  Wed Jun 13 10:35:59 2012
From: stephen at (Stephen J. Turnbull)
Date: Wed, 13 Jun 2012 17:35:59 +0900
Subject: [Python-ideas] changing sys.stdout encoding
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum writes:

 > I'm not sure about a name, but it might well be called set_encoding().

I would still prefer "initialize_encoding" or something like that, but
the main thing I was worried about was a "consenting adults" function
that shouldn't be called after I/O, but *could* be.

From rurpy at  Wed Jun 13 17:46:01 2012
From: rurpy at (Rurpy)
Date: Wed, 13 Jun 2012 08:46:01 -0700 (PDT)
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
Message-ID: <>

On 06/11/2012 10:24 AM, Stephen J. Turnbull wrote:
> > Nick Coghlan writes:
> > 
> > > Immediate thought: it seems like it would be easier to offer a way to
> > > inject data back into a buffered IO object's internal buffer.
> > 
> > ungetch()?

What would be the TextIOWrapper api for that? 

> > If you're only interested in the top of the file (see below), I would
> > suggest allowing only one bufferfull, and then simply rewinding the
> > buffer pointer once you're done.  This is one strategy used by Emacsen
> > for encoding detection (for the reason pointed out by Rurpy: not all
> > streams are rewindable).
> >
> > But is that really "easier"?  It might be more general, but you still
> > need to reinitialize the encoding (ie, from the trivial "binary" to
> > whatever is detected), with all the hair that comes with that.

I don't think there is any hair involved.  In at least 
the _pyio version of TextIOWrapper, initializing the 
encoding (in the read path) consists of calling
self._get_decoder().  One needs to move the few places
where that is called now to nearby places that are 
after the raw buffer has been read but before it is
decoded.  There may be need for some consideration 
given to raising errors at the old locations in the 
case the callable encoding hook is not being used (to 
maintain complete backwards compatibility; not sure 
that is necessary), but I wouldn't call that hairy. 
Of course there may be other factors I am missing...

> >  > > Executive summary:
> >  > > ==================
> >  > >
> >  > > There is no good way to read a text file when the
> >  > > encoding has to be determined by reading the start
> >  > > of the file.  A long-winded version of that follows.
> >  > > Scroll down the the "Proposal" section to skip it.
> > 
> > This may be insufficiently general.  Specifically, both Emacsen and vi
> > allow specification of editor configuration variables at the bottom of
> > the file as well as the top.  I don't know whether vi allows encoding
> > specs at the bottom, but Emacsen do (but only for files).
> > 
> > I wouldn't recommend paying much attention to what Emacsen actually
> > *do* when initializing a stream (it's, uh, "baroque").

Looking only at the beginning of an input stream is 
general enough for a large class of problems including 
tokenizing python source code.

From rurpy at  Wed Jun 13 17:56:02 2012
From: rurpy at (Rurpy)
Date: Wed, 13 Jun 2012 08:56:02 -0700 (PDT)
Subject: [Python-ideas] TextIOWrapper callable encoding parameter
Message-ID: <>

On 06/12/2012 03:48 PM, Victor Stinner wrote:
>> >> 1.  The most straight-forward way to handle this is to open
>> >> the file twice, first in binary mode or with latin1 encoding
>> >> and again in text mode after the encoding has been determined
>> >> This of course has a performance cost since the data is read
>> >> twice.  Further, it can't be used if the data source is a
>> >> from a pipe, socket or other non-rewindable source.  This
>> >> includes sys.stdin when it comes from a pipe.
> > 
> > Some months ago, I proposed to automatically detect if a file contains
> > a BOM and uses it to set the encoding. Various methods were proposed
> > but there was no real consensus. One proposition was to use a codec
> > (e.g. "bom") which uses the BOM if it is present, and so don't need to
> > reread the file twice.
> > 
> > For the pipe issue: it depends where the encoding specification is. If
> > the encoding is written at the end of your "file" (stream), you have
> > to store the whole stream content (few MB or maybe much more?) into
> > memory. If it is in the first lines, you have to store these lines in
> > a buffer. It's not easy to decide for the threshold.

That's always a problem.  When trying to determine a
character encoding one may have to read the entire file 
because it could consist of all ascii characters except 
the very last one.  (And of course there is no guarantee 
one can determine *the* encoding at all).

Nevertheless, I think thee is a very large class of
problems that can be usefully handled by looking at a 
limited amount of data at the start of a file (or stream).

The Python coding declaration in one example (obviously 
picked hoping it would have some resonance here.)

The buffer object used by TextIOWrapper already reads the 
start of the stream and buffers the first few lines, so
why not take advantage of that rather than repeating the 

One of the things I am not sure about is if there are 
cases when the buffered read returns, say, only one
line, as might happen with tty input.

> > I don't like the codec approach because the codec is disconnected from
> > the stream. For example, the codec doesn't know the current position
> > in stream nor can read a few more bytes forward or backward. If you
> > open the file in "append" mode, you are not writing at the beginning
> > but at the end of the file. You may also seek at an arbitrary position
> > before the first read...
> > 
> > There are also some special cases. For example, when a text file is
> > opened in write mode, the file is seekable and the file position is
> > not zero, TextIOWrapper calls encoder.setstate(0) to not write the BOM
> > in the middle of the file. (See also Lib/test/ for related
> > tests.)

A callable encoding parameter would not be terribly useful 
with a file opened in write or append mode, but it's behavior
would be predictable: a write would result in an error
because the encoding hadn't been set.  A read in the middle'
of the file would work the same way as at the beginning.
This is probably not very useful, but is consistent.
Of course one could choose to implement a callable encoding
parameter such that some or all of these paths are detected
at open and declared illegal then.  One could prohibit the 
encoding call after a seek though I'm not sure there is any
point to that.

>> >> 2.  Alternatively, with a little more expertise, one can rewrap
>> >> the open binary stream in a TextIOWrapper to avoid a second
>> >> OS file open.
> > 
> > That's my favorite method because you have the full control on the
> > stream. (I wrote But yes, it does not work on
> > non-seekable streams (e.g. pipes).
> > 
>> >> This too seems to read the data twice and of course the
>> >> seek(0) prevents this method also from being usable with
>> >> pipes, sockets and other non-seekable sources.
> > 
> > Does it really matter? You usually need to read few bytes to get the encoding.

It certainly matters if input is from a pipe.  Quoting from
my other message:

  $ cat test.utf8 | python3 reopen1
  got exception: [Errno 29] Illegal seek

The whole point of my suggestion was that you've already
read those few bytes -- but by the time you have access
to them, you've already been forced to choose an encoding.
My suggestion simply defers that encoding setting until
after you've had a chance to look at the bytes.

>> >> 9. In other non-read paths where encoding needs to be known,
>> >>  raise an error if it is still None.
> > 
> > Why not reading data until you the encoding is known instead?

That's how I do it now -- open file in binary mode
and read it, buffer it, determine encoding, and henceforth
decode the bytes data "by hand" to text.

But that's an awful lot like what TextIOWrpper does, yes?
Why can't I use TextIOWrapper instead of rewriting it myself?
(Yes, I know I can reopen or rewrap the binary stream but 
as I said, that loses the one-pass processing which breaks

>> >> I have modified a copy the _pyio module as described and
>> >> the changes required seemed unsurprising and relatively
>> >> few, though I am sure there are subtleties and other
>> >> considerations I am missing.  Hence this post seeking
>> >> feedback...
> > 
> > Can you post the modified somewhere so I can play with it?

I put a diff against the Python-3.2.3 file at:

Much of the diff is just moving existing stuff around.
The note at the bottom says:

| It is in no way supposed to be a serious patch.
| It was the minimal changes I could make in order to 
| see if my suggestion to allow a callable encoding parameter
| in TextIOWrapper was feasible, and allow some timing tests.
| I am quite sure it will not pass the Python's tests. 
| It does I hope give some idea of the nature and scale of the
| code changes needed to implement a callable encodign parameter.

From barry at  Wed Jun 13 22:38:19 2012
From: barry at (Barry Warsaw)
Date: Wed, 13 Jun 2012 16:38:19 -0400
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
References: <>
Message-ID: <>

I'd love to have a PBKDF2 implementation in the stdlib.  My flufl.password
module has an implementation donated by security expert Bob Fleck.  Any
insecure implementation bugs are solely blamed on me though. ;)

The API is a little odd because it fits into the larger API for
flufl.password, but if it's useful, I'd happily cleanup and donate the code
for the stdlib.  OTOH, I'd be just as happy (maybe more) to get rid of it in
favor of a stdlib implementation.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From lists at  Thu Jun 14 00:33:51 2012
From: lists at (Christian Heimes)
Date: Thu, 14 Jun 2012 00:33:51 +0200
Subject: [Python-ideas] Add adaptive-load salt-mandatory hashing
In-Reply-To: <>
References: <>
Message-ID: <jrb4gf$viu$>

Am 13.06.2012 22:38, schrieb Barry Warsaw:
> I'd love to have a PBKDF2 implementation in the stdlib.  My flufl.password
> module has an implementation donated by security expert Bob Fleck.  Any
> insecure implementation bugs are solely blamed on me though. ;)
> The API is a little odd because it fits into the larger API for
> flufl.password, but if it's useful, I'd happily cleanup and donate the code
> for the stdlib.  OTOH, I'd be just as happy (maybe more) to get rid of it in
> favor of a stdlib implementation.

At first glance your implementation is vulnerable to side channel
attacks because you aren't using a constant time equality function. Also
you are using the least secure variant of PBKDF2 (SHA-1 instead of
SHA-256 or SHA-512). At least you are using os.urandom() as source for
the salt, which is usually fine.

Passlib supports the LDAP variants, too. [1] Outside of LDAP the
established notation is $pbkdf2-digest$rounds$salt$checksum.


From gatesda at  Fri Jun 15 10:49:18 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 02:49:18 -0600
Subject: [Python-ideas] Multi-line comment blocks.
Message-ID: <>

Multi-line strings as comments don't nest, don't play well with docstrings,
and are counter-intuitive when there's special language support for
single-line comments. Python should only have one obvious way to do things,
and Python has two ways to comment, only one of which is obvious. My
suggestion is to add language support for comment blocks, using Python's
existing comment delimiter:

# Single-line comment
    Multi-line comment
        Nested multi-line comments work perfectly
        Of course they do, they're just nested blocks
    def foo():
        """Docstrings work perfectly. Why wouldn't they?"""
# No need for an end-delimiter like """ or */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From robert.kern at  Fri Jun 15 11:50:40 2012
From: robert.kern at (Robert Kern)
Date: Fri, 15 Jun 2012 10:50:40 +0100
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <jrf0hg$sj4$>

On 6/15/12 9:49 AM, David Gates wrote:
> Multi-line strings as comments don't nest, don't play well with docstrings, and
> are counter-intuitive when there's special language support for single-line
> comments. Python should only have one obvious way to do things, and Python has
> two ways to comment, only one of which is obvious.

Multi-line string literals aren't comments. They are multi-line string literals. 
Unlike a comment, which does not show up in the compiled bytecode, the Python 
interpreter actually does something with those string literals. Sometimes people 
abuse them as ways to poorly emulate block comments, but this is an abuse, not a 
feature of the language.

> My suggestion is to add
> language support for comment blocks, using Python's existing comment delimiter:
> # Single-line comment
> #:
>      Multi-line comment
>      #:
>          Nested multi-line comments work perfectly
>          Of course they do, they're just nested blocks
>      def foo():
> """Docstrings work perfectly. Why wouldn't they?"""
>          pass
> # No need for an end-delimiter like """ or */

The main problem is that #: currently has a meaning as a line comment. This 
could break existing code.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

From solipsis at  Fri Jun 15 12:33:54 2012
From: solipsis at (Antoine Pitrou)
Date: Fri, 15 Jun 2012 12:33:54 +0200
Subject: [Python-ideas] Multi-line comment blocks.
References: <>
Message-ID: <>

On Fri, 15 Jun 2012 02:49:18 -0600
David Gates <gatesda at> wrote:
> Multi-line strings as comments don't nest, don't play well with docstrings,
> and are counter-intuitive when there's special language support for
> single-line comments. Python should only have one obvious way to do things,
> and Python has two ways to comment, only one of which is obvious. My
> suggestion is to add language support for comment blocks, using Python's
> existing comment delimiter:

Any decent text editor has a way to comment and uncomment whole blocks
of text (in Kate, it is Ctrl+D IIRC).



From sven at  Fri Jun 15 12:49:47 2012
From: sven at (Sven Marnach)
Date: Fri, 15 Jun 2012 11:49:47 +0100
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <jrf0hg$sj4$>
References: <>
Message-ID: <20120615104947.GM4256@bagheera>

Robert Kern schrieb am Fri, 15. Jun 2012, um 10:50:40 +0100:
> Multi-line string literals aren't comments. They are multi-line
> string literals. Unlike a comment, which does not show up in the
> compiled bytecode, the Python interpreter actually does something
> with those string literals. Sometimes people abuse them as ways to
> poorly emulate block comments, but this is an abuse, not a feature
> of the language.

Multi-line string literals do not generate code in CPython, and their
use as comments has BDFL approval:

(I don't use them as comments either, and rather rely on my editor for
commenting blocks.)


From gatesda at  Fri Jun 15 13:47:57 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 05:47:57 -0600
Subject: [Python-ideas] Python-ideas Digest, Vol 67, Issue 51
In-Reply-To: <>
References: <>
Message-ID: <>

@Robert Kern: "Multi-line string literals aren't comments. They are
multi-line string literals.  Unlike a comment, which does not show up in
the compiled bytecode, the Python interpreter actually does something with
those string literals."

They have Guido's stamp of approval, and apparently the interpreter ignores
They feel like an ugly hack to me too, though.

@Robert Kern: "The main problem is that #: currently has a meaning as a
line comment. This could break existing code."

It could, but the only case I can see is when the comment isn't following
indentation convention:

#: Valid either way; next line's not indented,
#: so it's not counted as part of the block.

# Causes an IndentationError in existing code.

def foo():
#: This one would break.

On Fri, Jun 15, 2012 at 4:00 AM, <python-ideas-request at> wrote:

> Send Python-ideas mailing list submissions to
>        python-ideas at
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
>        python-ideas-request at
> You can reach the person managing the list at
>        python-ideas-owner at
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
> Today's Topics:
>   1. Multi-line comment blocks. (David Gates)
>   2. Re: Multi-line comment blocks. (Robert Kern)
> ----------------------------------------------------------------------
> Message: 1
> Date: Fri, 15 Jun 2012 02:49:18 -0600
> From: David Gates <gatesda at>
> To: python-ideas at
> Subject: [Python-ideas] Multi-line comment blocks.
> Message-ID:
>        <CAG2+q8Ska2HLksy6D49wOzgBbQ6E44xLFjqQyZRnhfJTywCmKw at
> >
> Content-Type: text/plain; charset="iso-8859-1"
> Multi-line strings as comments don't nest, don't play well with docstrings,
> and are counter-intuitive when there's special language support for
> single-line comments. Python should only have one obvious way to do things,
> and Python has two ways to comment, only one of which is obvious. My
> suggestion is to add language support for comment blocks, using Python's
> existing comment delimiter:
> # Single-line comment
> #:
>    Multi-line comment
>    #:
>        Nested multi-line comments work perfectly
>        Of course they do, they're just nested blocks
>    def foo():
>        """Docstrings work perfectly. Why wouldn't they?"""
>        pass
> # No need for an end-delimiter like """ or */
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> >
> ------------------------------
> Message: 2
> Date: Fri, 15 Jun 2012 10:50:40 +0100
> From: Robert Kern <robert.kern at>
> To: python-ideas at
> Subject: Re: [Python-ideas] Multi-line comment blocks.
> Message-ID: <jrf0hg$sj4$1 at>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> On 6/15/12 9:49 AM, David Gates wrote:
> > Multi-line strings as comments don't nest, don't play well with
> docstrings, and
> > are counter-intuitive when there's special language support for
> single-line
> > comments. Python should only have one obvious way to do things, and
> Python has
> > two ways to comment, only one of which is obvious.
> Multi-line string literals aren't comments. They are multi-line string
> literals.
> Unlike a comment, which does not show up in the compiled bytecode, the
> Python
> interpreter actually does something with those string literals. Sometimes
> people
> abuse them as ways to poorly emulate block comments, but this is an abuse,
> not a
> feature of the language.
> > My suggestion is to add
> > language support for comment blocks, using Python's existing comment
> delimiter:
> >
> > # Single-line comment
> > #:
> >      Multi-line comment
> >      #:
> >          Nested multi-line comments work perfectly
> >          Of course they do, they're just nested blocks
> >      def foo():
> > """Docstrings work perfectly. Why wouldn't they?"""
> >          pass
> > # No need for an end-delimiter like """ or */
> The main problem is that #: currently has a meaning as a line comment. This
> could break existing code.
> --
> Robert Kern
> "I have come to believe that the whole world is an enigma, a harmless
> enigma
>  that is made terrible by our own mad attempt to interpret it as though it
> had
>  an underlying truth."
>   -- Umberto Eco
> ------------------------------
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> End of Python-ideas Digest, Vol 67, Issue 51
> ********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From taleinat at  Fri Jun 15 14:41:10 2012
From: taleinat at (Tal Einat)
Date: Fri, 15 Jun 2012 15:41:10 +0300
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound) methods
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 11, 2012 at 2:16 AM, Jan Kaliszewski <zuo at> wrote:

> Hello,
> Today, I encountered a surprising bug in my code which creates
> some weakref.proxies to instance methods... The actual Python
> behaviour related to the issue can be ilustrated with the
> following example:
>    >>> import weakref
>    >>> class A:
>    ...     def method(self): print(self)
>    ...
>    >>> A.method
>    <function method at 0xb732926c>
>    >>> a = A()
>    >>> a.method
>    <bound method A.method of <__main__.A object at 0xb7326bec>>
>    >>> r = weakref.ref(a.method)  # creating a weak reference
>    >>> r                          # ...but it appears to be dead
>    <weakref at 0xb7327d9c; dead>
>    >>> w = weakref.proxy(a.method)  # the same with a weak proxy
>    >>> w
>    <weakproxy at 0xb7327d74 to NoneType at 0x829f7d0>
>    >>> w()
>    Traceback (most recent call last):
>      File "<stdin>", line 1, in <module>
>    ReferenceError: weakly-referenced object no longer exists
> This behaviour is perfectly correct -- but still surprising,
> especially for people who know little about method creation
> machinery, descriptors etc.
> I think it would be nice to make this 'trap' less painful --
> for example, by doing one or both of the following:
> 1. Describe and explain this behaviour in the weakref
> module documentation.
> 2. Provide (in functools?) a type-and-decorator that do the
> same what func_descr_get() does (transforms a function into
> a method) *plus* caches the created method (e.g. at the
> instance object).
> A prototype implementation:
>    class InstanceCachedMethod(object):
>        def __init__(self, func):
>            self.func = func
>            (self.instance_attr_name
>            ) = '__{0}_method_ref'.format(func.__name__)
>        def __get__(self, instance, owner):
>            if instance is None:
>                return self.func
>            try:
>                return getattr(instance, self.instance_attr_name)
>            except AttributeError:
>                method = types.MethodType(self.func, instance)
>                setattr(instance, self.instance_attr_name, method)
>                return method
> A simplified version that reuses the func.__name__ (works well
> as long as func.__name__ is the actual instance attribute name...):
>    class InstanceCachedMethod(object):
>        def __init__(self, func):
>            self.func = func
>        def __get__(self, instance, owner):
>            if instance is None:
>                return self.func
>            method = types.MethodType(self.func, instance)
>            setattr(instance, self.func.__name__, method)
>            return method
> Both versions work well with weakref.proxy()/ref() objects:
>    >>> class B:
>    ...     @InstanceCachedMethod
>    ...     def method(self): print(self)
>    ...
>    >>> B.method
>    <function method at 0xb7329d6c>
>    >>> b = B()
>    >>> b.method
>    <bound method B.method of <__main__.B object at 0xb7206ccc>>
>    >>> r = weakref.ref(b.method)
>    >>> r
>    <weakref at 0xb72c611c; to 'method' at 0xb736c40c (method)>
>    >>> w = weakref.proxy(b.method)
>    >>> w
>    <weakproxy at 0xb7327e14 to method at 0xb736c40c>
>    >>> w()
>    <__main__.B object at 0xb7206ccc>
> What do you think about it?

I was bitten by this issue a while ago as well. It made working with
weakref proxies much more involved than I expected it would be.

Wouldn't it be better to approach the issue from the opposite end, and
improve/wrap/replace weakref.proxy with something that can handle bound

- Tal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From shibturn at  Fri Jun 15 15:01:23 2012
From: shibturn at (shibturn)
Date: Fri, 15 Jun 2012 14:01:23 +0100
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound) methods
In-Reply-To: <>
References: <>
Message-ID: <jrfbn5$mbj$>

On 15/06/2012 1:41pm, Tal Einat wrote:
> I was bitten by this issue a while ago as well. It made working with
> weakref proxies much more involved than I expected it would be.
> Wouldn't it be better to approach the issue from the opposite end, and
> improve/wrap/replace weakref.proxy with something that can handle bound
> methods?

Maybe just add something like the following to weakref:

def weakboundmethod(m):
     return m.__func__.__get__(weakref.proxy(m.__self__),



From robert.kern at  Fri Jun 15 16:04:30 2012
From: robert.kern at (Robert Kern)
Date: Fri, 15 Jun 2012 15:04:30 +0100
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <20120615104947.GM4256@bagheera>
References: <>
	<jrf0hg$sj4$> <20120615104947.GM4256@bagheera>
Message-ID: <jrffde$j5v$>

On 6/15/12 11:49 AM, Sven Marnach wrote:
> Robert Kern schrieb am Fri, 15. Jun 2012, um 10:50:40 +0100:
>> Multi-line string literals aren't comments. They are multi-line
>> string literals. Unlike a comment, which does not show up in the
>> compiled bytecode, the Python interpreter actually does something
>> with those string literals. Sometimes people abuse them as ways to
>> poorly emulate block comments, but this is an abuse, not a feature
>> of the language.
> Multi-line string literals do not generate code in CPython, and their
> use as comments has BDFL approval:

Well fancy that.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

From gatesda at  Fri Jun 15 18:23:39 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 10:23:39 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <jrffde$j5v$>
References: <>
	<jrf0hg$sj4$> <20120615104947.GM4256@bagheera>
Message-ID: <>

I agree that using multi-line strings as literals comes across as an ugly
hack, even if it is BDFL-approved.

Your other point is valid, though as far as I can tell it's only an issue
when the comment is indented less than it ought to be (and starts with "#:",
of course):

#: Valid either way. The next line has the
#: same level of indentation, so it's not
#: counted as part of the block.

# Causes an IndentationError in existing code.

def foo():
#: This one would break.

On Fri, Jun 15, 2012 at 8:04 AM, Robert Kern <robert.kern at> wrote:

> On 6/15/12 11:49 AM, Sven Marnach wrote:
>> Robert Kern schrieb am Fri, 15. Jun 2012, um 10:50:40 +0100:
>>> Multi-line string literals aren't comments. They are multi-line
>>> string literals. Unlike a comment, which does not show up in the
>>> compiled bytecode, the Python interpreter actually does something
>>> with those string literals. Sometimes people abuse them as ways to
>>> poorly emulate block comments, but this is an abuse, not a feature
>>> of the language.
>> Multi-line string literals do not generate code in CPython, and their
>> use as comments has BDFL approval:
> Well fancy that.
> --
> Robert Kern
> "I have come to believe that the whole world is an enigma, a harmless
> enigma
>  that is made terrible by our own mad attempt to interpret it as though it
> had
>  an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Jun 15 18:43:35 2012
From: guido at (Guido van Rossum)
Date: Fri, 15 Jun 2012 09:43:35 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
	<jrf0hg$sj4$> <20120615104947.GM4256@bagheera>
Message-ID: <>

Let's not try to design a syntax for multi-line comments. There are
already enough ways to emulate them. Designing a new syntax based on #
plus some special character is doomed for backwards compatibility
(never mind the clever tricks proposed).


On Fri, Jun 15, 2012 at 9:23 AM, David Gates <gatesda at> wrote:
> I agree that using multi-line strings as literals comes across as an ugly
> hack, even if it is BDFL-approved.
> Your other point is valid, though as far as I can tell it's only an issue
> when the comment is indented less than it ought to be (and starts with "#:",
> of course):
> #: Valid either way. The next line has the
> #: same level of indentation, so it's not
> #: counted as part of the block.
> print('a')
> # Causes an IndentationError in existing code.
> #:
> ? ? print('b')
> def foo():
> #: This one would break.
> ? ? print('c')
> On Fri, Jun 15, 2012 at 8:04 AM, Robert Kern <robert.kern at> wrote:
>> On 6/15/12 11:49 AM, Sven Marnach wrote:
>>> Robert Kern schrieb am Fri, 15. Jun 2012, um 10:50:40 +0100:
>>>> Multi-line string literals aren't comments. They are multi-line
>>>> string literals. Unlike a comment, which does not show up in the
>>>> compiled bytecode, the Python interpreter actually does something
>>>> with those string literals. Sometimes people abuse them as ways to
>>>> poorly emulate block comments, but this is an abuse, not a feature
>>>> of the language.
>>> Multi-line string literals do not generate code in CPython, and their
>>> use as comments has BDFL approval:
>>> ? ?
>> Well fancy that.
>> --
>> Robert Kern
>> "I have come to believe that the whole world is an enigma, a harmless
>> enigma
>> ?that is made terrible by our own mad attempt to interpret it as though it
>> had
>> ?an underlying truth."
>> ?-- Umberto Eco
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

--Guido van Rossum (

From elic at  Fri Jun 15 21:07:00 2012
From: elic at (Eli Collins)
Date: Fri, 15 Jun 2012 19:07:00 +0000 (UTC)
Subject: [Python-ideas]
References: <>
Message-ID: <>

Christian Heimes <lists at ...> writes:
> Am 11.06.2012 22:21, schrieb Guido van Rossum:
> > Do you really think that including some API in the stdlib is going to
> > make a difference in education? And what would we do if in 2 years
> > time the stdlib's "basic functionality" were somehow compromised (not
> > due to a bug in Python's implementation but simply through some
> > advance in the crypto world) -- how would we get everyone who relied
> > on the stdlib to switch to a different algorithm? I really think that
> > the right approach here is to get *everyone* who needs this to use a
> > 3rd party library. Diversity is very good here!
> +1
> I'm against adding just the password hashing algorithms. Developers can
> easily screw up right algorithm with a erroneous approach. It's the
> beauty of passlib: The framework hides all the complex and
> easy-to-get-wrong stuff behind a minimal API.
> Christian

I know I'm a little late to this thread, but as the primary Passlib author, I
wanted to throw in my two cents.

I wholeheartedly agree with the idea of not having a high-level password hashing
library in stdlib. I'd be honored and happy to help in extracting a subset of
passlib for inclusion in the standard library. However, for all the reasons GvR
pointed out, I'm scared at the thought of how slowly end deployments would get
needed security updates (for one thing, I update the adaptive cost of the hashes
in passlib about once a year just as a matter of course). I'm reminded of how
the Debian project has had to create a "security" repository to supplement the
"stable" repository, just so the slow-moving "stable" release gets timely
security updates.

All that said, I wouldn't mind seeing a pbkdf2() primitive added to stdlib,
along the lines of M2Crypto's pbkdf2 function [1]. I agree such a function might
mislead developers to roll their own password hashing routines, but a word of
warning and redirection in the documentation might help with that. The reason I
see a need for such a function is that all existing password hashing libraries
(passlib, cryptacular, flufl.password, django.contrib.auth.hashers, etc) have
had to roll their own pure-python pbkdf2 implementations, to varying degrees of
speed. And speed is paramount for pbkdf2 usage, since security depends on
squeezing as many rounds / second out of the implementation as possible. 

Having a single C-accelerated primitive would be great for all of the above
libraries, and all the other uses pbkdf2 has. Furthermore, it wouldn't need
frequent security updates, since the hash storage format, default cost, default
digest, etc, would all be handled by the higher-level libraries. Not that I'm
advocating such a thing is *needed*, but that's what I'd love to see, were
anything to be added in this direction. 

Hope all that helps in your decision making.



- Eli Collins

From steve at  Sat Jun 16 00:12:51 2012
From: steve at (Steven D'Aprano)
Date: Sat, 16 Jun 2012 08:12:51 +1000
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

David Gates wrote:
> Multi-line strings as comments don't nest, don't play well with docstrings,
> and are counter-intuitive when there's special language support for
> single-line comments. Python should only have one obvious way to do things,

That's not what the Zen says. The zen says:

There should be one-- and preferably only one --obvious way to do it.

which is a positive statement that there should be an obvious way to solve 
problems, NOT a negative statement that there shouldn't be non-obvious ways.

> and Python has two ways to comment, only one of which is obvious. My
> suggestion is to add language support for comment blocks, using Python's
> existing comment delimiter:

There is already support for nested multi-line comments: the humble # symbol 
can be nested arbitrarily deep. All you need is a modern editor that 
understands Python syntax, and with a single command you can comment or 
uncomment a block:

# This is a commented line.

# def fun(a, b, c):
#     """Docstrings are fine when commented"""
#     pass
#     # This is a nested comment.
# And no need for an end-delimiter either.

If your editor is old or too basic, you can do it by hand, which is a pain, 
but doable.

Python doesn't need dedicated syntax to make up for the limitations of your 
editor. Don't complicate the language for the sake of those poor fools stuck 
using Notepad.


From gatesda at  Sat Jun 16 00:47:12 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 16:47:12 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

My proposal wasn't for people who hand-code the single-line comment syntax
but for those that use multi-line string comments.  Since the multi-line
string hack's BDFL-approved, people will use it and other people will have
to deal with it.

The best alternative would be official discouragement of multi-line string
comments.  It's fine if Python doesn't have an officially-sanctioned
multi-line comment syntax, but if it's going to have one, it should have
one that makes sense.

On Fri, Jun 15, 2012 at 4:12 PM, Steven D'Aprano <steve at>wrote:

> David Gates wrote:
>> Multi-line strings as comments don't nest, don't play well with
>> docstrings,
>> and are counter-intuitive when there's special language support for
>> single-line comments. Python should only have one obvious way to do
>> things,
> That's not what the Zen says. The zen says:
> There should be one-- and preferably only one --obvious way to do it.
> which is a positive statement that there should be an obvious way to solve
> problems, NOT a negative statement that there shouldn't be non-obvious ways.
>  and Python has two ways to comment, only one of which is obvious. My
>> suggestion is to add language support for comment blocks, using Python's
>> existing comment delimiter:
> There is already support for nested multi-line comments: the humble #
> symbol can be nested arbitrarily deep. All you need is a modern editor that
> understands Python syntax, and with a single command you can comment or
> uncomment a block:
> # This is a commented line.
> # def fun(a, b, c):
> #     """Docstrings are fine when commented"""
> #     pass
> #     # This is a nested comment.
> # And no need for an end-delimiter either.
> If your editor is old or too basic, you can do it by hand, which is a
> pain, but doable.
> Python doesn't need dedicated syntax to make up for the limitations of
> your editor. Don't complicate the language for the sake of those poor fools
> stuck using Notepad.
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jun 16 00:51:09 2012
From: guido at (Guido van Rossum)
Date: Fri, 15 Jun 2012 15:51:09 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
> My proposal wasn't for people who hand-code the single-line comment syntax
> but for those that use multi-line string comments. ?Since the multi-line
> string hack's BDFL-approved, people will use it and other people will have
> to deal with it.

What's wrong with it?

> The best alternative would be official discouragement of multi-line string
> comments. ?It's fine if Python doesn't have an officially-sanctioned
> multi-line comment syntax, but if it's going to have one, it should have one
> that makes sense.

What doesn't make sense about it?


> On Fri, Jun 15, 2012 at 4:12 PM, Steven D'Aprano <steve at>
> wrote:
>> David Gates wrote:
>>> Multi-line strings as comments don't nest, don't play well with
>>> docstrings,
>>> and are counter-intuitive when there's special language support for
>>> single-line comments. Python should only have one obvious way to do
>>> things,
>> That's not what the Zen says. The zen says:
>> There should be one-- and preferably only one --obvious way to do it.
>> which is a positive statement that there should be an obvious way to solve
>> problems, NOT a negative statement that there shouldn't be non-obvious ways.
>>> and Python has two ways to comment, only one of which is obvious. My
>>> suggestion is to add language support for comment blocks, using Python's
>>> existing comment delimiter:
>> There is already support for nested multi-line comments: the humble #
>> symbol can be nested arbitrarily deep. All you need is a modern editor that
>> understands Python syntax, and with a single command you can comment or
>> uncomment a block:
>> # This is a commented line.
>> # def fun(a, b, c):
>> # ? ? """Docstrings are fine when commented"""
>> # ? ? pass
>> # ? ? # This is a nested comment.
>> # And no need for an end-delimiter either.
>> If your editor is old or too basic, you can do it by hand, which is a
>> pain, but doable.
>> Python doesn't need dedicated syntax to make up for the limitations of
>> your editor. Don't complicate the language for the sake of those poor fools
>> stuck using Notepad.
>> --
>> Steven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

--Guido van Rossum (

From gatesda at  Sat Jun 16 01:33:15 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 17:33:15 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

 On discussions I've seen, including in this very thread ( ),
there are inevitably people that think the multi-line string comment syntax
is non-Pythonic, confusing, and/or a bad practice.  While they can adapt to
it, the initial impression is often that it's an overly-clever hack.

String literals *work* as comments in other languages, but the idiomatic
usage is always the dedicated comment syntax (even if there isn't a
multi-line syntax) because people assume that uncommented code is active
and significant.  The same goes for other tricks that use dead code as
comments, such as the "if false:" block I've seen suggested as an
alternative.  Comments do more than just delimit non-code: they signal
developer intent.

On Fri, Jun 15, 2012 at 4:51 PM, Guido van Rossum <guido at> wrote:

> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
> > My proposal wasn't for people who hand-code the single-line comment
> syntax
> > but for those that use multi-line string comments.  Since the multi-line
> > string hack's BDFL-approved, people will use it and other people will
> have
> > to deal with it.
> What's wrong with it?
> > The best alternative would be official discouragement of multi-line
> string
> > comments.  It's fine if Python doesn't have an officially-sanctioned
> > multi-line comment syntax, but if it's going to have one, it should have
> one
> > that makes sense.
> What doesn't make sense about it?
> --Guido
> > On Fri, Jun 15, 2012 at 4:12 PM, Steven D'Aprano <steve at>
> > wrote:
> >>
> >> David Gates wrote:
> >>>
> >>> Multi-line strings as comments don't nest, don't play well with
> >>> docstrings,
> >>> and are counter-intuitive when there's special language support for
> >>> single-line comments. Python should only have one obvious way to do
> >>> things,
> >>
> >>
> >> That's not what the Zen says. The zen says:
> >>
> >> There should be one-- and preferably only one --obvious way to do it.
> >>
> >> which is a positive statement that there should be an obvious way to
> solve
> >> problems, NOT a negative statement that there shouldn't be non-obvious
> ways.
> >>
> >>> and Python has two ways to comment, only one of which is obvious. My
> >>> suggestion is to add language support for comment blocks, using
> Python's
> >>> existing comment delimiter:
> >>
> >>
> >> There is already support for nested multi-line comments: the humble #
> >> symbol can be nested arbitrarily deep. All you need is a modern editor
> that
> >> understands Python syntax, and with a single command you can comment or
> >> uncomment a block:
> >>
> >> # This is a commented line.
> >>
> >> # def fun(a, b, c):
> >> #     """Docstrings are fine when commented"""
> >> #     pass
> >> #     # This is a nested comment.
> >> # And no need for an end-delimiter either.
> >>
> >> If your editor is old or too basic, you can do it by hand, which is a
> >> pain, but doable.
> >>
> >> Python doesn't need dedicated syntax to make up for the limitations of
> >> your editor. Don't complicate the language for the sake of those poor
> fools
> >> stuck using Notepad.
> >>
> >>
> >>
> >> --
> >> Steven
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at
> >>
> >
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> >
> --
> --Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jun 16 01:37:05 2012
From: guido at (Guido van Rossum)
Date: Fri, 15 Jun 2012 16:37:05 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

You can never get agreement on what is Pythonic or not, that's why we
have a BDFL. Feel free not to use strings as comments; as noted the
multi-line # form is fine. Do note that Python has docstrings, so
strings used as comments aren't completely alien like they would be in
most languages.

On Fri, Jun 15, 2012 at 4:33 PM, David Gates <gatesda at> wrote:
> On discussions I've seen, including in this very thread (
> ), there
> are inevitably people that think the multi-line string comment syntax is
> non-Pythonic, confusing, and/or a bad practice. ?While they can adapt to it,
> the initial impression is often that it's an overly-clever hack.
> String literals work as comments in other languages, but the idiomatic usage
> is always the dedicated comment syntax (even if there isn't a multi-line
> syntax) because people assume that uncommented code is active and
> significant. ?The same goes for other tricks that use dead code as comments,
> such as the "if false:" block I've seen suggested as an alternative.
> Comments do more than just delimit non-code: they signal developer intent.
> On Fri, Jun 15, 2012 at 4:51 PM, Guido van Rossum <guido at> wrote:
>> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
>> > My proposal wasn't for people who hand-code the single-line comment
>> > syntax
>> > but for those that use multi-line string comments. ?Since the multi-line
>> > string hack's BDFL-approved, people will use it and other people will
>> > have
>> > to deal with it.
>> What's wrong with it?
>> > The best alternative would be official discouragement of multi-line
>> > string
>> > comments. ?It's fine if Python doesn't have an officially-sanctioned
>> > multi-line comment syntax, but if it's going to have one, it should have
>> > one
>> > that makes sense.
>> What doesn't make sense about it?
>> --Guido
>> > On Fri, Jun 15, 2012 at 4:12 PM, Steven D'Aprano <steve at>
>> > wrote:
>> >>
>> >> David Gates wrote:
>> >>>
>> >>> Multi-line strings as comments don't nest, don't play well with
>> >>> docstrings,
>> >>> and are counter-intuitive when there's special language support for
>> >>> single-line comments. Python should only have one obvious way to do
>> >>> things,
>> >>
>> >>
>> >> That's not what the Zen says. The zen says:
>> >>
>> >> There should be one-- and preferably only one --obvious way to do it.
>> >>
>> >> which is a positive statement that there should be an obvious way to
>> >> solve
>> >> problems, NOT a negative statement that there shouldn't be non-obvious
>> >> ways.
>> >>
>> >>> and Python has two ways to comment, only one of which is obvious. My
>> >>> suggestion is to add language support for comment blocks, using
>> >>> Python's
>> >>> existing comment delimiter:
>> >>
>> >>
>> >> There is already support for nested multi-line comments: the humble #
>> >> symbol can be nested arbitrarily deep. All you need is a modern editor
>> >> that
>> >> understands Python syntax, and with a single command you can comment or
>> >> uncomment a block:
>> >>
>> >> # This is a commented line.
>> >>
>> >> # def fun(a, b, c):
>> >> # ? ? """Docstrings are fine when commented"""
>> >> # ? ? pass
>> >> # ? ? # This is a nested comment.
>> >> # And no need for an end-delimiter either.
>> >>
>> >> If your editor is old or too basic, you can do it by hand, which is a
>> >> pain, but doable.
>> >>
>> >> Python doesn't need dedicated syntax to make up for the limitations of
>> >> your editor. Don't complicate the language for the sake of those poor
>> >> fools
>> >> stuck using Notepad.
>> >>
>> >>
>> >>
>> >> --
>> >> Steven
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at
>> >>
>> >
>> >
>> >
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas at
>> >
>> >
>> --
>> --Guido van Rossum (

--Guido van Rossum (

From carl at  Sat Jun 16 01:07:55 2012
From: carl at (Carl Meyer)
Date: Fri, 15 Jun 2012 17:07:55 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On 06/15/2012 04:51 PM, Guido van Rossum wrote:
> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
>> My proposal wasn't for people who hand-code the single-line comment syntax
>> but for those that use multi-line string comments.  Since the multi-line
>> string hack's BDFL-approved, people will use it and other people will have
>> to deal with it.
> What's wrong with it?

The reason I discourage using multi-line strings as comments is that
they don't nest (which I think David mentioned earlier). If you've got a
short multi-line-string-as-comment in the middle of a function, and then
you try to use multi-line-string technique to comment out that entire
function, you don't get what you want, you get a syntax error as your
short comment is now parsed as code.

(FWIW, I don't think this means Python needs a dedicated syntax for
multi-line comments, I think multiple lines beginning with # works just


From guido at  Sat Jun 16 01:47:59 2012
From: guido at (Guido van Rossum)
Date: Fri, 15 Jun 2012 16:47:59 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 15, 2012 at 4:07 PM, Carl Meyer <carl at> wrote:
> On 06/15/2012 04:51 PM, Guido van Rossum wrote:
>> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
>>> My proposal wasn't for people who hand-code the single-line comment syntax
>>> but for those that use multi-line string comments. ?Since the multi-line
>>> string hack's BDFL-approved, people will use it and other people will have
>>> to deal with it.
>> What's wrong with it?
> The reason I discourage using multi-line strings as comments is that
> they don't nest (which I think David mentioned earlier). If you've got a
> short multi-line-string-as-comment in the middle of a function, and then
> you try to use multi-line-string technique to comment out that entire
> function, you don't get what you want, you get a syntax error as your
> short comment is now parsed as code.
> (FWIW, I don't think this means Python needs a dedicated syntax for
> multi-line comments, I think multiple lines beginning with # works just
> fine.)

In which languages do multi-line comments nest? AFAIK not in the
Java/C/C++/JavaScript family.

--Guido van Rossum (

From zuo at  Sat Jun 16 01:41:50 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 16 Jun 2012 01:41:50 +0200
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound) methods
In-Reply-To: <>
References: <>
Message-ID: <>

Tal Einat dixit (2012-06-15, 15:41):

> On Mon, Jun 11, 2012 at 2:16 AM, Jan Kaliszewski <zuo at> wrote:
> >    >>> import weakref
> >    >>> class A:
> >    ...     def method(self): print(self)
> >    ...
> >    >>> A.method
> >    <function method at 0xb732926c>
> >    >>> a = A()
> >    >>> a.method
> >    <bound method A.method of <__main__.A object at 0xb7326bec>>
> >    >>> r = weakref.ref(a.method)  # creating a weak reference
> >    >>> r                          # ...but it appears to be dead
> >    <weakref at 0xb7327d9c; dead>
> >    >>> w = weakref.proxy(a.method)  # the same with a weak proxy
> >    >>> w
> >    <weakproxy at 0xb7327d74 to NoneType at 0x829f7d0>
> >    >>> w()
> >    Traceback (most recent call last):
> >      File "<stdin>", line 1, in <module>
> >    ReferenceError: weakly-referenced object no longer exists
> >
> > This behaviour is perfectly correct -- but still surprising,
> > especially for people who know little about method creation
> > machinery, descriptors etc.
> >
> > I think it would be nice to make this 'trap' less painful --
> > A prototype implementation:
> >
> >    class InstanceCachedMethod(object):
> >
> >        def __init__(self, func):
> >            self.func = func
> >            (self.instance_attr_name
> >            ) = '__{0}_method_ref'.format(func.__name__)
> >
> >        def __get__(self, instance, owner):
> >            if instance is None:
> >                return self.func
> >            try:
> >                return getattr(instance, self.instance_attr_name)
> >            except AttributeError:
> >                method = types.MethodType(self.func, instance)
> >                setattr(instance, self.instance_attr_name, method)
> >                return method
> I was bitten by this issue a while ago as well. It made working with
> weakref proxies much more involved than I expected it would be.
> Wouldn't it be better to approach the issue from the opposite end, and
> improve/wrap/replace weakref.proxy with something that can handle bound
> methods?

Indeed, probably could it be done by wrapping weakref.ref()/proxy()
with something like the following:

    # here `obj` is the object that is being weak-referenced...
    if isinstance(obj, types.MethodType):
            cache = obj.__self__.__method_cache__
        except AttributeError:
            cache = obj.__self__.__method_cache__ = WeakKeyDictionary()
        method_cache.setdefault(obj.__func__, set()).add(obj)

(Using WeakKeyDictionary with corresponding function objects as weak
keys -- to provide automagic cleanup when a function is deleted, e.g.
replaced with another one.  In other words: the actual weak ref/proxy
to a method lives as long as the corresponding function does).

Any thoughts?


From gatesda at  Sat Jun 16 02:28:39 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 18:28:39 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

Perl, Ruby, Lisps, OCaml, F#, Haskell, and I believe Pascal.  There are
probably others.  None of these use significant indentation; they're just
smart enough to not ignore beginning delimiters within multi-line comments.

On Fri, Jun 15, 2012 at 5:47 PM, Guido van Rossum <guido at> wrote:

> On Fri, Jun 15, 2012 at 4:07 PM, Carl Meyer <carl at> wrote:
> > On 06/15/2012 04:51 PM, Guido van Rossum wrote:
> >> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
> >>> My proposal wasn't for people who hand-code the single-line comment
> syntax
> >>> but for those that use multi-line string comments.  Since the
> multi-line
> >>> string hack's BDFL-approved, people will use it and other people will
> have
> >>> to deal with it.
> >>
> >> What's wrong with it?
> >
> > The reason I discourage using multi-line strings as comments is that
> > they don't nest (which I think David mentioned earlier). If you've got a
> > short multi-line-string-as-comment in the middle of a function, and then
> > you try to use multi-line-string technique to comment out that entire
> > function, you don't get what you want, you get a syntax error as your
> > short comment is now parsed as code.
> >
> > (FWIW, I don't think this means Python needs a dedicated syntax for
> > multi-line comments, I think multiple lines beginning with # works just
> > fine.)
> In which languages do multi-line comments nest? AFAIK not in the
> Java/C/C++/JavaScript family.
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Sat Jun 16 02:47:09 2012
From: python at (MRAB)
Date: Sat, 16 Jun 2012 01:47:09 +0100
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On 16/06/2012 01:28, David Gates wrote:
> Perl, Ruby, Lisps, OCaml, F#, Haskell, and I believe Pascal.  There are
> probably others.  None of these use significant indentation; they're
> just smart enough to not ignore beginning delimiters within multi-line
> comments.
In Pascal, comments start with "{" or "(*" and end with "}" or "*)".

How do you write a nested comment in Perl? As far as I'm aware, it
doesn't have nested comments either.

From gatesda at  Sat Jun 16 03:00:10 2012
From: gatesda at (David Gates)
Date: Fri, 15 Jun 2012 19:00:10 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

A Perl nested comment:

    Nested comment

On Fri, Jun 15, 2012 at 6:47 PM, MRAB <python at> wrote:

> On 16/06/2012 01:28, David Gates wrote:
>> Perl, Ruby, Lisps, OCaml, F#, Haskell, and I believe Pascal.  There are
>> probably others.  None of these use significant indentation; they're
>> just smart enough to not ignore beginning delimiters within multi-line
>> comments.
>>  In Pascal, comments start with "{" or "(*" and end with "}" or "*)".
> How do you write a nested comment in Perl? As far as I'm aware, it
> doesn't have nested comments either.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Sat Jun 16 04:27:17 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Fri, 15 Jun 2012 22:27:17 -0400
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 15, 2012 at 6:51 PM, Guido van Rossum <guido at> wrote:
> On Fri, Jun 15, 2012 at 3:47 PM, David Gates <gatesda at> wrote:
>> My proposal wasn't for people who hand-code the single-line comment syntax
>> but for those that use multi-line string comments. ?Since the multi-line
>> string hack's BDFL-approved, people will use it and other people will have
>> to deal with it.
> What's wrong with it?

It behaves "badly" in a lot of circumstances. If you put it in an
expression, it's treated as a string (not as an invisible thing) (IMHO
this is its worst failing, the rest is just fluff). And to add on some
more reasons, if you put it at the top of a file, class statement, or
def statement, it's treated as a docstring, which may accidentally be
included in autogenerated docs. In fact, for some common tools
(epydoc), even if you put it in some other places it may be grabbed as
a docstring (e.g. if you put it after a variable definition inside a
class statement), and be included in the documentation. Basically the
Python tool world seems to think that strings that aren't inside an
expression are "docstrings", not comments, and you have to be careful
to avoid being misinterpreted by your tools, which is unfortunate.

In contrast, the reason that multiline comments are so great is that
they can go virtually anywhere without too much concern. For example:

    def foo(a, b(*=None*)):

In this hypothetical code, I commented out the =None in order to run
the test suite and see if any of my code omitted that argument, maybe
to judge how reasonable it is to remove the default. Here, neither "#"
comments nor docstrings really make this easy. The closest equivalent

    def foo(a, b): #=None):

And that has to be done entirely by hand, and might be especially
painful (involving copy-paste) if it isn't the last argument that's
being changed.

I have done this sort of thing (commenting out stuff inside def
statements) many times, I don't even remember why. It crops up.

Of course, multiline comments go anywhere, not just in def statements.
And they span multiple lines! In practice, most of the time that's
just as easy with the editor key that inserts "#", I just wanted to
point out a case where no existing solution makes it so easy.

-- Devin

From bruce at  Sat Jun 16 05:26:33 2012
From: bruce at (Bruce Leban)
Date: Fri, 15 Jun 2012 20:26:33 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Jun 15, 2012 4:46 PM, "Carl Meyer" <carl@
<carl at><carl at>>

> The reason I discourage using multi-line strings as comments is that
> they don't nest (which I think David mentioned earlier). If you've got a
> short multi-line-string-as-comment in the middle of a function, and then
> you try to use multi-line-string technique to comment out that entire
> function, you don't get what you want, you get a syntax error as your
> short comment is now parsed as code.

I think "commenting out" code and writing true comments are different

I would not advocate multi line strings for commenting out but they work
very well for long text comments.

Nested #s and a modern editor work well enough for commenting out IMHO and
are much harder to not notice.

--- Bruce
(from my phone)

On Jun 15, 2012 7:28 PM, "Devin Jeanpierre"
<jeanpierreda<jeanpierreda at>
@ <jeanpierreda at> <jeanpierreda at>> wrote:

> the reason that multiline comments are so great is that
> they can go virtually anywhere without too much concern. For example:
>    def foo(a, b(*=None*)):
>        ...
> In this hypothetical code, I commented out the =None in order to run
> the test suite and see if any of my code omitted that argument, maybe
> to judge how reasonable it is to remove the default. Here, neither "#"
> comments nor docstrings really make this easy. The closest equivalent
> is:
>    def foo(a, b): #=None):
>        ...
> And that has to be done entirely by hand, and might be especially
> painful (involving copy-paste) if it isn't the last argument that's
> being changed.

For commenting out part of a line I think best practice is duplicating the
entire line as a comment and editing it directly. That handles scenarios
that inline comments don't and more importantly ensures reverting is error

    # def foo(a, b=None):
    def foo(a, b=[]):

> Python tool world seems to think that strings that aren't inside an
> expression are "docstrings", not comments, and you have to be careful
> to avoid being misinterpreted by your tools, which is unfortunate.

Agreed. But even if multiline/inline comments were added you'd still have
that problem, right?

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Sat Jun 16 05:48:29 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Fri, 15 Jun 2012 23:48:29 -0400
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 15, 2012 at 11:26 PM, Bruce Leban <bruce at> wrote:
> For commenting out part of a line I think best practice is duplicating the
> entire line as a comment and editing it directly. That handles scenarios
> that inline comments don't and more importantly ensures reverting is error
> free.

I suppose so. So far I've done pretty much exactly what I wrote, and
used the undo buffer for safety.

There are also things like commenting out values inside lists and
such, but these are much less common for me. Like, definitely inline
comments are more flexible than EOL comments, but finding compelling
use-cases is kinda hard. There's only a bunch of minor special cases
and annoyances, as far as I can see.

>> Python tool world seems to think that strings that aren't inside an
>> expression are "docstrings", not comments, and you have to be careful
>> to avoid being misinterpreted by your tools, which is unfortunate.
> Agreed. But even if multiline/inline comments were added you'd still have
> that problem, right?

I don't see why this problem would exist for comments. Comments do not
have a (common) culture or behaviour of meaning anything else other
than comments, whereas triple-quoted strings have three purposes:

- Actual string objects
- Docstrings
- Comments

Multiline comments would need a lot of time to accumulate that many
orthogonal uses, and one would hope that they never do.

Aside from that, most of these sorts of tools manipulate code either
after parsing or after executing, and by then all comments have been
discarded. They wouldn't even see multiline comments.

Although, it's worth mentioning that doctest is an interesting
exception to all this: it uses regexps to parse out comments, which
are used as directives for the test runner. However, doctest only
touches code that is explicitly meant to be touched by doctest, and
that code generally doesn't need comments at all.

-- Devin

From bruce at  Sat Jun 16 06:28:33 2012
From: bruce at (Bruce Leban)
Date: Fri, 15 Jun 2012 21:28:33 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jun 15, 2012 at 8:48 PM, Devin Jeanpierre <jeanpierreda at>wrote:

> On Fri, Jun 15, 2012 at 11:26 PM, Bruce Leban <bruce at> wrote:
> > For commenting out part of a line I think best practice is duplicating
> the
> > entire line as a comment and editing it directly. That handles scenarios
> > that inline comments don't and more importantly ensures reverting is
> error
> > free.
> I suppose so. So far I've done pretty much exactly what I wrote, and
> used the undo buffer for safety.
> Undo is dangerous because in most editors it will undo other intervening
changes to other parts of the program. You make a change like this to find
a bug, then find and fix the bug. Undo will remove the fix.

> Agreed. But even if multiline/inline comments were added you'd still have
> > that problem, right?
> I don't see why this problem would exist for comments. Comments do not
> have a (common) culture or behaviour of meaning anything else other
> than comments, whereas triple-quoted strings have three purposes:

I meant that even if new comment syntax were added, string-style comments
wouldn't be going away anytime soon. There's a high bar for adding features
and an even higher bar for removing them. So tools will need handle the
current string comments for quite a while as well as being modified to
parse any new comment syntax.

--- Bruce
Follow me:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Sat Jun 16 08:34:58 2012
From: stephen at (Stephen J. Turnbull)
Date: Sat, 16 Jun 2012 15:34:58 +0900
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

Carl Meyer writes:

 > The reason I discourage using multi-line strings as comments is that
 > they don't nest (which I think David mentioned earlier).

I don't see that as a problem.

While I don't use multiline strings as comments myself, I wouldn't
object to others using them for commentary, especially given the
syntactic analogy to docstrings.  But for commenting out code, a nice
heavy line in the left margin is an appropriate marker, and would
certainly "discuss" the matter with colleagues who disabled large
chunks of code with paired delimiters, whether primarily string or
comment delimiters.  I'm a big non-fan of preprocessor conditional
compilation directives, for that matter.  (Sure, your editor can mark
or hide them, and it's not like you can avoid them in languages like
C, but that doesn't mean I have to *like* them.)

So IMO the current syntax encourages good style.

From stephen at  Sat Jun 16 08:45:23 2012
From: stephen at (Stephen J. Turnbull)
Date: Sat, 16 Jun 2012 15:45:23 +0900
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

Devin Jeanpierre writes:

 > [T]riple-quoted strings have three purposes:
 > - Actual string objects
 > - Docstrings

Docstrings are a subset of "actual string objects," of course.
They just have a special syntax, and their primary use is "meta" (eg,

 > - Comments

And so are strings-as-comments.  Strings could be used as comments in

void foo ()
    "This comment would be optimized away, most likely.";
    "Not to mention compilers may bitch about lack of effect.";
    return 42;

It's just a side effect of expression statements.  You don't have to
like it, of course.

From steve at  Sat Jun 16 08:53:10 2012
From: steve at (Steven D'Aprano)
Date: Sat, 16 Jun 2012 16:53:10 +1000
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>
Message-ID: <>

Devin Jeanpierre wrote:

> Although, it's worth mentioning that doctest is an interesting
> exception to all this: it uses regexps to parse out comments, which
> are used as directives for the test runner. However, doctest only
> touches code that is explicitly meant to be touched by doctest, 

The normal way of running doctest is to use implicit test discovery: you point 
doctest at a module, and it will discover your doctests without you needing to 
explicitly list them.

That's why there is a doctest directive to *disable* tests, but no directive 
to enable them: you only need to explicitly turn tests off, not turn them on.

> and that code generally doesn't need comments at all.

I write many functions or classes that include both documentation in the 
docstring, including doctests, and implementation comments in the body of the 
function. Docstrings and comments in the body of a function have very 
different purposes, just because a function has one doesn't mean that it won't 
have the other.


From steve at  Sat Jun 16 08:55:51 2012
From: steve at (Steven D'Aprano)
Date: Sat, 16 Jun 2012 16:55:51 +1000
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

Carl Meyer wrote:

> The reason I discourage using multi-line strings as comments is that
> they don't nest (which I think David mentioned earlier). If you've got a
> short multi-line-string-as-comment in the middle of a function, and then
> you try to use multi-line-string technique to comment out that entire
> function, you don't get what you want, you get a syntax error as your
> short comment is now parsed as code.

You can nest two such string-comments, by using different string delimiters:

'''Outermost comment

def func(x, y):
     """Innermost comment or docstring
     goes here


If you regularly need to do this, you're doing it wrong.
You should be deleting unused code, not commenting it
out. Nested comments as change tracking is *worse* than
no change tracking, in my opinion.

> (FWIW, I don't think this means Python needs a dedicated syntax for
> multi-line comments, I think multiple lines beginning with # works just
> fine.)



From steve at  Sat Jun 16 09:17:35 2012
From: steve at (Steven D'Aprano)
Date: Sat, 16 Jun 2012 17:17:35 +1000
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

Devin Jeanpierre wrote:

> In contrast, the reason that multiline comments are so great is that
> they can go virtually anywhere without too much concern. For example:
>     def foo(a, b(*=None*)):
>         ...

So now you're changing the semantics from *multiline* to *embedded* comments. 
Being able to embed a comment within an expression is a very different thing 
from just having comments extend across multiple lines.

> In this hypothetical code, I commented out the =None in order to run
> the test suite and see if any of my code omitted that argument, maybe
> to judge how reasonable it is to remove the default. Here, neither "#"
> comments nor docstrings really make this easy. The closest equivalent
> is:
>     def foo(a, b): #=None):
>         ...

The simplest change here would be to just delete the "=None", run your tests, 
then put it back if the tests fail.

Of course, alternatives are the comment above, or perhaps even better:

     #def foo(a, b=None):
     def foo(a, b):

which avoids the risk of forgetting what change needs to be undone.

In any case, all these alternatives are so trivial that they are hardly an 
argument for adding new comment syntax.

> And that has to be done entirely by hand, and might be especially
> painful (involving copy-paste) if it isn't the last argument that's
> being changed.

"Especially painful"? I fear you exaggerate somewhat.


From zuo at  Sat Jun 16 10:42:57 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 16 Jun 2012 10:42:57 +0200
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound) methods
In-Reply-To: <>
References: <>
Message-ID: <>

Jan Kaliszewski dixit (2012-06-16, 01:41):

> Tal Einat dixit (2012-06-15, 15:41):
> > On Mon, Jun 11, 2012 at 2:16 AM, Jan Kaliszewski <zuo at> wrote:
> [snip]
> > >    >>> import weakref
> > >    >>> class A:
> > >    ...     def method(self): print(self)
> > >    ...
> > >    >>> A.method
> > >    <function method at 0xb732926c>
> > >    >>> a = A()
> > >    >>> a.method
> > >    <bound method A.method of <__main__.A object at 0xb7326bec>>
> > >    >>> r = weakref.ref(a.method)  # creating a weak reference
> > >    >>> r                          # ...but it appears to be dead
> > >    <weakref at 0xb7327d9c; dead>
> > >    >>> w = weakref.proxy(a.method)  # the same with a weak proxy
> > >    >>> w
> > >    <weakproxy at 0xb7327d74 to NoneType at 0x829f7d0>
> > >    >>> w()
> > >    Traceback (most recent call last):
> > >      File "<stdin>", line 1, in <module>
> > >    ReferenceError: weakly-referenced object no longer exists
> > >
> > > This behaviour is perfectly correct -- but still surprising,
> > > especially for people who know little about method creation
> > > machinery, descriptors etc.
> > >
> > > I think it would be nice to make this 'trap' less painful --
> [snip]
> > > A prototype implementation:
> > >
> > >    class InstanceCachedMethod(object):
> > >
> > >        def __init__(self, func):
> > >            self.func = func
> > >            (self.instance_attr_name
> > >            ) = '__{0}_method_ref'.format(func.__name__)
> > >
> > >        def __get__(self, instance, owner):
> > >            if instance is None:
> > >                return self.func
> > >            try:
> > >                return getattr(instance, self.instance_attr_name)
> > >            except AttributeError:
> > >                method = types.MethodType(self.func, instance)
> > >                setattr(instance, self.instance_attr_name, method)
> > >                return method
> [snip]
> > I was bitten by this issue a while ago as well. It made working with
> > weakref proxies much more involved than I expected it would be.
> > 
> > Wouldn't it be better to approach the issue from the opposite end, and
> > improve/wrap/replace weakref.proxy with something that can handle bound
> > methods?
> Indeed, probably could it be done by wrapping weakref.ref()/proxy()
> with something like the following:
>     # here `obj` is the object that is being weak-referenced...
>     if isinstance(obj, types.MethodType):
>         try:
>             cache = obj.__self__.__method_cache__
>         except AttributeError:
>             cache = obj.__self__.__method_cache__ = WeakKeyDictionary()
>         method_cache.setdefault(obj.__func__, set()).add(obj)
> (Using WeakKeyDictionary with corresponding function objects as weak
> keys -- to provide automagic cleanup when a function is deleted, e.g.
> replaced with another one.  In other words: the actual weak ref/proxy
> to a method lives as long as the corresponding function does).

On second thought -- no, it shouldn't be done on the side of

Why?  My last idea described just above has such a bug: each time
you create a new weak reference to the method another method object
is cached (added to __method_cache__[func] set).

You could think that caching only one object (just in
__method_cache__[func]) would be a better idea, but it wouldn't:
such a behaviour would be strange and unstable: after creating
a new weakref to the method, the old weakref would became invalid...

And yes, we can prevent it by ensuring that each time you take
the method from a class instance you get the same object (per class
instance) -- but then we come back to my previous idea of a
descriptor-decorator.  And IMHO such a decorator should not be
applied on the class dictionary implicitly by weakref.ref()/proxy()
but explicitly in the class body with the decorator syntax
(applying such a decorater, i.e. replacing a function with a
caching descriptor is a class dict, is too invasive operation to
be done silently).

So I renew (and update) my previous descriptor-decorator that
could be added to functools (or to weakref as a helper?) and
applied explicitly by programmers, when needed:

    class CachedMethod(object):

        def __init__(self, func):
            self.func = func

        def __get__(self, instance, owner):
            if instance is None:
                return self.func
                cache = instance.__method_cache__
            except AttributeError:
                # not thread-safe :-(
                cache = instance.__method_cache__ = WeakKeyDictionary()
            return cache.setdefault(
                  types.MethodType(self.func, instance))


    class MyClass(object):
        def my_method(self):

    instance = MyClass()
    method_weak_proxy = weakref.proxy(instance.my_method)
    method_weak_proxy()  # works!

It should be noted that caching a reference to a method in an
instance causes circular referencing (class <-> instance).
However, ofter it is not a problem and can help avoiding
circular references involving other objects which we want to
have circular-ref-free (typical use case: passing a bound
method as a callback).


From zuo at  Sat Jun 16 10:46:48 2012
From: zuo at (Jan Kaliszewski)
Date: Sat, 16 Jun 2012 10:46:48 +0200
Subject: [Python-ideas] Weak-referencing/weak-proxying of (bound)
 methods [erratum, sorry]
In-Reply-To: <>
References: <>
Message-ID: <>

Jan Kaliszewski dixit (2012-06-16, 10:42):

> instance causes circular referencing (class <-> instance).


From p.f.moore at  Sat Jun 16 10:56:49 2012
From: p.f.moore at (Paul Moore)
Date: Sat, 16 Jun 2012 09:56:49 +0100
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On 16 June 2012 02:00, David Gates <gatesda at> wrote:
> A Perl nested comment:
> =for
> ? Comment
> ? =for
> ? ? Nested comment
> ? =cut
> =cut

And the irony is that, as far as I recall, this is a form of Perl's
embedded documentation syntax (and hence very similar in spirit to
using multiline strings as comments). See (and note that perl 6 does
*not*, apparently, include multiline comments).


From jeanpierreda at  Sat Jun 16 10:57:30 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 16 Jun 2012 04:57:30 -0400
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 16, 2012 at 2:53 AM, Steven D'Aprano <steve at> wrote:

Steven, the code I was talking about was the code inside the doctests,
not the code surrounding the doctests. So, for example, whether or not
the body of the function has comments doesn't matter. They could never
be confused with doctest directives.

-- Devin

From jeanpierreda at  Sat Jun 16 11:09:51 2012
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 16 Jun 2012 05:09:51 -0400
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 16, 2012 at 12:28 AM, Bruce Leban <bruce at> wrote:
> I meant that even if new comment syntax were added, string-style comments
> wouldn't be going away anytime soon. There's a high bar for adding features
> and an even higher bar for removing them. So tools will need?handle the
> current string comments for quite a while as well as being?modified to parse
> any new comment syntax.

Sorry, I didn't understand your point at first. That's a concern.
Although I'm not sure it pans out -- do any tools handle string

I only know of tools ignoring them or mistreating them, not handling
them specially.

-- Devin

From greg.ewing at  Sat Jun 16 09:59:21 2012
From: greg.ewing at (Greg Ewing)
Date: Sat, 16 Jun 2012 19:59:21 +1200
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:

> In which languages do multi-line comments nest? AFAIK not in the
> Java/C/C++/JavaScript family.



From gatesda at  Sat Jun 16 15:37:10 2012
From: gatesda at (David Gates)
Date: Sat, 16 Jun 2012 07:37:10 -0600
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

I was throwing together a quick language list, so I pulled some of them
from hyperpolyglot, including Perl.  So, guess it's not a dedicated comment
syntax, but it does nest (the document you linked says it doesn't, but it's

Found out that Lua also uses dead-code strings as comments.  It supports
nested strings, but the delimiters in each layer must be distinct.  Trying
to nest them otherwise is a syntax error, so you can't accidentally end a
string early like you can with quote delimiters.

On Sat, Jun 16, 2012 at 2:56 AM, Paul Moore <p.f.moore at> wrote:

> On 16 June 2012 02:00, David Gates <gatesda at> wrote:
> > A Perl nested comment:
> >
> > =for
> >   Comment
> >   =for
> >     Nested comment
> >   =cut
> > =cut
> And the irony is that, as far as I recall, this is a form of Perl's
> embedded documentation syntax (and hence very similar in spirit to
> using multiline strings as comments). See
> (and note that perl 6 does
> *not*, apparently, include multiline comments).
> Paul.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jun 16 16:56:27 2012
From: guido at (Guido van Rossum)
Date: Sat, 16 Jun 2012 07:56:27 -0700
Subject: [Python-ideas] Multi-line comment blocks.
In-Reply-To: <>
References: <>
Message-ID: <>

Please stop this discussion. Python is not going to change this.

--Guido van Rossum (

From ironfroggy at  Sat Jun 16 19:05:48 2012
From: ironfroggy at (Calvin Spealman)
Date: Sat, 16 Jun 2012 13:05:48 -0400
Subject: [Python-ideas] for/else statements considered harmful
In-Reply-To: <jqooj7$s7o$>
References: <jqooj7$s7o$>
Message-ID: <>

On Wed, Jun 6, 2012 at 7:20 PM, Alice Bevan?McGregor
<alice at> wrote:
> Howdy!
> Was teaching a new user to Python the ropes a short while ago and ran into
> an interesting headspace problem: the for/else syntax fails the obviousness
> and consistency tests. ?When used in an if/else block the conditional code
> is executed if the conditional passes, and the else block is executed if the
> conditional fails. ?Compared to for loops where the for code is repeated and
> the else code executed if we "naturally fall off the loop". ?(The new user's
> reaction was "why the hoek would I ever use for/else?")

I read it not as for/else and while/else, but break/else and this has
been a much
more natural framing for myself and those I've used the framing to explain the
behavior to.

> I forked Python 3.3 to experiment with an alternate implementation that
> follows the logic of pass/fail implied by if/else: (and to refactor the
> stdlib, but that's a different issue ;)
> ? for x in range(20):
> ? ? ? if x > 10: break
> ? else:
> ? ? ? pass # we had no values to iterate
> ? finally:
> ? ? ? pass # we naturally fell off the loop
> It abuses finally (to avoid tying up a potentially common word as a reserved
> word like "done") but makes possible an important distinction without having
> to perform potentially expensive length calculations (which may not even be
> possible!) on the value being iterated: that is, handling the case where
> there were no values in the collection or returned by the generator.
> Templating engines generally implement this type of structure. ?Of course
> this type of breaking change in semantics puts this idea firmly into Python
> 4 land.
> I'll isolate the for/else/finally code from my fork and post a patch this
> week-end, hopefully.
> ? ? ? ?? Alice.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

Read my blog! I depend on your acceptance of my opinion! I am interesting!
Follow me if you're into that sort of thing:

From lclarkmichalek at  Sun Jun 17 00:28:50 2012
From: lclarkmichalek at (Laurie Clark-Michalek)
Date: Sat, 16 Jun 2012 23:28:50 +0100
Subject: [Python-ideas] Make dict customisation easier
Message-ID: <>


A few weeks ago, a guy was on #python, looking to customise a dictionary to
be case insensitive (he was assuming string keys). His naive implementation
looked something like this:

    class CaseInsensitiveDict(dict):
        def __getitem__(self, key):
            return dict.__getitem__(self, key.lower())

        def __setitem__(self, key, item):
            dict.__setitem__(self, key.lower(), item)

However he was dismayed to find that this didn't work with other methods
that dict uses:

>>> d = CaseInsensitiveDict()
>>> d['a'] = 3
>>> d
{'a': 3}
>>> d['A']
>>> d.get('A', "No key found")
'No key found'

Eventually he was directed to dir(dict), and he seemed to accept that he
would have to wrap most of the methods of the dict builtin. This seemed
like the worse solution to me, and I couldn't see any real reason why
python couldn't either defer to user implemented __getitem__ and
__setitem__, or provide an alternative dict implementation that did allow
easy customisation.

I realise that python dicts are fairly high performance structures, and
that checking for a custom implementation might have an unacceptable impact
for a solution to what might be seen as a minor problem. Still, I think it
is worth the effort to clean up what seems to me to be a slight wart on a
very fundamental type in python.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From simon.sapin at  Sun Jun 17 00:41:27 2012
From: simon.sapin at (Simon Sapin)
Date: Sun, 17 Jun 2012 00:41:27 +0200
Subject: [Python-ideas] Make dict customisation easier
In-Reply-To: <>
References: <>
Message-ID: <>

Le 17/06/2012 00:28, Laurie Clark-Michalek a ?crit :
> Eventually he was directed to dir(dict), and he seemed to accept that he
> would have to wrap most of the methods of the dict builtin. This seemed
> like the worse solution to me, and I couldn't see any real reason why
> python couldn't either defer to user implemented __getitem__ and
> __setitem__, or provide an alternative dict implementation that did
> allow easy customisation.
> I realise that python dicts are fairly high performance structures, and
> that checking for a custom implementation might have an unacceptable
> impact for a solution to what might be seen as a minor problem. Still, I
> think it is worth the effort to clean up what seems to me to be a slight
> wart on a very fundamental type in python.


The MutableMapping class in the collections module has default 
implementations for many methods, based on a few basic method. I think 
that inheriting from it and adding __len__, __iter__, __getitem__, 
__setitem__ and __delitem__ should be enough.

Then you can override more methods for performance, but the defaults 
should be correct and consistent.

Simon Sapin

From lists at  Sun Jun 17 02:13:49 2012
From: lists at (Christian Heimes)
Date: Sun, 17 Jun 2012 02:13:49 +0200
Subject: [Python-ideas] Context helper for new os.*at functions
Message-ID: <jrj7ft$7fe$>


Python 3.3 has got new wrappers for the 'at' variants of low level
functions, for example os.openat(). The 'at' variants work like their
brothers and sisters with one exception. The first argument must be a
file descriptor of a directory. The fd is used to calculate the absolute
path instead of the current working directory.

File descriptors are harder to manage than files because a fd isnt't
automatically closed when it gets out of scope. I've written a small
wrapper that takes care of the details. It also ensures that only
directories are opened.


with atcontext("/etc") as at:
    # functools.partial(<built-in function openat>, 3)
    f ="fstab", os.O_RDONLY)
    print(, 50))


The code calculates the name and creates dynamic wrapper with
functool.partial. This may not be desired if the wrapper is added to the
os module. I could add explicit methods and generate the doc strings
from the methods' doc strings.

def docfix(func):
    name = func.__name__
    nameat = name + "at"
    doc = getattr(os, nameat).__doc__
    func.__doc__ = doc.replace("{}(dirfd, ".format(nameat),
    return func

class atcontext:

    def open(self, *args):
        return self.openat(self.dirf, *args)

How do you like my proposal?


From guido at  Sun Jun 17 02:46:15 2012
From: guido at (Guido van Rossum)
Date: Sat, 16 Jun 2012 17:46:15 -0700
Subject: [Python-ideas] Context helper for new os.*at functions
In-Reply-To: <jrj7ft$7fe$>
References: <jrj7ft$7fe$>
Message-ID: <>

Hmm... Isn't Larry Hastings working on replacing the separate
functions with an api where you pass an 'fd=...' argument to the
non-at function?

On Sat, Jun 16, 2012 at 5:13 PM, Christian Heimes <lists at> wrote:
> Hello,
> Python 3.3 has got new wrappers for the 'at' variants of low level
> functions, for example os.openat(). The 'at' variants work like their
> brothers and sisters with one exception. The first argument must be a
> file descriptor of a directory. The fd is used to calculate the absolute
> path instead of the current working directory.
> File descriptors are harder to manage than files because a fd isnt't
> automatically closed when it gets out of scope. I've written a small
> wrapper that takes care of the details. It also ensures that only
> directories are opened.
> Example:
> with atcontext("/etc") as at:
> ? ?print(
> ? ?# functools.partial(<built-in function openat>, 3)
> ? ?f ="fstab", os.O_RDONLY)
> ? ?print(, 50))
> ? ?os.close(f)
> Code:
> The code calculates the name and creates dynamic wrapper with
> functool.partial. This may not be desired if the wrapper is added to the
> os module. I could add explicit methods and generate the doc strings
> from the methods' doc strings.
> def docfix(func):
> ? ?name = func.__name__
> ? ?nameat = name + "at"
> ? ?doc = getattr(os, nameat).__doc__
> ? ?func.__doc__ = doc.replace("{}(dirfd, ".format(nameat),
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? "{}(".format(name))
> ? ?return func
> class atcontext:
> ? ?...
> ? ?@docfix
> ? ?def open(self, *args):
> ? ? ? ?return self.openat(self.dirf, *args)
> How do you like my proposal?
> Christian
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

--Guido van Rossum (

From lists at  Sun Jun 17 03:34:13 2012
From: lists at (Christian Heimes)
Date: Sun, 17 Jun 2012 03:34:13 +0200
Subject: [Python-ideas] Context helper for new os.*at functions
In-Reply-To: <>
References: <jrj7ft$7fe$>
Message-ID: <>

Am 17.06.2012 02:46, schrieb Guido van Rossum:
> Hmm... Isn't Larry Hastings working on replacing the separate
> functions with an api where you pass an 'fd=...' argument to the
> non-at function?

Oh, is he? I didn't know that. Indeed, it sounds like a good approach.

Users must still handle the fd correctly and make sure they open a
directory. Linux's man(2) open warns about possibility of
denial-of-service attempts for wrong fds. Linux has O_DIRECTORY for this
purpose. On other OSes users should do a stat() call in front, which is
open for race conditions but still better than getting stuck in a FIFO.

I could modify the wrapper a bit to make the wrapper useful for the new API:

class atcontext:
   def fileno(self):
       # for PyObject_AsFileDescriptor()
       return self.dirfd

with atcontext("/etc") as at:"fstab", os.O_RDONLY, fd=at)


From techtonik at  Mon Jun 18 17:26:59 2012
From: techtonik at (anatoly techtonik)
Date: Mon, 18 Jun 2012 18:26:59 +0300
Subject: [Python-ideas] Just __main__
Message-ID: <>

How about global __main__ as a boolean?

__name__ == '__main__' as a mark of entrypoint module is coherent and
logical, but awkward to type and requires explicit explaination for
newcomers even with prior background in other langauges.

From matt at  Mon Jun 18 18:07:04 2012
From: matt at (Matt Chaput)
Date: Mon, 18 Jun 2012 12:07:04 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

On 18/06/2012 11:26 AM, anatoly techtonik wrote:
> How about global __main__ as a boolean?

Love it.

From jkbbwr at  Mon Jun 18 18:09:11 2012
From: jkbbwr at (Jakob Bowyer)
Date: Mon, 18 Jun 2012 17:09:11 +0100
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>


On Mon, Jun 18, 2012 at 5:07 PM, Matt Chaput <matt at> wrote:
> On 18/06/2012 11:26 AM, anatoly techtonik wrote:
>> How about global __main__ as a boolean?
> Love it.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From ethan at  Mon Jun 18 18:17:05 2012
From: ethan at (Ethan Furman)
Date: Mon, 18 Jun 2012 09:17:05 -0700
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

anatoly techtonik wrote:
> How about global __main__ as a boolean?
> __name__ == '__main__' as a mark of entrypoint module is coherent and
> logical, but awkward to type and requires explicit explaination for
> newcomers even with prior background in other langauges.

So instead of:

   if __name__ == '__main__':

you would have:

   if __main__:



From ubershmekel at  Mon Jun 18 18:49:41 2012
From: ubershmekel at (Yuval Greenfield)
Date: Mon, 18 Jun 2012 19:49:41 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 18, 2012 at 7:17 PM, Ethan Furman <ethan at> wrote:

> anatoly techtonik wrote:
>> How about global __main__ as a boolean?
>> __name__ == '__main__' as a mark of entrypoint module is coherent and
>> logical, but awkward to type and requires explicit explaination for
>> newcomers even with prior background in other langauges.
> So instead of:
>  if __name__ == '__main__':
>    ...
> you would have:
>  if __main__:
>    ...
> ?
> ~Ethan~

Makes sense....

if __main__:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Mon Jun 18 18:57:15 2012
From: stephen at (Stephen J. Turnbull)
Date: Tue, 19 Jun 2012 01:57:15 +0900
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman writes:
 > anatoly techtonik wrote:
 > > How about global __main__ as a boolean?


Saves typing, yes, but otherwise there's no point.  It would need just
as much explanation, for one thing.

 > you would have:
 >    if __main__:
 >      ...

Would it be writable?

__main__ = False

if __main__:
    print("Oh, I didn't want to run these tests anyway...")

From storchaka at  Mon Jun 18 19:12:39 2012
From: storchaka at (Serhiy Storchaka)
Date: Mon, 18 Jun 2012 20:12:39 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <jrnnhb$n5a$>

On 18.06.12 19:17, Ethan Furman wrote:
> anatoly techtonik wrote:
>> How about global __main__ as a boolean?
>> __name__ == '__main__' as a mark of entrypoint module is coherent and
>> logical, but awkward to type and requires explicit explaination for
>> newcomers even with prior background in other langauges.
> So instead of:
> if __name__ == '__main__':
> ...
> you would have:
> if __main__:
> ...
> ?

No, it is much easier.

   import sys
   if __main__ if sys.version_info >= (3, 9) else __name__ == '__main__':


   except NameError:
       __main__ = __name__ == '__main__'
   if __main__:

From mikegraham at  Mon Jun 18 19:25:11 2012
From: mikegraham at (Mike Graham)
Date: Mon, 18 Jun 2012 13:25:11 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <jrnnhb$n5a$>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, Jun 18, 2012 at 1:12 PM, Serhiy Storchaka <storchaka at> wrote:
> No, it is much easier.
>  import sys
>  if __main__ if sys.version_info >= (3, 9) else __name__ == '__main__':
>      ...
> or
>  try:
>      __main__
>  except NameError:
>      __main__ = __name__ == '__main__'
>  if __main__:
>      ...

That's nonsense. If you wanted to support old Python versions, you'd
write `if __name__ == '__main__'` (there's no reason __name__ would
change its behavior). If the oldest version you wanted to support had
this feature, you're write `if __main__`. This is the way every other
new feature works. (It even has the advantage of failing loudly if you
try to do it on an older version of Python.)

That being said, I'm -0 on the feature. I don't think it's really much
easier to explain or worth any effort.


From amcnabb at  Mon Jun 18 20:05:26 2012
From: amcnabb at (Andrew McNabb)
Date: Mon, 18 Jun 2012 12:05:26 -0600
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, Jun 18, 2012 at 01:25:11PM -0400, Mike Graham wrote:
> That being said, I'm -0 on the feature. I don't think it's really much
> easier to explain or worth any effort.

I agree that having a boolean called "__main__" wouldn't add much value,
but I believe that recognizing a function called "__main__" could
potentially add a bit more value.

After executing the body of a script, the interpreter would
automatically call the "__main__" function if it exists, and exit with
its return value.  Thus:

def __main__():
    return 42

would be roughly equivalent to:

if __name__ == '__main__':

It might make sense to have "python -i" not call the "__main__"
function, making it easier to interact with a script after the time that
its methods and global variables are all defined but before the time
that it enters __main__.

I'm not sure if a "__main__" function would add enough value, but I
think it would add more value than a "__main__" boolean.

Andrew McNabb
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868

From jeremiah.dodds at  Mon Jun 18 20:59:27 2012
From: jeremiah.dodds at (Jeremiah Dodds)
Date: Mon, 18 Jun 2012 14:59:27 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <> (Andrew McNabb's message of
	"Mon, 18 Jun 2012 12:05:26 -0600")
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <87txy8cupc.fsf@destructor.i-did-not-set--mail-host-address--so-tickle-me>

Andrew McNabb <amcnabb at> writes:

> I'm not sure if a "__main__" function would add enough value, but I
> think it would add more value than a "__main__" boolean.

+1 . 

From bruce at  Mon Jun 18 21:39:05 2012
From: bruce at (Bruce Leban)
Date: Mon, 18 Jun 2012 12:39:05 -0700
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, Jun 18, 2012 at 11:05 AM, Andrew McNabb <amcnabb at> wrote:

> I agree that having a boolean called "__main__" wouldn't add much value,
> but I believe that recognizing a function called "__main__" could
> potentially add a bit more value.
> After executing the body of a script, the interpreter would
> automatically call the "__main__" function if it exists, and exit with
> its return value.

The special value of __name__ and the proposed __main__() function are both
a bit magic. However, when I write if __name__ == '__main__' it's at least
clear that that if statement *will* be executed. It's just a question of
when the condition is true and if I don't know I can find out fairly
easily. (As I did the first time I saw it and probably other people on this
list did too.) On the other hand, it's not at all obvious that a function
named __main__ will be executed automagically.

This will increase the python learning curve, because people will need to
learn both the old method and the new method, especially since code that is
compatible with multiple python versions will need to continue to use the
old method. It saves one or two lines:

    if __name__ == '__main__': main()

A __main__ boolean, that saves even less typing, and does not seem worth
adding either.

-1 for both

--- Bruce
Follow me:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From massimo.dipierro at  Mon Jun 18 21:58:38 2012
From: massimo.dipierro at (Massimo DiPierro)
Date: Mon, 18 Jun 2012 14:58:38 -0500
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

how about a decorator that have the same effect as calling the function is __name__=='__main__'?

On Jun 18, 2012, at 2:39 PM, Bruce Leban wrote:

> On Mon, Jun 18, 2012 at 11:05 AM, Andrew McNabb <amcnabb at> wrote:
> I agree that having a boolean called "__main__" wouldn't add much value,
> but I believe that recognizing a function called "__main__" could
> potentially add a bit more value.
> After executing the body of a script, the interpreter would
> automatically call the "__main__" function if it exists, and exit with
> its return value.
> The special value of __name__ and the proposed __main__() function are both a bit magic. However, when I write if __name__ == '__main__' it's at least clear that that if statement *will* be executed. It's just a question of when the condition is true and if I don't know I can find out fairly easily. (As I did the first time I saw it and probably other people on this list did too.) On the other hand, it's not at all obvious that a function named __main__ will be executed automagically.
> This will increase the python learning curve, because people will need to learn both the old method and the new method, especially since code that is compatible with multiple python versions will need to continue to use the old method. It saves one or two lines:
>     if __name__ == '__main__': main()
> A __main__ boolean, that saves even less typing, and does not seem worth adding either.
> -1 for both
> --- Bruce
> Follow me:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amcnabb at  Mon Jun 18 22:09:33 2012
From: amcnabb at (Andrew McNabb)
Date: Mon, 18 Jun 2012 14:09:33 -0600
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, Jun 18, 2012 at 12:39:05PM -0700, Bruce Leban wrote:
> The special value of __name__ and the proposed __main__() function are both
> a bit magic. However, when I write if __name__ == '__main__' it's at least
> clear that that if statement *will* be executed. It's just a question of
> when the condition is true and if I don't know I can find out fairly
> easily. (As I did the first time I saw it and probably other people on this
> list did too.) On the other hand, it's not at all obvious that a function
> named __main__ will be executed automagically.

Given that C, Java, and numerous other languages automagically execute a
function called "main", I would argue that a "__main__" function would
actually be _less_ surprising than "if __name__ == '__main__'" for most
new Python users.

> This will increase the python learning curve, because people will need to
> learn both the old method and the new method, especially since code that is
> compatible with multiple python versions will need to continue to use the
> old method. It saves one or two lines:
>     if __name__ == '__main__': main()

If the only difference is saving a few lines, I agree that it probably
isn't worth it.  However, it also allows for a richer interactive mode
as I mentioned previously, so the benefit may not be limited to the
neglible number of lines saved.

Andrew McNabb
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868

From storchaka at  Mon Jun 18 22:14:04 2012
From: storchaka at (Serhiy Storchaka)
Date: Mon, 18 Jun 2012 23:14:04 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <jro25j$f82$>

On 18.06.12 20:25, Mike Graham wrote:
> That's nonsense.

Of cause. This is a reductio ad absurdum.

From solipsis at  Mon Jun 18 22:13:50 2012
From: solipsis at (Antoine Pitrou)
Date: Mon, 18 Jun 2012 22:13:50 +0200
Subject: [Python-ideas] Just __main__
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, 18 Jun 2012 14:09:33 -0600
Andrew McNabb <amcnabb at> wrote:
> On Mon, Jun 18, 2012 at 12:39:05PM -0700, Bruce Leban wrote:
> > 
> > The special value of __name__ and the proposed __main__() function are both
> > a bit magic. However, when I write if __name__ == '__main__' it's at least
> > clear that that if statement *will* be executed. It's just a question of
> > when the condition is true and if I don't know I can find out fairly
> > easily. (As I did the first time I saw it and probably other people on this
> > list did too.) On the other hand, it's not at all obvious that a function
> > named __main__ will be executed automagically.
> Given that C, Java, and numerous other languages automagically execute a
> function called "main", I would argue that a "__main__" function would
> actually be _less_ surprising than "if __name__ == '__main__'" for most
> new Python users.

Yes, a __main__ function would be reasonable, especially now that we
have files in packages.

Massimo's suggestion of a decorator, OTOH, sounds useless: how would it
help in any way?



From ethan at  Mon Jun 18 22:24:28 2012
From: ethan at (Ethan Furman)
Date: Mon, 18 Jun 2012 13:24:28 -0700
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>	<>
	<jrnnhb$n5a$>	<>	<>	<>
Message-ID: <>

Massimo DiPierro wrote:
> how about a decorator that have the same effect as calling the function 
> is __name__=='__main__'?

I believe several have been written... something like (untested):

def main(automagically_run):
   if __name__ == 'main':
   return automagically_run  # assuming SystemExit wasn't raised ;)


From ethan at  Mon Jun 18 22:38:31 2012
From: ethan at (Ethan Furman)
Date: Mon, 18 Jun 2012 13:38:31 -0700
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>	<>
	<jrnnhb$n5a$>	<>	<>	<>	<>
Message-ID: <>

Antoine Pitrou wrote:
> On Mon, 18 Jun 2012 14:09:33 -0600
> Andrew McNabb <amcnabb at> wrote:
>> On Mon, Jun 18, 2012 at 12:39:05PM -0700, Bruce Leban wrote:
>>> The special value of __name__ and the proposed __main__() function are both
>>> a bit magic. However, when I write if __name__ == '__main__' it's at least
>>> clear that that if statement *will* be executed. It's just a question of
>>> when the condition is true and if I don't know I can find out fairly
>>> easily. (As I did the first time I saw it and probably other people on this
>>> list did too.) On the other hand, it's not at all obvious that a function
>>> named __main__ will be executed automagically.
>> Given that C, Java, and numerous other languages automagically execute a
>> function called "main", I would argue that a "__main__" function would
>> actually be _less_ surprising than "if __name__ == '__main__'" for most
>> new Python users.
> Yes, a __main__ function would be reasonable, especially now that we
> have files in packages.
> Massimo's suggestion of a decorator, OTOH, sounds useless: how would it
> help in any way?

I've actually tried the @main decorator approach, and found it not worth 
the trouble -- I went back to 'if __name__ == "__main__"'.


From steve at  Mon Jun 18 23:26:16 2012
From: steve at (Steven D'Aprano)
Date: Tue, 19 Jun 2012 07:26:16 +1000
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>	<>
	<jrnnhb$n5a$>	<>	<>	<>
Message-ID: <>

Andrew McNabb wrote:
> On Mon, Jun 18, 2012 at 12:39:05PM -0700, Bruce Leban wrote:
>> The special value of __name__ and the proposed __main__() function are both
>> a bit magic. However, when I write if __name__ == '__main__' it's at least
>> clear that that if statement *will* be executed. It's just a question of
>> when the condition is true and if I don't know I can find out fairly
>> easily. (As I did the first time I saw it and probably other people on this
>> list did too.) On the other hand, it's not at all obvious that a function
>> named __main__ will be executed automagically.
> Given that C, Java, and numerous other languages automagically execute a
> function called "main", I would argue that a "__main__" function would
> actually be _less_ surprising than "if __name__ == '__main__'" for most
> new Python users.

What makes you think that "most" new users will be experienced in C or Java?

I think it is more likely that the majority of new users will have no 
experience in programming at all, or that their primary experience will be in 
PHP or Javascript.

But we're all just guessing really. I don't think any of us know what 
languages most current Python users came from, let alone what future ones will 
come from.

But as a matter of principle, I would prefer to assume that new users come in 
with as few preconceived ideas as possible, rather than assuming that they 
expect Python to be just like <insert language of choice here>.


From ned at  Mon Jun 18 23:35:03 2012
From: ned at (Ned Batchelder)
Date: Mon, 18 Jun 2012 17:35:03 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On 6/18/2012 4:09 PM, Andrew McNabb wrote:
> On Mon, Jun 18, 2012 at 12:39:05PM -0700, Bruce Leban wrote:
>> The special value of __name__ and the proposed __main__() function are both
>> a bit magic. However, when I write if __name__ == '__main__' it's at least
>> clear that that if statement *will* be executed. It's just a question of
>> when the condition is true and if I don't know I can find out fairly
>> easily. (As I did the first time I saw it and probably other people on this
>> list did too.) On the other hand, it's not at all obvious that a function
>> named __main__ will be executed automagically.
> Given that C, Java, and numerous other languages automagically execute a
> function called "main", I would argue that a "__main__" function would
> actually be _less_ surprising than "if __name__ == '__main__'" for most
> new Python users.
But a __main__ function misses the whole point: that a module can be 
importable and runnable, and the if statement detects the difference.  
If you simply want a function that is always invoked as the main, then 
just invoke it:

    def main():
         blah blah


No need for special names at all.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From matt at  Tue Jun 19 00:39:14 2012
From: matt at (Matt Chaput)
Date: Mon, 18 Jun 2012 18:39:14 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

> But a __main__ function misses the whole point: that a module can be
> importable and runnable, and the if statement detects the difference. If
> you simply want a function that is always invoked as the main, then just
> invoke it:
>     def main():
>     blah blah
>     main()
> No need for special names at all.

I'm afraid you're the one who's missed the point... the interpreter 
would only call __main__() if __name__ == "__main__"

Some people will cry "magic", but to me this is about what makes sense 
when you explain it to someone, and I think __main__() makes more sense 
(especially to someone with experience in other languages) than "if 
__name__ == "__main__""


From ned at  Tue Jun 19 02:37:56 2012
From: ned at (Ned Batchelder)
Date: Mon, 18 Jun 2012 20:37:56 -0400
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
	<> <>
Message-ID: <>

On 6/18/2012 6:39 PM, Matt Chaput wrote:
>> But a __main__ function misses the whole point: that a module can be
>> importable and runnable, and the if statement detects the difference. If
>> you simply want a function that is always invoked as the main, then just
>> invoke it:
>>     def main():
>>     blah blah
>>     main()
>> No need for special names at all.
> I'm afraid you're the one who's missed the point... the interpreter 
> would only call __main__() if __name__ == "__main__"
> Some people will cry "magic", but to me this is about what makes sense 
> when you explain it to someone, and I think __main__() makes more 
> sense (especially to someone with experience in other languages) than 
> "if __name__ == "__main__""
I understand the proposal now, and yes, it is "magic".  Explicit is 
better than implicit.  I like this explanation: "When you run a Python 
program, all the statements are run, from top to bottom." better than, 
"When you run a Python program, all the statements are run, from top to 
bottom, and then if there is a __main__ function (which there need not 
be), then it is invoked."

Python is full of constructs that are simpler than other languages, 
which when used in conventional ways, act similar to other languages.  
No need to complicate things to make it easier for C programmers to 
understand.  There's a lot they need to get used to in Python, and "if 
__name__ == '__main__':" is not difficult.


> Matt
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From techtonik at  Tue Jun 19 09:01:25 2012
From: techtonik at (anatoly techtonik)
Date: Tue, 19 Jun 2012 10:01:25 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 18, 2012 at 7:57 PM, Stephen J. Turnbull <stephen at> wrote:
> Ethan Furman writes:
> ?> anatoly techtonik wrote:
> ?> > How about global __main__ as a boolean?
> -1
> Saves typing, yes, but otherwise there's no point. ?It would need just
> as much explanation, for one thing.

It would be more convincing to have a solid counter argument for -1,
or else I am inclined to count 'no point' arguments as -0.

> ?> you would have:
> ?>
> ?> ? ?if __main__:
> ?> ? ? ?...
> Would it be writable?

The same way as __name__. Yes.

From techtonik at  Tue Jun 19 09:52:36 2012
From: techtonik at (anatoly techtonik)
Date: Tue, 19 Jun 2012 10:52:36 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <87txy8cupc.fsf@destructor.i-did-not-set--mail-host-address--so-tickle-me>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Mon, Jun 18, 2012 at 9:59 PM, Jeremiah Dodds
<jeremiah.dodds at> wrote:
> Andrew McNabb <amcnabb at> writes:
>> I'm not sure if a "__main__" function would add enough value, but I
>> think it would add more value than a "__main__" boolean.
> +1 .

My first thoughts is that __main__() as a function is bad for Python,
and here is why.

In C, Java and other compiled languages so-called main() function is
the primary execution entrypoint. With no main() there was no way to
instruct compiler what should be run first, so logically it will just
start with the first function at the top (and if you remember early
compliers - you can only call functions that are already defined, that
means written above yours). Code exection always started with main().
It was the first application byte to start with when program was
loaded to memory by OS.

In Python execution of a program code starts before the  if __name__
== '__main__'   is encountered (and it's awesome feature of a
scripting language to start execution immediately). With automagical
__main__() function it will also start before. That's why __main__()
will never be the substitution for the classical C style entrypoint.

In Python entrypoint is a module (entrypoint namespace). __name__ is
equal to '__main__' not only in a script, but also in console. And
__main__ as a flag in this namespace correctly reflects this semantic
- "Is this a main namespace? True". A value of __name__ in console

So, __main__() function is not equivalent to C/Java entrypoint.
However, a function like this may play an important role to mark the
end of the "import phase" or "initialization phase". A high level
concept that is extremely useful for web
applications/servers/frameworks, who need to know when an application
processes can be more effectively forked.

Here is one more problem - when module is executed as a script, it
loses its __name__, which becomes equal to '__main__'. I don't know if
it ever caused any problems with imports or consistency in object
space, or with static imports - it will be interesting to know any
outcomes. What if module __name__ always meant module name? But that's
another thread. As for __main__ - in this case instead of boolean it
could be the name of the entrypoint module, and the check would be  if
__name__ == __main__   without quotes.

From stephen at  Tue Jun 19 09:58:30 2012
From: stephen at (Stephen J. Turnbull)
Date: Tue, 19 Jun 2012 16:58:30 +0900
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <>

anatoly techtonik writes:

 > It would be more convincing to have a solid counter argument for
 > -1,

"Not every three-line function needs to be a builtin."
"Explicit is better than implicit."
"Simple is better than complex."
"There should be one (and preferably only one) obvious way to do it."

 > or else I am inclined to count 'no point' arguments as -0.

Feel free; it doesn't matter to me, and I don't much matter to the
decision, either.  Not to mention that you don't do the counting.

The people who will actually make a decision on this don't need it
spelled out, though, and your ideas would get better reception from
Those Whose Opinions Really Count if you would filter them through the
Zen before posting.

From techtonik at  Tue Jun 19 09:56:30 2012
From: techtonik at (anatoly techtonik)
Date: Tue, 19 Jun 2012 10:56:30 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Tue, Jun 19, 2012 at 10:52 AM, anatoly techtonik <techtonik at> wrote:
> I don't know if it ever caused any problems with imports or consistency in object
> space, or with static imports - it will be interesting to know any
> outcomes.

s/static imports/static analysis/

From storchaka at  Tue Jun 19 10:14:47 2012
From: storchaka at (Serhiy Storchaka)
Date: Tue, 19 Jun 2012 11:14:47 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <jrpcct$7l0$>

On 18.06.12 21:05, Andrew McNabb wrote:
> It might make sense to have "python -i" not call the "__main__"
> function, making it easier to interact with a script after the time that
> its methods and global variables are all defined but before the time
> that it enters __main__.

   python -i -c "from SCRIPT import *"

From simon.sapin at  Tue Jun 19 11:41:03 2012
From: simon.sapin at (Simon Sapin)
Date: Tue, 19 Jun 2012 11:41:03 +0200
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

Le 19/06/2012 09:52, anatoly techtonik a ?crit :
> Here is one more problem - when module is executed as a script, it
> loses its __name__, which becomes equal to '__main__'.

PEP 395 "Qualified Names for Modules" tries to address this.

Simon Sapin

From ubershmekel at  Tue Jun 19 13:27:59 2012
From: ubershmekel at (Yuval Greenfield)
Date: Tue, 19 Jun 2012 14:27:59 +0300
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

On Tue, Jun 19, 2012 at 12:41 PM, Simon Sapin <simon.sapin at> wrote:

> Le 19/06/2012 09:52, anatoly techtonik a ?crit :
>  Here is one more problem - when module is executed as a script, it
>> loses its __name__, which becomes equal to '__main__'.
> PEP 395 "Qualified Names for Modules" tries to address this.
I agree that python does not need any magic __main__ function. The __main__
boolean is streets ahead in readability though.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexandre.zani at  Tue Jun 19 17:02:37 2012
From: alexandre.zani at (Alexandre Zani)
Date: Tue, 19 Jun 2012 08:02:37 -0700
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
	<> <jrnnhb$n5a$>
Message-ID: <>

-1 on a __main__ function. Seems like unnecessarily confusing magic.

-0 on a __main__ boolean. It just doesn't seem to add much value on
top of __name__ == '__main__'.

On Tue, Jun 19, 2012 at 4:27 AM, Yuval Greenfield <ubershmekel at> wrote:
> On Tue, Jun 19, 2012 at 12:41 PM, Simon Sapin <simon.sapin at> wrote:
>> Le 19/06/2012 09:52, anatoly techtonik a ?crit :
>>> Here is one more problem - when module is executed as a script, it
>>> loses its __name__, which becomes equal to '__main__'.
>> PEP 395 "Qualified Names for Modules" tries to address this.
> I agree that python does not need any magic __main__ function. The __main__
> boolean is streets ahead in readability though.
> Yuval
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

From barry at  Wed Jun 20 21:06:08 2012
From: barry at (Barry Warsaw)
Date: Wed, 20 Jun 2012 15:06:08 -0400
Subject: [Python-ideas] Add adaptive-load salt-mandatory
	hashing	functions?
References: <>
Message-ID: <>

On Jun 15, 2012, at 07:07 PM, Eli Collins wrote:

>The reason I see a need for such a function is that all existing password
>hashing libraries (passlib, cryptacular, flufl.password,
>django.contrib.auth.hashers, etc) have had to roll their own pure-python
>pbkdf2 implementations, to varying degrees of speed. And speed is paramount
>for pbkdf2 usage, since security depends on squeezing as many rounds / second
>out of the implementation as possible.

To be honest, if I'd known about passlib I probably would never have written
flufl.password.  Extra +1 goodness for passlib's Python 3 support!

I'm going to migrate my own applications to passlib and if that goes well,
I'll start the process of deprecating flufl.password.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From sven at  Thu Jun 21 19:04:37 2012
From: sven at (Sven Marnach)
Date: Thu, 21 Jun 2012 18:04:37 +0100
Subject: [Python-ideas] Just __main__
In-Reply-To: <>
References: <>
Message-ID: <20120621170437.GB4153@bagheera>

anatoly techtonik schrieb am Mon, 18. Jun 2012, um 18:26:59 +0300:
> How about global __main__ as a boolean?

Currently, __main__ is the name of a module.  You can do

    import __main__

to import this module.  After this import, __main__ evaluates to True
as a Boolean expression.

I don't think it's a good idea to overload the meaning of the


From steve at  Fri Jun 22 03:20:16 2012
From: steve at (Steven D'Aprano)
Date: Fri, 22 Jun 2012 11:20:16 +1000
Subject: [Python-ideas] Just __main__
In-Reply-To: <20120621170437.GB4153@bagheera>
References: <>
Message-ID: <>

Sven Marnach wrote:
> anatoly techtonik schrieb am Mon, 18. Jun 2012, um 18:26:59 +0300:
>> How about global __main__ as a boolean?
> Currently, __main__ is the name of a module.  You can do
>     import __main__
> to import this module.  After this import, __main__ evaluates to True
> as a Boolean expression.
> I don't think it's a good idea to overload the meaning of the
> __main__.

Well caught! I think that kills this proposal dead.


From kim at  Mon Jun 25 14:17:29 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Mon, 25 Jun 2012 14:17:29 +0200
Subject: [Python-ideas] BackupFile
Message-ID: <>


I'm new here, so forgive me if this has been discussed before or is off-topic.

I came up with a mechanism that I thought might be useful in the
Python standard library -- a scope-bound self-restoring backup file. I
came to this na?ve implementation;

class BackupError(Exception):

class Backup:
   def __init__(self, path):
       if not os.path.exists(path) or os.path.isdir(path):
           raise BackupError("%s must be a valid file path" % path)

       self.path = path
       self.backup_path = None

   def __enter__(self):

   def __exit__(self, type, value, traceback):

   def _generate_backup_path(self):
       tempdir = tempfile.mkdtemp()
       basename = os.path.basename(self.path)
       return os.path.join(tempdir, basename)

   def backup(self):
       backup_path = self._generate_backup_path()
       shutil.copy(self.path, backup_path)
       self.backup_path = backup_path

   def restore(self):
       if self.backup_path:
           # Write backup back onto original
           shutil.copy(self.backup_path, self.path)
           self.backup_path = None

Backups are intended to be scope-bound like so:

 with Backup(settings_file):

I even managed to use it with the @contextmanager attribute, to allow this:

 with rewrite_settings(settings_file):

So, open questions;

- Would something like this be useful outside of my office?
- Any suggestions for better names?
- This feels like it belongs in the tempfile module, would you agree?
- What's lacking in the implementation? Have I done something
decidedly non-Pythonic?

- Kim

From masklinn at  Mon Jun 25 14:33:36 2012
From: masklinn at (Masklinn)
Date: Mon, 25 Jun 2012 14:33:36 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

On 2012-06-25, at 14:17 , Kim Gr?sman wrote:
> - Would something like this be useful outside of my office?

I'm not sure I correctly understand the purpose of this, and if I do it
seems to be kind-of a hack for "fixing" kind-of crummy code: is it
correct that the goal is to temporarily edit a file (and restore it
later) to change the behavior of *other* pieces of code reading the same

So essentially dynamically scoping the content of a file?

I find the idea rather troublesome/problematic, as it's completely
blind to (and unsafe under) concurrent access, and will be tricky to
handle cleanly wrt filesystem caches and commits.

The initial mail hinted at atomic file replacement *or* backuping a file
and restoring the backup on error, something along the lines of:

    with Backup(settings_file):
    # altered file

    with Backup(settings_file):
        raise Exception("boom")
    # old file is back

in the same way e.g. Emacs will keep "~" files around during edition. That
could have been a ~+1 for me, but the behavior as I understood it
(understanding which may be incorrect, again) I'd be ?1 on, it seems too
dangerous and too tied to other issues in the code.

From mikegraham at  Mon Jun 25 15:42:22 2012
From: mikegraham at (Mike Graham)
Date: Mon, 25 Jun 2012 09:42:22 -0400
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jun 25, 2012 at 8:17 AM, Kim Gr?sman <kim at> wrote:
> Hello,
> I'm new here, so forgive me if this has been discussed before or is off-topic.
> I came up with a mechanism that I thought might be useful in the
> Python standard library -- a scope-bound self-restoring backup file. I
> came to this na?ve implementation;
> --
> class BackupError(Exception):
> ? pass
> class Backup:
> ? def __init__(self, path):
> ? ? ? if not os.path.exists(path) or os.path.isdir(path):
> ? ? ? ? ? raise BackupError("%s must be a valid file path" % path)
> ? ? ? self.path = path
> ? ? ? self.backup_path = None
> ? def __enter__(self):
> ? ? ? self.backup()
> ? def __exit__(self, type, value, traceback):
> ? ? ? self.restore()
> ? def _generate_backup_path(self):
> ? ? ? tempdir = tempfile.mkdtemp()
> ? ? ? basename = os.path.basename(self.path)
> ? ? ? return os.path.join(tempdir, basename)
> ? def backup(self):
> ? ? ? backup_path = self._generate_backup_path()
> ? ? ? shutil.copy(self.path, backup_path)
> ? ? ? self.backup_path = backup_path
> ? def restore(self):
> ? ? ? if self.backup_path:
> ? ? ? ? ? # Write backup back onto original
> ? ? ? ? ? shutil.copy(self.backup_path, self.path)
> ? ? ? ? ? shutil.rmtree(os.path.dirname(self.backup_path))
> ? ? ? ? ? self.backup_path = None
> --
> Backups are intended to be scope-bound like so:
> ?with Backup(settings_file):
> ? ?rewrite_settings(settings_file)
> ? ?do_something_else()
> I even managed to use it with the @contextmanager attribute, to allow this:
> ?with rewrite_settings(settings_file):
> ? ?do_something_else()
> So, open questions;
> - Would something like this be useful outside of my office?
> - Any suggestions for better names?
> - This feels like it belongs in the tempfile module, would you agree?
> - What's lacking in the implementation? Have I done something
> decidedly non-Pythonic?
> Thanks,
> - Kim

I like the basic idea, but if we do something like this, it would be
useful to have read access to the old version of the file while you
are writing out the new version that might become permanent.

If I was to implement something like this, I'd use a "right a
temporary file then copy it overwriting the old one when I'm done"
approach rather than a "back up the file" approach so that if the
process dies for a reason Python can't clean up after (like due to
SIGKILL), the half-written file doesn't remain.

I don't really like the name Backup but I can't think of a better name
at the moment.


From lists at  Mon Jun 25 15:59:40 2012
From: lists at (Christian Heimes)
Date: Mon, 25 Jun 2012 15:59:40 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <js9qsc$k21$>

Am 25.06.2012 14:17, schrieb Kim Gr?sman:
> Hello,
> I'm new here, so forgive me if this has been discussed before or is off-topic.
> I came up with a mechanism that I thought might be useful in the
> Python standard library -- a scope-bound self-restoring backup file. I
> came to this na?ve implementation;

Are you aiming for atomic file rollover backed by a temporary file?
That's the common way to safely overwrite an existing file. It works
differently than your code.

* Create a temporary file with O_CREAT | O_EXCL in the same directory as
the file you like to replace

* Write data to new file

* Call sync() on the file as well as fdatasync() and fsync() on the file

* close the file

* use atomic rename to replace the old file with the new file (IIRC
won't work atomically on Windows)

I've some code laying around somewhere that implements a RolloverFile
similar to tempfile.NamedTemporaryFile.


From masklinn at  Mon Jun 25 16:23:00 2012
From: masklinn at (Masklinn)
Date: Mon, 25 Jun 2012 16:23:00 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <js9qsc$k21$>
References: <>
Message-ID: <>

On 2012-06-25, at 15:59 , Christian Heimes wrote:

> Am 25.06.2012 14:17, schrieb Kim Gr?sman:
>> Hello,
>> I'm new here, so forgive me if this has been discussed before or is off-topic.
>> I came up with a mechanism that I thought might be useful in the
>> Python standard library -- a scope-bound self-restoring backup file. I
>> came to this na?ve implementation;
> Are you aiming for atomic file rollover backed by a temporary file?

No, see my mail and his confirmation, it's a shim to dynamically (scope-wise)
rewrite sections of a configuration file (and undo the rewrites thereafter)
because that's the sole way to configure a third-party library.

From kim at  Mon Jun 25 16:39:33 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Mon, 25 Jun 2012 16:39:33 +0200
Subject: [Python-ideas]  BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>


---------- Forwarded message ----------
From: Kim Gr?sman <kim at>
Date: Mon, Jun 25, 2012 at 3:21 PM
Subject: Re: [Python-ideas] BackupFile
To: Masklinn <masklinn at>

Hi Masklinn,

Thanks for your response!

On Mon, Jun 25, 2012 at 2:33 PM, Masklinn <masklinn at> wrote:
> On 2012-06-25, at 14:17 , Kim Gr?sman wrote:
>> - Would something like this be useful outside of my office?
> I'm not sure I correctly understand the purpose of this, and if I do it
> seems to be kind-of a hack for "fixing" kind-of crummy code: is it
> correct that the goal is to temporarily edit a file (and restore it
> later) to change the behavior of *other* pieces of code reading the same
> file?
> So essentially dynamically scoping the content of a file?

Yes, that's it. I use it to adapt the behavior of third-party code I
can only affect through configuration files.

> I find the idea rather troublesome/problematic, as it's completely
> blind to (and unsafe under) concurrent access, and will be tricky to
> handle cleanly wrt filesystem caches and commits.

Good point. I use this in a controlled environment, where I know
nobody else is using the file. Multiple concurrent users would break
this completely...

> The initial mail hinted at atomic file replacement *or* backuping a file
> and restoring the backup on error, something along the lines of:
> ? ?with Backup(settings_file):
> ? ? ? ?alter_file()
> ? ? ? ?alter_file_2()
> ? ? ? ?alter_file_3()
> ? ?# altered file

Nope, not this.

> ? ?with Backup(settings_file):
> ? ? ? ?alter_file()
> ? ? ? ?alter_file_2()
> ? ? ? ?raise Exception("boom")
> ? ? ? ?alter_file_3()
> ? ?# old file is back

This is what I was aiming for, except old file would be
unconditionally restored.

> in the same way e.g. Emacs will keep "~" files around during edition. That
> could have been a ~+1 for me, but the behavior as I understood it
> (understanding which may be incorrect, again) I'd be -1 on, it seems too
> dangerous and too tied to other issues in the code.

Yeah, I think the concurrency aspect of it makes it easy to misuse, so
it's probably not a good fit for the standard library.

- Kim

From kim at  Mon Jun 25 17:03:12 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Mon, 25 Jun 2012 17:03:12 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Mike,

On Mon, Jun 25, 2012 at 3:41 PM, Mike Graham <mikegraham at> wrote:
> I like the basic idea, but if we do something like this, it would be
> useful to have read access to the old version of the file while you
> are writing out the new version that might become permanent.

Thanks, though this sounds like another mechanism than the one I'm
aiming for :-)

I want to replace an existing file temporarily, and then restore it no
matter what.

> If I was to implement something like this, I'd use a "right a
> temporary file then copy it overwriting the old one when I'm done"
> approach rather than a "back up the file" approach so that if the
> process dies for a reason Python can't clean up after (like due to
> SIGKILL), the half-written file doesn't remain.

This is a very valid concern -- if the process dies unexpectedly I'd
leave the file replaced and the original in some temporary directory.
Not sure if there's a way around that, probably not.

> I don't really like the name Backup but I can't think of a better name
> at the moment.

Me neither.

- Kim

From kim at  Mon Jun 25 17:04:36 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Mon, 25 Jun 2012 17:04:36 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <js9qsc$k21$>
References: <>
Message-ID: <>

Hi Christian,

On Mon, Jun 25, 2012 at 3:59 PM, Christian Heimes <lists at> wrote:
> Are you aiming for atomic file rollover backed by a temporary file?
> That's the common way to safely overwrite an existing file. It works
> differently than your code.

Oops, I need to be clearer. This is not what I wanted to do. See other


- Kim

From lists at  Mon Jun 25 17:21:57 2012
From: lists at (Christian Heimes)
Date: Mon, 25 Jun 2012 17:21:57 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <js9vml$ji$>

Am 25.06.2012 17:03, schrieb Kim Gr?sman:
> This is a very valid concern -- if the process dies unexpectedly I'd
> leave the file replaced and the original in some temporary directory.
> Not sure if there's a way around that, probably not.

Your algorithm doesn't take SIGKILL, SIGSEV or server crash into
account. I don't see a chance to compensate for these problems. How
about you fix the 3rd party code instead?

-1 for addition of broken code.

Sorry ;)

From ethan at  Mon Jun 25 17:21:11 2012
From: ethan at (Ethan Furman)
Date: Mon, 25 Jun 2012 08:21:11 -0700
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>	<>
Message-ID: <>

Kim Gr?sman wrote:
> On Mon, Jun 25, 2012 at 3:41 PM, Mike Graham wrote:
>> I don't really like the name Backup but I can't think of a better name
>> at the moment.
> Me neither.

How about FileRollback, ModifyThenRestore, NowYouSeeItNowYouDont, or 
StupidThirdPartyProgramThatOnlyAllowsConfigThroughFiles ?

Tongue-partly-in-cheek'ly yours,


From kim at  Mon Jun 25 20:46:57 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Mon, 25 Jun 2012 20:46:57 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <js9vml$ji$>
References: <>
Message-ID: <>

On Mon, Jun 25, 2012 at 5:21 PM, Christian Heimes <lists at> wrote:
> Am 25.06.2012 17:03, schrieb Kim Gr?sman:
>> This is a very valid concern -- if the process dies unexpectedly I'd
>> leave the file replaced and the original in some temporary directory.
>> Not sure if there's a way around that, probably not.
> Your algorithm doesn't take SIGKILL, SIGSEV or server crash into
> account. I don't see a chance to compensate for these problems. How
> about you fix the 3rd party code instead?
> -1 for addition of broken code.

Duly noted :-)

It's simple enough and works well in my narrow context, so I'll just
keep it to myself.

- Kim

From lists at  Mon Jun 25 20:50:08 2012
From: lists at (Christian Heimes)
Date: Mon, 25 Jun 2012 20:50:08 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

Am 25.06.2012 20:46, schrieb Kim Gr?sman:
> Duly noted :-)
> It's simple enough and works well in my narrow context, so I'll just
> keep it to myself.

I'd use a similar approach in your place. Practicality beats purity. Or
beat the author of the broken lib with a big stick. :)


From tjreedy at  Mon Jun 25 22:33:07 2012
From: tjreedy at (Terry Reedy)
Date: Mon, 25 Jun 2012 16:33:07 -0400
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <jsahu5$pgu$>

On 6/25/2012 8:17 AM, Kim Gr?sman wrote:
> Hello,
> I'm new here, so forgive me if this has been discussed before or is off-topic.
> I came up with a mechanism that I thought might be useful in the
> Python standard library -- a scope-bound self-restoring backup file. I
> came to this na?ve implementation;
> --
> class BackupError(Exception):
>     pass
> class Backup:
>     def __init__(self, path):
>         if not os.path.exists(path) or os.path.isdir(path):
>             raise BackupError("%s must be a valid file path" % path)
>         self.path = path
>         self.backup_path = None
>     def __enter__(self):
>         self.backup()
>     def __exit__(self, type, value, traceback):
>         self.restore()
>     def _generate_backup_path(self):
>         tempdir = tempfile.mkdtemp()
>         basename = os.path.basename(self.path)
>         return os.path.join(tempdir, basename)
>     def backup(self):
>         backup_path = self._generate_backup_path()
>         shutil.copy(self.path, backup_path)
>         self.backup_path = backup_path
>     def restore(self):
>         if self.backup_path:
>             # Write backup back onto original
>             shutil.copy(self.backup_path, self.path)
>             shutil.rmtree(os.path.dirname(self.backup_path))
>             self.backup_path = None
> --
> Backups are intended to be scope-bound like so:
>   with Backup(settings_file):
>      rewrite_settings(settings_file)
>      do_something_else()
> I even managed to use it with the @contextmanager attribute, to allow this:
>   with rewrite_settings(settings_file):
>      do_something_else()
> So, open questions;
> - Would something like this be useful outside of my office?
> - Any suggestions for better names?
> - This feels like it belongs in the tempfile module, would you agree?
> - What's lacking in the implementation? Have I done something
> decidedly non-Pythonic?

It seems to me that what you actually *want* to do, given your other 
responses, is to make a temporary altered copy of the settings file and 
get the programs to use the *copy*. That way, other users would see the 
original undistrubed and a crash would at worst leave the copy 
undeleted. (Whether you want to copy alterations back is a different 
matter.) I presume the problem is that the program has the name of the 
settings file hard-coded. One possibility might be to run the program in 
a virtual environment with its temporary copy. (But I have 0 experience 
with that. I only know that venv has been added to 3.3.)

Terry Jan Reedy

From christopherreay at  Tue Jun 26 01:19:35 2012
From: christopherreay at (Christopher Reay)
Date: Tue, 26 Jun 2012 01:19:35 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

It seems to me it would be easier to patch the 3rd party library code and
submit the patch to them, than to do this.

There are other ways to manipulate the file system to achieve what you are
attempting.. but somewhere along the line you would have to interact with
another program. If you taught the shell to clean up after your act, then
this could be achieved in the event of a power failure. You could even
write a wrapper shell for Python. I think perhaps the case is too niche for
that kind of solution


Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Tue Jun 26 01:40:46 2012
From: ethan at (Ethan Furman)
Date: Mon, 25 Jun 2012 16:40:46 -0700
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>	<>	<>	<js9vml$ji$>	<>
Message-ID: <>

Christopher Reay wrote:
> It seems to me it would be easier to patch the 3rd party library code 
> and submit the patch to them, than to do this.

Not every third-party library is patchable.


From kim at  Tue Jun 26 07:21:01 2012
From: kim at (=?ISO-8859-1?Q?Kim_Gr=E4sman?=)
Date: Tue, 26 Jun 2012 07:21:01 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <jsahu5$pgu$>
References: <>
Message-ID: <>

Hi Terry, and all,

On Mon, Jun 25, 2012 at 10:33 PM, Terry Reedy <tjreedy at> wrote:
> It seems to me that what you actually *want* to do, given your other
> responses, is to make a temporary altered copy of the settings file and get
> the programs to use the *copy*. That way, other users would see the original
> undistrubed and a crash would at worst leave the copy undeleted. (Whether
> you want to copy alterations back is a different matter.) I presume the
> problem is that the program has the name of the settings file hard-coded.
> One possibility might be to run the program in a virtual environment with
> its temporary copy. (But I have 0 experience with that. I only know that
> venv has been added to 3.3.)

Thanks for all your alternative strategies! In this case, the third
party is a combination of Python, shell script, and executable
binaries in at least three different processes, and I'm pretty happy
with the modify-do work-restore model for this batch script.

I appreciate the input on the suggested idea, it gave me some new
error modes to worry about, even if most of them don't apply for this
specific case.

- Kim

From techtonik at  Tue Jun 26 10:03:06 2012
From: techtonik at (anatoly techtonik)
Date: Tue, 26 Jun 2012 11:03:06 +0300
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
Message-ID: <>

Now that Python 3 is all about iterators (which is a user killer
feature for Python according to StackOverflow - would it be nice to
introduce more first class functions to work with them? One function
to be exact to split string into chunks.

    itertools.chunks(iterable, size, fill=None)

Which is the 33th most voted Python question on SO -

P.S. CC'ing to python-dev@ to notify about the thread in python-ideas.

From g.brandl at  Tue Jun 26 10:39:04 2012
From: g.brandl at (Georg Brandl)
Date: Tue, 26 Jun 2012 10:39:04 +0200
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <jsbse3$jl6$>

On 26.06.2012 10:03, anatoly techtonik wrote:
> Now that Python 3 is all about iterators (which is a user killer
> feature for Python according to StackOverflow -
> would it be nice to
> introduce more first class functions to work with them? One function
> to be exact to split string into chunks.
>      itertools.chunks(iterable, size, fill=None)
> Which is the 33th most voted Python question on SO -

+1.  This is already a recipe in the itertools docs
(see grouper() on,
but it is so often requested (and used) that it is a very good
candidate for a stdlib function.


From taleinat at  Tue Jun 26 12:34:54 2012
From: taleinat at (Tal Einat)
Date: Tue, 26 Jun 2012 13:34:54 +0300
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jun 26, 2012 at 11:03 AM, anatoly techtonik <techtonik at> wrote:
> Now that Python 3 is all about iterators (which is a user killer
> feature for Python according to StackOverflow -
> would it be nice to
> introduce more first class functions to work with them? One function
> to be exact to split string into chunks.
> ? ?itertools.chunks(iterable, size, fill=None)
> Which is the 33th most voted Python question on SO -


When working with iterators I have needed this often, and have
implemented a similar utility function in many projects. As an
example, this is a basic building block in my RunnincCalcs[1] module.

- Tal Einat


From jsbueno at  Tue Jun 26 14:42:01 2012
From: jsbueno at (Joao S. O. Bueno)
Date: Tue, 26 Jun 2012 09:42:01 -0300
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <>

On 26 June 2012 07:34, Tal Einat <taleinat at> wrote:
> On Tue, Jun 26, 2012 at 11:03 AM, anatoly techtonik <techtonik at> wrote:

>> ? ?itertools.chunks(iterable, size, fill=None)
What about

tertools.chunks(iterable, size=None, separator=None, fill=None)

Requiring at leas one of size or separator to be set?

This would also work for "for x in text.split('\n')"  case.


From simon.sapin at  Tue Jun 26 14:58:35 2012
From: simon.sapin at (Simon Sapin)
Date: Tue, 26 Jun 2012 14:58:35 +0200
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <>

Le 26/06/2012 14:42, Joao S. O. Bueno a ?crit :
> itertools.chunks(iterable, size=None, separator=None, fill=None)
> Requiring at leas one of size or separator to be set?
> This would also work for "for x in text.split('\n')"  case.

I think that splitting an iterable on some separators or on a chunck 
size are two completely different functions. Having the same function do 
either is a bit confusing and I don?t see the benefit.

Or is there an use case in passing both parameters? What would it do 
then, end the chunck after `size` elements or at `separator`, whichever 
comes first?

Simon Sapin

From jsbueno at  Fri Jun 29 13:29:05 2012
From: jsbueno at (Joao S. O. Bueno)
Date: Fri, 29 Jun 2012 08:29:05 -0300
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 June 2012 05:55, Michele Lacchia <michelelacchia at> wrote:
> + 1 for the original proposal! I don't think splitting belongs to potential
> itertools.chunks
> Il giorno marted? 26 giugno 2012 14:58:35 UTC+2, Simon Sapin ha scritto:
>> Le 26/06/2012 14:42, Joao S. O. Bueno a ?crit :
>> > itertools.chunks(iterable, size=None, separator=None, fill=None)
>> >
>> > Requiring at leas one of size or separator to be set?
>> >
>> > This would also work for "for x in text.split('\n')" ?case.
>> I think that splitting an iterable on some separators or on a chunck
>> size are two completely different functions. Having the same function do
>> either is a bit confusing and I don?t see the benefit.
>> Or is there an use case in passing both parameters? What would it do
>> then, end the chunck after `size` elements or at `separator`, whichever
>> comes first?

Indeed - these are orthogonal features - but I think the ability to split on
a separator as an interator, if not as important as chunks, is missing as well.

Maybe add both?


>> Regards,
>> --
>> Simon Sapin
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at

From sturla at  Fri Jun 29 18:59:23 2012
From: sturla at (Sturla Molden)
Date: Fri, 29 Jun 2012 18:59:23 +0200
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

On 25.06.2012 14:17, Kim Gr?sman wrote:

 > I came up with a mechanism that I thought might be useful
 > in the Python standard library -- a scope-bound self-restoring
 > backup file.

>  with Backup(settings_file):
>      rewrite_settings(settings_file)
>      do_something_else()

Are you reinventing the transactional database?

If you need atomic commit and rollback, I am sure you can find a 
database that will take care of that (even Sqlite if you look in 
Python's standard library).


From g.brandl at  Fri Jun 29 22:32:49 2012
From: g.brandl at (Georg Brandl)
Date: Fri, 29 Jun 2012 22:32:49 +0200
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <>
References: <>
Message-ID: <jsl3cb$knn$>

On 26.06.2012 10:03, anatoly techtonik wrote:
> Now that Python 3 is all about iterators (which is a user killer
> feature for Python according to StackOverflow -
> would it be nice to
> introduce more first class functions to work with them? One function
> to be exact to split string into chunks.
>      itertools.chunks(iterable, size, fill=None)
> Which is the 33th most voted Python question on SO -
> P.S. CC'ing to python-dev@ to notify about the thread in python-ideas.

Anatoly, so far there were no negative votes -- would you care to go
another step and propose a patch?


From christopherreay at  Fri Jun 29 22:56:06 2012
From: christopherreay at (Christopher Reay)
Date: Fri, 29 Jun 2012 21:56:06 +0100
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

zope -> webdavfs ftw


Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From christopherreay at  Fri Jun 29 23:01:31 2012
From: christopherreay at (Christopher Reay)
Date: Fri, 29 Jun 2012 22:01:31 +0100
Subject: [Python-ideas] BackupFile
In-Reply-To: <>
References: <>
Message-ID: <>

or ftpfs ftm

On 29 June 2012 21:56, Christopher Reay <christopherreay at> wrote:

> zope -> webdavfs ftw
> --
> Be prepared to have your predictions come true


Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mikegraham at  Fri Jun 29 23:36:48 2012
From: mikegraham at (Mike Graham)
Date: Fri, 29 Jun 2012 17:36:48 -0400
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <jsl3cb$knn$>
References: <>
Message-ID: <>

On Fri, Jun 29, 2012 at 4:32 PM, Georg Brandl <g.brandl at> wrote:
> so far there were no negative votes

As far as I know, Raymond Hettinger is the itertools maintainer and he
has repeatedly objected to this idea in the past (e.g. ). Hopefully we can get his input


From fiatjaf at  Sat Jun 30 15:59:54 2012
From: fiatjaf at (fiatjaf at
Date: Sat, 30 Jun 2012 10:59:54 -0300
Subject: [Python-ideas] the optional "as" statement inside "if" statements
Message-ID: <>

the idea is to make an variable assignment at the same time that the
existence of that variable -- which is being returned by a function -- is

suppose we are returning a variable from the method 'get' from the
'request' object and them making some stuff with it, but that stuff we will
only do if it exists, if not, we'll just pass, instead of writing:

variable = self.request.get('variable')
if variable:
   print variable

we could write

if self.request.get('variable') as variable:
   print variable

seems stupid (or not?), but with lots of variables to process, this
pre-assignment could be very unpleasant -- especially if, as the in the
example case, very little use will be made of the tested variable.

also, the "as" expression already exists and is very pythonic.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ironfroggy at  Sat Jun 30 16:16:47 2012
From: ironfroggy at (Calvin Spealman)
Date: Sat, 30 Jun 2012 10:16:47 -0400
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 30, 2012 at 9:59 AM,  <fiatjaf at> wrote:
> the idea is to make an variable assignment at the same time that the
> existence of that variable -- which is being returned by a function -- is
> made.
> suppose we are returning a variable from the method 'get' from the 'request'
> object and them making some stuff with it, but that stuff we will only do if
> it exists, if not, we'll just pass, instead of writing:
> variable = self.request.get('variable')
> if variable:
> ? ?print variable
> we could write
> if self.request.get('variable') as variable:
> ? ?print variable
> seems stupid (or not?), but with lots of variables to process, this
> pre-assignment could be very unpleasant -- especially if, as the in the
> example case, very little use will be made of the tested variable.
> also, the "as" expression already exists and is very pythonic.

This is probably the best solution to the problem that would fit in
the language,
but I'm not convinced doing it at all fits very well.


> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

Read my blog! I depend on your acceptance of my opinion! I am interesting!
Follow me if you're into that sort of thing:

From zachary.ware+pyideas at  Sat Jun 30 16:20:14 2012
From: zachary.ware+pyideas at (Zachary Ware)
Date: Sat, 30 Jun 2012 09:20:14 -0500
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

On Jun 30, 2012 9:00 AM, <fiatjaf at> wrote:
> the idea is to make an variable assignment at the same time that the
existence of that variable -- which is being returned by a function -- is
> suppose we are returning a variable from the method 'get' from the
'request' object and them making some stuff with it, but that stuff we will
only do if it exists, if not, we'll just pass, instead of writing:
> variable = self.request.get('variable')
> if variable:
>    print variable
> we could write
> if self.request.get('variable') as variable:
>    print variable
> seems stupid (or not?), but with lots of variables to process, this
pre-assignment could be very unpleasant -- especially if, as the in the
example case, very little use will be made of the tested variable.
> also, the "as" expression already exists and is very pythonic.

I like it! I've found myself annoyed by writing an if statement using the
result of a function call, then realizing "oh wait, I need a reference to
that value" and have to go back and rewrite. This would eliminate that and,
to my mind, it flows very nicely.

+1 from me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alexander.belopolsky at  Sat Jun 30 16:23:45 2012
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sat, 30 Jun 2012 10:23:45 -0400
Subject: [Python-ideas] Happy leap second
Message-ID: <>

Even though many have hoped that the authorities would stop fiddling
with our clocks, today a leap second will be inserted in UTC.
Systems using Olson/IANA timezone database have a way to deal with
this without adjusting their clocks, but few systems are configured
that way:

$ TZ=right/UTC date -d @1341100824
Sat Jun 30 23:59:60 UTC 2012

(1341100824 is the number of seconds since epoch including the leap seconds.)

Python's time module works fine with the "right" timezones:

>>> import time
>>> print(time.strftime('%T', time.localtime(1341100824)))

but the datetime module clips the leap second down to the previous second:

>>> from datetime import datetime
>>> from datetime import datetime
>>> print(datetime.fromtimestamp(1341100824).strftime('%T'))
>>> print datetime.fromtimestamp(1341100823).strftime('%T')

BDFL has been resisting adding support for leap seconds to the
datetime module [1], but as the clocks become more accurate and
synchronization requirements become stricter, we may want to revisit
this issue.


From guido at  Sat Jun 30 16:57:50 2012
From: guido at (Guido van Rossum)
Date: Sat, 30 Jun 2012 07:57:50 -0700
Subject: [Python-ideas] Happy leap second
In-Reply-To: <>
References: <>
Message-ID: <>

POSIX timestamps don't have leap seconds. Convince POSIX to change
that and Python will follow suit.

On Sat, Jun 30, 2012 at 7:23 AM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> Even though many have hoped that the authorities would stop fiddling
> with our clocks, today a leap second will be inserted in UTC.
> Systems using Olson/IANA timezone database have a way to deal with
> this without adjusting their clocks, but few systems are configured
> that way:
> $ TZ=right/UTC date -d @1341100824
> Sat Jun 30 23:59:60 UTC 2012
> (1341100824 is the number of seconds since epoch including the leap seconds.)
> Python's time module works fine with the "right" timezones:
>>>> import time
>>>> print(time.strftime('%T', time.localtime(1341100824)))
> 23:59:60
> but the datetime module clips the leap second down to the previous second:
>>>> from datetime import datetime
>>>> from datetime import datetime
>>>> print(datetime.fromtimestamp(1341100824).strftime('%T'))
> 23:59:59
>>>> print datetime.fromtimestamp(1341100823).strftime('%T')
> 23:59:59
> BDFL has been resisting adding support for leap seconds to the
> datetime module [1], but as the clocks become more accurate and
> synchronization requirements become stricter, we may want to revisit
> this issue.
> [1]
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at

--Guido van Rossum (

From ncoghlan at  Sat Jun 30 17:06:39 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 1 Jul 2012 01:06:39 +1000
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 30, 2012 at 11:59 PM,  <fiatjaf at> wrote:
> the idea is to make an variable assignment at the same time that the
> existence of that variable -- which is being returned by a function -- is
> made.
> suppose we are returning a variable from the method 'get' from the 'request'
> object and them making some stuff with it, but that stuff we will only do if
> it exists, if not, we'll just pass, instead of writing:
> variable = self.request.get('variable')
> if variable:
> ? ?print variable
> we could write
> if self.request.get('variable') as variable:
> ? ?print variable
> seems stupid (or not?), but with lots of variables to process, this
> pre-assignment could be very unpleasant -- especially if, as the in the
> example case, very little use will be made of the tested variable.
> also, the "as" expression already exists and is very pythonic.

This proposal has been considered and rejected many times. It's not
general enough - it *only* works for those cases where the value to be
retained *and* the interesting condition are the same.

Consider the simple case of a value that may be either None (not
interesting) or a number (interesting). Since the interesting values
include "0", which evaluates as False along with None, this limited
form of embedded assignment syntax would not help.

Embedded assignment in C isn't that limited., but nobody has yet
volunteered to take the radical step of proposing "(X as Y)" as a
general embedded assignment syntax. I suggest anyone consider such an
idea do a *lot* of research in the python-ideas archives first, though
(as the idea has seen plenty of discussion). It is not as obviously
flawed as the if-and-while statement only variant, but it would still
involve being rather persuasive to make such a significant change to
the language.

You're also unlikely to get much in the way of core developer feedback
until after the 3.3 release in August.


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From alexander.belopolsky at  Sat Jun 30 17:18:07 2012
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sat, 30 Jun 2012 11:18:07 -0400
Subject: [Python-ideas] Happy leap second
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 30, 2012 at 10:57 AM, Guido van Rossum <guido at> wrote:
> POSIX timestamps don't have leap seconds. Convince POSIX to change
> that and Python will follow suit.

POSIX (time_t) timestamps are mostly irrelevant for the users of the
datetime module.  POSIX type that is closest to datetime.datetime is
struct tm and it does have leap seconds:

The <time.h> header shall declare the structure tm, which shall
include at least the following members:

int    tm_sec   Seconds [0,60].
"""  -

Note that that POSIX does require that a round-trip through time_t
(localtime(mktime(x))) converts hh:59:60 to (hh+1):00:00, but
datetime.timestamp() can still do the same if we make second=60 valid.

From fiatjaf at  Sat Jun 30 17:46:08 2012
From: fiatjaf at (fiatjaf at
Date: Sat, 30 Jun 2012 12:46:08 -0300
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

thank you two for the responses.

I'm a newbie here and I didn't find the archives (yes, I'm stupid, and I
didn't search well).
I have no hope of being persuasive, I only thought that if I introduced the
idea other people would like it instantenously, but if the idea is good,
obviously someone had already thought of it, so it makes me happy. I'll
look for the archives and keep watching the development and see what I get
from this.

On Sat, Jun 30, 2012 at 12:06 PM, Nick Coghlan <ncoghlan at> wrote:

> On Sat, Jun 30, 2012 at 11:59 PM,  <fiatjaf at> wrote:
> > the idea is to make an variable assignment at the same time that the
> > existence of that variable -- which is being returned by a function -- is
> > made.
> >
> > suppose we are returning a variable from the method 'get' from the
> 'request'
> > object and them making some stuff with it, but that stuff we will only
> do if
> > it exists, if not, we'll just pass, instead of writing:
> >
> > variable = self.request.get('variable')
> > if variable:
> >    print variable
> >
> > we could write
> >
> > if self.request.get('variable') as variable:
> >    print variable
> >
> > seems stupid (or not?), but with lots of variables to process, this
> > pre-assignment could be very unpleasant -- especially if, as the in the
> > example case, very little use will be made of the tested variable.
> >
> > also, the "as" expression already exists and is very pythonic.
> This proposal has been considered and rejected many times. It's not
> general enough - it *only* works for those cases where the value to be
> retained *and* the interesting condition are the same.
> Consider the simple case of a value that may be either None (not
> interesting) or a number (interesting). Since the interesting values
> include "0", which evaluates as False along with None, this limited
> form of embedded assignment syntax would not help.
> Embedded assignment in C isn't that limited., but nobody has yet
> volunteered to take the radical step of proposing "(X as Y)" as a
> general embedded assignment syntax. I suggest anyone consider such an
> idea do a *lot* of research in the python-ideas archives first, though
> (as the idea has seen plenty of discussion). It is not as obviously
> flawed as the if-and-while statement only variant, but it would still
> involve being rather persuasive to make such a significant change to
> the language.
> You're also unlikely to get much in the way of core developer feedback
> until after the 3.3 release in August.
> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From christopherreay at  Sat Jun 30 17:54:55 2012
From: christopherreay at (Christopher Reay)
Date: Sat, 30 Jun 2012 17:54:55 +0200
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

The only hope for a large archive like this one is to wait long enough to
make sure you dont re hash the really regular ideas.

... ponders... Do I have time to read the archives? Do people mind
adminiing the repetitive ideas?


Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Sat Jun 30 18:03:10 2012
From: storchaka at (Serhiy Storchaka)
Date: Sat, 30 Jun 2012 19:03:10 +0300
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
Message-ID: <jsn7um$2j1$>

As shown in issue #15016 [1], there is a use cases when it is useful to 
determine that string can be encoded in ASCII or Latin1. In working with 
Tk or Windows console applications can be useful to determine that 
string can be encoded in UCS2. C API provides interface for this, but at 
Python level it is not available.

I propose to add to strings class new methods: isascii(), islatin1() and 
isbmp() (in addition to such methods as isalpha() or isdigit()). The 
implementation will be trivial.

Pro: The current trick with trying to encode has O(n) complexity and has 
overhead of exception raising/catching.

Contra: In most cases after determining characters range we still need 
to encode a string with the appropriate encoding. New methods will 
complicate already overloaded strings class.



From ncoghlan at  Sat Jun 30 18:05:26 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 1 Jul 2012 02:05:26 +1000
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jul 1, 2012 at 1:54 AM, Christopher Reay
<christopherreay at> wrote:
> The only hope for a large archive like this one is to wait long enough to
> make sure you dont re hash the really regular ideas.
> ... ponders... Do I have time to read the archives? Do people mind adminiing
> the repetitive ideas?

It's more a matter of working out how to point Google (or the search
engine of your choice) at the archives in a useful way. In this case:


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From ncoghlan at  Sat Jun 30 18:14:23 2012
From: ncoghlan at (Nick Coghlan)
Date: Sun, 1 Jul 2012 02:14:23 +1000
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
In-Reply-To: <jsn7um$2j1$>
References: <jsn7um$2j1$>
Message-ID: <>

On Sun, Jul 1, 2012 at 2:03 AM, Serhiy Storchaka <storchaka at> wrote:
> As shown in issue #15016 [1], there is a use cases when it is useful to
> determine that string can be encoded in ASCII or Latin1. In working with Tk
> or Windows console applications can be useful to determine that string can
> be encoded in UCS2. C API provides interface for this, but at Python level
> it is not available.
> I propose to add to strings class new methods: isascii(), islatin1() and
> isbmp() (in addition to such methods as isalpha() or isdigit()). The
> implementation will be trivial.

Why not just expose max_code_point directly instead of adding three new methods?


Nick Coghlan?? |?? ncoghlan at |?? Brisbane, Australia

From guido at  Sat Jun 30 18:29:58 2012
From: guido at (Guido van Rossum)
Date: Sat, 30 Jun 2012 09:29:58 -0700
Subject: [Python-ideas] Happy leap second
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 30, 2012 at 8:18 AM, Alexander Belopolsky
<alexander.belopolsky at> wrote:
> On Sat, Jun 30, 2012 at 10:57 AM, Guido van Rossum <guido at> wrote:
>> POSIX timestamps don't have leap seconds. Convince POSIX to change
>> that and Python will follow suit.
> POSIX (time_t) timestamps are mostly irrelevant for the users of the
> datetime module. ?POSIX type that is closest to datetime.datetime is
> struct tm and it does have leap seconds:
> """
> The <time.h> header shall declare the structure tm, which shall
> include at least the following members:
> int ? ?tm_sec ? Seconds [0,60].
> ...
> """ ?-
> Note that that POSIX does require that a round-trip through time_t
> (localtime(mktime(x))) converts hh:59:60 to (hh+1):00:00, but
> datetime.timestamp() can still do the same if we make second=60 valid.

The roundtrip requirement is telling though -- they have no way to
actually represent a leap second in the underlying clock (which is a
POSIX timestamp).

--Guido van Rossum (

From matt at  Sat Jun 30 18:34:02 2012
From: matt at (Matt Chaput)
Date: Sat, 30 Jun 2012 12:34:02 -0400
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
In-Reply-To: <>
References: <jsn7um$2j1$>
Message-ID: <>

> Why not just expose max_code_point directly instead of adding three new methods?


I accidentally sent my reply directly to Serhiy, but basically I said that I could really use this in my search library when I'm trying to write efficient compressed indexes, but all I need is to know the maximum char code (or the number of bytes per char).

I've been meaning to ask about this for a while.


From storchaka at  Sat Jun 30 18:41:59 2012
From: storchaka at (Serhiy Storchaka)
Date: Sat, 30 Jun 2012 19:41:59 +0300
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
In-Reply-To: <>
References: <jsn7um$2j1$>
Message-ID: <jsna7j$hub$>

On 30.06.12 19:14, Nick Coghlan wrote:
> Why not just expose max_code_point directly instead of adding three new methods?

I think it will be easier to use. You do not have to remember that the 
maximum ASCII code is 127. This is similar to the old is*() methods.

From solipsis at  Sat Jun 30 18:43:16 2012
From: solipsis at (Antoine Pitrou)
Date: Sat, 30 Jun 2012 18:43:16 +0200
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
References: <jsn7um$2j1$>
Message-ID: <>

On Sun, 1 Jul 2012 02:14:23 +1000
Nick Coghlan <ncoghlan at> wrote:
> On Sun, Jul 1, 2012 at 2:03 AM, Serhiy Storchaka <storchaka at> wrote:
> > As shown in issue #15016 [1], there is a use cases when it is useful to
> > determine that string can be encoded in ASCII or Latin1. In working with Tk
> > or Windows console applications can be useful to determine that string can
> > be encoded in UCS2. C API provides interface for this, but at Python level
> > it is not available.
> >
> > I propose to add to strings class new methods: isascii(), islatin1() and
> > isbmp() (in addition to such methods as isalpha() or isdigit()). The
> > implementation will be trivial.
> Why not just expose max_code_point directly instead of adding three new methods?

Because it's really an implementation detail. We don't want to carry
around such a legacy.
Besides, we don't know the max code point for sure, only an upper bound
of it (and, implicitly, also a lower bound).

So while I'm -0 on the methods (calling encode() is as simple), I'm -1
on max_code_point.



From christopherreay at  Sat Jun 30 18:44:28 2012
From: christopherreay at (Christopher Reay)
Date: Sat, 30 Jun 2012 18:44:28 +0200
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
In-Reply-To: <jsna7j$hub$>
References: <jsn7um$2j1$>
Message-ID: <>

Well, there would be constants.

What about both the methods and the max_code_point, and use it as an excuse
to explain again that encodings exists, and point to the encodings docs?


Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Sat Jun 30 19:02:47 2012
From: storchaka at (Serhiy Storchaka)
Date: Sat, 30 Jun 2012 20:02:47 +0300
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
In-Reply-To: <>
References: <jsn7um$2j1$>
Message-ID: <jsnbel$pua$>

On 30.06.12 19:43, Antoine Pitrou wrote:
> Because it's really an implementation detail. We don't want to carry
> around such a legacy.
> Besides, we don't know the max code point for sure, only an upper bound
> of it (and, implicitly, also a lower bound).
> So while I'm -0 on the methods (calling encode() is as simple), I'm -1
> on max_code_point.

Thanks, Antoine. This objection also just occurred to me. We cannot 
guarantee that isascii() always will be O(1). Several enchantments have 
already been rejected for this reason. If an extension author wants to 
take advantage of CPython, he should use CPython's C API.

From alexander.belopolsky at  Sat Jun 30 19:17:38 2012
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Sat, 30 Jun 2012 13:17:38 -0400
Subject: [Python-ideas] Happy leap second
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jun 30, 2012 at 12:29 PM, Guido van Rossum <guido at> wrote:
> The roundtrip requirement is telling though -- they have no way to
> actually represent a leap second in the underlying clock (which is a
> POSIX timestamp).

This correct: POSIX gettimeofday() cannot produce accurate UTC time
during the leap second, but this does not mean that a python program
should not be able to keep UTC time as accurately as the underlying
hardware allows.  Systems synchronized with official time using NTP,
get notifications about leap seconds up to a day in advance and can
prepare for a second during which NTP time stops.  (As far as I
understand, few systems actually stop their clocks or roll them back
on a leap seconds - most slow the clocks down in various incompatible
ways.)  For example, during the leap second a software clock can use
clock_gettime() (or Python's new time.monotonic()) function to get
actual time.

For better worse, legal time throughout the world is based on UTC and
once every couple of years there is a second that has to be
communicated as hh:mm:60.  Today we are fortunate that it is inserted
during the time when most of the world markets are closed, but next
time we may see a lot of lawsuits between traders arguing over whose
orders should have been filled first.  While few systems report
accurate UTC time during a leap second, there is no technological
limitation that would prevent most systems from implementing it.  One
can even implement such UTC clock in python, but valid times produced
by such clock cannot be stored in datetime objects.

From benjamin at  Sat Jun 30 19:20:21 2012
From: benjamin at (Benjamin Peterson)
Date: Sat, 30 Jun 2012 17:20:21 +0000 (UTC)
Subject: [Python-ideas] isascii()/islatin1()/isbmp()
References: <jsn7um$2j1$>
Message-ID: <>

Nick Coghlan <ncoghlan at ...> writes:
> Why not just expose max_code_point directly instead of adding
> three new methods?

All of these proposals rely on the *current* implementation of CPython unicode
(at least for their efficiency). Let's not pollute the language with features
that will be bad on others implementations or even ours in the future.


From guido at  Sat Jun 30 20:04:51 2012
From: guido at (Guido van Rossum)
Date: Sat, 30 Jun 2012 11:04:51 -0700
Subject: [Python-ideas] Happy leap second
In-Reply-To: <>
References: <>
Message-ID: <>

There's no reason why you need to use date time objects for such extreme
use cases.

--Guido van Rossum (sent from Android phone)
On Jun 30, 2012 10:17 AM, "Alexander Belopolsky" <
alexander.belopolsky at> wrote:

> On Sat, Jun 30, 2012 at 12:29 PM, Guido van Rossum <guido at>
> wrote:
> ..
> > The roundtrip requirement is telling though -- they have no way to
> > actually represent a leap second in the underlying clock (which is a
> > POSIX timestamp).
> This correct: POSIX gettimeofday() cannot produce accurate UTC time
> during the leap second, but this does not mean that a python program
> should not be able to keep UTC time as accurately as the underlying
> hardware allows.  Systems synchronized with official time using NTP,
> get notifications about leap seconds up to a day in advance and can
> prepare for a second during which NTP time stops.  (As far as I
> understand, few systems actually stop their clocks or roll them back
> on a leap seconds - most slow the clocks down in various incompatible
> ways.)  For example, during the leap second a software clock can use
> clock_gettime() (or Python's new time.monotonic()) function to get
> actual time.
> For better worse, legal time throughout the world is based on UTC and
> once every couple of years there is a second that has to be
> communicated as hh:mm:60.  Today we are fortunate that it is inserted
> during the time when most of the world markets are closed, but next
> time we may see a lot of lawsuits between traders arguing over whose
> orders should have been filled first.  While few systems report
> accurate UTC time during a leap second, there is no technological
> limitation that would prevent most systems from implementing it.  One
> can even implement such UTC clock in python, but valid times produced
> by such clock cannot be stored in datetime objects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From christopherreay at  Sat Jun 30 20:38:17 2012
From: christopherreay at (Christopher Reay)
Date: Sat, 30 Jun 2012 19:38:17 +0100
Subject: [Python-ideas] the optional "as" statement inside "if"
In-Reply-To: <>
References: <>
Message-ID: <>

How many times have you told people that?

Be prepared to have your predictions come true
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sat Jun 30 23:09:36 2012
From: tjreedy at (Terry Reedy)
Date: Sat, 30 Jun 2012 17:09:36 -0400
Subject: [Python-ideas] itertools.chunks(iterable, size, fill=None)
In-Reply-To: <jsl3cb$knn$>
References: <>
Message-ID: <>

On 6/29/2012 4:32 PM, Georg Brandl wrote:
> On 26.06.2012 10:03, anatoly techtonik wrote:
>> Now that Python 3 is all about iterators (which is a user killer
>> feature for Python according to StackOverflow -
>> would it be nice to
>> introduce more first class functions to work with them? One function
>> to be exact to split string into chunks.

Nothing special about strings.

>>      itertools.chunks(iterable, size, fill=None)

This is a renaming of itertools.grouper in 9.1.2. Itertools Recipes. You 
should have mentioned this. I think of 'blocks' rather than 'chunks', 
but I notice several SO questions with 'chunk(s)' in the title.

>> Which is the 33th most voted Python question on SO -

I am curious how you get that number. I do note that there are about 15 
other Python SO questions that seem to be variations on the theme. There 
might be more if 'blocks' and 'groups' were searched for.

> Anatoly, so far there were no negative votes -- would you care to go
> another step and propose a patch?

That is because Raymond H. is not reading either list right now ;-)
Hence the Cc:. Also because I did not yet respond to a vague, very 
incomplete idea.

 From Raymond's first message on , add 

"This has been rejected before.

* It is not a fundamental itertool primitive.  The recipes section in
the docs shows a clean, fast implementation derived from zip_longest().

* There is some debate on a correct API for odd lengths.  Some people
want an exception, some want fill-in values, some want truncation, and
some want a partially filled-in tuple.  The alone is reason enough not
to set one behavior in stone.

* There is an issue with having too many itertools.  The module taken as
a whole becomes more difficult to use as new tools are added."

This is not to say that the question should not be re-considered. Given 
the StackOverflow experience in addition to that of the tracker and 
python-list (and maybe python-ideas), a special exception might be made 
in relation to points 1 and 3.

It regard to point 2: many 'proposals', including Anatoly's, neglect 
this detail. But the function has to do *something* when seqlen % 
grouplen != 0. So an 'idea' is not really a concrete programmable 
proposal until 'something' is specified.

Exception -- not possible for an itertool until the end of the iteration 
(see below). To raise immediately for sequences, one could wrap grouper.

def exactgrouper(sequence, k):  # untested
   if len(sequence) % k:
     raise ValueError('Sequence length {} must be a multiple of group 
length {}'.format(len(sequence), k)
     return itertools.grouper(sequence, k)

Of course, sequences can also be directly sequentially sliced (but 
should the result be an iterable or sequence of blocks?). But we do not 
have a seqtools module and I do not think there should be another method 
added to the seq protocol.

Fill -- grouper always does this, with a default of None.

Truncate, Remainder -- grouper (zip_longest) cannot directly do this and 
no recipes are given in the itertools docs. (More could be, see below.)

Discussions on python-list gives various implementations either for 
sequences or iterables. For the latter, one approach is "it = 
iter(iterable)" followed by repeated islice of the first n items. 
Another is to use a sentinal for the 'fill' to detect a final incomplete 
block (tuple for grouper).

def grouper_x(n, iterable):  # untested
   sentinal = object()
   for g in grouper(n, iterable, sentinal):
     if g[-1] != sentinal:
       yield g
       # pass to truncate
       # yield g[:g.index(sentinal) for remainer
       # raise ValueError for delayed exception

The above discussion of point 2 touches on point 4, which Raymond 
neglected in the particular message above but which has come up before: 
What are the allowed input and output types? An idea is not a 
programmable proposal until the domain, range, and mapping are specified.

Possible inputs are a specific sequence (string, for instance), any 
sequence, any iterable. Possible outputs are a sequence or iterator of 
sequence or iterator. The various python-list and stackoverflow posts 
questions asks for various combinations. zip_longest and hence grouper 
takes any iterable and returns an iterator of tuples. (An iterator of 
maps might be more useful as a building block.) This is not what one 
usually wants with string input, for instance, nor with range input. To 

import itertools as it

def grouper(n, iterable, fillvalue=None):
     "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
     args = [iter(iterable)] * n
     return it.zip_longest(*args, fillvalue=fillvalue)

print(*(grouper(3, 'ABCDEFG', 'x')))  # probably not wanted
print(*(''.join(g) for g in grouper(3, 'ABCDEFG', 'x')))
('A', 'B', 'C') ('D', 'E', 'F') ('G', 'x', 'x')

What to do? One could easily write 20 different functions. So more 
thought is needed before adding anything. -1 on the idea as is.

For the doc, I think it would be helpful here and in most module 
subchapters if there were a subchapter table of contents at the top 
(under 9.1 in this case). Even though just 2 lines here (currently, but 
see below), it would let people know that there *is* a recipes section. 
After the appropriate tables, mention that there are example uses in the 
recipe section. Possibly add similar tables in the recipe section.

Another addition could be a new subsection on grouping (chunking) that 
would discuss post-processing of grouper (as discussed above), as well 
as other recipes, including ones specific to strings and sequences. It 
would essentially be a short how-to. Call it 9.1.3 "Grouping, Blocking, 
or Chunking Sequences and Iterables". The synonyms will help external 
searching. A toc would let people who have found this doc know to look 
for this at the bottom.

Terry Jan Reedy