On 3/31/2016 8:43 PM, Steven D'Aprano wrote:
> This matter boils down to a question of taste. You apparently don't like 
> the look of "def spam()", I do. I think my experience supports the 
> current requirement, you think that it hurts readability, I don't. 
> Unless you can give some objective evidence that it hurts readability, 
> you aren't going to convince me.

In addition to all of Steven's points, it's just way, way to late for
this change.

I don't see us ever changing Python to allow (or require!) you to omit
the parens on a function definition with no parameters. For backward
compatibility, you'd have to allow the parens. And then what's the point
in having two ways to do the same thing? You'd just be creating confusion.


From stefan at  Fri Apr  1 04:01:56 2016
From: stefan at (Stefan Krah)
Date: Fri, 1 Apr 2016 08:01:56 +0000 (UTC)
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <>

Mike Miller <python-ideas at ...> writes:
> The fact that the class definition doesn't require empty parentheses works 
> against a few of these arguments.  I like Mahan's idea and don't find it 
> confusing, though I agree it isn't a big pain point either way.

There is a difference. Classes can be used like C-structs:

>>> class the_struct:
...     x = 10
...     y = 20
>>> the_struct.x
>>> the_struct.y

In order to use equational reasoning, functions need the empty
argument list in the definition:

>>> def f(): return 10 * 20
>>> f() == 10 * 20
>>> f == 10 * 20

Read: We define f, when applied to the empty argument list, to equal
10 * 20.

f by itself does not equal 10 * 20. It is a function symbol.

Stefan Krah

From srkunze at  Fri Apr  1 05:25:24 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 1 Apr 2016 11:25:24 +0200
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
Message-ID: <>

On 01.04.2016 02:43, Steven D'Aprano wrote:
> (where the parameter-list might have one parameter, or ten, or zero).
> With your proposal the reader will *nearly always* see the consistent
> pattern:
> def name ( parameter-list ) :
> but very occasionally, maybe one time in a hundred functions, or a
> thousand, see:
> def name :
> and be surprised. Even if it is just for a millisecond, that doesn't
> help readability, it hurts it.

I disagree.

I remember this "issue" with decorators as well.

def foo....

@lru_cache()    # really?
def bar...

All decorators created with context_manager have this behavior. The 
proposal is about removing () for function definitions, so this differs.

However, decorators have a very declarative style as do function 
definitions. So what do those special characters serve?

IMHO It's just visual clutter.


From srkunze at  Fri Apr  1 05:32:00 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 1 Apr 2016 11:32:00 +0200
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
Message-ID: <>

On 01.04.2016 02:27, Steven D'Aprano wrote:
> On Thu, Mar 31, 2016 at 10:49:47PM +0200, Sven R. Kunze wrote:
>> On 31.03.2016 20:06, Terry Reedy wrote:
>>>> def greet: # note the missing parenthesis
>>>>      print('hello')
>>> -1  This will lead people to think even more often than they do now
>>> that they can omit () in the call.
>> Interesting that you mentioned it. Doesn't Ruby handle it this way?
>> Let's see how this would look like in Python:
>> def distance of point1, point2:
>>      # Pythagoras
>> point1 = (3, 1)
>> point2 = (1, 4)
>> print distance of point1, point2
> I don't know that Ruby allows function calls like that. It doesn't work
> in Ruby 1.8, which is the most recent version I have installed:
> steve at orac:~$ irb
> irb(main):001:0> def foo(x)
> irb(main):002:1> return x+1
> irb(main):003:1> end
> => nil
> irb(main):004:0> foo(7)
> => 8
> irb(main):005:0> foo of 7
> NoMethodError: undefined method `of' for main:Object
>          from (irb):5
>          from :0
> irb(main):006:0>

The 'of' was just off the top of my head. You shouldn't take that too 

It works without the 'of'.

> However, Hypertalk, and other similar "XTalk" languages, do. Function
> calls in Hypertalk generally have a long form and a short form. The long
> form will be something like:
>    total = the sum of field "expenses"
> while the short form is the more familiar:
>    total = sum(field "expenses")
> Although Hypercard only allowed the long form if there was exactly one
> argument.
>> Hmmm. Although I like the lightness of this, it's somewhat confusing, isn't?
> In Hypertalk, it worked very well. But I wouldn't think it would be a
> good fit to Python.

Interesting. I think I will have a look at Hypertalk. :)

Why do you think it would not fit into Python?

> In a previous email, Sven also wrote:
>> I think the keystrokes are less important than the visual clutter
>> one have with those parentheses.
> I don't think they introduce "visual clutter" to the function. I think
> they make it explicit and clear that the function has no arguments.
> That's not clutter.
> When I was learning Python, I found it hard to remember when I needed parens
> and when I didn't, because the inconsistency between class and def
> confused me. I would write `class X()` or `def spam` and get a syntax
> error (this was back in Python 1.5). My own experience tells me strongly
> that the inconsistency between:
> class X:  # must not use parens in Python 1.5
> class Y(X):  # must use parens
> and the inconsistency between class and def was harmful. If I could, I'd
> make parens mandatory for both. So I think that making parens optional
> for functions doesn't simplify the syntax so much as make the syntax
> *more complicated* and therefore harder to learn.
> And I do not agree that the empty parens are "clutter" or make the
> function definition harder to read.

Interesting view. It never occurred to me that this might be a problem.

But I started with a later Python version. So, it was just no a problem 
because both is possible now.


From srkunze at  Fri Apr  1 05:35:55 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 1 Apr 2016 11:35:55 +0200
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <>

On 01.04.2016 07:46, Eric V. Smith wrote:
> And then what's the point in having two ways to do the same thing? You'd just be creating confusion.

Class definitions allows this, function definitions don't? I'd call this 


From tjreedy at  Fri Apr  1 05:52:31 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 1 Apr 2016 05:52:31 -0400
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <ndlgd5$6sc$>

On 4/1/2016 5:25 AM, Sven R. Kunze wrote:

> I remember this "issue" with decorators as well.
> @property
> def foo....
> @lru_cache()    # really?
> def bar...

There is a sematic difference between the two examples. The ()s are not 
optional. 'property' is a decorator that is applied to foo after it is 
defined.  'lru_cache()' is a function call that returns a decorator. 
Quite different.

Terry Jan Reedy

From srkunze at  Fri Apr  1 06:10:52 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 1 Apr 2016 12:10:52 +0200
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <ndlgd5$6sc$>
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <>

On 01.04.2016 11:52, Terry Reedy wrote:
> On 4/1/2016 5:25 AM, Sven R. Kunze wrote:
>> I remember this "issue" with decorators as well.
>> @property
>> def foo....
>> @lru_cache()    # really?
>> def bar...
> There is a sematic difference between the two examples. The ()s are 
> not optional. 'property' is a decorator that is applied to foo after 
> it is defined.  'lru_cache()' is a function call that returns a 
> decorator. Quite different.

Thanks for snipping away the relevant explanation and not addressing my 
point. >.<

Never said it were optional. Decorating and function definitions have a 
declarative style in common. The point was that these () don't serve 
anything in case of declarations.


From steve at  Fri Apr  1 07:05:53 2016
From: steve at (Steven D'Aprano)
Date: Fri, 1 Apr 2016 22:05:53 +1100
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <>

On Fri, Apr 01, 2016 at 11:32:00AM +0200, Sven R. Kunze wrote:

> >However, Hypertalk, and other similar "XTalk" languages, do. Function
> >calls in Hypertalk generally have a long form and a short form. The long
> >form will be something like:
> >
> >   total = the sum of field "expenses"
> >
> >while the short form is the more familiar:
> >
> >   total = sum(field "expenses")
> >
> >
> >Although Hypercard only allowed the long form if there was exactly one
> >argument.
> >In Hypertalk, it worked very well. But I wouldn't think it would be a
> >good fit to Python.
> Interesting. I think I will have a look at Hypertalk. :)

Unfortunately, Hypertalk is long dead. Apple never quite understood why 
it was popular, or what to do with it. But it influenced the design of 
the WWW and Javascript, and it lives on in a couple of languages such as 
OpenXion and LiveCode:

(LiveCode has a booming user community, OpenXion is all but dead, but it 
works and lets you experiment with the language.)

If you have an old Classic Mac capable of running System 6 through 9 
(pre OS X), or an emulator for the same, then you might be able to run 
Hypercard, which was a sort of combined software development kit, 
Rolodex application, and IDE. Hypercard was the GUI to the Hypertalk 
language, and Hypertalk was the scripting language that controlled the 
Hypercard GUI.

> Why do you think it would not fit into Python?

Hypertalk's execution model, data model and syntax are all very 
different from Python's. Hypertalk was also linked very heavily to the 
GUI, which makes it a relatively weak fit with less specialised 
languages like Python.

But mostly, Python already has a standard syntax for calling functions:

value = function(arg)

There's no need to add a more verbose "the function of arg" syntax.


From contrebasse at  Fri Apr  1 07:42:08 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Fri, 1 Apr 2016 11:42:08 +0000 (UTC)
Subject: [Python-ideas] Working with Path objects: p-strings?
References: <>
 <> <>
 <> <-6058951228755932973@unknownmsgid>
Message-ID: <>

Paul Moore <p.f.moore at ...> writes:

> > People want paths to be a strings so that they will work with all the code
> > that already works with strings.
> Correct. That's the prime motivation. But you then say
> > But whatever happened to duck typing? Paths don't need to BE strings.
> > Rather, everything that needs a path needs to accept anything that
> > acts like a path.
> But "all the code that already works with strings" doesn't do that. If
> we're allowed to change that code then a simple
>     patharg = getattr(patharg, 'path', patharg)
> is sufficient to work with path objects or strings.
I think you have it backwards here: the problem is not updating code that
you can change, it's using any external library that uses paths as strings.

When I use a library from pypi, I won't add a new line in every function
using a string representing a path. When if I use, I can directly
pass paths to any library without changing its code.

From eric at  Fri Apr  1 07:48:01 2016
From: eric at (Eric V. Smith)
Date: Fri, 1 Apr 2016 07:48:01 -0400
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
 <> <>
Message-ID: <>

On 4/1/2016 5:35 AM, Sven R. Kunze wrote:
> On 01.04.2016 07:46, Eric V. Smith wrote:
>> And then what's the point in having two ways to do the same thing?
>> You'd just be creating confusion.
> Class definitions allows this, function definitions don't? I'd call this
> confusing.

Even if there was agreement on that, it remains a bad idea to add more


From desmoulinmichel at  Fri Apr  1 07:49:03 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 13:49:03 +0200
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
Message-ID: <>

Le 01/04/2016 04:37, Joao S. O. Bueno a ?crit :
> On 31 March 2016 at 14:57, Steven D'Aprano <steve at> wrote:
>> On Thu, Mar 31, 2016 at 09:29:36PM +0500, Mahan Marwat wrote:
>>> I have an idea of making parenthesis optional for functions having no
>>> parameters. i.e
>>> def greet: # note the missing parenthesis
>>>     print('hello')
>> -1
>> I don't think that the benefit (two fewer characters to type) is worth
>> the effort of learning the special case. Right now, the rule is simple:
>> the def keyword ALWAYS needs parentheses after the name of the function,
>> regardless of whether there is one argumemt, two arguments, twenty
>> arguments, or zero arguments. Why treat zero as special?
> Because class definitions already do so?

Yes, and because after more than a decade of Python, I still forget to
type out the parenthesis some time, then go back and realize that it's
silly that I have to since I don't with classes.

That, and:

- it's a common student error;
- really it wouldn't hurt anyone.

Is there really a strong case against it than just "it's not pure" ?
I've seen of lot of this argument on the list lately and I find it
counter productive.

There are dozen of good way to oppose an idea, just saying "we got a
moral stand to not do it" is not convincing. Espacially in a language
with so many compromised like len(foo) instead of foo.len, functional
paradigme and Poo and immutability and mutability, etc.

Python has an history of making things to get out of the way:

- no {} for indentation;
- optional parentheses for tuples;
- optional parenthesis for classes;

If this changes does not hurt readability, ability to debug and doesn't
make your code/program any worst than it was but does't help even a
little, why not ?

> So, if it is possible to omit parentheses when inheriting from the
> default object when declaring a class, not needing parenthesis for
> default parameterless functions would not be an exception - it would
> be generalizing the idea of "Python admits less clutter".
> For that, I'd think of this a good idea - but I don't like changing
> the idea syntax in such a fundamental way - so I am +0 on this thing.
> I think this can be an interesting discussion - but I dislike people
> taking a ride on this to suggest omitting parentheses on function
> calls as well - that is totally broken. :-)

+1. This is for another thread (that hope will die).
>   js
>  -><-
From p.f.moore at  Fri Apr  1 08:37:14 2016
From: p.f.moore at (Paul Moore)
Date: Fri, 1 Apr 2016 13:37:14 +0100
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
Message-ID: <>

On 1 April 2016 at 12:49, Michel Desmoulin <desmoulinmichel at> wrote:
> Is there really a strong case against it than just "it's not pure" ?
> I've seen of lot of this argument on the list lately and I find it
> counter productive.
> There are dozen of good way to oppose an idea, just saying "we got a
> moral stand to not do it" is not convincing. Espacially in a language
> with so many compromised like len(foo) instead of foo.len, functional
> paradigme and Poo and immutability and mutability, etc.

You seem to have missed the argument that's been made a couple of
times, which is that you would then have two ways of doing the same
thing (empty parentheses would need to still be allowed for backward
compatibility) and Python has a strong tradition of "there should only
be one way of doing things". (Yes, it's not always followed 100%, as
with any real life situation not everything is perfect). You may
prefer to allow people to choose their own style and have options. But
as a maintainer, I can confirm that I personally prefer code that I
have to maintain to have a consistent style, and Python's lack of
multiple ways to say the same thing is a benefit in ensuring that
happens. So for me, your proposal would result in extra work, and no
benefit (I would continue adding "()" as I find that style more
readable and consistent).

This is not a "moral stand". The "one way to do things" principle is a
highly practical decision based on experience of different programming
environments - Perl in particular allowed many ways of doing the same
thing ("there's more than one way to do it" was a catch phrase in the
Perl community) and it's an acknowledged fact that Perl code is hard
to maintain as a consequence (unless strict style guidelines are

As someone proposing a change to the Python language, the onus is on
you to argue the benefits of your change, not on others to argue
against it. If no-one does anything, Python won't change so you have
to convince people. At the moment your argument is little more than
"it looks neater", and opinions on that are clearly divided. As
there's a *huge* amount of material that would need changing as a
result of the change (documentation, training courses, style guides,
IDEs, ...) you need a much better argument - and complaining that the
people pointing out that your argument isn't strong enough are being
"counter productive" is not helpful.

If we're discussing "I've seen a lot of it on this list lately", then
I would argue that there has been an awful lot of ideas proposed here
recently which don't take into account the significant and genuine
costs involved in *any* change to Python, and as a result don't even
try to offer a justification for the proposal that is in proportion to
the impact of the change. It's perfectly possible to propose a change
here, have it supported, and get it implemented - but it needs *work*,
and a lot of the discussions here are pointless because no-one is
willing to do any work (not even the amount of work needed to convince
the core developers that their proposal is worthwhile). It's very easy
to accuse people of not being willing to listen to your proposals -
but I challenge you to try to get a change like this included into C++
or Java, if you think that about Python!


From ncoghlan at  Fri Apr  1 09:07:07 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 1 Apr 2016 23:07:07 +1000
Subject: [Python-ideas] Package reputation system
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On 30 March 2016 at 06:51, Michael Selik <mike at> wrote:
> In days of yore, before package managers, people used to download source code and read it. Maybe not all of it, but enough to feel not terribly scared when running that code. In modern times, with centralized package repositories and convenient installer tools, we want a better way to know "what?s the right package to use for this task?" It's a "first-world problem" in a sense. There are too many products in my supermarket aisle! On a personal note, I have on occasion spent twenty minutes choosing a toothpaste at Target.
> If I care enough, I'll take a moment to look at how many downloads have been counted recently, how many issues there are (usually on GitHub), how many contributors, etc. I'll read the docs. I might even poke around in the source. I'll also check Google rankings to see if people are chatting about the module and linking to it.
> I'm not sure if there's a good centralized solution to this problem, but it's a question many people are asking: How do I know which non-stdlib module to use?
> Back at Georgia Tech, my professor [0] once told me that the way to get rich is to invent an index. He was referring to Richard Florida's "Creative Class" book and the subsequent "Creativity Index" consulting that Florida provided to various municipalities. People who score high on the index pay you to speak. People who score low on the index pay you to consult.
> There are a few companies who sell a Python package reputation service, along with some distribution tools. Continuum's Anaconda, Enthought's Canopy, and ActiveState's ActivePython come to mind. There's clearly value in helping people answer this question.

And so do Linux distributions.

There's a reason there are so many of the latter: "good" choices
depend a great deal on what you're doing, which means there's no point
in trying to come up with the "one true reputation system" for
software components. does a relatively good job for
the Django ecosystem, but the simple fact of using Django puts you far
enough down the path towards solving a particular kind of problem (web
service design with a rich user permission system backed by a
relational database) that a community driven rating system can work.

By contrast, I think Python itself covers too many domains for a
common rating system to be feasible - "good for education" is not the
same as "good for sysadmin tasks" is not the same as "good for data
analysis" is not the same as "good for network service development",


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From desmoulinmichel at  Fri Apr  1 09:43:14 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 15:43:14 +0200
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
Message-ID: <>

Le 01/04/2016 14:37, Paul Moore a ?crit :
> On 1 April 2016 at 12:49, Michel Desmoulin <desmoulinmichel at> wrote:
>> Is there really a strong case against it than just "it's not pure" ?
>> I've seen of lot of this argument on the list lately and I find it
>> counter productive.
>> There are dozen of good way to oppose an idea, just saying "we got a
>> moral stand to not do it" is not convincing. Espacially in a language
>> with so many compromised like len(foo) instead of foo.len, functional
>> paradigme and Poo and immutability and mutability, etc.
> You seem to have missed the argument that's been made a couple of
> times, which is that you would then have two ways of doing the same
> thing (empty parentheses would need to still be allowed for backward
> compatibility) and Python has a strong tradition of "there should only
> be one way of doing things". (Yes, it's not always followed 100%, as
> with any real life situation not everything is perfect). You may
> prefer to allow people to choose their own style and have options. But
> as a maintainer, I can confirm that I personally prefer code that I
> have to maintain to have a consistent style, and Python's lack of
> multiple ways to say the same thing is a benefit in ensuring that
> happens. So for me, your proposal would result in extra work, and no
> benefit (I would continue adding "()" as I find that style more
> readable and consistent).
> This is not a "moral stand". The "one way to do things" principle is a
> highly practical decision based on experience of different programming
> environments - Perl in particular allowed many ways of doing the same
> thing ("there's more than one way to do it" was a catch phrase in the
> Perl community) and it's an acknowledged fact that Perl code is hard
> to maintain as a consequence (unless strict style guidelines are
> imposed).
> As someone proposing a change to the Python language, the onus is on
> you to argue the benefits of your change, not on others to argue
> against it. If no-one does anything, Python won't change so you have
> to convince people. At the moment your argument is little more than
> "it looks neater", and opinions on that are clearly divided. As
> there's a *huge* amount of material that would need changing as a
> result of the change (documentation, training courses, style guides,
> IDEs, ...) you need a much better argument - and complaining that the
> people pointing out that your argument isn't strong enough are being
> "counter productive" is not helpful.
> If we're discussing "I've seen a lot of it on this list lately", then
> I would argue that there has been an awful lot of ideas proposed here
> recently which don't take into account the significant and genuine
> costs involved in *any* change to Python, and as a result don't even
> try to offer a justification for the proposal that is in proportion to
> the impact of the change. It's perfectly possible to propose a change
> here, have it supported, and get it implemented - but it needs *work*,
> and a lot of the discussions here are pointless because no-one is
> willing to do any work (not even the amount of work needed to convince
> the core developers that their proposal is worthwhile). It's very easy
> to accuse people of not being willing to listen to your proposals -
> but I challenge you to try to get a change like this included into C++
> or Java, if you think that about Python!
> Paul

This is actually a very convincing one :)

From ncoghlan at  Fri Apr  1 09:44:37 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 1 Apr 2016 23:44:37 +1000
Subject: [Python-ideas] `to_file()` method for strings
In-Reply-To: <>
References: <>
 <-7249072436132249471@unknownmsgid> <>
Message-ID: <>

On 29 March 2016 at 01:44, Michel Desmoulin <desmoulinmichel at> wrote:
> Le 28/03/2016 17:30, Chris Barker - NOAA Federal a ?crit :
>>> On Mar 24, 2016, at 7:22 PM, Nick Coghlan : what if we had a JSON-based save builtin that wrote
>>> UTF-8 encoded files based on json.dump()?
>> I've been think about this for a while, but would rather have a
>> "pyson" format -- I.e. Python literals, rather than JSON. This would
>> preserve the tuple vs list and integer vs float distinction, and allow
>> more options for dictionary keys.(and sets?).
>> Granted, you'd lose the interoperability, but for the quick saving and
>> loading of data, it'd be pretty nice.
>> There is also JSON pickle:
>> Though as I understand it, it has the same security issues as pickle.
>> But could we make a not-quite-as-complete pickle-like protocol that
>> could save and load arbitrary objects, without ever running arbitrary
>> code?
> If it's for quick data saving, the security is not an issue since the
> data will never comes from an attacker if you do a quick script.

"These files will never be supplied or altered by an attacker" is the
kind of assumption that has graced the world with such things as MS
Office macro viruses. That means that as Python makes more inroads
into the traditional territory of MS Excel and other spreadsheets,
ensuring we encourage a clear distinction between code (which is
always dangerous to trust) and data (which *should* be safe to read,
aside from processing capacity limits) becomes increasingly important.

If we ever did something like this, then Chris's suggestion of a
Python-specific format that can be loaded from a string via
ast.literal_eval() rather than using JSON likely makes sense [1], but
it would also be appropriate to revisit that idea first as a project
outside the standard library for ad hoc data persistence, before
proposing it for standard library inclusion.


[1] is a project from several
years ago aimed at that task.

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From steve at  Fri Apr  1 09:54:46 2016
From: steve at (Steven D'Aprano)
Date: Sat, 2 Apr 2016 00:54:46 +1100
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 01, 2016 at 01:49:03PM +0200, Michel Desmoulin wrote:

> There are dozen of good way to oppose an idea, just saying "we got a
> moral stand to not do it" is not convincing.

Nobody has made that argument.

> Espacially in a language
> with so many compromised like len(foo) instead of foo.len,

len(foo) isn't a compromise, it is an intentional feature.

> functional paradigme and Poo and immutability and mutability, etc.


You might not be aware that "poo" is an English euphemism for excrement, 
normally used for and by children. So I'm completely confused by what 
you mean by "and Poo".

> Python has an history of making things to get out of the way:

There are many people who would say that Python's case sensitivity and 
significant indentation "get in the way".

> - no {} for indentation;
> - optional parentheses for tuples;

No. Parentheses have nothing to do with tuples (except the empty tuple). 
Parentheses are used for *grouping*. Parens don't make tuples, and they 
aren't "optional" any more than parens are "optional" in addition 
because you can write `result = (a+b)`. The parens here have nothing to 
do with addition, and it would be misleading to say "optional 
parentheses for addition".

Writing (1, 2, 3) is similar to writing ([1, 2, 3]) or ("abc") or (123). 
Apart from nested tuples, it's almost never needed.

> - optional parenthesis for classes;

Needed for backwards compatibility. Let's not copy that misfeature into 
future misfeatures.

> If this changes does not hurt readability, ability to debug and doesn't
> make your code/program any worst than it was but does't help even a
> little, why not ?

Who says that it doesn't hurt readability? My personal experience tells 
me that it DOES hurt readability, at least a little, and adds confusion 
to the rules of what needs parens when and what doesn't.

You might not agree with my personal experience, but you shouldn't just 
dismiss it or misrepresent it as a "moral stand". My argument cuts right 
to the core of the argument that making parens optional helps -- my 
experience is that it *doesn't help*, it actually HURTS.


From boekewurm at  Fri Apr  1 10:18:14 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 16:18:14 +0200
Subject: [Python-ideas] Decorators for variables
Message-ID: <>

tldr: Using three method declarations or chaining method calls is ugly, why
not allow variables and attributes to be decorated too?

Currently the way to create variables with custom get/set/deleters is to
use the @property decorator or use property(get, set, del, doc?), and this
must be repeated per variable. If I were able to decorate multiple
properties with decorators like @not_none or something similar, it would
take away a lot of currently used confusing code.

Feedback is appreciated.


The current ways to define the getter, setter and deleter methods for a
variable or attribute are the following:

    def name():
        """ docstring """
       ... code

    def name():
        ... code

    def name():
        ... code


    var = property(getter, setter, deleter, docstring)

These two methods are doable when you only need to change access behaviour
changes on only one variable or property, but the more variables you want
to change access to, the longer and more bloated the code will get. Adding
multiple restrictions on a variable will for instance look like this:

    var = decorator_a(decorator_b(property(value)))


    def var(self):
        return decorator_a.getter(decorator_b..getter(self._value))
        ... etc

or even this

    def var(self):

I propose the following syntax, essentially equal to the syntax of function

    var = some_value

which would be the same as

    var = decorator(some_value)

and can be chained as well:

    var = some_value

which would be

   var = decorator(decorator_2(some_value))

or similarly

    var = decorator(decorator_2())
    var = some_value

The main idea behind the proposal is that you can use the decorator as a
standardized way to create variables that have the same behaviour, instead
of havng to do that using methods. I think that a lot can be gained by
specifying a decorator that can decorate variables or properties.

Note that many arguments will be the same as for function decorators (PEP
0318), but then applied to variable/property/attribute declaration.
From rymg19 at  Fri Apr  1 10:19:04 2016
From: rymg19 at (Ryan Gonzalez)
Date: Fri, 1 Apr 2016 09:19:04 -0500
Subject: [Python-ideas] The future of Python: fixing broken error handling
 in Python 8
Message-ID: <>

Python's exception handling system is currently badly brokeTypeError:
unsupported operand type(s) for +: 'NoneType' and 'NoneType'n. Therefore,
with the recent news of the joyous release of Python 8 (, I
have decided to propose a revolutionary idea: safe mock objects.

A "safe" mock object (qualified name
Java-style naming was adopted for readability purposes; comments are now no
longer necessary) is a magic object that supports everything and returns
itself. Since examples speak more words than are in the Python source code,
here are some (examples, not words in the Python source code):

a = 1
b = None
c = a + b # Returns a
print(c) # Prints the empty string.
d = c+1 # All operations on
return a new one.
e =, 2, 3) # `e` is now a
def f():
    assert 0 # Causes the function to return a
    raise 123 # Does the same thing.
print(L) # L is undefined, so it becomes a

Safe mock objects are obviously the Next Error Handling Revolution ?.
errors now simply disappear and return more

As for `try` and `catch` (protest the naming of `except`!!) statements,
they will
be completely ignored. The `try`, `except`, and `finally` bodies will all be
executed in sequence, except that printing and returning values with an
statement does nothing:

    xyz = None.a # `xyz` becomes a
    print(123) # Does nothing.
    return None # Does nothing.
    return xyz # Returns a

Aggressive error handling (as shown in PanicSort [])
that does destructive actions (such as `rm -rf /`) will always execute the
destructive code, encouraging more honest development.

In addition, due to errors simply being ignored, nothing can ever quite go
All discussions about a safe navigation operator can now be immediately
since any undefined attributes will simply return a

Although I have not yet destroy--I mean, improved CPython to allow for this
amazing idea, I have created a primitive implementation of the
`_frozensafemockobjectimplementation` module:

I hope you will all realize that this new idea is a drastic improvement
over current technologies and therefore support it, because we can Make
Python Great Again?.

[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
From rosuav at  Fri Apr  1 10:27:48 2016
From: rosuav at (Chris Angelico)
Date: Sat, 2 Apr 2016 01:27:48 +1100
Subject: [Python-ideas] The future of Python: fixing broken error
 handling in Python 8
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 2, 2016 at 1:19 AM, Ryan Gonzalez <rymg19 at> wrote:
> Safe mock objects are obviously the Next Error Handling Revolution ?.
> Unicode
> errors now simply disappear and return more
> `_frozensafemockobjectimplementation.SafeMockObjectThatIsIncludedWithPython8`s.

Finally. It's about time. For some reason, technology deals with the
hard problems but not the easy ones. I mean, we put a man on the moon,
but we can't cure the common cold OR cancer. We make aeroplanes that
fly us around the world any time we like, but can't get through
airport security in less than an hour. And we design programming
languages that eliminate memory allocation problems, but can't make it
so bytes and text magically work together.

At last, a language that lets me express myself in code without having
to think about any languages other than my own parochial subset of


From tritium-list at  Fri Apr  1 10:36:27 2016
From: tritium-list at (Alexander Walters)
Date: Fri, 1 Apr 2016 10:36:27 -0400
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 3/31/2016 19:22, Ben Finney wrote:
> No mention of keystrokes. I'm endlessly disappointed that discussions of
> ?too much noise in the code? are mis-interpreted as*only*  about
> writing, not about reading.
Because, presumably, the people making those comments read a lot of code 
and don't see a problem with the two characters?

 >>> def foo():
...   pass

There are four pieces of information in the first line of that 
definition.  It starts with the keyword at the beginning of the line.  
Readers of languages that are left to right will instantly know they are 
in a function definition.  The next piece of information is the 
identifier that the function will be assigned too, and that is clearly 
defined right there.  The proposal would not change this.  The third is 
the argument list, in this case an empty one.  The fourth is the 'block 
delimiter' for lack of anything better for me to call it.

As this example sits, that line should be read as "define a function 
named foo that explicitly takes no arguments with the following code."  
Omitting the parens changes the way it reads to something along the 
lines of "define a function named foo with the following code."  The 
proposal has removed vital visual information.

I said earlier that if this suggestion was made in 1991 it should have 
been accepted to make functions more consistent with classes. I have 
changed my mind; in 1991 classes should have been corrected to always 
require parens.
From ian.g.kelly at  Fri Apr  1 10:37:32 2016
From: ian.g.kelly at (Ian Kelly)
Date: Fri, 1 Apr 2016 08:37:32 -0600
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016 at 8:18 AM, Matthias welp <boekewurm at> wrote:
> Currently the way to create variables with custom get/set/deleters is to use
> the @property decorator or use property(get, set, del, doc?), and this must
> be repeated per variable. If I were able to decorate multiple properties
> with decorators like @not_none or something similar, it would take away a
> lot of currently used confusing code.

I don't see the relationship between this paragraph and the rest of
your proposal. How does a decorator in place of an explicit function
call prevent repetition?

> I propose the following syntax, essentially equal to the syntax of function
> decorators:
>     @decorator
>     var = some_value
> which would be the same as
>     var = decorator(some_value)
> and can be chained as well:
>     @decorator
>     @decorator_2
>     var = some_value
> which would be
>    var = decorator(decorator_2(some_value))
> or similarly
>     var = decorator(decorator_2())
>     var = some_value

What about augmented assignment? Should this work?

var += 20

And would that be equivalent to:

var = var + 20


var = var + float(20)

Also, what about attributes and items?

x.attr = value

d['foo'] = value

> The main idea behind the proposal is that you can use the decorator as a
> standardized way to create variables that have the same behaviour, instead
> of havng to do that using methods. I think that a lot can be gained by
> specifying a decorator that can decorate variables or properties.

By "methods" you mean "function composition", right? Otherwise I don't
understand what methods have got to do with this.

From ethan at  Fri Apr  1 10:47:02 2016
From: ethan at (Ethan Furman)
Date: Fri, 01 Apr 2016 07:47:02 -0700
Subject: [Python-ideas] Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <ndjovk$at5$> <>
 <> <>
 <ndlgd5$6sc$> <>
Message-ID: <>

On 04/01/2016 03:10 AM, Sven R. Kunze wrote:
> On 01.04.2016 11:52, Terry Reedy wrote:
>> On 4/1/2016 5:25 AM, Sven R. Kunze wrote:
>>> I remember this "issue" with decorators as well.
>>> @property
>>> def foo....
>>> @lru_cache()    # really?
>>> def bar...
>> There is a sematic difference between the two examples. The ()s are
>> not optional. 'property' is a decorator that is applied to foo after
>> it is defined.  'lru_cache()' is a function call that returns a
>> decorator. Quite different.
> @Terry
> Thanks for snipping away the relevant explanation and not addressing my
> point. >.<

Your "point" appears to be that ()s are optional in the case of 
decorators, and they are not -- decorators don't need them, and can't 
have them*, while functions that return a decorator do need them and 
must have them.

If you meant something else please offer a better explanation -- no need 
to be snide with Terry.


* Okay, it is possible to write a decorator that works either with or 
without parens, but it's a pain, not general-purpose, and can be 
confusing to use.

From tritium-list at  Fri Apr  1 10:42:30 2016
From: tritium-list at (Alexander Walters)
Date: Fri, 1 Apr 2016 10:42:30 -0400
Subject: [Python-ideas] Package reputation system
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On 4/1/2016 09:07, Nick Coghlan wrote:
> And so do Linux distributions.
> There's a reason there are so many of the latter: "good" choices
> depend a great deal on what you're doing, which means there's no point
> in trying to come up with the "one true reputation system" for
> software components. does a relatively good job for
> the Django ecosystem, but the simple fact of using Django puts you far
> enough down the path towards solving a particular kind of problem (web
> service design with a rich user permission system backed by a
> relational database) that a community driven rating system can work.
> By contrast, I think Python itself covers too many domains for a
> common rating system to be feasible - "good for education" is not the
> same as "good for sysadmin tasks" is not the same as "good for data
> analysis" is not the same as "good for network service development",
> etc.
> Cheers,
> Nick.
Not that this was the original proposal, but there can be such a thing 
as a universal 'bad' package, though.  So about the only thing that a 
universal package rating system can do effectively is shame developers.  
I don't think we want that.

From ericfahlgren at  Fri Apr  1 11:07:21 2016
From: ericfahlgren at (Eric Fahlgren)
Date: Fri, 1 Apr 2016 08:07:21 -0700
Subject: [Python-ideas] The next major Python version will be Python 8
In-Reply-To: <>
References: <>
Message-ID: <010a01d18c28$31d0e810$9572b830$>

Victor Stinner  wrote:
> Sent: Thursday, March 31, 2016 15:27
> To: python-ideas
> The PSF is happy to announce that the new Python release will be Python 8!


Excellent work!   Good to see people thinking about the future.  I've been working on some similar enhancements myself.

My principal development platform is the Apple ][, which sports a highly ergonomic 40-column display (with supremely aesthetic shades of grey in lieu of useless colors).  This leads me to propose an enhancement to PEP 8, which I'm calling PEP 8.1 (since as all Windows users know, 8.1 is much better than 8).

One of the major changes in PEP 8.1 is the reduction of the allowed line length from 79 to 39, with a suggested maximum of 36 characters.  It would be great if Python 8.1 would implement this, and disallow the use of any source code with characters beyond the magic boundary.  This clearly makes sense, since as everyone knows that if the 79-column limit is good in this age of ubiquitous 1080, nay 4k, monitors, then 39 is surely better!  


From victor.stinner at  Fri Apr  1 11:34:14 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 1 Apr 2016 17:34:14 +0200
Subject: [Python-ideas] The next major Python version will be Python 8
In-Reply-To: <010a01d18c28$31d0e810$9572b830$>
References: <>
Message-ID: <>

2016-04-01 17:07 GMT+02:00 Eric Fahlgren <ericfahlgren at>:
> Excellent work!   Good to see people thinking about the future.  I've been working on some similar enhancements myself.
> (...)
> One of the major changes in PEP 8.1 is the reduction of the allowed line length from 79 to 39, with a suggested maximum of 36 characters.  It would be great if Python 8.1 would implement this, and disallow the use of any source code with characters beyond the magic boundary.  This clearly makes sense, since as everyone knows that if the 79-column limit is good in this age of ubiquitous 1080, nay 4k, monitors, then 39 is surely better!

The rules of the pep8 module *must* change at each Python release to
ensure that each release is backward-incompatible!


From boekewurm at  Fri Apr  1 11:46:26 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 17:46:26 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> > Currently the way to create variables with custom get/set/deleters is
> > use the @property decorator or use property(get, set, del, doc?), and
> > this must be repeated per variable. If I were able to decorate multiple
> > properties with decorators like @not_none or something similar, it
> > would take away a lot of currently used confusing code.
> How does this prevent repetition?

If you were to define a variable you currently could use the @property for
each variable, which could take up to 3 declarations of the same name per
use of the pattern. Using a decorator might take that down to only 1 extra

> What about augmented assignment? Should this work?

The steps it would go through were these:

1. the value of the statement is calculated. e.g. val + 20 in the first
case given.
2. the decorator is applied on that value
3. the return value from the decorator is then assigned to the variable.

This is, again, very similar to the way function decorators work, and a
short-handed method to make property access more transparent to the

> Also, what about attributes and items?

I have not yet thought about that, as they are not direct scope variables.
The idea was there to decorate the attribute or variable at the moment it
would get defined, not per se after definition.

> > instead of having to do that using methods
> By "methods" you mean "function composition", right?

Sorry for my terminology, I meant function calls, but function composition
is indeed what would happen effectively.
From ian.g.kelly at  Fri Apr  1 11:55:02 2016
From: ian.g.kelly at (Ian Kelly)
Date: Fri, 1 Apr 2016 09:55:02 -0600
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016 at 9:46 AM, Matthias welp <boekewurm at> wrote:
>> > Currently the way to create variables with custom get/set/deleters is to
>> > use the @property decorator or use property(get, set, del, doc?), and
>> > this must be repeated per variable. If I were able to decorate multiple
>> > properties with decorators like @not_none or something similar, it
>> > would take away a lot of currently used confusing code.
>> How does this prevent repetition?
> If you were to define a variable you currently could use the @property for
> each variable, which could take up to 3 declarations of the same name per
> use of the pattern. Using a decorator might take that down to only 1 extra
> line.

An example of the transformation that you intend would help here. If
you're intending this as a @property replacement then as far as I can
see you still need to write up to three functions that define the
property's behavior.

From rosuav at  Fri Apr  1 12:45:18 2016
From: rosuav at (Chris Angelico)
Date: Sat, 2 Apr 2016 03:45:18 +1100
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 2, 2016 at 1:18 AM, Matthias welp <boekewurm at> wrote:
> I propose the following syntax, essentially equal to the syntax of function
> decorators:
>     @decorator
>     var = some_value
> which would be the same as
>     var = decorator(some_value)

The whole point of decorator syntax for classes and functions is that
their definitions take many lines, and the decoration belongs as part
of the function signature or class definition. At the top of a
function block is a line which specifies the function name and any
arguments, and then you have the docstring. Similarly with classes -
name, superclasses, metaclass, docstring. All up the top. Placing the
decorator above that allows for an extremely convenient declarative
syntax that keeps all that information together. Also, the decorator
syntax replaces the redundant names:

def functionname():
functionname = decorator(functionname)

where the function first gets defined using its name, and then gets
rebound (which involves looking up the name and then assigning the
result back) - three separate uses of the name. In contrast, you're
asking for syntax to help you modify an expression. Expressions
already don't need the decorator syntax, because we can replace this:

var = some_value
var = decorator(var)

with this:

var = decorator(some_value)

as in your example. Decorator syntax buys us nothing above this.


From boekewurm at  Fri Apr  1 13:14:41 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 19:14:41 +0200
Subject: [Python-ideas] Decorators for variables
Message-ID: <>

> An example of the transformation would help here

An example, that detects cycles in a graph, and doesn't do an update if
the graph has cycles.

class prevent_cycles(property):
        This uses nodes that point to only one other node: if there is a
        in the current subgraph, it will detect that within O(n)
    def __init__(self, value):
        self._value = None

    def getter(self):
        return self._value

    def setter(self, value):
        if not turtle_and_hare(value):
            self._value = value
            raise Exception("cycle detected, shutting down")

    def deleter(self):
        del self._value

    def turtle_and_hare(self, other):
            Generic turtle and hare implementation. true if cycle, false if
            Returns True if cyclic from this point, false if it is not.
        turtle = other
        hare = other
        fieldname = self.__name__
        # this assuming that properties have access to their name, but that
        # would also be the same as a function.

        while True:
            if hare is None:
                return False

            hare = getattr(hare, fieldname)

            if hare is None:
                return False
            if hare is turtle:
                return True

            hare = getattr(hare, fieldname)
            turtle = getattr(turtle, fieldname)

            if hare is turtle:
                return True

class A(object):
    def __init__(self, parent):
        self.parent = parent

This would prevent cycles from being created in this object A, and would
some highly reusable code. The same can be done for @not_none, etc, to
some states which may be unwanted.
-------------- next part --------------
From mike at  Fri Apr  1 13:32:57 2016
From: mike at (Michael Selik)
Date: Fri, 01 Apr 2016 17:32:57 +0000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016 at 10:18 AM Matthias welp <boekewurm at> wrote:

> tldr: Using three method declarations or chaining method calls is ugly,
> why not allow variables and attributes to be decorated too?

Focusing on attributes, not variables in general. Check out how Django
dealt with this ( And
SQLAlchemy (

Do their solutions satisfy?
From ian.g.kelly at  Fri Apr  1 13:33:15 2016
From: ian.g.kelly at (Ian Kelly)
Date: Fri, 1 Apr 2016 11:33:15 -0600
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

(Resending to correct list. Sorry about that.)

On Fri, Apr 1, 2016 at 11:25 AM, Ian Kelly <ian.g.kelly at> wrote:
> On Fri, Apr 1, 2016 at 11:14 AM, Matthias welp <boekewurm at> wrote:
>>> An example of the transformation would help here
>> An example, that detects cycles in a graph, and doesn't do an update if
>> the graph has cycles.
> Thanks.
>> class A(object):
>>     def __init__(self, parent):
>>         @prevent_cycles
>>         self.parent = parent
> I think you'll find that this doesn't work. Properties are members of
> the class, not of instances of the class.
>> This would prevent cycles from being created in this object A, and would
>> make
>> some highly reusable code. The same can be done for @not_none, etc, to
>> prevent
>> some states which may be unwanted.
> But you could accomplish the same thing with "self.parent =
> prevent_cycles(parent)". So I'm still not seeing how the use of the
> decorator syntax eliminates repetition.

From desmoulinmichel at  Fri Apr  1 13:41:13 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 19:41:13 +0200
Subject: [Python-ideas]  Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

It's dangerous to talk about a new feature the 1st of April, so I'll
start by saying this one is not a joke.

I read recently a proposal to allow md5 hashing doing python -m hashlib
md5 filename.

I think it's a great idea, and that can extend this kind of API to other
parts of Python such as:

python -m random randint 0 10 => print(random.randin(0, 10))
python -m random urandom 10 => print(os.urandom(10))
python -m uuid uuid4 => print(uuid.uuid4().hex)
python -m uuid uuid3 => print(uuid.uuid3().hex)

From desmoulinmichel at  Fri Apr  1 13:44:21 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 19:44:21 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

Le 01/04/2016 19:33, Ian Kelly a ?crit :
> (Resending to correct list. Sorry about that.)
> On Fri, Apr 1, 2016 at 11:25 AM, Ian Kelly <ian.g.kelly at> wrote:
>> On Fri, Apr 1, 2016 at 11:14 AM, Matthias welp <boekewurm at> wrote:
>>>> An example of the transformation would help here
>>> An example, that detects cycles in a graph, and doesn't do an update if
>>> the graph has cycles.
>> Thanks.
>>> class A(object):
>>>     def __init__(self, parent):
>>>         @prevent_cycles
>>>         self.parent = parent
>> I think you'll find that this doesn't work. Properties are members of
>> the class, not of instances of the class.
>>> This would prevent cycles from being created in this object A, and would
>>> make
>>> some highly reusable code. The same can be done for @not_none, etc, to
>>> prevent
>>> some states which may be unwanted.
>> But you could accomplish the same thing with "self.parent =
>> prevent_cycles(parent)". So I'm still not seeing how the use of the
>> decorator syntax eliminates repetition.

Not saying I like the proposal, but you can argue against regular
decorators the same way:

def bar():

Is just:

bar = foo(bar)

But, I think the benefit for @decorator on functions is mainly because a
function body is big, and this way we can read the decorator next to the
function signature while on a variable, this just add another way to
call a function on a variable.

From desmoulinmichel at  Fri Apr  1 13:47:42 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 19:47:42 +0200
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

Le 01/04/2016 15:54, Steven D'Aprano a ?crit :
> On Fri, Apr 01, 2016 at 01:49:03PM +0200, Michel Desmoulin wrote:
>> There are dozen of good way to oppose an idea, just saying "we got a
>> moral stand to not do it" is not convincing.
> Nobody has made that argument.
>> Espacially in a language
>> with so many compromised like len(foo) instead of foo.len,
> len(foo) isn't a compromise, it is an intentional feature.
>> functional paradigme and Poo and immutability and mutability, etc.
> "Paradigm". 
> You might not be aware that "poo" is an English euphemism for excrement, 
> normally used for and by children. So I'm completely confused by what 
> you mean by "and Poo".

Those are both french mistake.

paradigme has a "e" in french while OOP is POO. Wrote the email too fast.

>> Python has an history of making things to get out of the way:
> There are many people who would say that Python's case sensitivity and 
> significant indentation "get in the way".
>> - no {} for indentation;
>> - optional parentheses for tuples;
> No. Parentheses have nothing to do with tuples (except the empty tuple). 
> Parentheses are used for *grouping*. Parens don't make tuples, and they 
> aren't "optional" any more than parens are "optional" in addition 
> because you can write `result = (a+b)`. The parens here have nothing to 
> do with addition, and it would be misleading to say "optional 
> parentheses for addition".
> Writing (1, 2, 3) is similar to writing ([1, 2, 3]) or ("abc") or (123). 
> Apart from nested tuples, it's almost never needed.
>> - optional parenthesis for classes;
> Needed for backwards compatibility. Let's not copy that misfeature into 
> future misfeatures.
>> If this changes does not hurt readability, ability to debug and doesn't
>> make your code/program any worst than it was but does't help even a
>> little, why not ?
> Who says that it doesn't hurt readability? My personal experience tells 
> me that it DOES hurt readability, at least a little, and adds confusion 
> to the rules of what needs parens when and what doesn't.
> You might not agree with my personal experience, but you shouldn't just 
> dismiss it or misrepresent it as a "moral stand". My argument cuts right 
> to the core of the argument that making parens optional helps -- my 
> experience is that it *doesn't help*, it actually HURTS.

But as I said earlier, I now agree with you.

From boekewurm at  Fri Apr  1 13:49:41 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 19:49:41 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> > tldr
> Check out how Django dealt with this. And SQLAlchemy
> Do their solutions satisfy

The thing I'm missing in those solutions is how it isn't chainable. If I
would want something that uses access logging (like what Django and
SQLAlchemy are doing), and some 'mixin' for that variable to prevent cycle
protection, then that would be hard. The only other way is using function
composition, and that can lead to statements that are too long to read
-------------- next part --------------
From boekewurm at  Fri Apr  1 13:53:32 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 19:53:32 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

 > But, I think the benefit for @decorator on functions is mainly because a
> function body is big, and this way we can read the decorator next to the
> function signature while on a variable, this just add another way to
> call a function on a variable.

Yes, it is, but I proposed it to give a visual indicator that it does not
change the value that is assigned to the identifier, but rather changes the
behaviour of the identifier, just like what most function decorators do.
-------------- next part --------------
From rosuav at  Fri Apr  1 13:57:22 2016
From: rosuav at (Chris Angelico)
Date: Sat, 2 Apr 2016 04:57:22 +1100
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 2, 2016 at 4:53 AM, Matthias welp <boekewurm at> wrote:
>> But, I think the benefit for @decorator on functions is mainly because a
>> function body is big, and this way we can read the decorator next to the
>> function signature while on a variable, this just add another way to
>> call a function on a variable.
> Yes, it is, but I proposed it to give a visual indicator that it does not
> change the value that is assigned to the identifier, but rather changes the
> behaviour of the identifier, just like what most function decorators do.

Wait, what?

Function decorators are simply higher-order functions: they take a
function as an argument, and return a function [1]. They can't change
the behaviour of the name, only the value it's bound to.


[1] Usually. Nothing's stopping them from returning non-callables,
except that it'd confuse the living daylights out of people.

From ethan at  Fri Apr  1 13:58:09 2016
From: ethan at (Ethan Furman)
Date: Fri, 01 Apr 2016 10:58:09 -0700
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/01/2016 10:53 AM, Matthias welp wrote:
>  > But, I think the benefit for @decorator on functions is mainly because a
>  > function body is big, and this way we can read the decorator next to the
>  > function signature while on a variable, this just add another way to
>  > call a function on a variable.
> Yes, it is, but I proposed it to give a visual indicator that it does
> not change the value that is assigned to the identifier, but rather
> changes the behaviour of the identifier, just like what most function
> decorators do.

So instead of

   a = Char(length=10, value='empty')

you want

   a = 'empty'



From desmoulinmichel at  Fri Apr  1 14:02:10 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 1 Apr 2016 20:02:10 +0200
Subject: [Python-ideas]  Provide __main__ for datetime and time
In-Reply-To: <>
References: <>
 <-7249072436132249471@unknownmsgid> <>
Message-ID: <>

As with my previous email about __main__ for random, uuid and os, I wish
to suggest a similar __main__ for datetime. The target audience may not
be the same, so I'm making it a difference proposal.


python -m datetime now => print(str(
python -m datetime utcnow => print(str(datetime.datetime.utcnow())
python -m time epoch => print(time.time())

python -m datetime now "%d/%m/%Y" =>
python -m datetime utcnow "%d/%m/%Y" =>

From boekewurm at  Fri Apr  1 14:08:54 2016
From: boekewurm at (Matthias welp)
Date: Fri, 1 Apr 2016 20:08:54 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> Function decorators

There are decorators that return a callable that not calls the function
that was given as an argument, but also do some other things, and therefore
change the behaviour of that function.

> So instead of
>   a = Char(length=10, value='empty')
> you want
>   @Char(length=10)
>   a = 'empty'
> ?

If possible, yes. So that there is a standardized way to access changing
variables, or to put limits on the content of the variable, similar to the
@accepts and @produces decorators that are seen here (
From ethan at  Fri Apr  1 14:33:03 2016
From: ethan at (Ethan Furman)
Date: Fri, 01 Apr 2016 11:33:03 -0700
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/01/2016 11:08 AM, Matthias welp wrote:
 > Even earlier, Ethan Furman wrote:

>> So instead of
>>   a = Char(length=10, value='empty')
>> you want
>>   @Char(length=10)
>>   a = 'empty'
>> ?
> If possible, yes. So that there is a standardized way to access changing
> variables, or to put limits on the content of the variable, similar to
> the @accepts and @produces decorators that are seen here
> (

I don't see it happening.  Making that change would be a lot of work, 
and the advantages (if any) of the second method over the first do not 
warrant it.


From mike at  Fri Apr  1 14:33:34 2016
From: mike at (Michael Selik)
Date: Fri, 01 Apr 2016 18:33:34 +0000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016 at 2:09 PM Matthias welp <boekewurm at> wrote:

> > Function decorators
> There are decorators that return a callable that not calls the function
> that was given as an argument, but also do some other things, and therefore
> change the behaviour of that function.
> > So instead of
> >
> >   a = Char(length=10, value='empty')
> >
> > you want
> >
> >   @Char(length=10)
> >   a = 'empty'
> >
> > ?
> If possible, yes. So that there is a standardized way to access changing
> variables, or to put limits on the content of the variable, similar to the
> @accepts and @produces decorators that are seen here (
> )

There is a standardized way. You can extend ``property`` or mimic its
implementation. Then instead of

class Foo:
    a = property(getter, no_cycles)

You can write

class Foo:
    a = NoCycles()

I haven't linked to a how-to for doing this is, because I think it's
unnecessary for most small projects. Every so often someone asks for a more
pleasant syntax for specifying a property, getter and setter. Guido seems
to consistently reply that he thinks our current situation is good enough.
I'd dig up a link to the email archive for you, but Google wasn't being
very kind to me.
From mike at  Fri Apr  1 14:46:11 2016
From: mike at (Michael Selik)
Date: Fri, 01 Apr 2016 18:46:11 +0000
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

There's a large category of these features that is solved by ``python -c``.
What's special about these particular tasks?

On Fri, Apr 1, 2016 at 1:41 PM Michel Desmoulin <desmoulinmichel at>

> It's dangerous to talk about a new feature the 1st of April, so I'll
> start by saying this one is not a joke.
> I read recently a proposal to allow md5 hashing doing python -m hashlib
> md5 filename.
> I think it's a great idea, and that can extend this kind of API to other
> parts of Python such as:
> python -m random randint 0 10 => print(random.randin(0, 10))
> python -m random urandom 10 => print(os.urandom(10))
> python -m uuid uuid4 => print(uuid.uuid4().hex)
> python -m uuid uuid3 => print(uuid.uuid3().hex)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
From random832 at  Fri Apr  1 15:04:40 2016
From: random832 at (Random832)
Date: Fri, 01 Apr 2016 15:04:40 -0400
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016, at 13:41, Michel Desmoulin wrote:
> It's dangerous to talk about a new feature the 1st of April, so I'll
> start by saying this one is not a joke.
> I read recently a proposal to allow md5 hashing doing python -m hashlib
> md5 filename.
> I think it's a great idea, and that can extend this kind of API to other
> parts of Python such as:
> python -m random randint 0 10 => print(random.randin(0, 10))
> python -m random urandom 10 => print(os.urandom(10))
> python -m uuid uuid4 => print(uuid.uuid4().hex)
> python -m uuid uuid3 => print(uuid.uuid3().hex)

Bikeshedding a bit, but how about just python -m uuid for

From victor.stinner at  Fri Apr  1 15:07:41 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 1 Apr 2016 21:07:41 +0200
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>


2016-04-01 19:41 GMT+02:00 Michel Desmoulin <desmoulinmichel at>:
> I read recently a proposal to allow md5 hashing doing python -m hashlib
> md5 filename.
> I think it's a great idea, and that can extend this kind of API to other
> parts of Python such as:
> python -m random randint 0 10 => print(random.randin(0, 10))

How many times did you need this feature recently? I don't recall
having to generate a random number in a range.

> python -m random urandom 10 => print(os.urandom(10))

I don't understand the use case. Can you elaborate?

> python -m uuid uuid4 => print(uuid.uuid4().hex)

FYI On Linux, you can use "cat /proc/sys/kernel/random/uuid" ;-)

I agree that -c is enough here:
./python -c 'import uuid; print(uuid.uuid4())'

> python -m uuid uuid3 => print(uuid.uuid3().hex)

I don't know UUID3. It looks like you need more parameters.

IMHO these use cases are not popular enough to justify a CLI.

hashlib CLI is inspired by existing tools: md5sum, sha1sum, etc. Same
rationale for Python tarfile CLI. What are the existing commands which
inspired your CLI?


From tjreedy at  Fri Apr  1 15:31:59 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 1 Apr 2016 15:31:59 -0400
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
Message-ID: <ndmibl$sk0$>

On 4/1/2016 7:49 AM, Michel Desmoulin wrote:

> Yes, and because after more than a decade of Python, I still forget to
> type out the parenthesis some time,

Only in function definitions, or also in function calls.

  then go back and realize that it's
> silly that I have to since I don't with classes.

The parallel in invalid.  To repeat what I said in the my initial 
response and what others have said: the header in a def statement shows 
now to call the function (the signature).  The header in a class 
statement does no such thing.  The () in a def statement represent the 
call operator.  The () is a class statement do not.  There serve a 
visual grouping and subordination purpose.  "class mystring(str)" does 
not say anything about how to use mystring as a callable.

> Is there really a strong case against it

Yes, as has been presented before, but ignored by proponents.  For one: 
I believe that omitting () in def will encourage people to even more 
ofter omit () in calls, when needed, and that is BAD.  I consider this a 
killer argument against it.

 > than just "it's not pure" ?

I have seen this too often.  Practical arguments against a proposal are 
either ignored or wrongly dismissed as 'purity arguments'.  To me, this 
makes the discussion useless.

Terry Jan Reedy

From tjreedy at  Fri Apr  1 15:55:38 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 1 Apr 2016 15:55:38 -0400
Subject: [Python-ideas] Provide __main__ for datetime and time
In-Reply-To: <>
References: <>
 <-7249072436132249471@unknownmsgid> <>
Message-ID: <ndmjnv$io1$>

Note: new subjects should be posted as new threads.  This was posted as 
a response in the unrelated "`to_file()` method for strings" thread and 
will not be seen by anyone who has killed that thread or who has threads 

On 4/1/2016 2:02 PM, Michel Desmoulin wrote:
> As with my previous email about __main__ for random, uuid and os, I wish
> to suggest a similar __main__ for datetime.

File __main__ is for packages.  I believe none of the modules you 
mention are packages, so I presume you mean adding within the module 
something like

def main(args): ...

if __name__ = '__main__':
     from sys import args

or the equivalent in C.

 > The target audience may not
> be the same, so I'm making it a difference proposal.
> E.G:
> python -m datetime now => print(str(
> python -m datetime utcnow => print(str(datetime.datetime.utcnow())
> python -m time epoch => print(time.time())
> python -m datetime now "%d/%m/%Y" =>
> print(str("%d/%m/%Y"))
> python -m datetime utcnow "%d/%m/%Y" =>
> print(str(datetime.datetime.utcnow().strftime("%d/%m/%Y"))

What is the particular motivation for this package?  Should we add a 
command line interface to math?  So that

python -m math func arg => print(func(arg))
I am wondering what is or should be the general policy on the subject. 
How easy is this for C-coded modules?

Terry Jan Reedy

From tjreedy at  Fri Apr  1 15:57:49 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 1 Apr 2016 15:57:49 -0400
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <ndmjs2$io1$>

On 4/1/2016 1:41 PM, Michel Desmoulin wrote:
> It's dangerous to talk about a new feature the 1st of April, so I'll
> start by saying this one is not a joke.

This you should not have posted this as a followup on the joke thread '-).

See my response to your separate similar proposal.

Terry Jan Reedy

From jsbueno at  Fri Apr  1 16:35:25 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Fri, 1 Apr 2016 17:35:25 -0300
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On 1 April 2016 at 16:07, Victor Stinner <victor.stinner at> wrote:
> Hi,
> 2016-04-01 19:41 GMT+02:00 Michel Desmoulin <desmoulinmichel at>:
>> I read recently a proposal to allow md5 hashing doing python -m hashlib
>> md5 filename.
>> I think it's a great idea, and that can extend this kind of API to other
>> parts of Python such as:
>> python -m random randint 0 10 => print(random.randin(0, 10))
> How many times did you need this feature recently? I don't recall
> having to generate a random number in a range.

Well...actually, really recently (Tuesday), I did
python3 -c "__import__('calendar').calendar(2016)"

And just when this thread started here, I recalled it was possible
to do just
python -m calendar

That is an anecdote related with the features in question, and I
myself can't decide if
it is a point for or against them  :-)
I suppose they'd be "nice to haves" .

But maybe, if instead of sprinkiling 4 - 5 lines of code in "__main__"
files everywhere,
would not make sense to put all of it in a single place?

Maybe shutils itself?
So that python -m shutils.uuid  , python -m shutils.randint, and so on
would each do their thing?


From ethan at  Fri Apr  1 16:42:35 2016
From: ethan at (Ethan Furman)
Date: Fri, 01 Apr 2016 13:42:35 -0700
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote:

> Well...actually, really recently (Tuesday), I did
> python3 -c "__import__('calendar').calendar(2016)"

Wouldn't have

   python3 -c "import calendar; calendar(2016)"

been clearer?  and easier to type?  :)


From breamoreboy at  Fri Apr  1 18:44:33 2016
From: breamoreboy at (Mark Lawrence)
Date: Fri, 1 Apr 2016 23:44:33 +0100
Subject: [Python-ideas] Provide __main__ for datetime and time
In-Reply-To: <>
References: <>
 <-7249072436132249471@unknownmsgid> <>
Message-ID: <ndmtkk$9cj$>

On 01/04/2016 19:02, Michel Desmoulin wrote:
> As with my previous email about __main__ for random, uuid and os, I wish
> to suggest a similar __main__ for datetime. The target audience may not
> be the same, so I'm making it a difference proposal.
> E.G:
> python -m datetime now => print(str(
> python -m datetime utcnow => print(str(datetime.datetime.utcnow())
> python -m time epoch => print(time.time())
> python -m datetime now "%d/%m/%Y" =>
> print(str("%d/%m/%Y"))
> python -m datetime utcnow "%d/%m/%Y" =>
> print(str(datetime.datetime.utcnow().strftime("%d/%m/%Y"))

Not very funny for 1st April.  Would you care to have another go?

My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

From breamoreboy at  Fri Apr  1 18:47:44 2016
From: breamoreboy at (Mark Lawrence)
Date: Fri, 1 Apr 2016 23:47:44 +0100
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <ndmtqj$bvn$>

On 01/04/2016 21:42, Ethan Furman wrote:
> On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote:
>> Well...actually, really recently (Tuesday), I did
>> python3 -c "__import__('calendar').calendar(2016)"
> Wouldn't have
>    python3 -c "import calendar; calendar(2016)"
> been clearer?  and easier to type?  :)
> --
> ~Ethan~

My understanding is that this kind of construction does not make you as 
much money when you are a consultant, as opposed to a mere programmer or 

My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

From jsbueno at  Fri Apr  1 19:10:51 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Fri, 1 Apr 2016 20:10:51 -0300
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <ndmtqj$bvn$>
References: <>
 <> <ndmtqj$bvn$>
Message-ID: <>

On 1 April 2016 at 19:47, Mark Lawrence via Python-ideas
<python-ideas at> wrote:
> On 01/04/2016 21:42, Ethan Furman wrote:
>> On 04/01/2016 01:35 PM, Joao S. O. Bueno wrote:
>>> Well...actually, really recently (Tuesday), I did
>>> python3 -c "__import__('calendar').calendar(2016)"
>> Wouldn't have
>>    python3 -c "import calendar; calendar(2016)"
>> been clearer?  and easier to type?  :)
>> --
>> ~Ethan~

It would, but   python3 -c "import calendar; calendar.calendar(2016)"
not so much easier to type.
The reason, however is that I usually think of  "I have the right to
use one single expression when using python -c" and not "a single

> My understanding is that this kind of construction does not make you as much
> money when you are a consultant, as opposed to a mere programmer or similar.

Or that   ^   :-)

Anyway - back to the thread - it does not seen a bad idea for me at all.

From greg.ewing at  Fri Apr  1 20:27:48 2016
From: greg.ewing at (Greg Ewing)
Date: Sat, 02 Apr 2016 13:27:48 +1300
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

Chris Angelico wrote:
> var = some_value
> var = decorator(var)
> with this:
> var = decorator(some_value)
> as in your example. Decorator syntax buys us nothing above this.

It would if it made the name of the variable available
to the decorator:

var = value


var = decorator("var", value)


From alexander.belopolsky at  Fri Apr  1 20:49:25 2016
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Fri, 1 Apr 2016 20:49:25 -0400
Subject: [Python-ideas] Provide __main__ for datetime and time
In-Reply-To: <>
References: <>
 <-7249072436132249471@unknownmsgid> <>
Message-ID: <>

On Fri, Apr 1, 2016 at 2:02 PM, Michel Desmoulin <desmoulinmichel at>

> As with my previous email about __main__ for random, uuid and os, I wish
> to suggest a similar __main__ for datetime. The target audience may not
> be the same, so I'm making it a difference proposal.
> E.G:
> python -m datetime now => print(str(
> python -m datetime utcnow => print(str(datetime.datetime.utcnow())


In fact, I would not mind seeing a fairly complete GNU date utility [1]
reimplementation as datetime.__main__.

From stephen at  Sat Apr  2 01:09:30 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 2 Apr 2016 14:09:30 +0900
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
 <> <ndmtqj$bvn$>
Message-ID: <>

Joao S. O. Bueno writes:

 > Anyway - back to the thread - it does not seen a bad idea for me at all.

I think it's a waste of time: an invitation to endless bikeshedding
and a slippery slope to very half-baked attempts at full-featured
utilities in the module's "__main__" section.  It won't be consistent,
as many modules already have test code in there, although I suppose
you could move that functionality to a "test" command (at the expense
of breaking existing "install and test" scripts).  Finally to the
extent that the feature itself is pretty consistently present, people
are going to demand that their favorite options be one-linable, and
that pretty much every module grow one-line-abilty.

Whatever happened to "not every 3-liner need be builtin"?

The suggested scripts also seem, uh, of incredibly marginal
usefulness.  Except for calendar, but there, the desirable set of
options is huge (did you know that many Japanese like their weeks to
start with Monday, for example? and that some people care about the
Mayan calendar? or of more relevance, different base years -- in Japan
it's Year 28 of the Heisei Emperor).  Alexander was right as to the
ultimate goal (full implementations of POSIX date or GNU calendar plus
whatever extensions might seem useful).  But if you do more than one
or two of those, you're going to end up with a confusing mishmash of
CLI syntaxes, with a long script invocation: "python -m calendar ...".

That said, your earlier idea taken one step farther seems like the
best way to go: create a module oneline[1], put the functionality
in there, put it on PyPI, and you can start lobbying for stdlib
inclusion in 2018.

Or better yet, create a pysh script[2] so you can omit the "-m".  An
advantage to this is that you can create an API, a module-level
variable such as "one_line_subcommands", which would contain a dict of
name-function pairs that pysh could look for and run.  You could also
have a global directory in pysh so that pysh can override those
stubborn module maintainers who won't add your favorite one-liner to

The best idea of all, perhaps, is

# echo >> /etc/shells /usr/bin/python
# chsh -s /usr/bin/python <your login>

(Sorry, Windows users, I don't know the equivalent there.)  Or make
that ipython for even more shellshocking convenience.

[1]  Too bad it can't be called "1line".  "oneln" is a little shorter.

[2]  ISTR that name is already taken but can't think of a better one.
"py1line" maybe?

From random832 at  Sat Apr  2 01:17:29 2016
From: random832 at (Random832)
Date: Sat, 02 Apr 2016 01:17:29 -0400
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
 <> <ndmtqj$bvn$>
Message-ID: <>

On Sat, Apr 2, 2016, at 01:09, Stephen J. Turnbull wrote:
> [1]  Too bad it can't be called "1line".  "oneln" is a little shorter.

It's inconvenient to import a module starting with a digit, but not
impossible (and it works fine with -m)

From ncoghlan at  Sat Apr  2 01:29:19 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 2 Apr 2016 15:29:19 +1000
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 2 April 2016 at 00:36, Alexander Walters <tritium-list at> wrote:
> I said earlier that if this suggestion was made in 1991 it should have been
> accepted to make functions more consistent with classes.  I have changed my
> mind; in 1991 classes should have been corrected to always require parens.

In fact, the one change made in this area since then was to *permit*
empty parens on class definitions back in Python 2.5:

Prior to that, the empty parens were mandatory for functions and
explicitly disallowed for classes. Making them mandatory for classes
would break too much code for not enough benefit, while making them
optional for functions would introduce an additional stylistic choice
with no demonstrable benefit to code maintainability, and a clear
disadvantage in creating a new opportunity for style inconsistencies.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From k7hoven at  Sat Apr  2 02:37:39 2016
From: k7hoven at (Koos Zevenhoven)
Date: Sat, 2 Apr 2016 09:37:39 +0300
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 1, 2016 at 8:41 PM, Michel Desmoulin <desmoulinmichel at>

> It's dangerous to talk about a new feature the 1st of April, so I'll
> start by saying this one is not a joke.
> I read recently a proposal to allow md5 hashing doing python -m hashlib
> md5 filename.
> I think it's a great idea, and that can extend this kind of API to other
> parts of Python such as:
> python -m random randint 0 10 => print(random.randin(0, 10))
> python -m random urandom 10 => print(os.urandom(10))
> python -m uuid uuid4 => print(uuid.uuid4().hex)
> python -m uuid uuid3 => print(uuid.uuid3().hex)
This brings back memories :). Back when I wasn't using Python yet, I ended
up needing some stuff in a shell script that was available in Python. Then,
as I basically did not know any Python, I had to (1) figure out how do that
in Python, and (2) how the heck to get the result cleanly out of there.
Especially step 2 may have been a waste of time, even if I did eventually
find out that I could do python -c and that i could do stuff like "import
foo; print bar(foo)". I ended up writing a tiny bash script that did the
"import foo; print " part for me so I only had to do the `bar(foo)` part.
If I had had to install a package, I probably would have just found another

How about something like

 python --me module expression_in_module_namespace

 python --me random "randint(0,10)"


 python --me module expression_assuming_module_is_imported

 python --me random "random.randint(0,10)"

Or even

 python -e "random.randint(0,10)"

which would automatically import stdlib if their names appear in the

- Koos
From random832 at  Sat Apr  2 04:22:15 2016
From: random832 at (Random832)
Date: Sat, 02 Apr 2016 04:22:15 -0400
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 2, 2016, at 02:37, Koos Zevenhoven wrote:
>  python -e "random.randint(0,10)"

#!/usr/bin/env python3
import sys
class magicdict(dict):
    def __getitem__(self, x):
            return super().__getitem__(x)
        except KeyError:
                mod = __import__(x)
                self[x] = mod
                return mod
            except ImportError:
                raise KeyError

g = magicdict()
for arg in sys.argv[1:]:
        p, obj = True, eval(arg, g)
    except SyntaxError:
        p = False
        exec(arg, g)
    if p:

Handling modules inside packages is left as an exercise for the reader. 

From pavol.lisy at  Sat Apr  2 05:16:08 2016
From: pavol.lisy at (Pavol Lisy)
Date: Sat, 2 Apr 2016 11:16:08 +0200
Subject: [Python-ideas] Package reputation system
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

2016-03-29 22:51 GMT+02:00, Michael Selik <mike at>:
> Back at Georgia Tech, my professor [0] once told me that the way to get rich
> is to invent an index.

2016-04-01 16:42 GMT+02:00, Alexander Walters <tritium-list at>:
> On 4/1/2016 09:07, Nick Coghlan wrote:
>> By contrast, I think Python itself covers too many domains for a
>> common rating system to be feasible - "good for education" is not the
>> same as "good for sysadmin tasks" is not the same as "good for data
>> analysis" is not the same as "good for network service development",
>> etc.
>> Cheers,
>> Nick.
> Not that this was the original proposal, but there can be such a thing
> as a universal 'bad' package, though.  So about the only thing that a
> universal package rating system can do effectively is shame developers.
> I don't think we want that.

1. There is not only good-bad dimension.

beauty-ugly, simple-complex, flat-nested (and others from PEP20)

trusted-untrusted is one from others.

2. Art critics are not to shame artists. Constructive criticism could
help authors too. It is not benefit only for "customers". One who
wants invent index to get rich could probably invent this job (Python
critic) too.

3. pypi has number of downloads for day, week and month. If there is
public api to data - one could make graphs, trends, etc and public
them. If they will be nice then they probably could be included on
pypi site too.

From rosuav at  Sat Apr  2 08:28:20 2016
From: rosuav at (Chris Angelico)
Date: Sat, 2 Apr 2016 23:28:20 +1100
Subject: [Python-ideas] Package reputation system
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On Sat, Apr 2, 2016 at 8:16 PM, Pavol Lisy <pavol.lisy at> wrote:
> 2. Art critics are not to shame artists. Constructive criticism could
> help authors too. It is not benefit only for "customers". One who
> wants invent index to get rich could probably invent this job (Python
> critic) too.

A reputation system is simply a score, though. If you are low on the
rankings, you don't automatically know why - and you can't look at the
high ranked results to figure out what to do better. That's why the
inventor of the index gets a lot of money - the consultancy fees to
help people get higher on the index will be gladly paid, because
there's no other way to learn.

> 3. pypi has number of downloads for day, week and month. If there is
> public api to data - one could make graphs, trends, etc and public
> them. If they will be nice then they probably could be included on
> pypi site too.

That's popularity, which is different again from reputation. It
suffers from several brutal limitations:

1) By definition, it cannot count *usage*. If I download something
only to find that it completely fails me, it's still a download; if I
download something and use it every day for years, it's still only one

2) It can only count downloads from the ultimate origin. If a Python
package gets included in a downstream collection such as the Debian
repositories, people who apt-get the package won't ting the stats.

3) Installations cascade into their dependencies. Does that count?
What if you 'pip freeze' and now explicitly list all those
dependencies in your requirements.txt - should they NOW count?

4) Even if you solve all of those problems, popularity begets
popularity in many ways. Two competing packages may fight for a while,
but one of them will "win", and the other languish - and it's
virtually impossible for a new competitor to get to the point where it
can be properly respected.

Choosing which thing to use based on how many other people use it is
sometimes important (the network effect - using Python rather than
Bob's Obscure Scripting Language means it's a lot easier to find
collaborators, no matter how technically superior BOSL might be), but
it's almost completely orthogonal to quality.


From tjreedy at  Sat Apr  2 15:13:24 2016
From: tjreedy at (Terry Reedy)
Date: Sat, 2 Apr 2016 15:13:24 -0400
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <ndp5kq$ogn$>

On 4/2/2016 2:37 AM, Koos Zevenhoven wrote:

>   python -e "random.randint(0,10)"
> which would automatically import stdlib if their names appear in the
> expression.

I like this much better than cluttering a arbitrary selection of modules 
with main functions with an arbitrary selection functions whose values 
are printed.

Terry Jan Reedy

From steve at  Sat Apr  2 23:09:21 2016
From: steve at (Steven D'Aprano)
Date: Sun, 3 Apr 2016 13:09:21 +1000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 01, 2016 at 07:44:21PM +0200, Michel Desmoulin wrote:

> Not saying I like the proposal, but you can argue against regular
> decorators the same way:
> @foo
> def bar():
>    pass
> Is just:
> bar = foo(bar)

Not quite. It's actually:

def bar():
bar = foo(bar)

which is *three* uses of the same name, two of which may be a long way 
from the first. As you say:

> But, I think the benefit for @decorator on functions is mainly because a
> function body is big, 

That is certainly part of the justification for @ syntax.

It might help to remember that decorators themselves have been possible 
in Python going all the way back to Python 1.5, at least, if not 
earlier, and the first three decorators in the standard library 
(property, classmethod and staticmethod) were added a few releases 
before decorator syntax using @. (If I remember correctly, property etc 
were added in 2.2 but @ syntax wasn't added until 2.4.)

So the decorator concept had many years to prove itself before being 
given special syntax. The two reasons for adding extra syntax were:

(1) to keep the decoration near the function declaration; and

(2) to avoid having to repeat the function name three times.

> and this way we can read the decorator next to the
> function signature while on a variable, this just add another way to
> call a function on a variable.

I'm afraid I don't understand what you mean by this. What does "while on 
a variable" mean here?


From steve at  Sun Apr  3 00:02:15 2016
From: steve at (Steven D'Aprano)
Date: Sun, 3 Apr 2016 14:02:15 +1000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 01, 2016 at 08:08:54PM +0200, Matthias welp wrote:

> > So instead of
> >
> >   a = Char(length=10, value='empty')
> >
> > you want
> >
> >   @Char(length=10)
> >   a = 'empty'
> >
> > ?
> If possible, yes. So that there is a standardized way to access changing
> variables, or to put limits on the content of the variable, 

But decoratoring a *name binding* isn't going to do that. All it will do 
is limit the *initial value*, exactly as the function call does.

After calling 

a = Char(length=10, value='empty')

the name "a" is bound to the result of Char(...), whatever that happens 
to call. But that doesn't change the behaviour of the *name* "a", it 
only sets the value it is bound to. Nothing stops anyone from saying:

a = 42

and re-binding the name to another value which is no longer a Char.

Python has no default support for running custom code when binding 
arbitrary names to a value. To get this sort of thing to work, you are 
limited to attributes, using the descriptor protocol. I'm not going 
to explain descriptors now, you can google them, but property is 
a descriptor.

So let's imagine that Python allows the @ syntax as you request, and go 
through the cases to see what that would imply.

For local variables inside functions, or global top-level module 
variables, it doesn't give you any interesting power at all. All you 
have is an alternative syntax:

x = 999

is exactly the same as 

x = spam(eggs(cheese(999)))

right now. You don't even save any characters: 28 (including newlines) 
for both. But this doesn't give us anything new and exciting, since x is 
now just a regular variable that can be replaced with some other value.

So in this scenario, this sounds boring -- it gives you nothing you 
don't already have.

Inside a class, we have the power of descriptors available to us, so we 
can use them for computed attributes. (That's how property works, among 
many others.) So suppose we have:

class MyClass(object):
    x = 999

instance = MyClass()

Now instance.x can be a descriptor, which means that it can enforce 
type and value validation rules, or logging, or whatever amazing 
functionality you want to add. This is good.

But you can already do this. You just have to write:

class MyClass(object):
    x = decorate(999)

instead. If decorate() returns a descriptor, it returns a descriptor 
whatever syntax you use.

What benefit do you gain? Well, in the function and class decorator 
case, you gain the benefit that the decoration is close to the class or 
function header, and you don't have to write the name three times:

class MyClass(object):
    def method(self):
    method = decorate(method)

But this isn't an advantage in the case of the variable:

class MyClass(object):
    x = decorate(value)

You only have to write the name once, and the decoration is not going to 
be far away. So just like the global variable case, there's no advantage 
to @decorate syntax, regardless of whether you are in a class or not. 


From steve at  Sun Apr  3 00:19:22 2016
From: steve at (Steven D'Aprano)
Date: Sun, 3 Apr 2016 14:19:22 +1000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 01, 2016 at 07:49:41PM +0200, Matthias welp wrote:
> > > tldr
> > Check out how Django dealt with this. And SQLAlchemy
> > Do their solutions satisfy
> The thing I'm missing in those solutions is how it isn't chainable. If I
> would want something that uses access logging (like what Django and
> SQLAlchemy are doing), and some 'mixin' for that variable to prevent cycle
> protection, then that would be hard. The only other way is using function
> composition, and that can lead to statements that are too long to read
> comfortably.

I'm not sure how function composition is harder to read than decorator 

x = 999


x = logging(

is not that different. Sure, you have a few extra brackets, but you have 
fewer @ symbols. And if you're going to do that sort of thing a lot, you 
just need a couple of helper functions:

def compose(f, g):
    # Return a new function which applies f(g(args)).
    def composed(*args, **kw):
        return f(g(*args, **kwargs))
    return composed
def helper(use_logging=False, requires=None, ensures=None, 
           use_validate=False, use_spam=False, use_eggs=False):
    funcs = []
    if use_logging:
   if requires:
   if ensures:
   if use_validate:
   # likewise for spam and eggs
   if not funcs:
       # Identity function returns whatever it is given.
       return lambda arg: arg
       f = funcs[0]
       for g in funcs[1:]:
           f = compose(g)
       return f

all_validation = helper(True, int, str, True, True, True)
x = all_validation(999)


From steve at  Sun Apr  3 00:22:25 2016
From: steve at (Steven D'Aprano)
Date: Sun, 3 Apr 2016 14:22:25 +1000
Subject: [Python-ideas] Fwd: Make parenthesis optional in parameterless
 functions definitions
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Fri, Apr 01, 2016 at 07:47:42PM +0200, Michel Desmoulin wrote:

> >> functional paradigme and Poo and immutability and mutability, etc.
> > 
> > "Paradigm". 
> > 
> > You might not be aware that "poo" is an English euphemism for excrement, 
> > normally used for and by children. So I'm completely confused by what 
> > you mean by "and Poo".
> > 
> Those are both french mistake.
> paradigme has a "e" in french while OOP is POO. Wrote the email too fast.

Today I learned something new! Thank you.

> > You might not agree with my personal experience, but you shouldn't just 
> > dismiss it or misrepresent it as a "moral stand". My argument cuts right 
> > to the core of the argument that making parens optional helps -- my 
> > experience is that it *doesn't help*, it actually HURTS.
> But as I said earlier, I now agree with you.

I had missed that post. Thanks for the comments.


From ethan at  Sun Apr  3 01:47:26 2016
From: ethan at (Ethan Furman)
Date: Sat, 02 Apr 2016 22:47:26 -0700
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/02/2016 09:02 PM, Steven D'Aprano wrote:
> On Fri, Apr 01, 2016 at 08:08:54PM +0200, Matthias welp wrote:
>>> So instead of
>>>    a = Char(length=10, value='empty')
>>> you want
>>>    @Char(length=10)
>>>    a = 'empty'
>>> ?
>> If possible, yes. So that there is a standardized way to access changing
>> variables, or to put limits on the content of the variable,

Steven, the above was a short cut of a name-binding inside a class 

Sorry for the confusion.


From boekewurm at  Sun Apr  3 02:05:36 2016
From: boekewurm at (Matthias welp)
Date: Sun, 3 Apr 2016 08:05:36 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> But that doesn't change the behaviour of the *name* "a", it
> only sets the value it is bound to. Nothing stops anyone from saying:
> a = 42
> and re-binding the name to another value which is no longer a Char.

> Python has no default support for running custom code when binding
> arbitrary names to a value. To get this sort of thing to work, you are
> limited to attributes, using the descriptor protocol.

This is exactly my point. When they implemented descriptors (2.2) they did
that to add to the new-style class system. I think they did a great job,
but namespaces were overlooked in my opinion. Why should we be limited to
class attributes when using descriptors? Any namespace which contains a
'.__dict__' content mapping should be able to hold descriptors in my
opinion. If this has been discussed before, then please link me to the
relevant discussion, I'd love to read the points made.

> But this isn't an advantage in the case of the variable

Assigning variables from function results is fairly common. When chaining
function calls to get the desired behaviour of the variable, it will get
confusing which part is the 'value' part, and which part is the 'behaviour'
part, apart from namings:

var = logging(

would get more clear if you wrote it this way:

var = factorize(4)

Something I just thought about: currently you can use the property() call
to make an attribute have descriptor properties. This may be somewhat
controversial, but maybe limit this to decorators only? While keeping the
'.__dict__' override method open, the way things work currently won't
change that much, but it will make the assignment of attributes or
variables with descriptor properties a lot more intuitive, as you do not
set a *name* to a variable and then can undo that just a while later:

var = property(setter, getter, deleter, docs)
var = 20

currently changes behaviour depending on what kind of scope it is located
in (class description, any other scope), while decorators (for functions at
least) work in every scope I can think of. I think that is strange, and
that it should just be the same everywhere. Using decorators here could be
a very nice and interesting choice.
From rosuav at  Sun Apr  3 02:32:56 2016
From: rosuav at (Chris Angelico)
Date: Sun, 3 Apr 2016 16:32:56 +1000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp <boekewurm at> wrote:
> While keeping the '.__dict__' override method open, the way things work
> currently won't change that much, but it will make the assignment of
> attributes or variables with descriptor properties a lot more intuitive, as
> you do not set a *name* to a variable and then can undo that just a while
> later:
> var = property(setter, getter, deleter, docs)
> var = 20
> currently changes behaviour depending on what kind of scope it is located in
> (class description, any other scope), while decorators (for functions at
> least) work in every scope I can think of. I think that is strange, and that
> it should just be the same everywhere.

Can you explain - or, preferably, demonstrate - the difference you're
talking about here?


From boekewurm at  Sun Apr  3 02:53:10 2016
From: boekewurm at (Matthias welp)
Date: Sun, 3 Apr 2016 08:53:10 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> > var = property(setter, getter, deleter, docs)
> > var = 20
> Can you explain - or, preferably, demonstrate - the difference you're
> talking about here?

Sorry, that was untested code. My expectations of class definitions was
wrong, as it does not actually change behaviour inside it's own scope. I
thought that when you are defining a class, that when you assign a property
value to an attribute, that the attribute 'name value' will directly change
it's behaviour to include the descriptor properties of the property object
assigned. My mistake.

On 3 April 2016 at 08:32, Chris Angelico <rosuav at> wrote:

> On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp <boekewurm at> wrote:
> > While keeping the '.__dict__' override method open, the way things work
> > currently won't change that much, but it will make the assignment of
> > attributes or variables with descriptor properties a lot more intuitive,
> as
> > you do not set a *name* to a variable and then can undo that just a while
> > later:
> >
> > var = property(setter, getter, deleter, docs)
> > var = 20
> >
> > currently changes behaviour depending on what kind of scope it is
> located in
> > (class description, any other scope), while decorators (for functions at
> > least) work in every scope I can think of. I think that is strange, and
> that
> > it should just be the same everywhere.
> Can you explain - or, preferably, demonstrate - the difference you're
> talking about here?
From rosuav at  Sun Apr  3 03:20:13 2016
From: rosuav at (Chris Angelico)
Date: Sun, 3 Apr 2016 17:20:13 +1000
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 3, 2016 at 4:53 PM, Matthias welp <boekewurm at> wrote:
>> > var = property(setter, getter, deleter, docs)
>> > var = 20
>> Can you explain - or, preferably, demonstrate - the difference you're
>> talking about here?
> Sorry, that was untested code. My expectations of class definitions was
> wrong, as it does not actually change behaviour inside it's own scope. I
> thought that when you are defining a class, that when you assign a property
> value to an attribute, that the attribute 'name value' will directly change
> it's behaviour to include the descriptor properties of the property object
> assigned. My mistake.

Ah. There is a significant difference between assignment within a
class definition and assignment from a function _inside_ that class
definition, but in any given scope, double assignment always does the
same thing: last one wins. Which is a good thing, when it comes to the
@property decorator:

class LifeAndUniverse:
    def answer(self):
        return 42
    def answer(self, value):
        print("No fair changing the answer!")
    def answer(self):
        print("You just deleted.... everything.")

Each function definition overwrites the previous "answer" with a new
one, which (thanks to the way setter and deleter are implemented)
incorporates the previous code, but nothing in Python mandates that.

So is there anything left of the assignment-decorator proposal, or is
it completely withdrawn? (I always like to read over even the bad
proposals - there's often something good in them, Martin Farquhar
Tupper's "Proverbial Philosophy" aside.)


From boekewurm at  Sun Apr  3 04:02:55 2016
From: boekewurm at (Matthias welp)
Date: Sun, 3 Apr 2016 10:02:55 +0200
Subject: [Python-ideas] Decorators for variables
In-Reply-To: <>
References: <>
Message-ID: <>

> So is there anything left of the assignment-decorator proposal, or is
> it completely withdrawn?

I think this sums the current open ends up:

- Namespace variables decoration was dismissed by one of Steven's posts,
but is actually something that might be wanted (via being able to put
descriptors into namespaces that have a __dict__ accessor (e.g. modules))
- Variable decoration can be more clear about descriptor/value difference
at assignment
- Giving property objects access to their variable's name (e.g. via
__name__) like functions have would open up quite a bit of possibilities,
and would mean decorators would get quite a bit more power than what they

Something that I had said earlier, but what went on a sidepath
- Decorators may directly *deep* set the behaviour of the variable, and
with it set the further behaviour of the variable (in the same scope). Such

    var = 20

    var = 40

    will not reset var to 40, but the var = 40 goes through the descriptor
(if applied).
-------------- next part --------------
From stephen at  Sun Apr  3 05:24:22 2016
From: stephen at (Stephen J. Turnbull)
Date: Sun, 3 Apr 2016 18:24:22 +0900
Subject: [Python-ideas] Decorators for variables
Steven D'Aprano writes:

 > > and this way we can read the decorator next to the function
 > > signature while on a variable, this just add another way to call
 > > a function on a variable.
 > I'm afraid I don't understand what you mean by this. What does
 > "while on a variable" mean here?

Missing punctuation (see substring in square brackets):

   and this way we can read the decorator next to the function
   signature[.  W]hile on a variable, this just add another way to
   call a function on a variable.

So, it appears that he's proposing pure syntactic sugar with no
semantic change at all.  The syntactic effects are that decorator
syntax separates function calls from the values they act on (-1 on
that), and makes it appear that in Python a function can change the
value of its formal argument (-1 on that, too).

The latter characteristic may need some expansion.  Of course Python
has mutable objects which can be changed "inside" a function and that
effect propagates to "outside" the function.  But in the containing
scope, while the object has mutated, the variable (more precisely, the
name-object binding) has not.

It seems to me that everything he's asked for can be accomplished by
defining a descriptor class and assigning an instance of that class.
I don't yet understand why that isn't good enough, unless it's just a
matter of taste about the sweetness of syntax.

On Sun, Apr 3, 2016 at 6:02 PM, Matthias welp <boekewurm at> wrote:
>> So is there anything left of the assignment-decorator proposal, or is
>> it completely withdrawn?
> I think this sums the current open ends up:
> - Namespace variables decoration was dismissed by one of Steven's posts, but
> is actually something that might be wanted (via being able to put
> descriptors into namespaces that have a __dict__ accessor (e.g. modules))
> - Variable decoration can be more clear about descriptor/value difference at
> assignment
> - Giving property objects access to their variable's name (e.g. via
> __name__) like functions have would open up quite a bit of possibilities,
> and would mean decorators would get quite a bit more power than what they
> have.
> Something that I had said earlier, but what went on a sidepath
> - Decorators may directly *deep* set the behaviour of the variable, and with
> it set the further behaviour of the variable (in the same scope). Such that
>     @decorator
>     var = 20
>     var = 40
>     will not reset var to 40, but the var = 40 goes through the descriptor
> (if applied).

None of this is possible with decorators *per se*, as they simply
mutate an object as it goes past. The magic you're seeing (and yearing
for) comes from two places: functions have names (so the bit in "def
SOMETHING(...)" actually affects the object constructed, not just the
name it's bound to), and attribute access and the descriptor protocol.
The latter is what makes @property work, but you don't need decorators
to use descriptor protocol - in fact, it's crucial to the way bound
methods are created.

So, let's take your proposals one by one.

> - Namespace variables decoration was dismissed by one of Steven's posts, but
> is actually something that might be wanted (via being able to put
> descriptors into namespaces that have a __dict__ accessor (e.g. modules))

You'll need to elaborate on exactly what this can and can't do.

> - Variable decoration can be more clear about descriptor/value difference at
> assignment

All Python names are in namespaces, so I'm not sure how this is
different from the above.

> - Giving property objects access to their variable's name (e.g. via
> __name__) like functions have would open up quite a bit of possibilities,
> and would mean decorators would get quite a bit more power than what they
> have.

This is fundamentally not possible in the general case; however, you
could easily make a special form of the @property decorator which
*requires* that the attribute name be the same as the function name
that implements it, and then it would work (because function names are
attributes of function objects, while "the name this is bound to" is
not an attribute of anything). It's already accessible, as
ClassName.property_name.fget.__name__; you could make this more
accessible or more discoverable by subclassing property. In fact, you
can even go meta:

>>> class property(property):
...     @property
...     def name(self): return self.fget.__name__
>>> class X:
...     @property
...     def foo(self): return 42

However, this is vulnerable to the same thing that any other "what is
my name" system has: you can rebind anything.

>>> =
>>> = 42
>>> X().foo
>>> X().bar
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'name'

So there's fundamentally no way to say "what name is bound to me"; but
you can quite effectively say "what is my canonical name", depending
on a function to provide that name.


On Sun, Apr 03, 2016 at 10:02:55AM +0200, Matthias welp wrote:
> > So is there anything left of the assignment-decorator proposal, or is
> > it completely withdrawn?
> I think this sums the current open ends up:
> - Namespace variables decoration was dismissed by one of Steven's posts,
> but is actually something that might be wanted (via being able to put
> descriptors into namespaces that have a __dict__ accessor (e.g. modules))

It's not that I dismissed the idea, but that such a thing is not 
possible today.

Maybe you could hack up something tricky by replacing the module 
__dict__ with a subclass, or replacing the entire module object with 
some custom class instance, but that's messy and fragile, and I'm not 
sure that it would work for (say) globals inside functions.

I think the idea of "namespace descriptors" is promising. But it 
probably needs a PEP and a significant amount of work to determine what 
actually is possible now and what needs support from the compiler.

> - Variable decoration can be more clear about descriptor/value difference
> at assignment

I'm not sure what you mean by that.

> - Giving property objects access to their variable's name (e.g. via
> __name__) like functions have would open up quite a bit of possibilities,

Such as what?

> and would mean decorators would get quite a bit more power than what they
> have.

What extra power are you thinking of?

> Something that I had said earlier, but what went on a sidepath
> - Decorators may directly *deep* set the behaviour of the variable, and
> with it set the further behaviour of the variable (in the same scope). Such
> that
>     @decorator
>     var = 20
>     var = 40
>     will not reset var to 40, but the var = 40 goes through the descriptor
> (if applied).

This functionality has nothing to do with decorators. If it were 
possible, if namespaces (other than classes) supported descriptors, then 
the decorator syntax doesn't add anything to this. You could write:

var = magic(20)

I'm not calling it "decorator" because the whole @ decorator syntax is 
irrelevant to this.

With respect Matthias, I think you are conflating and mixing up the 
power of descriptors as applied to classes, and the separate and 
distinct powers of decorators. You don't need one to have the other.


On Saturday, April 02, 2016 23:33, Chris Angelico wrote:
> On Sun, Apr 3, 2016 at 4:05 PM, Matthias welp <boekewurm at> wrote:
> > currently changes behaviour depending on what kind of scope it is 
> > located in (class description, any other scope), while decorators (for 
> > functions at
> > least) work in every scope I can think of. I think that is strange, 
> > and that it should just be the same everywhere.
> Can you explain - or, preferably, demonstrate - the difference you're talking about here?

I can sort of see it, like this?

class C():
    def __init__(self):
        self.x = 99
    def f(self):
        return self.x

x = 101
def f(namespace):
    return namespace.x

c = C()


Here's the inconsistency, should implicitly bind to a "global namespace object".
<property object at 0x00000000023FB6D8>

In other words, something like this:

class GNS:
    def __getattr__(self, name):
        prop = globals().get(name)
        if isinstance(prop, property):
            return prop.fget(self)
        return prop
global_namespace = GNS()

print(x is global_namespace.x, "should be 'True'")
True should be 'True'


> [1] is a project from several
> years ago aimed at that task.

Thanks for the link -- I'll check it out.

- Chris

> FYI On Linux, you can use "cat /proc/sys/kernel/random/uuid" ;-)

uuidgen -r

On Sat, Apr 2, 2016 at 1:49 AM Alexander Belopolsky <
alexander.belopolsky at> wrote:

> On Fri, Apr 1, 2016 at 2:02 PM, Michel Desmoulin <
> desmoulinmichel at> wrote:
>> As with my previous email about __main__ for random, uuid and os, I wish
>> to suggest a similar __main__ for datetime. The target audience may not
>> be the same, so I'm making it a difference proposal.
>> E.G:
>> python -m datetime now => print(str(
>> python -m datetime utcnow => print(str(datetime.datetime.utcnow())
> +1
> In fact, I would not mind seeing a fairly complete GNU date utility [1]
> reimplementation as datetime.__main__.
> [1]

Seems like Microsoft is solving this problem for you.
Let's start with a quick quiz: what is the result of each of the following
(on Python3.5.x)?

    {}.keys() | set()  # 1
    set() | []  # 2
    {}.keys() | []  # 3
    set().union(set())  # 4
    set().union([])  # 5
    {}.keys().union(set())  # 6

If your answer was set([]), TypeError, set([]), set([]), set([]),
AttributeError, then you were correct. That, to me, is incredibly

Next up:

    {}.keys() == {}.keys()  # 7
    {}.items() == {}.items()  # 8
    {}.values() == {}.values()  # 9
    d = {}; d.values() == d.values()  # 10

True, True, False, False.

Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
for debate.[1]

First thing first, the behavior exhibited by #3 is a bug (or at least it
probably should be, and one I'd be happy to fix. However, before doing that
I felt it would be good to propose some questions and suggestions for how
that might be done.

There are, as far as I can tell, two reasons to use a MappingView, memory
efficiency or auto-updating (a view will continue to mirror changes in the
underlying object). I'll focus on the first because the second can
conceivably be solved in other ways.

Currently, if I want to union a dictionaries keys with a non-set iterable,
I can do `{}.keys() | []`. This is a bug[2], the set or operator should
only work on another set. That said, fixing this bug removes the ability to
efficiently or keys with another iterable, `set({}.keys()).update([])` for
efficiency, or `set({}.keys()).union([])` for clarity.

Fixing this is simply a matter of adding a `.union` method to the KeysView
(and possibly to the Set abc as well). Although that may not be something
that is wanted. The issue, as far as I can tell, is whether we want people
converting from MappingViews to "primitives" as soon as possible, or if we
want to encourage people people to use the views for as long as possible.

There are arguments on both sides: encouraging people to use the views
causes these objects to become more complex, introducing more surface area
for bugs, more code to be maintained, etc. Further, there's there is
currently one obvious way to do things, convert to a primitive, if you're
doing any complex action. On the other hand, making MappingViews more like
the primitives they represent has positives for performance, simplifies
user code, and would probably make testing these objects easier, since many
tests could be stolen from

My opinion is that the operators on MappingViews should be no more
permissive than their primitive counterparts. A KeysView is in various ways
more restrictive than a set, so having it be also occasionally less
restrictive is surprising and in my opinion bad. This causes the loss of an
efficient way to union a dict's keys with a list (among other methods). I'd
then add .union, .intersection, etc. to remedy this.

This solution would bring the existing objects more in line with their
primitive counterparts, while still allowing efficient actions on large

In short:

 - Is #3 intended behavior?
 - Should it (and the others be)?
  - As a related aside, should .union and other frozen ops be added to the
Set interface?
 - If so, should the fix solely be a bugfix, should I do what I proposed,
or something else entirely?
 - More generally, should there be a guiding principle when it comes to
MappingViews and similar special case objects?

[1]: There's some good conversation in this prior thread on this issue
The consensus seemed to be that making ValuesViews comparable by value is
technically infeasible (O(n^2) worst case), while making it comparable
based on the underlying dictionary is a possibility. This would be for
OrderedDict, although many of the same arguments apply for a normal

[2]: Well, it probably should be a bug, its explicitly tested for (,
whereas sets are explicitly tested for the opposite functionality (

Thanks, I'm looking forward to the feedback,
On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote:
>     {}.keys() == {}.keys()  # 7
>     {}.items() == {}.items()  # 8
>     {}.values() == {}.values()  # 9
>     d = {}; d.values() == d.values()  # 10
> True, True, False, False.
> Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
> for debate.[1]

Last time this came up, the conclusion was that making values views
comparable was intractable due to the fact that they're unordered but
the values themselves aren't hashable. Then the discussion got
sidetracked into a discussion of whether the justification for not
having them be hashable (Java does just fine with everything being
hashable and content-based hashes for mutable objects) makes sense in a
"consenting-adults" world.

Something has changed then, because in Python 2.7.10 I get
     ##1,3     TypeError
     ##9,10   True
I can't see anything wrong with the actual behaviour in any of the examples.
Rob Cliffe

On 06/04/2016 10:38, Joshua Morton wrote:
> Let's start with a quick quiz: what is the result of each of the 
> following (on Python3.5.x)?
>     {}.keys() | set()  # 1
>     set() | []  # 2
>     {}.keys() | []  # 3
>     set().union(set())  # 4
>     set().union([])  # 5
>     {}.keys().union(set())  # 6
> If your answer was set([]), TypeError, set([]), set([]), set([]), 
> AttributeError, then you were correct. That, to me, is incredibly 
> unintuitive.
> Next up:
>     {}.keys() == {}.keys()  # 7
>     {}.items() == {}.items()  # 8
>     {}.values() == {}.values()  # 9
>     d = {}; d.values() == d.values()  # 10
> True, True, False, False.
> Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is 
> up for debate.[1]

On Wed, Apr 6, 2016 at 11:03 PM, Rob Cliffe <rob.cliffe at> wrote:
> Something has changed then, because in Python 2.7.10 I get
>     ##1,3     TypeError
>     ##9,10   True
> I can't see anything wrong with the actual behaviour in any of the examples.

Python 2.7 has very different handling of .keys() and .values() on
dictionaries - they return lists instead of views. You can ignore 2.7
for the purposes of this discussion - it's not "set-like" in the way
that's being considered here.


Am 24.03.2016 um 18:44 schrieb Sven R. Kunze:
> On 24.03.2016 14:54, Thomas G?ttler wrote:
>> for item in my_iterator:
>>     # do per item
>> on empty:
>>    # this code gets executed if iterator was empty
>> on break:
>>    # this code gets executed if the iteration was left by a "break"
>> on notempty:
>>    # ...
> Hmm, interesting. "on" would indeed be a distinguishing keyword to "except". So, "except" handles exceptions and "on"
> handles internal control flow (the iter protocol). Nice idea actually.

I thank you, because you had the idea to extend the loop syntax :-)

But I guess it is impossible to get "on empty" implemented.

   Thomas G?ttler

Thomas Guettler

From desmoulinmichel at  Wed Apr  6 10:49:05 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Wed, 6 Apr 2016 16:49:05 +0200
Subject: [Python-ideas] Control Flow - Never Executed Loop Body
In-Reply-To: <>
References: <> <>
 <> <>
Message-ID: <>

What about:

>>> for item in my_iterable as iterator:
>>>     # do per item
>>> if iterator.empty: # empty
>>>    # this code gets executed if iterator was empty
>>> else: # not empty

The empty marker would be set to True but the for loop and set to False
when iterating on the loop. The perf cost of this is only added if you
explicitly requires it with the "as" keyword.

You don't handle break, which is handle by else anyway.

And maybe having access to the iterator could open the door to more

Le 06/04/2016 16:16, Thomas G?ttler a ?crit :
> Am 24.03.2016 um 18:44 schrieb Sven R. Kunze:
>> On 24.03.2016 14:54, Thomas G?ttler wrote:
>>> for item in my_iterator:
>>>     # do per item
>>> on empty:
>>>    # this code gets executed if iterator was empty
>>> on break:
>>>    # this code gets executed if the iteration was left by a "break"
>>> on notempty:
>>>    # ...
>> Hmm, interesting. "on" would indeed be a distinguishing keyword to
>> "except". So, "except" handles exceptions and "on"
>> handles internal control flow (the iter protocol). Nice idea actually.
> I thank you, because you had the idea to extend the loop syntax :-)
> But I guess it is impossible to get "on empty" implemented.
> Regards,
>   Thomas G?ttler

Hi all,

Recently, I have spent quite a lot of time thinking about what should
be done to improve the situation of pathlib. Last week, after all
kinds of wild ideas and trying hard every night to prove to myself
that duck-typing-like compatibility with other path-representing
objects is sufficient, I noticed that I had failed to prove that and
that I had gone around a full circle back to where I started: "why
can't paths subclass ``str``?"

In the "Working with Path objects: p-strings?" thread, I said I was
working on a proposal. Since it's been several days already, I think i
should post it here and get some feedback before going any further.
Maybe I should have done that even earlier. Anyway, there are some
rough edges, and I will need to add links to references etc.

So, do not hesitate to give feedback or criticism, which is especially
appreciated it you take the time to read through the whole thing first

You can read a github-rendered version here:

And the raw rst is here below:

Proposal: Making path objects inherit from ``str``.


This proposal addresses issues that limit the usability of the
object-oriented filesystem path objects, most notably those of
``pathlib``, introduced in the standard library in Python 3.4 via PEP
0428. One issue being that the path objects are not directly compatible
with nearly any libraries, including the standard library. A further
goal of this proposal is to provide a smooth transition into a Python
with better Path handling, while keeping backwards compatiblity concerns
to a minimum. The approach involves making the path classes in
``pathlib`` (and optionally also DirEntry) subclasses of ``str``, but
takes further measures to avoid problems and unnecessary additions.


Filesystem paths are strings that give instructions for traversing a
directory tree. In Python, they have traditionally been represented as
byte strings, and more recently, unicode string. However, Python now has
``pathlib`` in the standard library, which is an object-oriented library
for dealing with objects specialized in representing a path and working
with it. In this proposal, such objects are generally referred to as
*path objects*, or sometimes, in the specific context of instances of
the ``pathlib`` path classes, they are explicitly referred to as
``pathlib`` objects.

In ``pathlib`` there is a hierarchy of path classes with a common base
class ``PurePath``. It has a subclass ``Path`` which essentially assumes
the path is intended to represent a path on the current system. However,
both of these classes, when called, instantiate a subclass of the
``Windows`` or ``Posix`` flavor, which have slightly different behavior.
In total, there are thus five public classes: ``PurePath``,
``PurePosixPath``, ``PureWindowsPath``, ``Path``, ``PosixPath`` and

Since Python 3.5 and the introduction of ``os.scandir``, the family of
path classes has a new member, ``DirEntry``, which is a
performance-oriented path object with significant duck-typing
compatibility with ``pathlib`` objects.

The adoption of the different types of path objects is still quite low,
which is perhaps unsurprising, because they were only introduced very
recently. However, it can also be inconvenient to work with these
objects, because, they usually need to be explicitly converted into
strings before passing them to functions, and path strings returned by
functions need to be explicitly converted into path objects. Especially
the latter issue is difficult in terms of backwards compatibility of
APIs. While many things were recently discussed on Python ideas
regarding the future of path-like objects, this proposal has a much more
limited scope, to provide first steps in the right direction. However,
the last part of this proposal considers possible future directions that
this may optionally lead to.


Filesystem paths (or comparable things like URIs) are strings of
characters that represent information needed to access a file or
directory (or other resource). In other words, they form a subset of
strings, involving specialized functionality such as joining absolute
and relative paths together, accessing different parts of the path or
file name, and even accessing the resources the path points to. In
Python terms, for a path ``path``, one would have
``isinstance(path, str)``. It is also clear that not all strings are

On the one hand, this would make an ideal case for making all
path-representing objects inherit from ``str``; while Python tries not
to over-emphasize object-oriented programming and inheritance, it should
not try to avoid class hierarchies when they are appropriate in terms of
both purity and practicality. Regarding practicality, making specialized
*path objects* also instances of ``str`` would make almost any stdlib or
third-party function accept path objects as path arguments, assuming
that they accept any instance of ``str``. Furthermore, functions now
returning instances of ``str`` to represent paths could in future
versions return path objects, with only minor backwards-incompatibility

On the other hand, strings are a very general concept, and the Python
``str`` class provides a large variety of methods to manipulate and work
with them, including ``.split()``, ``.find()``, ``.isnumeric()`` and
``.join()``. These operations may be defined just as well for a string
that represents a path than for any other string. In fact, this is the
status quo in Python, as the adoption of ``pathlib`` is still quite
limited and paths are in most cases represented as strings (sometimes
byte strings). But while the string operations are *defined* on
path-representing strings, the results of these operations may not be of
any use in most cases, even if in some cases, they may be.

While it is not the responsibility of the programming language to
prevent doing things that are not useful, it may be practical in some
cases. For instance, the string method ``.find()`` could be mistaken to
mean finding files on the file system, while it in fact searches for a
substring. String concatenation, in turn, can be a perfectly reasonable
thing to do: ``show_msg("Data saved to file: " + file_path)``. The
result of the concatenation of a string and a path is not a path, but a
general string. Directly concatenating two path objects together as
strings, however, likely has no sensible use cases.

There is prior art in subclassing the Python ``str`` type to build a
path object type. Packages on PyPI (TODO: list more?) that do this
include ```` and ``antipathy``. The latter also supports
``bytes``-based paths by instantiating a different class, a subclass of
``bytes``. Since these libraries have existed for several years,
experience from them is available for evaluating the potential benefits
and weaknesses of this proposal (as well as other aspects regarding
``pathlib``). However, this proposal goes a step further to avoid
potential problems and to provide a smooth transition plan that, if
desired, can be followed to painlessly move towards a Python with a
clear distinction between strings and paths. An optional long-term goal
that this proposal facilitates may be to gradually move away from using
strings (or even their subclasses) as paths.

Specification of standard library changes

Making ``pathlib`` classes subclasses of ``str``

Assuming the present class hierarchy in ``pathlib``, inheritance from
``str`` will be introduced by making the base class ``pathlib.PurePath``
a subclass of ``str``. Methods will further be overridden in
``PurePath`` as described in the following.

Overriding all ``str``-specific methods

Since most of the ``str`` methods are not of any use on paths and can be
confusing, leading to undesired behavior, *most* ``str`` methods
(including magic methods, but excluding methods listed below) are
overridden in ``PurePath`` with methods that by default raise
``TypeError("str method '<name>' is not available for paths."``. This
will help programmers to immediately notice when they are using the
wrong method. The perhaps unusual practice of disabling most base-class
methods can be regarded as being conservative in adding ``str``
functionality to path objects.

All methods, including double-underscore methods are overridden, except
for the following, which are *not* overridden:

-  Methods of the ``str`` or ``object`` types that are already
   overridden by ``PurePath``
-  Methods of the ``object`` type that are not overridden by ``str``
-  ``__getattribute__``
-  ``__len__`` (this could be debated, but not having it might be weird
   for a str instance)
-  ``encode``
-  ``startswith`` and ``endswith`` (TODO: override these with
   case-insensitive behavior on the windows flavor)
-  ``__add__`` will be overriden separately, as described in later

This will allow ``open(...)`` as well as most ``os`` and ``os.path``
functionality to work immediately, although there are cases that need
special handling.

Later, if shown to be desirable, some additional string methods may be
enabled on paths.

Overriding ``.__add__`` to disable adding two path objects together

Overloading of the ``+`` operator in ``str`` will be overridden with a
version which disables concatenation of two path objects together while
allowing other type combinations (TODO: consider also fully disabling

.. code:: python

    def __add__(self, other):
        if isinstance(other, PurePath):
            raise TypeError("Operator + for two paths is not defined;
use / for joining paths.")
        return str.__add__(self, other)

Optional enabling of string methods

Since many APIs currently have functions or methods that return paths as
strings, existing code may expect to have all string functionality
available on the returned objects. While most users are unlikely to use
much of the ``str`` functionality, a library function may want to
explicitly allow these operations on a path object that it returns.
Therefore, the overridden ``str`` methods can be enabled by setting a
``._enable_str_functionality`` method on a path object as follows:

-  ``pathobj._enable_str_functionality = True    #`` -- Enable ``str``
-  ``pathobj._enable_str_functionality = 'warn'  #`` -- Enable ``str``
   methods, but emit a ``FutureWarning`` with the message
   ``"str method '<name>' may be disabled on paths in future versions."``

The warning will help the API users notice that the return value is no
longer a plain path.

.. code:: python

    def <name>(self, *args, **kwargs):
        """Method of str, not for use with pathlib path objects."""
            enable = self._enable_str_functionality
        except AttributeError:
            raise TypeError("str method '{}' is not available for paths."
                                .format('<name>')) from None
        if enable == 'warn':
            warnings.warn("str method '{}' may be disabled on paths in
future versions."
                              .format('<name>'), FutureWarning, stacklevel = 2)
        elif enable is True:
            raise ValueError("_enable_str_functionality can be True or 'warn'")
        return getattr(str, name)(self, *args, **kwargs)

New APIs, however, do not need to enable ``str`` functionality and may
return default path objects.

Helping interactive python tools and IDEs

Interactive Python tools such as Jupyter are growing in popularity. When
they use ``dir(...)`` to give suggestions for code completion, it is
harmful to have all the disabled ``str`` methods show up in the list,
even if they typically would raise exceptions. Therefore, the
``__dir__`` method should be overridden on ``PurePath`` to only show the
methods that are meaningful for paths. Some tools used for code
completion, such as ``rope`` and ``jedi`` may need some changes for
optimal code completion. This in fact includes also the standard Python
REPL, which currently does not respect ``__dir__`` in tab completion.

Changes needed to other stdlib modules

In stdlib modules other than ``pathlib``, mainly ``os``, ``ntpath`` and
``posixpath``, The stdlib functions in modules that use the
methods/functionality listed below on path or file names, will be
modified to explicitly convert the name ``name`` to a plain string
first, e.g., using ``getattr(name, 'path', name)``, which also works for
``DirEntry`` but may return ``bytes``:

-  ``split``
-  ``find``
-  ``rfind``
-  ``partition``
-  ``__iter__``
-  ``__getitem__``

(However, if ``DirEntry`` is not made to subclass ``str``, the idiom
``getattr(name, 'path', name)`` which is already supported in the
development version, should be implemented in stdlib functions to accept
not only ``str`` and path objects, but also DirEntry.)

Guidelines for third-party package maintainers

Libraries that take paths as arguments or return them

Since all of the standard library will accept path objects as path
arguments, most third-party libraries will automatically do so. However,
those that directly manipulate or examine the path name using ``str``
methods may not work. Those libraries will not immediately be

To achieve full ``pathlib``-compatiblity, the libraries are advised to:
1. Make sure they do not explicitly check the ``type(...)`` of
arguments, but use ``isinstance(...)`` instead, if needed. 2. See if
their functions use disabled ``str``/``bytes`` methods on paths that
they take as arguments. If so, they should either: \* change their code
to, achieve the same using ``os.path`` functions (*this is the preferred
option*), or \* convert the argument first using
``name = getattr(name, 'path', name)``, which does not require importing
pathlib 3. Consider, when returning a path or file name, to convert it
to a path object first if a ``str``-subclassing ``pathlib`` is
available. During a transition period, the attribute
``._enable_str_functionality = 'warn'`` should be set before returning
the object. For an even softer, transition period it is also possible to
set ``._enable_str_functionality = True``, which enables ``str`` methods
with no warnings.

Pathlib-compatible or near-compatible libraries

To have the best level of compatibility, all path-like objects should
preferably behave similarly to pathlib objects regarding subclassing
``str``. However, for the best level of *compatibility*, the safest
options is to subclass ``str`` and *not* disable ``str`` functionality
(which is already done by some known libraries). However, they may want
to further disable methods of ``str`` to achieve the additional clarity
that ``pathlib`` has regarding \* Having a ``.path`` attribute/property
which gives a ``str`` (or ``bytes``) instance representing the path

Older Python versions

The ``pathlib2`` module, which provides ``pathlib`` to pre-3.4 python
versions, can also subclass ``str``, but it should by default have
``._enable_str_functionality = 'warn'`` or
``.enable_str_functionality = True``, because the stdlib in the older
Python-versions is not compatible with paths that have ``str``
functionality disabled.

Transition plans and future possibilities for long-term consideration


``DirEntry`` should also undergo a similar transition, which was, at
first, part of this proposal, but it was removed to limit the scope (It
could be added back, of course, if desired). Since ``DirEntry`` focuses
on performance, it is important not to cause any significant performance

It would, however, simplify things if ``DirEntry`` did the same as
``pathlib`` regarding subclassing and disabling methods. A slight
complication, however, arises from the fact that ``DirEntry`` may
represent a path using ``bytes``, making the ``.path`` attribute also an
instance of ``bytes`` instead of ``str``. This issue could be solved by
at least two different approaches:

1. Make ``bytes``-kind DirEntry instances, interpreted as ``str``
   instances, equivalent to ``os.fsdecode(direntry.path)``.
2. Instantiate a different ``DirEntry`` class for ``bytes`` paths,
   perhaps in a way similar to how the ``antipathy`` path library
   instantiates ``bPath`` when the ``bytes`` type is used.

The future of plain string paths and ``os.path``?

It is possible to imagine having both ``os.path`` and ``pathlib``
coexist, as long as they co-operate well. Potentially, things like
``open("filename.txt")`` with a plain string argument will always be
accepted. However, if regardless of what people use Python for, they
slowly adopt path objects as the way to represent a path, the support
for plain string paths may be deprecated and eventually dropped.

On the one hand, to support the former situation, ``os.path`` functions
can choose their return type to match the type of the arguments; with
multiple different types in the arguments, ``pathlib`` might 'win'
because it is already imported. On the other hand, to support the
latter, all path-returning functions in the stdlib can begin to return
pathlib objects, at first with ``str`` methods enabled with or without
warning, and eventually, with ``str`` methods disabled.

Literal syntax for paths: p-strings?

Should Python choose the *path* towards not allowing plain strings as
paths, a convenient way to instantiate a path is desperately needed. As
discussed in the recent python-ideas thread "Working with path objects:
p-strings?", one possibility would be a new syntax like
``p"/path/to/file.ext"``, which would instantiate a path object.

Another way of turning a string into a path could be to have a ``.path``
property on ``str`` objects that instantiates and returns a path object.
It can be debated whether this 'Pythonic' or not. See also the next

The ``.path`` attribute on path-like objects

``DirEntry`` already had the ``.path`` attribute when it was introduced
to the standard library in Python 3.5. It represents the absolute or
relative path as a whole as a ``str`` or ``bytes`` instance. However,
several people have raised the concern that the word ``path`` not
referring to an actual path object may be misleading. However, if path
objects are instances of str, the ``.path`` may in the future shift to
mean the path object. In the case of ``pathlib`` paths, it would could
be implemented as a property that returns ``self``, or during a
transition phase, a path object with ``str`` functions enabled:

.. code:: python

        def path(self):
            path = type(self)(self)
            path._enable_str_functionality = 'warn'
            return self

``DirEntry`` objects, on the other hand, could be converted to pathlib
objects using the ``.path`` method. Similarly, ``str`` objects could
have a similar property for conversion into a pathlib object (see
previous section).

Possibilities for making ``pathlib`` more lightweight

If path objects were to become the norm in handling paths and file
names, there may be a need for optimizations in terms of the speed and
memory usage of path objects as well as the import time and memory
footprint. Dependencies that are not always used by pathlib objects
could also be imported lazily.

Another base class for path-like objects

Python already has multiple types that can represent paths-like objects.
There could be one common base class for all of them, which would (at
least at first) inherit from ``str``. ``DirEntry`` and ``PurePath``
would both be subclasses of this class.

One would, however, need to answer the questions of what this class
would be called, what it would look like, and what module would it be in
(if not builtin). For now, let us call it ``PyRL`` for Python
(Pyniversal?-) Resource Locator. This could also be a base class for

Generalized Resource Locator addresses: a-strings? l-strings?

... not to mention g-strings!

A generalized concept may be valuable in the future, because the
distinction between local and remote is getting more and more vague. As
discussed in the python-ideas thread "URLs/URIs + pathlib.Path + literal
syntax = ?", it is possible to quite reliably distinguish common types
of URLs from filesystem paths. If this became the norm, many
Python-written programs could 'magically' accept URLs as input file
paths by simply calling the ``PyRL(...)``, which could be equivalent to
some literal syntax for use in a scripting, testing or interactive
setting, or when loading config files from fixed locations.

On 04/06/2016 09:04 AM, Koos Zevenhoven wrote:

> Recently, I have spent quite a lot of time thinking about what should
> be done to improve the situation of pathlib. Last week, after all
> kinds of wild ideas and trying hard every night to prove to myself
> that duck-typing-like compatibility with other path-representing
> objects is sufficient, I noticed that I had failed to prove that and
> that I had gone around a full circle back to where I started: "why
> can't paths subclass ``str``?"
> In the "Working with Path objects: p-strings?" thread, I said I was
> working on a proposal. Since it's been several days already, I think i
> should post it here and get some feedback before going any further.
> Maybe I should have done that even earlier. Anyway, there are some
> rough edges, and I will need to add links to references etc.
> So, do not hesitate to give feedback or criticism, which is especially
> appreciated it you take the time to read through the whole thing first
> :).

Fair enough, and done!  :)

Overall: excellent research and interesting ideas.  However, before 
spending too much more time on this realize that the fate of pathlib in 
the stdlib is being discussed on Python-Dev, so you may want to move 
your efforts over there.

> Proposal: Making path objects inherit from ``str``.
> ===================================================

> Rationale
> ---------

> There is prior art in subclassing the Python ``str`` type to build a
> path object type. Packages on PyPI (TODO: list more?) that do this
> include ```` and ``antipathy``.

An honorable mention for antipathy!  :)

> Specification of standard library changes
> -----------------------------------------

> Overriding all ``str``-specific methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Have to be careful here -- the os module uses some of those string 
methods to work with string-paths.

> Overriding ``.__add__`` to disable adding two path objects together
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See above.

> Optional enabling of string methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Since many APIs currently have functions or methods that return paths as
> strings, existing code may expect to have all string functionality
> available on the returned objects. While most users are unlikely to use
> much of the ``str`` functionality, a library function may want to
> explicitly allow these operations on a path object that it returns.
> Therefore, the overridden ``str`` methods can be enabled by setting a
> ``._enable_str_functionality`` method on a path object as follows:
> -  ``pathobj._enable_str_functionality = True    #`` -- Enable ``str``
>     methods
> -  ``pathobj._enable_str_functionality = 'warn'  #`` -- Enable ``str``
>     methods, but emit a ``FutureWarning`` with the message
>     ``"str method '<name>' may be disabled on paths in future versions."``
> The warning will help the API users notice that the return value is no
> longer a plain path.

This would be good for the pathlib backport, which I think should 
inherit from str/unicode.

> Helping interactive python tools and IDEs
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Interactive Python tools such as Jupyter are growing in popularity. When
> they use ``dir(...)`` to give suggestions for code completion, it is
> harmful to have all the disabled ``str`` methods show up in the list,
> even if they typically would raise exceptions. Therefore, the
> ``__dir__`` method should be overridden on ``PurePath`` to only show the
> methods that are meaningful for paths.

Good idea.

> Older Python versions
> ~~~~~~~~~~~~~~~~~~~~~
> The ``pathlib2`` module, which provides ``pathlib`` to pre-3.4 python
> versions, can also subclass ``str``, but it should by default have
> ``._enable_str_functionality = 'warn'`` or
> ``.enable_str_functionality = True``, because the stdlib in the older
> Python-versions is not compatible with paths that have ``str``
> functionality disabled.

The 'warn' setting would be preferable.

> The future of plain string paths and ``os.path``?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> It is possible to imagine having both ``os.path`` and ``pathlib``
> coexist, as long as they co-operate well. Potentially, things like
> ``open("filename.txt")`` with a plain string argument will always be
> accepted. However, if regardless of what people use Python for, they
> slowly adopt path objects as the way to represent a path, the support
> for plain string paths may be deprecated and eventually dropped.

I don't see the stdlib ever /not/ accepting string paths.

> Literal syntax for paths: p-strings?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I don't see this as necessary.

> Another way of turning a string into a path could be to have a ``.path``
> property on ``str`` objects that instantiates and returns a path object.

And definitely not this.

> Generalized Resource Locator addresses: a-strings? l-strings?
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> A generalized concept may be valuable in the future, because the
> distinction between local and remote is getting more and more vague. As
> discussed in the python-ideas thread "URLs/URIs + pathlib.Path + literal
> syntax = ?", it is possible to quite reliably distinguish common types
> of URLs from filesystem paths.

This was disproved.  Just about anything can be a posix-path.


On Wed, Apr 6, 2016 at 6:49 AM, Random832 <random832 at> wrote:
> Last time this came up, the conclusion was that making values views
> comparable was intractable due to the fact that they're unordered but
> the values themselves aren't hashable. Then the discussion got
> sidetracked into a discussion of whether the justification for not
> having them be hashable (Java does just fine with everything being
> hashable and content-based hashes for mutable objects) makes sense in a
> "consenting-adults" world.

At risk of repeating the derailment: while that's true, I'm also under the
impression that interest in deeply immutable objects is growing in the
Java community.

Indeed, I noted that (in a not-obvious endnote). Perhaps I shouldn't have
mentioned the issue at all as its relatively secondary, but on the other
hand while my main points have to do specifically with KeysView, it would
be worth bundling all of the inconsistencies in all MappingViews at once,
if only for pragmatic reasons.

In the prior discussion, guido also seemed open to making values equivalent
on dictionary identity equality (#10), which I think makes more sense than
current behavior and doesn't suffer performance issues. In any case, I
would consider that a secondary concern.

On Wed, Apr 6, 2016 at 8:50 AM Random832 <random832 at> wrote:

> On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote:
> >     {}.keys() == {}.keys()  # 7
> >     {}.items() == {}.items()  # 8
> >     {}.values() == {}.values()  # 9
> >     d = {}; d.values() == d.values()  # 10
> >
> > True, True, False, False.
> >
> > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
> > for debate.[1]
> Last time this came up, the conclusion was that making values views
> comparable was intractable due to the fact that they're unordered but
> the values themselves aren't hashable. Then the discussion got
> sidetracked into a discussion of whether the justification for not
> having them be hashable (Java does just fine with everything being
> hashable and content-based hashes for mutable objects) makes sense in a
> "consenting-adults" world.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
On Thu, Apr 7, 2016 at 2:04 AM, Koos Zevenhoven <k7hoven at> wrote:
> Rationale
> ---------
> Furthermore, functions now
> returning instances of ``str`` to represent paths could in future
> versions return path objects, with only minor backwards-incompatibility
> worries.
> Making ``pathlib`` classes subclasses of ``str``
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Overriding all ``str``-specific methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> The perhaps unusual practice of disabling most base-class
> methods can be regarded as being conservative in adding ``str``
> functionality to path objects.

I'm not sure I entirely understand what's going on here. Your
rationale is "it should be possible to use a Path as a str", and
that's supported by your proposal to subclass str; but then you want
to override a bunch of methods to force users to be aware that a Path
is *not* a str. Why subclass only to force people to distinguish?

If you want to make a Path act "just a little bit" like a str, I'd
expect to go the other way: don't subclass str, and add in a specific
set of methods to provide str-like functionality. Or am I missing
something here?


On Wed, Apr 6, 2016 at 1:28 PM Chris Angelico <rosuav at> wrote:

> I'm not sure I entirely understand what's going on here. Your
> rationale is "it should be possible to use a Path as a str", and
> that's supported by your proposal to subclass str; but then you want
> to override a bunch of methods to force users to be aware that a Path
> is *not* a str. Why subclass only to force people to distinguish?

> If you want to make a Path act "just a little bit" like a str, I'd
> expect to go the other way: don't subclass str, and add in a specific
> set of methods to provide str-like functionality. Or am I missing
> something here?

Indeed this was my thought as well. Originally (this was discussed on the
python subreddit to some extent as well), my objection was that a simple
program like say, would break depending on the implementation.
This addresses that issue, but in my opinion creates even deeper problems.

Its not hard to conceive some formatting or pretty printing library that at
some point contains the code

    def concat(user_list: List[String]) -> String:
        return functools.reduce(lambda x, y: x + y, user_list)

where user_list is given from some sort of user input. This change would
lead to some nasty bugs, where this action will throw an error when any two
adjacent args are Paths. What's worse is that type annotations actually
exacerbates this issue since the objects disobey the typechecker.

Java subclasses aren't allowed to just unimplement a parent's methods, they
shouldn't in python either. This argument seems to boil to "well a string
is one valid representation of a path, so a Path is a string" but by that
definition a Path is also a node in a tree, and so we should create a Tree
class and have Path subclass TreeNode, so that we could find out its
children and depth from the root.

There are so many surprising results of a change like this that I can't see
a reason to do this.

On 6 April 2016 at 17:04, Koos Zevenhoven <k7hoven at> wrote:
> In the "Working with Path objects: p-strings?" thread, I said I was
> working on a proposal. Since it's been several days already, I think i
> should post it here and get some feedback before going any further.
> Maybe I should have done that even earlier. Anyway, there are some
> rough edges, and I will need to add links to references etc.

Thanks for putting this together. I don't agree with much of it, but
it's good to have the proposal stated so clearly.

> So, do not hesitate to give feedback or criticism, which is especially
> appreciated it you take the time to read through the whole thing first
> :).

While I've read the whole proposal, there's a lot to digest, and
honestly I don't have the time to spend on this right now - so my
apologies if I missed anything relevant. Hopefully my comments will
make sense anyway :-)

> Filesystem paths are strings that give instructions for traversing a
> directory tree. In Python, they have traditionally been represented as
> byte strings, and more recently, unicode string. However, Python now has
> ``pathlib`` in the standard library, which is an object-oriented library
> for dealing with objects specialized in representing a path and working
> with it. In this proposal, such objects are generally referred to as
> *path objects*, or sometimes, in the specific context of instances of
> the ``pathlib`` path classes, they are explicitly referred to as
> ``pathlib`` objects.

I'm not sure I agree with this. To me, "filesystem paths" are a things
which define the location of a file in a filesystem. They are not
strings, even though they can be represented by strings (actually,
they can't, technically - POSIX allows nearly arbitrary bytestrings
for for paths, whereas Python strings are Unicode). Saying a path is a
string is no more true than saying that integers are strings that
represent whole numbers.

Traditionally, people haven't thought of paths as objects because not
many languages provide *any* sort of abstraction of paths - doing so
in a cross-platform way is *hard* and most languages duck the issue.
Python is exceptional in providing good path manipulation functions
(even os.path is streets ahead of what many other languages offer).

> Filesystem paths (or comparable things like URIs) are strings of
> characters that represent information needed to access a file or
> directory (or other resource). In other words, they form a subset of
> strings, involving specialized functionality such as joining absolute
> and relative paths together, accessing different parts of the path or
> file name, and even accessing the resources the path points to. In
> Python terms, for a path ``path``, one would have
> ``isinstance(path, str)``. It is also clear that not all strings are
> paths.

As noted above, this makes no sense to me. By this argument "integers
are strings of characters that represent numbers". The string
representation of an object is *not* the object.

> On the one hand, this would make an ideal case for making all
> path-representing objects inherit from ``str``; while Python tries not
> to over-emphasize object-oriented programming and inheritance, it should
> not try to avoid class hierarchies when they are appropriate in terms of
> both purity and practicality. Regarding practicality, making specialized
> *path objects* also instances of ``str`` would make almost any stdlib or
> third-party function accept path objects as path arguments, assuming
> that they accept any instance of ``str``. Furthermore, functions now
> returning instances of ``str`` to represent paths could in future
> versions return path objects, with only minor backwards-incompatibility
> worries.

You mention both practicality and purity here but only offer
"practical" arguments. The practical arguments are fair, and as far as
I can see are the crux of any proposal to make Path objects subclass
str. You should focus on this, and not try to argue that subclassing
str is "right" in any purity sense.

> On the other hand, strings are a very general concept, and the Python
> ``str`` class provides a large variety of methods to manipulate and work
> with them, including ``.split()``, ``.find()``, ``.isnumeric()`` and
> ``.join()``. These operations may be defined just as well for a string
> that represents a path than for any other string. In fact, this is the
> status quo in Python, as the adoption of ``pathlib`` is still quite
> limited and paths are in most cases represented as strings (sometimes
> byte strings). But while the string operations are *defined* on
> path-representing strings, the results of these operations may not be of
> any use in most cases, even if in some cases, they may be.

This seems to me to be a key point - if (many) of the operations that
are part of the interface of a string don't make sense for a
filesystem path, doesn't that very clearly make the point that
filesystem paths are *not* strings?

> There is prior art in subclassing the Python ``str`` type to build a
> path object type. Packages on PyPI (TODO: list more?) that do this

pylib's path.local object (used in pytest in particular) is another.

> include ```` and ``antipathy``. The latter also supports
> ``bytes``-based paths by instantiating a different class, a subclass of
> ``bytes``. Since these libraries have existed for several years,
> experience from them is available for evaluating the potential benefits
> and weaknesses of this proposal (as well as other aspects regarding
> ``pathlib``).

I don't think there's been any attempt made to collect or quantify
that experience, though. All I've ever seen is hearsay "I've not heard
of anyone reporting problems" evidence. While anecdotal evidence is a
lot better than nothing, it's of limited value. Apart from anything
else, there's a self-selection issue - people who *did* have problems
may simply have stopped using the libraries.

> Overriding all ``str``-specific methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Since most of the ``str`` methods are not of any use on paths and can be
> confusing, leading to undesired behavior, *most* ``str`` methods
> (including magic methods, but excluding methods listed below) are
> overridden in ``PurePath`` with methods that by default raise
> ``TypeError("str method '<name>' is not available for paths."``. This
> will help programmers to immediately notice when they are using the
> wrong method. The perhaps unusual practice of disabling most base-class
> methods can be regarded as being conservative in adding ``str``
> functionality to path objects.

This seems to me to be the biggest issue. You're proposing that Path
objects will subclass strings, but code written to expect a string may
fail if passed a Path object. Presumably though that code works if
passed str(the_path_object) - as it works correctly right now. Maybe
it's doing "string-like" things, but equally, it's presumably intended
to. Consider a "make path uppercase" function that simply does
.upper() on its argument.

You are proposing a class that is a subclass of str, but calling str()
on an instance gives an object that behaves differently. That's
bizarre at best, and realistically I'd describe it as fundamentally
broken. I don't want to argue type-theory here, but I'm pretty sure
that violates most people's intuition of what inheritance means.

> Optional enabling of string methods
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Since many APIs currently have functions or methods that return paths as
> strings, existing code may expect to have all string functionality
> available on the returned objects. While most users are unlikely to use
> much of the ``str`` functionality, a library function may want to
> explicitly allow these operations on a path object that it returns.
> Therefore, the overridden ``str`` methods can be enabled by setting a
> ``._enable_str_functionality`` method on a path object as follows:
> -  ``pathobj._enable_str_functionality = True    #`` -- Enable ``str``
>    methods
> -  ``pathobj._enable_str_functionality = 'warn'  #`` -- Enable ``str``
>    methods, but emit a ``FutureWarning`` with the message
>    ``"str method '<name>' may be disabled on paths in future versions."``

This is a huge chunk of extra complexity, both in terms of
implementation, and even more so in terms of understanding.

If someone wants a "real" string, just call cast using str() or use
the .path attribute.

This whole section of the proposal says to me that you haven't
actually solved the problem you're trying to solve - you still expect
people to have problems passing Path objects to functions that aren't
expecting them, and you've had to consider how to work round that. The
fact that you came up with (in effect) a "configuration flag" on an
immutable object like a Path rather than just using the existing "give
me a real string" options on Path, implies that your proposal is not
well thought through in this area.

Here's some questions for you (but IMO this section is unfixable - no
matter what answers you give, I still consider this whole mechanism as
a non-starter).

* Are Path objects hashable, given they now have a mutable attribute?
* If you change the _enable_str_functionality flag, does the object's
hash change?
* If it doesn't, what happens when you add 2 identical paths with
different _enable_str_functionality flags to a set?
* If you enable str methods do they return str or Path objects? If the
latter, what is the flag set to on these objects?

Basically, you broke a fundamental property of both Path and string
objects - they are immutable.

> Changes needed to other stdlib modules
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> In stdlib modules other than ``pathlib``, mainly ``os``, ``ntpath`` and
> ``posixpath``, The stdlib functions in modules that use the
> methods/functionality listed below on path or file names, will be
> modified to explicitly convert the name ``name`` to a plain string
> first, e.g., using ``getattr(name, 'path', name)``, which also works for
> ``DirEntry`` but may return ``bytes``:
> -  ``split``
> -  ``find``
> -  ``rfind``
> -  ``partition``
> -  ``__iter__``
> -  ``__getitem__``

This can be done with the current Path objects (and should). It is
unrelated to this proposal. And it doesn't need to be restricted to
"if overridden string functions are used". Just do it regardless, and
all existing functions work immediately.

The only issue is functions that *return* paths. And they are no
harder under current Pathlib than under your proposal - a decision on
what type to return has to be made either way.

> Guidelines for third-party package maintainers
> ----------------------------------------------
> Libraries that take paths as arguments or return them
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Since all of the standard library will accept path objects as path
> arguments, most third-party libraries will automatically do so. However,
> those that directly manipulate or examine the path name using ``str``
> methods may not work. Those libraries will not immediately be
> ``pathlib``-compatible.

Overcomplicated. If you accept paths, just do getattr(patharg, 'path',
patharg) and you're fine. If you return paths, do nothing (or if you
prefer, think about your API and make a more considered decision).

Your proposal means that library authors have to actually consider
whether the new path objects will cause subtle failures, because the
string-like objects will not fail quickly, leading to bugs propogating
into unrelated code.

Overall, I'm a strong -1. If we subclass str, we should just do it and
not over-complicate like this. I'm still not convinced we should do
so, but your proposal *has* convinced me that any attempt to
compromise is going to end up being worse than either option.

Sorry I can't be more positive - but again, thanks for the thorough write-up.

From random832 at  Wed Apr  6 14:50:54 2016
From: random832 at (Random832)
Date: Wed, 06 Apr 2016 14:50:54 -0400
Subject: [Python-ideas] Dictionary views are not entirely 'set like'
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 6, 2016, at 13:25, Joshua Morton wrote:
> In the prior discussion, guido also seemed open to making values
> equivalent
> on dictionary identity equality (#10), which I think makes more sense
> than
> current behavior and doesn't suffer performance issues. In any case, I
> would consider that a secondary concern.

You could probably handle a lot of common cases by:

- Eliminating any items which are present in both collections by
- Attempting to sort the lists of remaining items.

But, yeah, the way to do it in O(N) requires an "ephemeral hash"
operation which Python doesn't have and can't grow _now_, no matter what
the justification for not always having had it. That he won't change it
even after he builds the time machine doesn't really mean anything when
it comes down to it.

From wes.turner at  Wed Apr  6 16:31:01 2016
Message-ID: <>

On Apr 6, 2016 7:50 AM, "Random832" <random832 at> wrote:
> On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote:
> >     {}.keys() == {}.keys()  # 7
> >     {}.items() == {}.items()  # 8
> >     {}.values() == {}.values()  # 9
> >     d = {}; d.values() == d.values()  # 10
> >
> > True, True, False, False.
> >
> > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is
> > for debate.[1]
> Last time this came up, the conclusion was that making values views
> comparable was intractable due to the fact that they're unordered but
> the values themselves aren't hashable. Then the discussion got
> sidetracked into a discussion of whether the justification for not
> having them be hashable (Java does just fine with everything being
> hashable and content-based hashes for mutable objects) makes sense in a
> "consenting-adults" world.

here's a related discussion:

[Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values
views behave not as expected?
On Apr 6, 2016 4:39 AM, "Joshua Morton" <joshua.morton13 at> wrote:
> Let's start with a quick quiz: what is the result of each of the
following (on Python3.5.x)?
>     {}.keys() | set()  # 1
>     set() | []  # 2
>     {}.keys() | []  # 3
>     set().union(set())  # 4
>     set().union([])  # 5
>     {}.keys().union(set())  # 6
> If your answer was set([]), TypeError, set([]), set([]), set([]),
AttributeError, then you were correct. That, to me, is incredibly
> Next up:
>     {}.keys() == {}.keys()  # 7
>     {}.items() == {}.items()  # 8
>     {}.values() == {}.values()  # 9
>     d = {}; d.values() == d.values()  # 10
> True, True, False, False.
> Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is up
for debate.[1]
> First thing first, the behavior exhibited by #3 is a bug (or at least it
probably should be, and one I'd be happy to fix. However, before doing that
I felt it would be good to propose some questions and suggestions for how
that might be done.
> There are, as far as I can tell, two reasons to use a MappingView, memory
efficiency or auto-updating (a view will continue to mirror changes in the
underlying object).

a third reason:
Mapping and MutableMapping do not assume that all of the data is buffered
into RAM.
* e.g. on top of a DB

> I'll focus on the first because the second can conceivably be solved in
other ways.
> Currently, if I want to union a dictionaries keys with a non-set
iterable, I can do `{}.keys() | []`. This is a bug[2], the set or operator
should only work on another set. That said, fixing this bug removes the
ability to efficiently or keys with another iterable,
`set({}.keys()).update([])` for efficiency, or `set({}.keys()).union([])`
for clarity.
> Fixing this is simply a matter of adding a `.union` method to the
KeysView (and possibly to the Set abc as well). Although that may not be
something that is wanted. The issue, as far as I can tell, is whether we
want people converting from MappingViews to "primitives" as soon as
possible, or if we want to encourage people people to use the views for as
long as possible.
> There are arguments on both sides: encouraging people to use the views
causes these objects to become more complex, introducing more surface area
for bugs, more code to be maintained, etc. Further, there's there is
currently one obvious way to do things, convert to a primitive, if you're
doing any complex action. On the other hand, making MappingViews more like
the primitives they represent has positives for performance, simplifies
user code, and would probably make testing these objects easier, since many
tests could be stolen from
> My opinion is that the operators on MappingViews should be no more
permissive than their primitive counterparts. A KeysView is in various ways
more restrictive than a set, so having it be also occasionally less
restrictive is surprising and in my opinion bad. This causes the loss of an
efficient way to union a dict's keys with a list (among other methods). I'd
then add .union, .intersection, etc. to remedy this.
> This solution would bring the existing objects more in line with their
primitive counterparts, while still allowing efficient actions on large
> In short:
>  - Is #3 intended behavior?
>  - Should it (and the others be)?
>   - As a related aside, should .union and other frozen ops be added to
the Set interface?
>  - If so, should the fix solely be a bugfix, should I do what I proposed,
or something else entirely?
>  - More generally, should there be a guiding principle when it comes to
MappingViews and similar special case objects?
> [1]: There's some good conversation in this prior thread on this issue
The consensus seemed to be that making ValuesViews comparable by value is
technically infeasible (O(n^2) worst case), while making it comparable
based on the underlying dictionary is a possibility. This would be for
OrderedDict, although many of the same arguments apply for a normal
> [2]: Well, it probably should be a bug, its explicitly tested for (,
whereas sets are explicitly tested for the opposite functionality (
> Thanks, I'm looking forward to the feedback,
> Josh
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
On Apr 6, 2016 7:50 AM, "Random832" <random832 at> wrote:
> On Wed, Apr 6, 2016, at 05:38, Joshua Morton wrote:
> >     {}.keys() == {}.keys()  # 7
> >     {}.items() == {}.items()  # 8
> >     {}.values() == {}.values()  # 9
> >     d = {}; d.values() == d.values()  # 10
> >
> > True, True, False, False.
> >
> > Numbers 1, 2, 4, 5 are expected behavior. 3 and 6 are not, and 7-10 is
> > for debate.[1]
> Last time this came up, the conclusion was that making values views
> comparable was intractable due to the fact that they're unordered but
> the values themselves aren't hashable. Then the discussion got
> sidetracked into a discussion of whether the justification for not
> having them be hashable (Java does just fine with everything being
> hashable and content-based hashes for mutable objects) makes sense in a
> "consenting-adults" world.

With e.g. OrderedDict, one practical solution is to subclass and wrap
.keys() in a set and .values() in a list.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
I was trying to unpack an array or tuple in index expression (python
3.5.1) but this failed.
I think for consistency of language use it would be desirable to allow
that as well.  Below an example of my attempt indexing a numpy array,

>>> import numpy as np
>>> x = np.arange(12).reshape(3,-1)
>>> x[*[0,1]]
SyntaxError: invalid syntax

>>> x[*(0,1)]
SyntaxError: invalid syntax

for the case of numpy, the latter works as intended w/o unpacking

>>> x[(0,1)]

which is equivalent to the desired behaviour of unpacking

>>> x[0,1]

but for the list case the behaviour is different (as intended by design)

>>> x[[0,1]]
array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

My proposal hence is to allow unpacking in index expressions as well.
I do not have a use case for dictionary / keyword unpacking; this
would look awkward anyway.


On 07.04.16 01:26, Alexander Heger wrote:
> I was trying to unpack an array or tuple in index expression (python
> 3.5.1) but this failed.
> I think for consistency of language use it would be desirable to allow
> that as well.  Below an example of my attempt indexing a numpy array,
> x:
>>>> import numpy as np
>>>> x = np.arange(12).reshape(3,-1)
>>>> x[*[0,1]]
>      x[*[0,1]]
>        ^
> SyntaxError: invalid syntax

x[0, 1] is a syntax sugar for x[(0, 1)]. Thus you need just convert your 
list index to tuple index: x[tuple([0, 1])].

Unpacking arguments for function is different thing.

Booleans currently have reasonable overrides for the bitwise binary

>>> True | False
>>> True & False
>>> True ^ False

However, the same cannot be said of bitwise unary complement, which
returns rather useless integer values:

>>> ~False
>>> ~True

Numpy's boolean type does the more useful (and more expected) thing:

>>> ~np.bool_(True)

How about changing the behaviour of bool.__invert__ to make it in line
with the Numpy boolean?
(i.e. bool.__invert__ == operator.not_)



On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou <solipsis at> wrote:
> Booleans currently have reasonable overrides for the bitwise binary
> operators:
>>>> True | False
> True
>>>> True & False
> False
>>>> True ^ False
> True

All those are consistent with bool being a subclass of int, in the
sense that (for example) `int(True | False)` is identical to
`int(True) | int(False)`.

Redefining ~True to be False wouldn't preserve that: int(~True) ==
~int(True) would become invalid. In short, the proposal would break
the Liskov Substitution Principle. (The obvious fix is of course to
make True have value -1 rather than 1. Then everything's consistent.
No, I'm not seriously suggesting this - the amount of breakage would
be insane.)

NumPy has the luxury that numpy.bool_ is *not* a subclass of any integer type.


On Thu, 7 Apr 2016 09:00:36 +0100
Mark Dickinson <dickinsm at> wrote:
> On Thu, Apr 7, 2016 at 8:46 AM, Antoine Pitrou <solipsis at> wrote:
> > Booleans currently have reasonable overrides for the bitwise binary
> > operators:
> >
> >>>> True | False
> > True
> >>>> True & False
> > False
> >>>> True ^ False
> > True
> All those are consistent with bool being a subclass of int, in the
> sense that (for example) `int(True | False)` is identical to
> `int(True) | int(False)`.

However, the return type (bool in one place, int in another one) is
inconsistent, and the user-visible semantics are confusing...

Apparently someone went to the trouble of overriding __and__, __or__
and __xor__ for booleans, which is why it looks unexpected to leave
__invert__ alone.



On Thu, Apr 07, 2016 at 09:46:18AM +0200, Antoine Pitrou wrote:
> Hello,
> Booleans currently have reasonable overrides for the bitwise binary
> operators:
> >>> True | False
> True
> >>> True & False
> False
> >>> True ^ False
> True

Substitute 1 for True and 0 for False, and these results are exactly the 
same as the bitwise operations on ints. And that works since True is 
defined to equal 1 and False to equal 0.

> However, the same cannot be said of bitwise unary complement, which
> returns rather useless integer values:
> >>> ~False
> -1
> >>> ~True
> -2

Substitute 0 for False and 1 for True, and you get exactly the same 
results. What else did you expect from bitwise-not?

> Numpy's boolean type does the more useful (and more expected) thing:
> >>> ~np.bool_(True)
> False

Expected by whom? I wouldn't expect bitwise-not to be the same as binary 
not. If I want binary not, I'll spell it `not`.

> How about changing the behaviour of bool.__invert__ to make it in line
> with the Numpy boolean?
> (i.e. bool.__invert__ == operator.not_)

Why? What problem does this solve? We already have a perfectly good way 
of spelling binary not, why break backwards compatibility to get a 
second way to spell it?

It also breaks a fundamental property of most mathematical relations:

if a == b, then f(a) == f(b) 

(assuming f(a) and f(b) are defined for the type of both a and b).

That is currently true for bools:

py> (True == 1) and (~True == ~1)
py> (False==0) and (~False == ~0)

You want ~b to return `not b`:

py> (True == 1) and (False == ~1)
py> (False==0) and (True == ~0)

I see no upside and a serious downside to this proposal.


On 7 April 2016 at 08:46, Antoine Pitrou <solipsis at> wrote:
> Hello,
> Booleans currently have reasonable overrides for the bitwise binary
> operators:
>>>> True | False
> True
>>>> True & False
> False
>>>> True ^ False
> True
> However, the same cannot be said of bitwise unary complement, which
> returns rather useless integer values:
>>>> ~False
> -1
>>>> ~True
> -2

This is all just a consequence of the (unfortunate IMO) design that
treats True and False as being equivalent to 1 and 0.

> Numpy's boolean type does the more useful (and more expected) thing:
>>>> ~np.bool_(True)
> False

This is a consequence of another unfortunate design by numpy. The
reason for this is that numpy uses Python's bitwise operators to do
element-wise logical operations. This is because it is not possible to
overload Python's 'and', 'or', and 'not' operators. So if I write:

>>> import numpy as np
>>> a = np.array([1, 2, 3, 4, 5, 6])
>>> a
array([1, 2, 3, 4, 5, 6])
>>> 1 < a
array([False,  True,  True,  True,  True,  True], dtype=bool)
>>> a < 4
array([ True,  True,  True, False, False, False], dtype=bool)
>>> (1 < a) & (a < 4)
array([False,  True,  True, False, False, False], dtype=bool)
>>> (1 < a) and (a < 4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()

The problem is that Python tries to shortcut this expression and so it
calls bool(1 < a):

>>> bool(1 < a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()

Returning a non-bool from __bool__ is prohibited which implicitly
prevents numpy from overloading an expression like `not a`:

>>> class boolish:
...     def __bool__(self):
...         return "true"
>>> bool(boolish())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __bool__ should return bool, returned str
>>> not boolish()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __bool__ should return bool, returned str

Because of this numpy uses &,|,~ in place of and,or,not for numpy
arrays and by extension this occurs for numpy scalars as you showed.

I think the numpy project really wanted to use the normal Python
operators but no mechanism was available to do it. It would have been
nice to use &,|,~ for genuine bitwise operations (on e.g. unsigned int
arrays). It also means that chained relations don't work e.g.:

>>> 1 < a < 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()

The recommended way is (1 < a) & (a < 4) which not as nice and
requires factoring a out if a is actually an expression like sin(x)**2
or something.


This is a spin-off from the __fspath__ discussion on python-dev, in
which a few people said that a more general approach would be better.

Proposal: Objects should be allowed to declare that they are
"string-like" by creating a dunder method (analogously to __index__
for integers) which implies a loss-less conversion to str.

This could supersede the __fspath__ "give me the string for this path"
protocol, or could stand in parallel with it.

Obviously str will have this dunder method, returning self. Most other
core types (notably 'object') will not define it. Absence of this
method implies that the object cannot be treated as a string.

String methods will be defined as accepting string-like objects. For
instance, "hello"+foo will succeed if foo is string-like.

Downside: Two string-like objects may behave unexpectedly - foo+bar
will concatenate strings if either is an actual string, but if both
are other string-like objects, depends on the implementation of those


1) What should the dunder method be named? __str_coerce__? __stringlike__?

2) Which standard types are sufficiently string-like to be used thus?

3) Should there be a bytes equivalent?

4) Should there be a format string "coerce to str"? "{}".format(x) is
equivalent to str(x), but it might be nice to be able to assert that
something's stringish already.

Open season!

On 2016-04-07 13:19, Oscar Benjamin wrote:
> On 7 April 2016 at 08:46, Antoine Pitrou <solipsis at> wrote:

>> Numpy's boolean type does the more useful (and more expected) thing:
>>>>> ~np.bool_(True)
>> False
> This is a consequence of another unfortunate design by numpy. The
> reason for this is that numpy uses Python's bitwise operators to do
> element-wise logical operations.

This is not correct. & | ^ ~ are all genuine bitwise operations on numpy arrays.

 >>> i = np.arange(5)
 >>> ~i
array([-1, -2, -3, -4, -5])

What you are seeing with bool arrays is that bool arrays are *not* just uint8 
arrays. Each element happens to take up a single 8-bit byte, but only one of 
those bits contributes to its value; the other 7 bits are mere padding. The 
bitwise & | ^ ~ operators all work on that single bit correctly. They do not 
operate on the padding bits as they are not part of the bool's value.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

Steven D'Aprano <steve at ...> writes:
> > Numpy's boolean type does the more useful (and more expected) thing:
> > 
> > >>> ~np.bool_(True)
> > False
> Expected by whom?

By anyone who takes booleans at face value (that is, takes booleans as
representing a truth value and expects operations on booleans to reflect
the semantics of useful operations on truth values, not some arbitrary
side-effect of the internal representation of a boolean...).

But I'm not surprised by such armchair commenting and pointless controversy
on python-ideas, since that's what the list is for....



From phd at  Thu Apr  7 09:28:45 2016
On Thu, Apr 07, 2016 at 11:04:56PM +1000, Chris Angelico <rosuav at> wrote:
> 1) What should the dunder method be named? __str_coerce__? __stringlike__?


     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

On 7 April 2016 at 14:08, Robert Kern <robert.kern at> wrote:
>> This is a consequence of another unfortunate design by numpy. The
>> reason for this is that numpy uses Python's bitwise operators to do
>> element-wise logical operations.
> This is not correct. & | ^ ~ are all genuine bitwise operations on numpy
> arrays.
>>>> i = np.arange(5)
>>>> ~i
> array([-1, -2, -3, -4, -5])
> What you are seeing with bool arrays is that bool arrays are *not* just
> uint8 arrays. Each element happens to take up a single 8-bit byte, but only
> one of those bits contributes to its value; the other 7 bits are mere
> padding. The bitwise & | ^ ~ operators all work on that single bit
> correctly. They do not operate on the padding bits as they are not part of
> the bool's value.

I stand corrected.


On 7 April 2016 at 14:04, Chris Angelico <rosuav at> wrote:
> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.

What would this be used for? Other than Path, what types would be
viable candidates for being "string like"? I can't think of a use case
for this feature.

For that matter, what constitutes a "lossless conversion to str"? If
you mean "doesn't lose information", then integers could easily be
said to have such a conversion - but that doesn't seem right (we don't
want things to start auto-converting numbers to strings).

I'm not sure I understand the point.


From vgr255 at  Thu Apr  7 09:46:52 2016
+1 for me.

On Thu, Apr 07, 2016 at 11:04:56PM +1000, Chris Angelico <rosuav at>
> 1) What should the dunder method be named? __str_coerce__? __stringlike__?

The equivalent for ints is __index__, which is one subset of the vast
possibilities of integers, yet one that will mostly be served by int-like
By that reasoning, the str-like method should correspond to a large - yet
wide - use case. The first use case for strings that comes to mind is dict
(namespace) keys. I'd rule out __keys__ as not explicit enough. I was
thinking __item__, but then again probably not explicit enough. I don't have
"the obvious choice" in mind, but I'd like to be one word only.

> 2) Which standard types are sufficiently string-like to be used thus?

No built-in types, but I'd like the feature for a third-party app, for
example, that pretends to be a string.

> 3) Should there be a bytes equivalent?

This would probably fall in the "nice to have, but no use case backing it
up" category. +0 for me.

> 4) Should there be a format string "coerce to str"? "{}".format(x) is
equivalent to str(x), but it might be nice to be able to assert that
something's stringish already.

__index__ is meant to return the same thing as __int__, and I think the same
restrictions should apply here - we have an object pretending to be a
string, so there should *not* be a difference between use_as_string(obj) and
use_as_string(str(obj)) - in the same sense that there should not be a
difference between use_as_int(obj) and use_as_int(int(obj)).


On Thu, Apr 7, 2016 at 11:46 PM, ?manuel Barry <vgr255 at> wrote:
>> 4) Should there be a format string "coerce to str"? "{}".format(x) is
> equivalent to str(x), but it might be nice to be able to assert that
> something's stringish already.
> __index__ is meant to return the same thing as __int__, and I think the same
> restrictions should apply here - we have an object pretending to be a
> string, so there should *not* be a difference between use_as_string(obj) and
> use_as_string(str(obj)) - in the same sense that there should not be a
> difference between use_as_int(obj) and use_as_int(int(obj)).

Absolutely agreed; however, I was thinking of the same restriction:
"{!must_be_str}".format(x) would raise an exception if use_as_str(x)
raises. But if it succeeds, yes, it's the same as simple str()


On 07.04.2016 15:04, Chris Angelico wrote:
> This is a spin-off from the __fspath__ discussion on python-dev, in
> which a few people said that a more general approach would be better.
> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.

I must be missing something... we already have a method for this:

Marc-Andre Lemburg

On 2016-04-07 14:04, Chris Angelico wrote:
> This is a spin-off from the __fspath__ discussion on python-dev, in
> which a few people said that a more general approach would be better.
> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.
> This could supersede the __fspath__ "give me the string for this path"
> protocol, or could stand in parallel with it.
> Obviously str will have this dunder method, returning self. Most other
> core types (notably 'object') will not define it. Absence of this
> method implies that the object cannot be treated as a string.
> String methods will be defined as accepting string-like objects. For
> instance, "hello"+foo will succeed if foo is string-like.
> Downside: Two string-like objects may behave unexpectedly - foo+bar
> will concatenate strings if either is an actual string, but if both
> are other string-like objects, depends on the implementation of those
> objects.
> Bikeshedding:
> 1) What should the dunder method be named? __str_coerce__? __stringlike__?
> 2) Which standard types are sufficiently string-like to be used thus?
> 3) Should there be a bytes equivalent?
> 4) Should there be a format string "coerce to str"? "{}".format(x) is
> equivalent to str(x), but it might be nice to be able to assert that
> something's stringish already.
> Open season!

On Thu, Apr 7, 2016, at 09:41, Paul Moore wrote:
> On 7 April 2016 at 14:04, Chris Angelico <rosuav at> wrote:
> > Proposal: Objects should be allowed to declare that they are
> > "string-like" by creating a dunder method (analogously to __index__
> > for integers) which implies a loss-less conversion to str.
> What would this be used for? Other than Path, what types would be
> viable candidates for being "string like"? I can't think of a use case
> for this feature.

How about ASCII bytes strings? ;)

> For that matter, what constitutes a "lossless conversion to str"?

What's __index__ for?

On Thu, Apr 7, 2016, at 10:10, M.-A. Lemburg wrote:
> I must be missing something... we already have a method for this:
> .__str__()

The point is to have a method that objects that "shouldn't" be used as
strings in some contexts *won't* have, as floats (let alone strings)
don't have __index__, even though they do have __int__.

I can see this going horribly wrong when it comes to things like
concatenation. I've used JS far too many times to be excited for things
like this.

[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
On Apr 7, 2016 8:05 AM, "Chris Angelico" <rosuav at> wrote:

> This is a spin-off from the __fspath__ discussion on python-dev, in
> which a few people said that a more general approach would be better.
> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.
> This could supersede the __fspath__ "give me the string for this path"
> protocol, or could stand in parallel with it.
> Obviously str will have this dunder method, returning self. Most other
> core types (notably 'object') will not define it. Absence of this
> method implies that the object cannot be treated as a string.
> String methods will be defined as accepting string-like objects. For
> instance, "hello"+foo will succeed if foo is string-like.
> Downside: Two string-like objects may behave unexpectedly - foo+bar
> will concatenate strings if either is an actual string, but if both
> are other string-like objects, depends on the implementation of those
> objects.
> Bikeshedding:
> 1) What should the dunder method be named? __str_coerce__? __stringlike__?
> 2) Which standard types are sufficiently string-like to be used thus?
> 3) Should there be a bytes equivalent?
> 4) Should there be a format string "coerce to str"? "{}".format(x) is
> equivalent to str(x), but it might be nice to be able to assert that
> something's stringish already.
> Open season!
Why strings and not, say, floats or bools or lists or bytes? Is it because
strings have a uniquely compelling use case, or...?

(For context from the numpy side of things, since IIUC numpy's fixed-width
integer types were the impetus for __index__: numpy actually has a shadow
type system that has versions of all of those types above except list. It
also has a general conversion/casting system with a concept of different
levels of safety, so for us __index__ is like a "safe" cast to int, and
__int__ is like a "unsafe", and we have versions of this for all pairs of
internal types, which if you care about this in general then having a
single parametrized concept of safe casting might make more sense then
adding new methods one by one.

I'm not aware of any particular pain points triggered by numpy's strings
not being real strings, though, at least in numpy's current design.)

On Apr 7, 2016 6:05 AM, "Chris Angelico" <rosuav at> wrote:

This is a spin-off from the __fspath__ discussion on python-dev, in
which a few people said that a more general approach would be better.

Proposal: Objects should be allowed to declare that they are
"string-like" by creating a dunder method (analogously to __index__
for integers) which implies a loss-less conversion to str.

This could supersede the __fspath__ "give me the string for this path"
protocol, or could stand in parallel with it.

Obviously str will have this dunder method, returning self. Most other
core types (notably 'object') will not define it. Absence of this
method implies that the object cannot be treated as a string.

String methods will be defined as accepting string-like objects. For
instance, "hello"+foo will succeed if foo is string-like.

Downside: Two string-like objects may behave unexpectedly - foo+bar
will concatenate strings if either is an actual string, but if both
are other string-like objects, depends on the implementation of those


1) What should the dunder method be named? __str_coerce__? __stringlike__?

2) Which standard types are sufficiently string-like to be used thus?

3) Should there be a bytes equivalent?

4) Should there be a format string "coerce to str"? "{}".format(x) is
equivalent to str(x), but it might be nice to be able to assert that
something's stringish already.

On 2016-04-07 15:10, M.-A. Lemburg wrote:
> On 07.04.2016 15:04, Chris Angelico wrote:
>> This is a spin-off from the __fspath__ discussion on python-dev, in
>> which a few people said that a more general approach would be better.
>> Proposal: Objects should be allowed to declare that they are
>> "string-like" by creating a dunder method (analogously to __index__
>> for integers) which implies a loss-less conversion to str.
> I must be missing something... we already have a method for this:
> .__str__()
It's for making string-like objects into strings.

__str__ isn't suitable because, ints, for example, have a __str__ 
method, but they aren't string-like.

On 04/07/2016 06:46 AM, ?manuel Barry wrote:

> __index__ is meant to return the same thing as __int__, and I think the same
> restrictions should apply here

Do you mean "the same thing" as in both return 'int', or "the same 
thing" is in __int__(3.4) == __index__(3.4) ?  Because that last one is 

> - we have an object pretending to be a
> string, so there should *not* be a difference between use_as_string(obj) and
> use_as_string(str(obj))

Hmmm -- so maybe you are saying that if the results of __str__ and 
__tostring__ are the same we're fine, just like if the results of 
__int__ and __index__ are the same we're fine?  Otherwise an error is 

Sounds reasonable.


On 7 April 2016 at 15:07, Random832 <random832 at> wrote:
> On Thu, Apr 7, 2016, at 09:41, Paul Moore wrote:
>> On 7 April 2016 at 14:04, Chris Angelico <rosuav at> wrote:
>> > Proposal: Objects should be allowed to declare that they are
>> > "string-like" by creating a dunder method (analogously to __index__
>> > for integers) which implies a loss-less conversion to str.
>> What would this be used for? Other than Path, what types would be
>> viable candidates for being "string like"? I can't think of a use case
>> for this feature.
> How about ASCII bytes strings? ;)

OK, I guess. I can't see why it's a significant enough use case to
justify a new protocol, though.

>> For that matter, what constitutes a "lossless conversion to str"?
> What's __index__ for?

I don't follow. It's for indexing, which requires an integer. How does
that relate to the question of what constitutes a lossless conversion
to str?

On 04/07/2016 07:07 AM, Random832 wrote:

> What's __index__ for?

__index__ is a way to get an int from an int-like object without losing 
information; so it fails with values like 3.4, but should succeed with 
values like Fraction(4, 2).

__int__ is a way to convert the value to an int, so 3.4 becomes 3 (and 
and the 4/10's is lost).


> From: Ethan Furman
> Sent: Thursday, April 07, 2016 11:01 AM
> To: python-ideas at
> Subject: Re: [Python-ideas] Dunder method to make object str-like
> On 04/07/2016 06:46 AM, ?manuel Barry wrote:
> > __index__ is meant to return the same thing as __int__, and I think the
> same
> > restrictions should apply here
> Do you mean "the same thing" as in both return 'int', or "the same
> thing" is in __int__(3.4) == __index__(3.4) ?  Because that last one is
> False.

I badly phrased that, let me try again. In the cases that __index__ is
defined (and does not raise), it should return the same thing as __int__.

> Hmmm -- so maybe you are saying that if the results of __str__ and
> __tostring__ are the same we're fine, just like if the results of
> __int__ and __index__ are the same we're fine?  Otherwise an error is
> raised.

Pretty much; as above, if __tostring__ (in absence of a better name) is
defined and does not raise, it should return the same as __str__.


From ethan at  Thu Apr  7 11:15:59 2016
> Booleans currently have reasonable overrides for the bitwise binary
> operators:
> --> True | False
> True
> --> True & False
> False
> --> True ^ False
> True
> However, the same cannot be said of bitwise unary complement, which
> returns rather useless integer values:
> --> ~False
> -1
>  --> ~True
> -2
> Numpy's boolean type does the more useful (and more expected) thing:
>>>> ~np.bool_(True)
> False
> How about changing the behaviour of bool.__invert__ to make it in line
> with the Numpy boolean?
> (i.e. bool.__invert__ == operator.not_)

No.  bool is a subclass of int, and changing that now would be a serious 
breach of backward-compatibility, not to mention breaking existing code 
for no good reason.

Anyone who wants to can create their own Boolean class that doesn't 
subclass int and then declare the behaviour they want.

If bool had been it's own thing from the start this wouldn't have been a 
problem, but it is far too late to change that now.  You would be better 
off suggesting a new Logical type instead (it could even support unknown 

> Apparently someone went to the trouble of overriding __and__, __or__
> and __xor__ for booleans, which is why it looks unexpected to leave
> __invert__ alone.

__and__, __or__, and __xor__'s results in within bool's domain (right 
word?) so keeping them in the bool subtype makes sense;  the result of 
__invert__ is not.

> But I'm not surprised by such armchair commenting and pointless controversy
> on python-ideas, since that's what the list is for....

If you aren't going to be civil, don't bother coming back.


On 7 April 2016 at 16:05, ?manuel Barry <vgr255 at> wrote:
>> Hmmm -- so maybe you are saying that if the results of __str__ and
>> __tostring__ are the same we're fine, just like if the results of
>> __int__ and __index__ are the same we're fine?  Otherwise an error is
>> raised.
> Pretty much; as above, if __tostring__ (in absence of a better name) is
> defined and does not raise, it should return the same as __str__.

OK, that makes sense as a definition of "what __tostring__ does". But
I'm struggling to think of an example of code that would legitimately
want to *use* it (unlike __index__ where the obvious motivating case
was indexing a sequence). And even with a motivating example of use, I
don't know that I can think of an example of something other than a
string that might *provide* the method.

The problem with the "ASCII byte string" example is that if a type
provides __tostring__ I'd expect it to work for all values of that
type. I'm not clear if "ASCII byte string" is intended to mean "a
value of type bytes that only contains ASCII characters", or "a type
that subclasses bytes to refuse to allow non-ASCII". The former would
imply __tostring__ working sometimes, but not always. The latter seems
like a contrived example (although I'm open to someone explaining a
real-world use case where it's important).


Message-ID: <>

On 07.04.2016 16:58, MRAB wrote:
> On 2016-04-07 15:10, M.-A. Lemburg wrote:
>> On 07.04.2016 15:04, Chris Angelico wrote:
>>> This is a spin-off from the __fspath__ discussion on python-dev, in
>>> which a few people said that a more general approach would be better.
>>> Proposal: Objects should be allowed to declare that they are
>>> "string-like" by creating a dunder method (analogously to __index__
>>> for integers) which implies a loss-less conversion to str.
>> I must be missing something... we already have a method for this:
>> .__str__()
> It's for making string-like objects into strings.
> __str__ isn't suitable because, ints, for example, have a __str__
> method, but they aren't string-like.

Depends on what you define as "string-like", I guess :-)

We have abstract base classes for such tests, but there's nothing
which would define "string-like" as ABC. Before trying to define
a test via a special method, I think it's better to define what
exactly you mean by "string-like".

Something along the lines of numbers.Number, but for strings.

To make an object string-like, you then test for the ABC and
then call .__str__() to get the string representation as string.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Apr 07 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

On Fri, Apr 8, 2016 at 12:56 AM, Nathaniel Smith <njs at> wrote:
> Why strings and not, say, floats or bools or lists or bytes? Is it because
> strings have a uniquely compelling use case, or...?
> (For context from the numpy side of things, since IIUC numpy's fixed-width
> integer types were the impetus for __index__: numpy actually has a shadow
> type system that has versions of all of those types above except list. It
> also has a general conversion/casting system with a concept of different
> levels of safety, so for us __index__ is like a "safe" cast to int, and
> __int__ is like a "unsafe", and we have versions of this for all pairs of
> internal types, which if you care about this in general then having a single
> parametrized concept of safe casting might make more sense then adding new
> methods one by one.
> I'm not aware of any particular pain points triggered by numpy's strings not
> being real strings, though, at least in numpy's current design.)

I'm not against there being more such "treat as" dunder methods, but
I'm not sure which types need them. We have __index__/__int__, and I'm
proposing adding __something__/__str__; bytes was mentioned in the
"okay folks, start bikeshedding" section.

With bool, I'm not sure there needs to be anything. What object would
need to say "I'm functionally a boolean!" in a way that's stronger
than truthiness/falsiness? In what context would it be appropriate to
use True and numpy.bool8(1) identically, but *not* use 42 or "spam" or
object() to also mean true? If there's a use-case for that, then sure,
that can be added to the proposal, and maybe something generic should
be added. +0.

Similar with float. I don't know of any use-cases myself, but I'd
guess numpy is the best place to look. Remember, though, this would
convert the other object into a float, not the float into a numpy
object; if you want that, stick with the existing dunder methods and
implement them yourself - numpy.float64(1.0)+2.0 is an instance of
numpy.float64, NOT float. +0.

With lists, though, I'm -1. The normal way to say "I function as a
list" is to implement sequence protocol. If you want to be able to
accept a list or a "thing like a list", what you usually want is a

So are you suggesting that we should instead have a single "safe cast"
dunder method?

class object:
    def __interpret_as__(self, t):
        """Return the same value as self, as an instance of t.

        If self cannot losslessly be interpreted as t, raise TypeError.
        if self.__class__ is t: return self
        if t is int and hasattr(self, "__index__"):
            return self.__index__()
        raise TypeError("'%s' object cannot be interpreted as %s"
            % (self.__class__.__name__, t.__name__))

class Path:
    def __interpret_as__(self, t):
        if t is str: return str(self)
        return super().__interpret_as__(t)

Message-ID: <>

On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore <p.f.moore at> wrote:
> On 7 April 2016 at 16:05, ?manuel Barry <vgr255 at> wrote:
>>> Hmmm -- so maybe you are saying that if the results of __str__ and
>>> __tostring__ are the same we're fine, just like if the results of
>>> __int__ and __index__ are the same we're fine?  Otherwise an error is
>>> raised.
>> Pretty much; as above, if __tostring__ (in absence of a better name) is
>> defined and does not raise, it should return the same as __str__.
> OK, that makes sense as a definition of "what __tostring__ does". But
> I'm struggling to think of an example of code that would legitimately
> want to *use* it (unlike __index__ where the obvious motivating case
> was indexing a sequence). And even with a motivating example of use, I
> don't know that I can think of an example of something other than a
> string that might *provide* the method.
> The problem with the "ASCII byte string" example is that if a type
> provides __tostring__ I'd expect it to work for all values of that
> type. I'm not clear if "ASCII byte string" is intended to mean "a
> value of type bytes that only contains ASCII characters", or "a type
> that subclasses bytes to refuse to allow non-ASCII". The former would
> imply __tostring__ working sometimes, but not always. The latter seems
> like a contrived example (although I'm open to someone explaining a
> real-world use case where it's important).

The original use-case was Path objects, which stringify as the
human-readable representation of the file system path. If you need to
pass a string-or-Path to something that requires a string, you need to
convert the Path to a string, but *not* convert other arbitrary
objects to strings; calling str(x) would happily convert the integer
1234 into the string "1234", and then you'd go trying to create that
file in the current directory. Python does not let you append
non-strings to strings unless you write __radd__ manually; this
proposal would allow objects to declare to the interpreter "hey, I'm
basically a string here", and allow Paths and strings to interoperate.


On Fri, Apr 8, 2016 at 1:26 AM, M.-A. Lemburg <mal at> wrote:
> On 07.04.2016 16:58, MRAB wrote:
>> On 2016-04-07 15:10, M.-A. Lemburg wrote:
>>> On 07.04.2016 15:04, Chris Angelico wrote:
>>>> This is a spin-off from the __fspath__ discussion on python-dev, in
>>>> which a few people said that a more general approach would be better.
>>>> Proposal: Objects should be allowed to declare that they are
>>>> "string-like" by creating a dunder method (analogously to __index__
>>>> for integers) which implies a loss-less conversion to str.
>>> I must be missing something... we already have a method for this:
>>> .__str__()
>> It's for making string-like objects into strings.
>> __str__ isn't suitable because, ints, for example, have a __str__
>> method, but they aren't string-like.
> Depends on what you define as "string-like", I guess :-)
> We have abstract base classes for such tests, but there's nothing
> which would define "string-like" as ABC. Before trying to define
> a test via a special method, I think it's better to define what
> exactly you mean by "string-like".
> Something along the lines of numbers.Number, but for strings.
> To make an object string-like, you then test for the ABC and
> then call .__str__() to get the string representation as string.

The trouble with the ABC is that it implies a large number of methods;
to declare that your object is string-like, you have to define a
boatload of interactions with other types, and then decide whether
str+Path yields a Path or a string. By defining __tostring__ (or
whatever it gets called), you effectively say "hey, go ahead and cast
me to str any time you need a str". Instead of defining all those
methods, you simply define one thing, and then you can basically just
treat it as a string thereafter.

While you could do this by simply creating a whole lot of methods and
making them all return strings, you'd still have the fundamental
problem that you can't be sure that this is "safe". How do you
distinguish between "object that can be treated as a string" and
"object that knows how to add itself to a string"? Hence, this.


From ethan at  Thu Apr  7 11:43:31 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 08:43:31 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 08:37 AM, Chris Angelico wrote:
> On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore wrote:

>> OK, that makes sense as a definition of "what __tostring__ does". But
>> I'm struggling to think of an example of code that would legitimately
>> want to *use* it (unlike __index__ where the obvious motivating case
>> was indexing a sequence). And even with a motivating example of use, I
>> don't know that I can think of an example of something other than a
>> string that might *provide* the method.
> The original use-case was Path objects, which stringify as the
> human-readable representation of the file system path. If you need to
> pass a string-or-Path to something that requires a string, you need to
> convert the Path to a string, but *not* convert other arbitrary
> objects to strings; calling str(x) would happily convert the integer
> 1234 into the string "1234", and then you'd go trying to create that
> file in the current directory. Python does not let you append
> non-strings to strings unless you write __radd__ manually; this
> proposal would allow objects to declare to the interpreter "hey, I'm
> basically a string here", and allow Paths and strings to interoperate.

The problem with that is that Paths are conceptually not strings, they 
just serialize to strings, and we have a lot of infrastructure in place 
to deal with that serialized form.

Is there anything, anywhere in the Python ecosystem, that would also 
benefit from something like an __as_str__ method/attribute?


From tritium-list at  Thu Apr  7 11:46:14 2016
From: tritium-list at (Alexander Walters)
Date: Thu, 7 Apr 2016 11:46:14 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 4/7/2016 11:43, Ethan Furman wrote:
> The problem with that is that Paths are conceptually not strings, they 
> just serialize to strings, and we have a lot of infrastructure in 
> place to deal with that serialized form. 

Does any OS expose access to paths as anything other than the serialized 

From p.f.moore at  Thu Apr  7 11:47:57 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 16:47:57 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 16:37, Chris Angelico <rosuav at> wrote:
> On Fri, Apr 8, 2016 at 1:26 AM, Paul Moore <p.f.moore at> wrote:
>> On 7 April 2016 at 16:05, ?manuel Barry <vgr255 at> wrote:
>>>> Hmmm -- so maybe you are saying that if the results of __str__ and
>>>> __tostring__ are the same we're fine, just like if the results of
>>>> __int__ and __index__ are the same we're fine?  Otherwise an error is
>>>> raised.
>>> Pretty much; as above, if __tostring__ (in absence of a better name) is
>>> defined and does not raise, it should return the same as __str__.
>> OK, that makes sense as a definition of "what __tostring__ does". But
>> I'm struggling to think of an example of code that would legitimately
>> want to *use* it (unlike __index__ where the obvious motivating case
>> was indexing a sequence). And even with a motivating example of use, I
>> don't know that I can think of an example of something other than a
>> string that might *provide* the method.
>> The problem with the "ASCII byte string" example is that if a type
>> provides __tostring__ I'd expect it to work for all values of that
>> type. I'm not clear if "ASCII byte string" is intended to mean "a
>> value of type bytes that only contains ASCII characters", or "a type
>> that subclasses bytes to refuse to allow non-ASCII". The former would
>> imply __tostring__ working sometimes, but not always. The latter seems
>> like a contrived example (although I'm open to someone explaining a
>> real-world use case where it's important).
> The original use-case was Path objects, which stringify as the
> human-readable representation of the file system path. If you need to
> pass a string-or-Path to something that requires a string, you need to
> convert the Path to a string, but *not* convert other arbitrary
> objects to strings; calling str(x) would happily convert the integer
> 1234 into the string "1234", and then you'd go trying to create that
> file in the current directory. Python does not let you append
> non-strings to strings unless you write __radd__ manually; this
> proposal would allow objects to declare to the interpreter "hey, I'm
> basically a string here", and allow Paths and strings to interoperate.

But the proposal for paths is to have a *specific* method that says
"give me a string representing a filesystem path from this object". An
"interpret this object as a string" wouldn't be appropriate for the
cases where I'd want to do "give me a string representing a filesystem
path". And that's where I get stuck, as I can't think of an example
where I *would* want the more general option. For a number of reasons:

1. I can't think of a real-world example of when I'd *use* such a facility
2. I can't think of a real-world example of a type that might
*provide* such a facility
3. I can't see how something so general would be of benefit.

The __index__ and __fspath__ special methods have clear, focused
reasons for existing. The proposed __tostring__ protocol seems too
general to have a purpose.


From rosuav at  Thu Apr  7 11:49:15 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 01:49:15 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman <ethan at> wrote:
> The problem with that is that Paths are conceptually not strings, they just
> serialize to strings, and we have a lot of infrastructure in place to deal
> with that serialized form.
> Is there anything, anywhere in the Python ecosystem, that would also benefit
> from something like an __as_str__ method/attribute?

I'd like to see this used in 2/3 compatibility code. You can make an
Ascii class which subclasses bytes, but can be treated as
str-compatible in both versions. By restricting its contents to
[0,128) it can easily be "safe" in both byte and text contexts, and
it'll cover 99%+ of use cases. So if you try to getattr(x,
Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of
str) and in Py3 (because Ascii.__tostring__ returns a valid string),
and there's a guarantee that it'll behave the same way (because the
string must be ASCII-only).


From contrebasse at  Thu Apr  7 12:00:20 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Thu, 7 Apr 2016 16:00:20 +0000 (UTC)
Subject: [Python-ideas] 
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

Ethan Furman <ethan at ...> writes:

> > --> ~False
> > -1
> >  --> ~True
> > -2
> >
> No.  bool is a subclass of int, and changing that now would be a serious 
> breach of backward-compatibility, not to mention breaking existing code 
> for no good reason.

I get that "technically" it would be a backward incompatible change, but any
code that relies on `~True == -2` has other problems than backward

From p.f.moore at  Thu Apr  7 12:00:40 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 17:00:40 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 7 April 2016 at 16:46, Alexander Walters <tritium-list at> wrote:
> On 4/7/2016 11:43, Ethan Furman wrote:
>> The problem with that is that Paths are conceptually not strings, they
>> just serialize to strings, and we have a lot of infrastructure in place to
>> deal with that serialized form.
> Does any OS expose access to paths as anything other than the serialized
> form?

Does any OS have an object oriented API? At the OS level, you're lucky
to get any sort of structured access, so I don't think that "what the
OS provides" is the level to look at this at. Languages are what
provide abstractions. Lisp provides a path object, I believe. Perl
does (although it may well be a 3rd party CPAN library as with a lot
of things in Perl). A path abstraction isn't commonly provided, but
that doesn't mean it's not useful.

But this is of course offtopic for this thread.

From contrebasse at  Thu Apr  7 12:02:37 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Thu, 7 Apr 2016 16:02:37 +0000 (UTC)
Subject: [Python-ideas] 
References: <20160407094618.5c01847d@fsol>
Message-ID: <>

Steven D'Aprano <steve at ...> writes:

> It also breaks a fundamental property of most mathematical relations:
> if a == b, then f(a) == f(b) 
> (assuming f(a) and f(b) are defined for the type of both a and b).
> That is currently true for bools:
> py> (True == 1) and (~True == ~1)
> True
> py> (False==0) and (~False == ~0)
> True
> You want ~b to return `not b`:
> py> (True == 1) and (False == ~1)
> False
> py> (False==0) and (True == ~0)
> False

How about the str function ?
str(True) != str(1)
str(False) != str(0)

True == 1 is not a mathematical relation, it is python's history.

From p.f.moore at  Thu Apr  7 12:11:25 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 17:11:25 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 16:49, Chris Angelico <rosuav at> wrote:
> On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman <ethan at> wrote:
>> The problem with that is that Paths are conceptually not strings, they just
>> serialize to strings, and we have a lot of infrastructure in place to deal
>> with that serialized form.
>> Is there anything, anywhere in the Python ecosystem, that would also benefit
>> from something like an __as_str__ method/attribute?
> I'd like to see this used in 2/3 compatibility code. You can make an
> Ascii class which subclasses bytes, but can be treated as
> str-compatible in both versions. By restricting its contents to
> [0,128) it can easily be "safe" in both byte and text contexts, and
> it'll cover 99%+ of use cases. So if you try to getattr(x,
> Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of
> str) and in Py3 (because Ascii.__tostring__ returns a valid string),
> and there's a guarantee that it'll behave the same way (because the
> string must be ASCII-only).

OK, that makes sense. But when you say "it'll work fine in Python 3"
how will that happen? What code needs to call __fromstring__ to make
this happen? You mention getattr. Would you expect every builtin and
stdlib function that takes a string to be modified to try
__fromstring__? That sounds like a pretty big performance hit, as
strings are very critical to the interpreter.

Even worse, what should open() do? It takes a string as an argument.
To support patthlib, it needs to also call __fspath__. I presume you'd
also want it to call __fromstring__ so that your Ascii class could be
used as an argument to open as well. This is starting to seem
incredibly messy to solve a problem that's basically about extending
support for Python 2, which is explicitly not something the Python 3
core should be doing...


From random832 at  Thu Apr  7 12:16:31 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 12:16:31 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote:
> >> For that matter, what constitutes a "lossless conversion to str"?
> >
> > What's __index__ for?
> I don't follow. It's for indexing, which requires an integer.

Sure, but why isn't int() good enough? For the same reason you only want
the kinds of objects that implement __index__ (and not, say, a float or
a string that happens to be numeric) for indexing, you only want the
kinds of objects that implement this method for certain purposes.

From random832 at  Thu Apr  7 12:23:42 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 12:23:42 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On Thu, Apr 7, 2016, at 12:00, Joseph Martinot-Lagarde wrote:
> Ethan Furman <ethan at ...> writes:
> > > --> ~False
> > > -1
> > >  --> ~True
> > > -2
> > >
> > No.  bool is a subclass of int, and changing that now would be a serious 
> > breach of backward-compatibility, not to mention breaking existing code 
> > for no good reason.
> I get that "technically" it would be a backward incompatible change, but
> any
> code that relies on `~True == -2` has other problems than backward
> compatibility.

It's more a combination of other things that code may rely on:

True == 1

False == 0

False + 1, False + True # truthy

True + True == 2 # imagine implementing count_if_true() as sum(map(bool,

From ethan at  Thu Apr  7 12:25:52 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 09:25:52 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On 04/07/2016 09:00 AM, Joseph Martinot-Lagarde wrote:
> Ethan Furman writes:
>>> --> ~False
>>> -1
>>>   --> ~True
>>> -2
>> No.  bool is a subclass of int, and changing that now would be a serious
>> breach of backward-compatibility, not to mention breaking existing code
>> for no good reason.
> I get that "technically" it would be a backward incompatible change, but any
> code that relies on `~True == -2` has other problems than backward
> compatibility.

Really?  Code that relies on correct behavior (~1 == -2) has problems? 
Well, it may, but that's not it.


From mal at  Thu Apr  7 12:27:30 2016
From: mal at (M.-A. Lemburg)
Date: Thu, 7 Apr 2016 18:27:30 +0200
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 07.04.2016 17:42, Chris Angelico wrote:
> On Fri, Apr 8, 2016 at 1:26 AM, M.-A. Lemburg <mal at> wrote:
>> On 07.04.2016 16:58, MRAB wrote:
>>> On 2016-04-07 15:10, M.-A. Lemburg wrote:
>>>> On 07.04.2016 15:04, Chris Angelico wrote:
>>>>> This is a spin-off from the __fspath__ discussion on python-dev, in
>>>>> which a few people said that a more general approach would be better.
>>>>> Proposal: Objects should be allowed to declare that they are
>>>>> "string-like" by creating a dunder method (analogously to __index__
>>>>> for integers) which implies a loss-less conversion to str.
>>>> I must be missing something... we already have a method for this:
>>>> .__str__()
>>> It's for making string-like objects into strings.
>>> __str__ isn't suitable because, ints, for example, have a __str__
>>> method, but they aren't string-like.
>> Depends on what you define as "string-like", I guess :-)
>> We have abstract base classes for such tests, but there's nothing
>> which would define "string-like" as ABC. Before trying to define
>> a test via a special method, I think it's better to define what
>> exactly you mean by "string-like".
>> Something along the lines of numbers.Number, but for strings.
>> To make an object string-like, you then test for the ABC and
>> then call .__str__() to get the string representation as string.
> The trouble with the ABC is that it implies a large number of methods;
> to declare that your object is string-like, you have to define a
> boatload of interactions with other types, and then decide whether
> str+Path yields a Path or a string.

Not necessarily. In fact, a string.String ABC could have only
a single method: .__str__() defined.

> By defining __tostring__ (or
> whatever it gets called), you effectively say "hey, go ahead and cast
> me to str any time you need a str". Instead of defining all those
> methods, you simply define one thing, and then you can basically just
> treat it as a string thereafter.

As I said: it's all a matter of defining what a "string-like"
object is supposed to mean. With the above definition of string.String,
you'd have exactly what you want.

> While you could do this by simply creating a whole lot of methods and
> making them all return strings, you'd still have the fundamental
> problem that you can't be sure that this is "safe". How do you
> distinguish between "object that can be treated as a string" and
> "object that knows how to add itself to a string"? Hence, this.

I suppose the .__add__() method of Path objects would implement
the necessary check isinstance(other_obj, string.String) to find
out whether it's safe to assume that other_obj.__str__() returns
the str version of the "string-like" object other_obj.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Apr 07 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From random832 at  Thu Apr  7 12:31:23 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 12:31:23 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016, at 12:11, Paul Moore wrote:
> OK, that makes sense. But when you say "it'll work fine in Python 3"
> how will that happen? What code needs to call __fromstring__ to make
> this happen? You mention getattr. Would you expect every builtin and
> stdlib function that takes a string to be modified to try
> __fromstring__? That sounds like a pretty big performance hit, as
> strings are very critical to the interpreter.

Isn't it only a performance hit on something that's an exception now?
Like, if PyString_Check fails, then call it.

I wonder how much could be done in a "blanket" way without having to
change individual methods. Like, put the necessary machinery in
PyArg_ParseTuple. Or does that borrow a reference?


Taking a step back, can someone explain to me in plain english why io
and os shouldn't directly support pathlib? All this "well maybe make it
a subclass, well maybe make a special protocol stuff can implement"
stuff is dancing around that.

From ethan at  Thu Apr  7 12:32:42 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 09:32:42 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 08:49 AM, Chris Angelico wrote:
> On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman wrote:

>> The problem with that is that Paths are conceptually not strings, they just
>> serialize to strings, and we have a lot of infrastructure in place to deal
>> with that serialized form.
>> Is there anything, anywhere in the Python ecosystem, that would also benefit
>> from something like an __as_str__ method/attribute?
> I'd like to see this used in 2/3 compatibility code. You can make an
> Ascii class which subclasses bytes, but can be treated as
> str-compatible in both versions. By restricting its contents to
> [0,128) it can easily be "safe" in both byte and text contexts, and
> it'll cover 99%+ of use cases. So if you try to getattr(x,
> Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of
> str) and in Py3 (because Ascii.__tostring__ returns a valid string),
> and there's a guarantee that it'll behave the same way (because the
> string must be ASCII-only).

Make the Ascii class subclass bytes in 2.x and str in 3.x; the 2.x 
version's __str__ returns bytes, while it's __unicode__ returns unicode 
and in 3.x __str__returns str, and __unicode__ doesn't exist.

Problem solved, no other special methods needed.*  ;)


* crosses fingers hoping to not have missed something obvious ;)

From guido at  Thu Apr  7 12:38:25 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Apr 2016 09:38:25 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

Honestly I think that the OP has a point, and I don't think we have to
bend over backwards to preserve int compatibility. After all str(True)
!= str(1), and surely there are other examples.

--Guido van Rossum (

From ethan at  Thu Apr  7 12:43:28 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 09:43:28 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol>
Message-ID: <>

On 04/07/2016 09:02 AM, Joseph Martinot-Lagarde wrote:

> How about the str function ?
> str(True) != str(1)
> str(False) != str(0)

bools are a subclass of int, and the string represention of an int is 
not integral to its value:

class fraction(int):

fraction('4/2') == 2
str(fraction('4/2')) != str(2)


From rosuav at  Thu Apr  7 12:44:17 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 02:44:17 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore <p.f.moore at> wrote:
> Even worse, what should open() do? It takes a string as an argument.
> To support patthlib, it needs to also call __fspath__. I presume you'd
> also want it to call __fromstring__ so that your Ascii class could be
> used as an argument to open as well. This is starting to seem
> incredibly messy to solve a problem that's basically about extending
> support for Python 2, which is explicitly not something the Python 3
> core should be doing...

This would replace __fspath__. There'd be no need for a Path-specific
dunder if there's a generic "this can be treated as a string" dunder.


From rosuav at  Thu Apr  7 12:46:32 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 02:46:32 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg <mal at> wrote:
> Not necessarily. In fact, a string.String ABC could have only
> a single method: .__str__() defined.

That wouldn't be very useful; object.__str__ exists and is functional,
so EVERY object would count as a string.

The point of "string-like" is that it can be treated as a string, not
just that it can be converted to one. This is exactly parallel to the
difference between __index__ and __int__; floats can be converted to
int (and will truncate), but cannot be *treated* as ints.


From rosuav at  Thu Apr  7 13:02:30 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 03:02:30 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 2:31 AM, Random832 <random832 at> wrote:
> On Thu, Apr 7, 2016, at 12:11, Paul Moore wrote:
>> OK, that makes sense. But when you say "it'll work fine in Python 3"
>> how will that happen? What code needs to call __fromstring__ to make
>> this happen? You mention getattr. Would you expect every builtin and
>> stdlib function that takes a string to be modified to try
>> __fromstring__? That sounds like a pretty big performance hit, as
>> strings are very critical to the interpreter.
> Isn't it only a performance hit on something that's an exception now?
> Like, if PyString_Check fails, then call it.

I like this idea; it can be supported by a very small semantic change,
namely that this method is not guaranteed to be called on strings or
subclasses of strings. For cross-implementation compatibility, I would
require that str.__name_needed__() return self, and that well-behaved
subclasses not override this; that way, there's no semantic
difference. Consider this like the "x is y implies x == y"
optimization that crops up here and there; while it's legal to have
x.__eq__(x) return False (eg NaN), there are some places where it's
assumed to be found anyway (or, to be technically correct, where
something looks for "x is y or x == y" rather than simply "x == y").
So this would be "self if isinstance(self, str) else

Deep inside CPython, it would simply be failure-mode handling: instead
of directly raising TypeError, attempt a treat-as-string conversion,
which will raise TypeError if it's not possible. A cursory inspection
of the code suggests that the only place that needs to changed is
PyUnicode_FromObject in unicodeobject.c. It currently checks if the
object is exactly an instance of PyUnicode (aka 'str'), then for a
subclass of same, then raises TypeError "Can't convert '%.100s' object
to str implicitly". Even the wording of that message leaves it open to
more types being implicitly convertible; the only change here is that
you can make your type convertible without actually subclassing str.



From desmoulinmichel at  Thu Apr  7 13:03:35 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Thu, 7 Apr 2016 19:03:35 +0200
Subject: [Python-ideas] Boundaries for unpacking
Message-ID: <>

Python is a lot about iteration, and I often have to get values from an
iterable. For this, unpacking is fantastic:

a, b, c = iterable

One that problem arises is that you don't know when iterable will
contain 3 items.

In that case, this beautiful code becomes:

iterator = iter(iterable)
a = next(iterator, "default value")
b = next(iterator, "default value")
c = next(iterator, "default value")

More often than not, I wish there were a way to specify a default value
for unpacking.

This would also come in handy to get the first or last item of an
iterable that can be empty. Indeed:

a, *b = iterable
*a, b = iterable

Would fail if the iterable's iterator cannot yield at least one item.

I don't have a clean syntax in mind, so I hope some people will get
inspired by it and will make suggestions.

I add a couple of ideas:

a, b, c with "default value" = iterable
a, b, c except "default value" = iterable
a, b, c | "default value" = iterable

But none of them are great.

Another problem is that if you can't use unpacking on generator yielding
1000000 values:

a, b = iterable # error
a, b, c* = iterable # consume the whole thing, and load it in memory

So if I want the 2 first items, I have to go back to:

iterator = iter(iterable)
a = next(iterator)
b = next(iterator)

Now, I made a suggestion to allow slicing on any iterable (or at least
on any iterator), but I'm not going to hold my breath on it:

a, b = iterable[:2] # allow that on generators


a, b = iter(iterable)[:2] # allow that

Some have suggested to add a new syntax:

a, b = iterable(:2)

But even just a way to allow to specify that you just want the 2 first
items without unpacking the all iterable would work for me.

Ideally though, the 2 options should be able to be used together.

From p.f.moore at  Thu Apr  7 13:16:56 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 18:16:56 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 17:16, Random832 <random832 at> wrote:
> On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote:
>> >> For that matter, what constitutes a "lossless conversion to str"?
>> >
>> > What's __index__ for?
>> I don't follow. It's for indexing, which requires an integer.
> Sure, but why isn't int() good enough? For the same reason you only want
> the kinds of objects that implement __index__ (and not, say, a float or
> a string that happens to be numeric) for indexing, you only want the
> kinds of objects that implement this method for certain purposes.

You're making my point here. A "lossless conversion to str" isn't a
good enough definition of which things should implement the new
protocol. I'm trying to get someone to tell me what criteria I should
use to decide if my type should implement the new protocol. At the
moment, I have no idea.


From ethan at  Thu Apr  7 13:17:57 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 10:17:57 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On 04/07/2016 09:38 AM, Guido van Rossum wrote:

> Honestly I think that the OP has a point, and I don't think we have to
> bend over backwards to preserve int compatibility. After all str(True)
> != str(1), and surely there are other examples.

I think the str() of a value, while possibly being the most interesting 
piece of information (IntEnum, anyone?), is hardly the most intrinsic.

If we do make this change, besides needing a couple major versions to 
make it happen, will anything else be different?

- no longer subclass int?
- add an "unknown" value?
   - how will indexing work?
   - or any of the other operations?
- don't bother with any of the other mathematical operations?
   - counting True's is not the same as adding True's

I'm not firmly opposed, I just don't see a major issue here -- I've 
needed an Unknown value for more often that I've needed ~True to be False.


From p.f.moore at  Thu Apr  7 13:18:54 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 18:18:54 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 17:44, Chris Angelico <rosuav at> wrote:
> On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore <p.f.moore at> wrote:
>> Even worse, what should open() do? It takes a string as an argument.
>> To support patthlib, it needs to also call __fspath__. I presume you'd
>> also want it to call __fromstring__ so that your Ascii class could be
>> used as an argument to open as well. This is starting to seem
>> incredibly messy to solve a problem that's basically about extending
>> support for Python 2, which is explicitly not something the Python 3
>> core should be doing...
> This would replace __fspath__. There'd be no need for a Path-specific
> dunder if there's a generic "this can be treated as a string" dunder.

So the only things that should implement the new protocol would be
paths. Otherwise, they could be passed to things that expect a path.

Once again, I'm confused.

Can someone please explain to me how to decide whether my type should
provide the new protocol. And whether my code should check the new
protocol. At the moment, I can't answer those questions with the
information given in this thread.


From ian.g.kelly at  Thu Apr  7 13:19:10 2016
From: ian.g.kelly at (Ian Kelly)
Date: Thu, 7 Apr 2016 11:19:10 -0600
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Thu, Apr 7, 2016 at 10:38 AM, Guido van Rossum <guido at> wrote:
> Honestly I think that the OP has a point, and I don't think we have to
> bend over backwards to preserve int compatibility. After all str(True)
> != str(1), and surely there are other examples.

I can see it going either way: if we treat the domain of bool as that
of the integers, then ~True == ~1 == -2.

If on the other hand we treat it as the integers modulo 2, then it
makes sense that ~True == ~1 == 0. But this would also imply that True
+ True == False, which would definitely break existing code.

I note that if you add an explicit modulo division by 2, then it works out:

py> ~True % 2
py> ~False % 2

The salient point to me is that there's no strong justification for
making the change. As has been pointed out elsewhere in the thread, if
you want binary not, just use not.

From guido at  Thu Apr  7 13:23:59 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Apr 2016 10:23:59 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

Nothing else is on the table. Seriously. Stop hijacking the thread.

--Guido (mobile)
On Apr 7, 2016 10:18 AM, "Ethan Furman" <ethan at> wrote:

> On 04/07/2016 09:38 AM, Guido van Rossum wrote:
> Honestly I think that the OP has a point, and I don't think we have to
>> bend over backwards to preserve int compatibility. After all str(True)
>> != str(1), and surely there are other examples.
> I think the str() of a value, while possibly being the most interesting
> piece of information (IntEnum, anyone?), is hardly the most intrinsic.
> If we do make this change, besides needing a couple major versions to make
> it happen, will anything else be different?
> - no longer subclass int?
> - add an "unknown" value?
>   - how will indexing work?
>   - or any of the other operations?
> - don't bother with any of the other mathematical operations?
>   - counting True's is not the same as adding True's
> I'm not firmly opposed, I just don't see a major issue here -- I've needed
> an Unknown value for more often that I've needed ~True to be False.
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr  7 13:30:43 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 10:30:43 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 09:31 AM, Random832 wrote:

> Taking a step back, can someone explain to me in plain english why io
> and os shouldn't directly support pathlib? All this "well maybe make it
> a subclass, well maybe make a special protocol stuff can implement"
> stuff is dancing around that.

The protocol is "the how" of supporting pathlib, and, interestingly 
enough, the easiest way to do so (avoids circular imports, etc., etc,.).


From brett at  Thu Apr  7 13:31:19 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 17:31:19 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, 7 Apr 2016 at 08:58 Chris Angelico <rosuav at> wrote:

> On Fri, Apr 8, 2016 at 1:43 AM, Ethan Furman <ethan at> wrote:
> > The problem with that is that Paths are conceptually not strings, they
> just
> > serialize to strings, and we have a lot of infrastructure in place to
> deal
> > with that serialized form.
> >
> > Is there anything, anywhere in the Python ecosystem, that would also
> benefit
> > from something like an __as_str__ method/attribute?
> I'd like to see this used in 2/3 compatibility code. You can make an
> Ascii class which subclasses bytes, but can be treated as
> str-compatible in both versions. By restricting its contents to
> [0,128) it can easily be "safe" in both byte and text contexts, and
> it'll cover 99%+ of use cases. So if you try to getattr(x,
> Ascii("foo")), it'll work fine in Py2 (because Ascii is a subclass of
> str) and in Py3 (because Ascii.__tostring__ returns a valid string),
> and there's a guarantee that it'll behave the same way (because the
> string must be ASCII-only).

But couldn't you also just define a str subclass that checks its
argument(s) are only valid ASCII values? What you're proposing is to
potentially change all places that operate with a string to now check for a
special method which would be a costly change potentially to performance as
well as propagating this concept everywhere a string-like object is

I think it's important to realize that the main reason we are considering
this special method concept is to make it easier to introduce in
third-party code which doesn't have pathlib or for people who don't want to
import pathlib just to convert a pathlib.PurePath object to a string.
Otherwise we would simply have pathlib.fspath() or something that checked
if its argument was a subclass of pathlib.PurePath and if so call str() on
it, else double-check that argument was an instance of str and keep it
simple. But instead we are trying to do the practical thing and come up
with a common method name that people can be sure will exist on
pathlib.PurePath and any other third-party path library. I think trying to
generalize to "string-like but not __str__()" is a premature optimization
that has not exposed enough use-cases to warrant such a deep change to the
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mike at  Thu Apr  7 13:39:56 2016
From: mike at (Michael Selik)
Date: Thu, 7 Apr 2016 18:39:56 +0100
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

> On Apr 7, 2016, at 6:23 PM, Guido van Rossum <guido at> wrote:
> Nothing else is on the table. Seriously. Stop hijacking the thread.

To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?

From brett at  Thu Apr  7 13:40:23 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 17:40:23 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, 7 Apr 2016 at 10:19 Paul Moore <p.f.moore at> wrote:

> On 7 April 2016 at 17:44, Chris Angelico <rosuav at> wrote:
> > On Fri, Apr 8, 2016 at 2:11 AM, Paul Moore <p.f.moore at> wrote:
> >> Even worse, what should open() do? It takes a string as an argument.
> >> To support patthlib, it needs to also call __fspath__. I presume you'd
> >> also want it to call __fromstring__ so that your Ascii class could be
> >> used as an argument to open as well. This is starting to seem
> >> incredibly messy to solve a problem that's basically about extending
> >> support for Python 2, which is explicitly not something the Python 3
> >> core should be doing...
> >
> > This would replace __fspath__. There'd be no need for a Path-specific
> > dunder if there's a generic "this can be treated as a string" dunder.
> So the only things that should implement the new protocol would be
> paths. Otherwise, they could be passed to things that expect a path.
> Once again, I'm confused.
> Can someone please explain to me how to decide whether my type should
> provide the new protocol. And whether my code should check the new
> protocol. At the moment, I can't answer those questions with the
> information given in this thread.

I've reached the same conclusion/point myself. The attempt to come up with
a pure solution is wreaking havoc with the practicality side of my brain
that doesn't understand the rules it's supposed to follow in order to make
this work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From toddrjen at  Thu Apr  7 13:44:50 2016
From: toddrjen at (Todd)
Date: Thu, 7 Apr 2016 13:44:50 -0400
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin <desmoulinmichel at>

> Python is a lot about iteration, and I often have to get values from an
> iterable. For this, unpacking is fantastic:
> a, b, c = iterable
> One that problem arises is that you don't know when iterable will
> contain 3 items.
> In that case, this beautiful code becomes:
> iterator = iter(iterable)
> a = next(iterator, "default value")
> b = next(iterator, "default value")
> c = next(iterator, "default value")

I think rather than adding a new syntax, it would be better to just make
one of these work:

a, b, c, *d = itertools.chain(iterable, itertools.repeat('default value'))

a, b, c = itertools.chain(iterable, itertools.repeat('default value'))

a, b, c = itertools.chain.from_iterable(iterable, default='default value')
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mike at  Thu Apr  7 13:53:40 2016
From: mike at (Michael Selik)
Date: Thu, 7 Apr 2016 18:53:40 +0100
Subject: [Python-ideas] Dictionary views are not entirely 'set like'
In-Reply-To: <>
References: <>
Message-ID: <>

> On Apr 6, 2016, at 10:38 AM, Joshua Morton <joshua.morton13 at> wrote:
>     set() | []  # 2
>     {}.keys() | []  # 3

Looks like this should be standardized. Either both raise TypeError, or both return a set. My preference would be TypeError, but that might be worse for backwards-compatibility.

>     {}.keys().union(set())  # 6

Seems to me that the pipe operator is staying on MappingView, so it's reasonable to add a corresponding ``.union`` to mimic sets. And intersection, etc.

>     {}.values() == {}.values()  # 9
>     d = {}; d.values() == d.values()  # 10

It's weird, but float('nan') != float('nan'). I'm not particularly bothered by this.

From ethan at  Thu Apr  7 13:56:04 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 10:56:04 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol>	<>	<>	<>	<>	<>
Message-ID: <>

On 04/07/2016 10:23 AM, Guido van Rossum wrote:

> Nothing else is on the table.


> Seriously.

Whew, I thought you were joking.

> Stop hijacking the thread.

Bite me.

> --Guido (mobile)

--Ethan (pissed)

From guido at  Thu Apr  7 14:08:38 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Apr 2016 11:08:38 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike at> wrote:
>> On Apr 7, 2016, at 6:23 PM, Guido van Rossum <guido at> wrote:
>> Nothing else is on the table. Seriously. Stop hijacking the thread.
> To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?

Yes. To be more precise, there are some "arithmetic" operations (+, -,
*, /, **) and they all treat bools as ints and always return ints;
there are also some "bitwise" operations (&, |, ^, ~) and they should
all treat bools as bools and return a bool. Currently the only
exception to this idea is that ~ returns an int, so the proposal is to
fix that. (There are also some "boolean" operations (and, or, not) and
they are also unchanged.)

--Guido van Rossum (

From brett at  Thu Apr  7 14:10:32 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 18:10:32 +0000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 10:55 Ethan Furman <ethan at> wrote:

> On 04/07/2016 10:23 AM, Guido van Rossum wrote:
> > Nothing else is on the table.
> Okay.
> > Seriously.
> Whew, I thought you were joking.
> > Stop hijacking the thread.
> Bite me.
> > --Guido (mobile)
> --Ethan (pissed)

OK, that's enough. You already snapped at Antoine over his negative dig at
python-ideas -- which I don't condone either -- and now this. Please
consider stepping away from the keyboard until you've calmed down.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From desmoulinmichel at  Thu Apr  7 14:13:16 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Thu, 7 Apr 2016 20:13:16 +0200
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
Message-ID: <>

Ignoring the fact that it's inelegant and verbose (not even counting the
import), the biggest problem is that there is no way I will ever
remember it. Which mean everytime I want to do it (probably twice a
week), I will have to look it up on the doc, just like I do with
itertools.groupby or the receipe to iterate on a window of values.

Le 07/04/2016 19:44, Todd a ?crit :
> a, b, c, *d = itertools.chain(iterable, itertools.repeat('default value'))

From mike at  Thu Apr  7 14:15:26 2016
From: mike at (Michael Selik)
Date: Thu, 7 Apr 2016 19:15:26 +0100
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
Message-ID: <>

> On Apr 7, 2016, at 6:44 PM, Todd <toddrjen at> wrote:
> On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin <desmoulinmichel at> wrote:
> Python is a lot about iteration, and I often have to get values from an
> iterable. For this, unpacking is fantastic:
> a, b, c = iterable
> One that problem arises is that you don't know when iterable will
> contain 3 items.
> In that case, this beautiful code becomes:
> iterator = iter(iterable)
> a = next(iterator, "default value")
> b = next(iterator, "default value")
> c = next(iterator, "default value")

I actually don't think that's so ugly. Looks fairly clear to me. What about these alternatives?

>>> from itertools import repeat, islice, chain
>>> it = range(2)
>>> a, b, c = islice(chain(it, repeat('default')), 3)
>>> a, b, c
(0, 1, 'default')

If you want to stick with builtins:

>>> it = iter(range(2))
>>> a, b, c = [next(it, 'default') for i in range(3)]
>>> a, b, c
(0, 1, 'default')

If your utterable is sliceable and sizeable:

>>> it = list(range(2))
>>> a, b, c = it[:3] + ['default'] * (3 - len(it))
>>> a, b, c
(0, 1, 'default')

Perhaps add a recipe to itertools, or change the "take" recipe?

def unpack(n, iterable, default=None):
    "Slice the first n items of the iterable, padding with a default"
    padding = repeat(default)
    return islice(chain(iterable, padding), n))

From desmoulinmichel at  Thu Apr  7 14:24:23 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Thu, 7 Apr 2016 20:24:23 +0200
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
Message-ID: <>

Le 07/04/2016 20:15, Michael Selik a ?crit :
>> On Apr 7, 2016, at 6:44 PM, Todd <toddrjen at> wrote:
>> On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin <desmoulinmichel at> wrote:
>> Python is a lot about iteration, and I often have to get values from an
>> iterable. For this, unpacking is fantastic:
>> a, b, c = iterable
>> One that problem arises is that you don't know when iterable will
>> contain 3 items.
>> In that case, this beautiful code becomes:
>> iterator = iter(iterable)
>> a = next(iterator, "default value")
>> b = next(iterator, "default value")
>> c = next(iterator, "default value")
> I actually don't think that's so ugly. Looks fairly clear to me. What about these alternatives?

Well, there is nothing wrong with:

a = mylist[0]

b = mylist[1]

c = mylist[2]

But we still prefer unpacking. And this is way more verbose.

>>>> from itertools import repeat, islice, chain
>>>> it = range(2)
>>>> a, b, c = islice(chain(it, repeat('default')), 3)
>>>> a, b, c
> (0, 1, 'default')
> If you want to stick with builtins:
>>>> it = iter(range(2))
>>>> a, b, c = [next(it, 'default') for i in range(3)]
>>>> a, b, c
> (0, 1, 'default')

They all work, but they are impossible to remember, plus you will need a
comment everytime you use them outside of the shell.

> If your utterable is sliceable and sizeable:
>>>> it = list(range(2))
>>>> a, b, c = it[:3] + ['default'] * (3 - len(it))
>>>> a, b, c
> (0, 1, 'default')

Same problem, and as you said, you can forget about generators.

> Perhaps add a recipe to itertools, or change the "take" recipe?
> def unpack(n, iterable, default=None):
>     "Slice the first n items of the iterable, padding with a default"
>     padding = repeat(default)
>     return islice(chain(iterable, padding), n))

Not a bad idea. Built in would be better, but I can live with itertools.
I import it so often I'm wondering if itertools shouldn't be in
__builtins__ :)

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From rosuav at  Thu Apr  7 14:28:35 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 04:28:35 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 3:31 AM, Brett Cannon <brett at> wrote:
> But couldn't you also just define a str subclass that checks its argument(s)
> are only valid ASCII values? What you're proposing is to potentially change
> all places that operate with a string to now check for a special method
> which would be a costly change potentially to performance as well as
> propagating this concept everywhere a string-like object is expected.

Fair enough. That was a spur-of-the-moment thought. To be honest, this
proposal is a massive generalization from, ultimately, a single

> I think it's important to realize that the main reason we are considering
> this special method concept is to make it easier to introduce in third-party
> code which doesn't have pathlib or for people who don't want to import
> pathlib just to convert a pathlib.PurePath object to a string.

And this should make it easier for third-party code to be functional
without even being aware of the Path object. There are two basic
things that code will be doing with paths: passing them unchanged to
standard library functions (eg open()), and combining them with
strings. The first will work by definition; the second will if paths
can implicitly upcast to strings.

In contrast, a function or method to convert a path to a string
requires conscious effort on the part of any function that needs to do
such manipulation, which correspondingly means version compatibility
checks. Libraries will need to release a new version that's
path-compatible, even though they don't actually gain any


From mal at  Thu Apr  7 14:42:46 2016
From: mal at (M.-A. Lemburg)
Date: Thu, 7 Apr 2016 20:42:46 +0200
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 07.04.2016 18:46, Chris Angelico wrote:
> On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg <mal at> wrote:
>> Not necessarily. In fact, a string.String ABC could have only
>> a single method: .__str__() defined.
> That wouldn't be very useful; object.__str__ exists and is functional,
> so EVERY object would count as a string.

No, only those objects that register with the ABC would be considered
string-like, not all objects implementing the .__str__() method.
In regular Python, only str() objects would register with

Path objects could also register to be treated as "string-like"

Just like numeric types are only considered part of the ABCs
under numbers, if they register with these.

Perhaps the name strings.String sounds confusing, so
perhaps strings.StringLike or strings.FineToConvertToAStringIfNeeded
would be better :-)

> The point of "string-like" is that it can be treated as a string, not
> just that it can be converted to one. This is exactly parallel to the
> difference between __index__ and __int__; floats can be converted to
> int (and will truncate), but cannot be *treated* as ints.

Right, and that's what you can define via an ABC. Those abstract
base classes are not to be confused with introspecting methods
on objects - they help optimize this kind of test, but most
importantly help express the combination of providing an interface
by exposing methods with the semantics of how those
methods are expected to be used.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Apr 07 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From random832 at  Thu Apr  7 14:43:40 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 14:43:40 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016, at 13:30, Ethan Furman wrote:
> The protocol is "the how" of supporting pathlib, and, interestingly 
> enough, the easiest way to do so (avoids circular imports, etc., etc,.).

If the problem is importing pathlib, what about a "pathlib lite" that
can check if an object is a path without importing pathlib? This could
be a recipe, a separate module, or part of os.

def is_Path(x):
    return 'pathlib' in sys.modules and isinstance(x,

def path_str(x):
    if isinstance(x, str): return x
    if isPath(x): return x.path
    raise TypeError

# convenience methods for modules that want to support returning a Path
but don't import it
def to_Path(x):
    import pathlib
    return pathlib.Path(x)

# most common case, return a path iff an input argument is a path.
def to_Path_maybe(value, *args):
   if any(is_Path(arg) for arg in args):
      return to_Path(value)
      return value

All this other stuff seems to have an ambition of making these things
fully general, such that some other library can be dropped in instead of
pathlib without subclassing either str or Path, and I'm not sure what
the use case for that is. Pathlib is the battery that's included.

From chris.barker at  Thu Apr  7 14:45:36 2016
From: chris.barker at (Chris Barker)
Date: Thu, 7 Apr 2016 11:45:36 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico <rosuav at> wrote:

> Fair enough. That was a spur-of-the-moment thought. To be honest, this
> proposal is a massive generalization from, ultimately, a single
> use-case.

yes, but now that you say:


> And this should make it easier for third-party code to be functional
> without even being aware of the Path object. There are two basic
> things that code will be doing with paths: passing them unchanged to
> standard library functions (eg open()), and combining them with
> strings. The first will work by definition; the second will if paths
> can implicitly upcast to strings.

so this is really a way to make the whole path!=string thing easier -- we
don't want to simply call str() on anything that might be a path, because
that will work on anything, whether it's the least bit pathlike at all, but
this would introduce a new kind of __str__, so that only things for which
it makes sense would "Just work":

str + something

Would only concatenate to a new string if somethign was an object that
couuld "losslessly" be considered a string?

but if we use the __index__ example, that is specifically "an integer that
can be ued as an index", not "something that can be losslessly converted to
an integer".

so why not the __fspath__ protocol anyway?

Unless there are all sorts of other use cases you have in mind?



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Thu Apr  7 14:48:55 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 7 Apr 2016 14:48:55 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <ne6a2v$fg$>

On 4/7/2016 12:16 PM, Random832 wrote:
> On Thu, Apr 7, 2016, at 11:01, Paul Moore wrote:
>> Someone (lost is quotes)
>>> What's __index__ for?
>> I don't follow. It's for indexing, which requires an integer.
> Sure, but why isn't int() good enough?

Good question. The reason to add __index__ was to allow indexing with 
some things other that ints while not allowing just anything that can be 
converted to int (with or without loss).  The reason for the restriction 
is to prevent subtle bugs.  Expanding the domain of indexing with 
__index__ instead of int() was a judgment call, not a logical necessity.

 > For the same reason you only want
> the kinds of objects that implement __index__ (and not, say, a float or
> a string that happens to be numeric) for indexing, you only want the
> kinds of objects that implement this method for certain purposes.

I understand this, but I am going to challenge the analogy.  An index 
really is a int -- a count of items of the sequence from either end 
(where the right end can be thought of as an invisible End_ofSequence 
item).  A path, on the other hand, is not really a string.  A path is a 
sequence of nodes in the filesystem graph.  Converting structured data, 
in this case paths, to strings, is just a least-common-denominator way 
of communicating structured data between different languages.

To me, the default proposal to expand the domain of open and other path 
functions is to call str on the path arg, either always or as needed. 
We should then ask "why isn't str() good enough"?  Most bad args for 
open will immediately result in a file-not-found exception.  But the 
os.path functions would not.  Do the possible bugs and violation of 
python philosophy out-weigh the simplicity of the proposal?

Terry Jan Reedy

From brett at  Thu Apr  7 14:51:31 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 18:51:31 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, 7 Apr 2016 at 11:29 Chris Angelico <rosuav at> wrote:

> On Fri, Apr 8, 2016 at 3:31 AM, Brett Cannon <brett at> wrote:
> > But couldn't you also just define a str subclass that checks its
> argument(s)
> > are only valid ASCII values? What you're proposing is to potentially
> change
> > all places that operate with a string to now check for a special method
> > which would be a costly change potentially to performance as well as
> > propagating this concept everywhere a string-like object is expected.
> Fair enough. That was a spur-of-the-moment thought. To be honest, this
> proposal is a massive generalization from, ultimately, a single
> use-case.
> > I think it's important to realize that the main reason we are considering
> > this special method concept is to make it easier to introduce in
> third-party
> > code which doesn't have pathlib or for people who don't want to import
> > pathlib just to convert a pathlib.PurePath object to a string.
> And this should make it easier for third-party code to be functional
> without even being aware of the Path object. There are two basic
> things that code will be doing with paths: passing them unchanged to
> standard library functions (eg open()), and combining them with
> strings. The first will work by definition; the second will if paths
> can implicitly upcast to strings.

> In contrast, a function or method to convert a path to a string
> requires conscious effort on the part of any function that needs to do
> such manipulation, which correspondingly means version compatibility
> checks.

> Libraries will need to release a new version that's
> path-compatible, even though they don't actually gain any
> functionality.

But they will with yours as well. At best you could get this into Python
3.6 and then propagate this implicit string-like conversion functionality
throughout the language and stdlib, but any implicitness won't be
backported either as it's implicit. So while you're saying you don't like
the explicitness required to backport this, your solution doesn't help with
that either as you will still need to do the exact same method lookup for
any code that wants to work pre-3.6. And if using path objects takes off
and new APIs come up that don't take a string for paths, then the explicit
conversion can be left out -- and perhaps removed in converted APIs --
while your implicit conversion will forever be baked into Python itself.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Thu Apr  7 14:53:34 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 04:53:34 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 4:42 AM, M.-A. Lemburg <mal at> wrote:
> On 07.04.2016 18:46, Chris Angelico wrote:
>> On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg <mal at> wrote:
>>> Not necessarily. In fact, a string.String ABC could have only
>>> a single method: .__str__() defined.
>> That wouldn't be very useful; object.__str__ exists and is functional,
>> so EVERY object would count as a string.
> No, only those objects that register with the ABC would be considered
> string-like, not all objects implementing the .__str__() method.
> In regular Python, only str() objects would register with
> strings.String.

Oh, gotcha.

In that case, it would still be pretty much the same as the __fspath__
proposal, except that instead of creating a dunder method (to
implement a protocol), you register with an ABC.


From ethan at  Thu Apr  7 14:59:41 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 11:59:41 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 11:45 AM, Chris Barker wrote:
> On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico wrote:

>> And this should make it easier for third-party code to be functional
>> without even being aware of the Path object. There are two basic
>> things that code will be doing with paths: passing them unchanged to
>> standard library functions (eg open()), and combining them with
>> strings. The first will work by definition; the second will if paths
>> can implicitly upcast to strings.
> so this is really a way to make the whole path!=string thing easier --
> we don't want to simply call str() on anything that might be a path,
> because that will work on anything, whether it's the least bit pathlike
> at all, but this would introduce a new kind of __str__, so that only
> things for which it makes sense would "Just work":
> str + something
> Would only concatenate to a new string if somethign was an object that
> couuld "losslessly" be considered a string?

Which is mildly attractive.

> but if we use the __index__ example, that is specifically "an integer
> that can be ued as an index", not "something that can be losslessly
> converted to an integer".

__index__ was originally created to support indexing, but has morphed 
over time to mean "something that can be losslessly converted to an 


From brett at  Thu Apr  7 14:59:20 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 18:59:20 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 11:43 M.-A. Lemburg <mal at> wrote:

> On 07.04.2016 18:46, Chris Angelico wrote:
> > On Fri, Apr 8, 2016 at 2:27 AM, M.-A. Lemburg <mal at> wrote:
> >> Not necessarily. In fact, a string.String ABC could have only
> >> a single method: .__str__() defined.
> >
> > That wouldn't be very useful; object.__str__ exists and is functional,
> > so EVERY object would count as a string.
> No, only those objects that register with the ABC would be considered
> string-like, not all objects implementing the .__str__() method.
> In regular Python, only str() objects would register with
> strings.String.
> Path objects could also register to be treated as "string-like"
> object.
> Just like numeric types are only considered part of the ABCs
> under numbers, if they register with these.
> Perhaps the name strings.String sounds confusing, so
> perhaps strings.StringLike or strings.FineToConvertToAStringIfNeeded
> would be better :-)
> > The point of "string-like" is that it can be treated as a string, not
> > just that it can be converted to one. This is exactly parallel to the
> > difference between __index__ and __int__; floats can be converted to
> > int (and will truncate), but cannot be *treated* as ints.
> Right, and that's what you can define via an ABC. Those abstract
> base classes are not to be confused with introspecting methods
> on objects - they help optimize this kind of test, but most
> importantly help express the combination of providing an interface
> by exposing methods with the semantics of how those
> methods are expected to be used.

To make MAL's proposal concrete:

  class StringLike(abc.ABC):

    def __str__(self):
        """Return the string representation of something."""

  StringLike.register(pathlib.PurePath)  # Any 3rd-party library can do the

You could also call the class StringablePath or something and get the exact
same concept across where you are using the registration abilities of ABCs
to semantically delineate when a class's __str__() returns a usable file

The drawback is that this isn't easily backported like `path.__ospath__()
if hasattr(path, '__ospath__') else path` for libraries that don't
necessarily have access to pathlib but want to be compatible with accepting
path objects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Thu Apr  7 14:59:09 2016
From: chris.barker at (Chris Barker)
Date: Thu, 7 Apr 2016 11:59:09 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <ne6a2v$fg$>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:48 AM, Terry Reedy <tjreedy at> wrote:

> To me, the default proposal to expand the domain of open and other path
> functions is to call str on the path arg, either always or as needed. We
> should then ask "why isn't str() good enough"?  Most bad args for open will
> immediately result in a file-not-found exception.  But the os.path
> functions would not.  Do the possible bugs and violation of python
> philosophy out-weigh the simplicity of the proposal?

Thanks -- this is exactly the question at hand. it's gotten a bit caught up
in all the discussion of what to call the magic method, should there be a
top-level function, etc.  but this is the only question on the table that
isn't just bikeshedding.

personally, I think not -- but I"m very close to the fence.

Though no, most calls to open() would not fail -- most calls to open(path,
'r') would but most calls to open(path, 'w') would succeed and produce some
really weird filenames -- but so what?

that's the question -- are these subtle and hard to find bugs we want to



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Thu Apr  7 15:00:53 2016
From: chris.barker at (Chris Barker)
Date: Thu, 7 Apr 2016 12:00:53 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:59 AM, Ethan Furman <ethan at> wrote:

> __index__ was originally created to support indexing, but has morphed over
> time to mean "something that can be losslessly converted to an integer".

Ahh! very good precedent, then -- where else is this used?



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Thu Apr  7 15:02:16 2016
From: chris.barker at (Chris Barker)
Date: Thu, 7 Apr 2016 12:02:16 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:59 AM, Brett Cannon <brett at> wrote:

>   class StringLike(abc.ABC):
>     @abstractmethod
>     def __str__(self):
>         """Return the string representation of something."""
>   StringLike.register(pathlib.PurePath)  # Any 3rd-party library can do
> the same.
> You could also call the class StringablePath or something and get the
> exact same concept across where you are using the registration abilities of
> ABCs to semantically delineate when a class's __str__() returns a usable
> file path.
> The drawback is that this isn't easily backported like `path.__ospath__()
> if hasattr(path, '__ospath__') else path` for libraries that don't
> necessarily have access to pathlib but want to be compatible with accepting
> path objects.

and a plus is that it's compatible with type hinting -- is that the future
of Python???



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr  7 15:06:40 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 12:06:40 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 11:59 AM, Chris Barker wrote:

> Though no, most calls to open() would not fail -- most calls to
> open(path, 'r') would but most calls to open(path, 'w') would succeed
> and produce some really weird filenames -- but so what?
> that's the question -- are these subtle and hard to find bugs we want to
> prevent?

If we are trying to fix issues, why would we leave the door open to 
other, more subtle, bugs?


From rosuav at  Thu Apr  7 15:06:31 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 05:06:31 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 4:45 AM, Chris Barker <chris.barker at> wrote:
> On Thu, Apr 7, 2016 at 11:28 AM, Chris Angelico <rosuav at> wrote:
>> And this should make it easier for third-party code to be functional
>> without even being aware of the Path object. There are two basic
>> things that code will be doing with paths: passing them unchanged to
>> standard library functions (eg open()), and combining them with
>> strings. The first will work by definition; the second will if paths
>> can implicitly upcast to strings.
> so this is really a way to make the whole path!=string thing easier -- we
> don't want to simply call str() on anything that might be a path, because
> that will work on anything, whether it's the least bit pathlike at all, but
> this would introduce a new kind of __str__, so that only things for which it
> makes sense would "Just work":
> str + something
> Would only concatenate to a new string if somethign was an object that
> couuld "losslessly" be considered a string?
> but if we use the __index__ example, that is specifically "an integer that
> can be ued as an index", not "something that can be losslessly converted to
> an integer".

That's two ways of describing the same thing. When you call __index__,
you either get back an integer with the exact same meaning as the
original object, or you get an exception. In contrast, calling __int__
may result in an integer which is similar, but not equal, to the
original - for instance, int(3.2) is 3. That's a lossy conversion,
which __index__ will refuse to do.

Similarly, str() can be quite lossy and non-strict on arbitrary
objects. With this conversion, it will always either return an
exactly-equivalent string, or raise.

> so why not the __fspath__ protocol anyway?
> Unless there are all sorts of other use cases you have in mind?

Not actually in mind, but there have been multiple people raising
concerns that __fspath__ is too specific for what it achieves.


From tjreedy at  Thu Apr  7 15:06:47 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 7 Apr 2016 15:06:47 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <ne6b4f$idq$>

On 4/7/2016 11:03 AM, Ethan Furman wrote:
> On 04/07/2016 07:07 AM, Random832 wrote:
>> What's __index__ for?
> __index__ is a way to get an int from an int-like object without losing
> information; so it fails with values like 3.4, but should succeed with
> values like Fraction(4, 2).
> __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and
> and the 4/10's is lost).

Why is that a problem?  Loss is not the issue.  And as someone else 
pointed out, int('4') does not lose information, but seq['1'] is 
prohibited.  Why is passing integer strings as indexes bad?

The answer to the latter, I believe, is that Guido is against treating 
numbers and string representations of numbers as interchangable.  And 
the answer to both, I believe, is that the downside of flexibility is 
ease of creating buggy code, especially code where bugs do not 
immediately raise something, or ease of creating confusing code that is 
hard to maintain.

Terry Jan Reedy

From ethan at  Thu Apr  7 15:10:22 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 12:10:22 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 04/07/2016 11:59 AM, Brett Cannon wrote:

> To make MAL's proposal concrete:
>    class StringLike(abc.ABC):
>      @abstractmethod
>      def __str__(self):
>          """Return the string representation of something."""
>    StringLike.register(pathlib.PurePath)  # Any 3rd-party library can do
> the same.
> You could also call the class StringablePath or something and get the
> exact same concept across where you are using the registration abilities
> of ABCs to semantically delineate when a class's __str__() returns a
> usable file path.

I think I might like this better than a new magic method.

> The drawback is that this isn't easily backported like
> `path.__ospath__() if hasattr(path, '__ospath__') else path` for
> libraries that don't necessarily have access to pathlib but want to be
> compatible with accepting path objects.

I don't understand.


From brett at  Thu Apr  7 15:10:00 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 19:10:00 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 12:03 Chris Barker <chris.barker at> wrote:

> On Thu, Apr 7, 2016 at 11:59 AM, Brett Cannon <brett at> wrote:
>>   class StringLike(abc.ABC):
>>     @abstractmethod
>>     def __str__(self):
>>         """Return the string representation of something."""
>>   StringLike.register(pathlib.PurePath)  # Any 3rd-party library can do
>> the same.
>> You could also call the class StringablePath or something and get the
>> exact same concept across where you are using the registration abilities of
>> ABCs to semantically delineate when a class's __str__() returns a usable
>> file path.
>> The drawback is that this isn't easily backported like `path.__ospath__()
>> if hasattr(path, '__ospath__') else path` for libraries that don't
>> necessarily have access to pathlib but want to be compatible with accepting
>> path objects.
> and a plus is that it's compatible with type hinting -- is that the future
> of Python???

So the other way to do this is to combine the proposals:

  class BasePath(abc.ABC):

      def __ospath__(self):
          """Return the file system path, serialized as a string."""

Then pathlib.PurePath can inherit from this ABC and anyone else can as well
or be registered as doing so. Then in you can have:

  class Path(extra=pathlib.BasePath):
     __slots__ = ()

  PathLike = Union[str, Path]

Then any third-party library can register with the ABC and get the typing
correctly (assuming I didn't botch the type specification).

Guido also had some protocol proposal a while back that I think he floated
here, but I don't think the discussion really went anywhere as it was an
early idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Thu Apr  7 15:10:33 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 05:10:33 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <ne6b4f$idq$>
References: <>
 <> <ne6b4f$idq$>
Message-ID: <>

On Fri, Apr 8, 2016 at 5:06 AM, Terry Reedy <tjreedy at> wrote:
> On 4/7/2016 11:03 AM, Ethan Furman wrote:
>> On 04/07/2016 07:07 AM, Random832 wrote:
>>> What's __index__ for?
>> __index__ is a way to get an int from an int-like object without losing
>> information; so it fails with values like 3.4, but should succeed with
>> values like Fraction(4, 2).
>> __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and
>> and the 4/10's is lost).
> Why is that a problem?  Loss is not the issue.  And as someone else pointed
> out, int('4') does not lose information, but seq['1'] is prohibited.  Why is
> passing integer strings as indexes bad?
> The answer to the latter, I believe, is that Guido is against treating
> numbers and string representations of numbers as interchangable.  And the
> answer to both, I believe, is that the downside of flexibility is ease of
> creating buggy code, especially code where bugs do not immediately raise
> something, or ease of creating confusing code that is hard to maintain.

Hence my wording of "string-like". Anything can be converted to a
string, but only certain objects are sufficiently string-like to be
implicitly treated as strings. But ultimately it's all the same


From chris.barker at  Thu Apr  7 15:10:24 2016
From: chris.barker at (Chris Barker)
Date: Thu, 7 Apr 2016 12:10:24 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 12:06 PM, Ethan Furman <ethan at> wrote:

> that's the question -- are these subtle and hard to find bugs we want to
>> prevent?
> If we are trying to fix issues, why would we leave the door open to other,
> more subtle, bugs?

we're not trying to fix issues -- we're trying to make Path compatible with
the stdlib and other libs that use strings as path.

Or do you mean that not having Path subclass str is trying to fix issues,
in which case, yes, I suppose, but you need to stop somewhere, or we'll
have a statically typed language...



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr  7 15:12:18 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 12:12:18 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <ne6b4f$idq$>
References: <>
 <> <ne6b4f$idq$>
Message-ID: <>

On 04/07/2016 12:06 PM, Terry Reedy wrote:
> On 4/7/2016 11:03 AM, Ethan Furman wrote:
>> On 04/07/2016 07:07 AM, Random832 wrote:

>>> What's __index__ for?
>> __index__ is a way to get an int from an int-like object without losing
>> information; so it fails with values like 3.4, but should succeed with
>> values like Fraction(4, 2).
>> __int__ is a way to convert the value to an int, so 3.4 becomes 3 (and
>> and the 4/10's is lost).
> Why is that a problem?

It isn't.  I was explaining the difference between __int__ and __index__.


From joshua.morton13 at  Thu Apr  7 15:11:49 2016
From: joshua.morton13 at (Joshua Morton)
Date: Thu, 07 Apr 2016 19:11:49 +0000
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

If I had to make a syntax suggestion, I think it would be

    a, b, c = iterable or defaults

although making or both pseudo None-coalescing and pseudo error-coalescing
might be a bit too much sugar. However I think it clearly expresses the

That said, I have to ask what the usecase is for dealing with fixed length
unpacking where you might not have the fixed length of items. That feels
smelly to me. Why are you trying to name values from a variable length list?


On Thu, Apr 7, 2016 at 2:24 PM Michel Desmoulin <desmoulinmichel at>

> Le 07/04/2016 20:15, Michael Selik a ?crit :
> >
> >> On Apr 7, 2016, at 6:44 PM, Todd <toddrjen at> wrote:
> >>
> >> On Thu, Apr 7, 2016 at 1:03 PM, Michel Desmoulin <
> desmoulinmichel at> wrote:
> >> Python is a lot about iteration, and I often have to get values from an
> >> iterable. For this, unpacking is fantastic:
> >>
> >> a, b, c = iterable
> >>
> >> One that problem arises is that you don't know when iterable will
> >> contain 3 items.
> >>
> >> In that case, this beautiful code becomes:
> >>
> >> iterator = iter(iterable)
> >> a = next(iterator, "default value")
> >> b = next(iterator, "default value")
> >> c = next(iterator, "default value")
> >
> > I actually don't think that's so ugly. Looks fairly clear to me. What
> about these alternatives?
> Well, there is nothing wrong with:
> a = mylist[0]
> b = mylist[1]
> c = mylist[2]
> But we still prefer unpacking. And this is way more verbose.
> >
> >>>> from itertools import repeat, islice, chain
> >>>> it = range(2)
> >>>> a, b, c = islice(chain(it, repeat('default')), 3)
> >>>> a, b, c
> > (0, 1, 'default')
> >
> >
> > If you want to stick with builtins:
> >
> >>>> it = iter(range(2))
> >>>> a, b, c = [next(it, 'default') for i in range(3)]
> >>>> a, b, c
> > (0, 1, 'default')
> They all work, but they are impossible to remember, plus you will need a
> comment everytime you use them outside of the shell.
> >
> >
> > If your utterable is sliceable and sizeable:
> >
> >>>> it = list(range(2))
> >>>> a, b, c = it[:3] + ['default'] * (3 - len(it))
> >>>> a, b, c
> > (0, 1, 'default')
> Same problem, and as you said, you can forget about generators.
> >
> >
> >
> > Perhaps add a recipe to itertools, or change the "take" recipe?
> >
> > def unpack(n, iterable, default=None):
> >     "Slice the first n items of the iterable, padding with a default"
> >     padding = repeat(default)
> >     return islice(chain(iterable, padding), n))
> Not a bad idea. Built in would be better, but I can live with itertools.
> I import it so often I'm wondering if itertools shouldn't be in
> __builtins__ :)
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr  7 15:15:54 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 12:15:54 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 12:10 PM, Chris Barker wrote:
> On Thu, Apr 7, 2016 at 12:06 PM, Ethan Furman wrote:

>> If we are trying to fix issues, why would we leave the door open to
>> other, more subtle, bugs?
> we're not trying to fix issues -- we're trying to make Path compatible
> with the stdlib and other libs that use strings as path.

Yeah, that's the issue.  ;)


From brett at  Thu Apr  7 15:17:26 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 19:17:26 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 12:11 Ethan Furman <ethan at> wrote:

> On 04/07/2016 11:59 AM, Brett Cannon wrote:
> > To make MAL's proposal concrete:
> >
> >    class StringLike(abc.ABC):
> >
> >      @abstractmethod
> >      def __str__(self):
> >          """Return the string representation of something."""
> >
> >    StringLike.register(pathlib.PurePath)  # Any 3rd-party library can do
> > the same.
> >
> > You could also call the class StringablePath or something and get the
> > exact same concept across where you are using the registration abilities
> > of ABCs to semantically delineate when a class's __str__() returns a
> > usable file path.
> I think I might like this better than a new magic method.
> > The drawback is that this isn't easily backported like
> > `path.__ospath__() if hasattr(path, '__ospath__') else path` for
> > libraries that don't necessarily have access to pathlib but want to be
> > compatible with accepting path objects.
> I don't understand.

How do you make Python 3.3 code work with this when the ABC will simply not
be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0
(under the assumption that the ABC is put in pathlib and backported thanks
to its provisional status)? The ternary operator one-liner is
backwards-compatible while the ABC is only forward-compatible.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Thu Apr  7 15:18:59 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 7 Apr 2016 15:18:59 -0400
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <>
Message-ID: <ne6brb$ubp$>

On 4/7/2016 1:03 PM, Michel Desmoulin wrote:
> Python is a lot about iteration, and I often have to get values from an
> iterable. For this, unpacking is fantastic:
> a, b, c = iterable
> One that problem arises is that you don't know when iterable will
> contain 3 items.

You need to augment if needed to get at least 3 items.
You need to chop if needed to get at most 3 items.  The following does 
exactly this.

import itertools as it
a, b, c = it.islice(it.chain(iterable, it.repeat(default, 3), 3)

You can wrap this if you want with def exactly(iterable, n, default). 
This might already be a recipe in the itertools doc recipe section.

Terry Jan Reedy

From random832 at  Thu Apr  7 15:35:57 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 15:35:57 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, Apr 7, 2016, at 15:17, Brett Cannon wrote:
> How do you make Python 3.3 code work with this when the ABC will simply
> not
> be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0
> (under the assumption that the ABC is put in pathlib and backported
> thanks
> to its provisional status)? The ternary operator one-liner is
> backwards-compatible while the ABC is only forward-compatible.

If it's not available to you, then it's not available to anyone to
register either, so you obviously act as if no objects are StringLike if
you get an ImportError when trying to use it. Isn't this just the
standard dance for using *any* function that's new to a new version of

    from pathlib import StringLike # if it's in pathlib why is it called
    def is_StringLike(x): return isinstance(x, StringLike)
except ImportError:
    def is_StringLike(x): return False

From donald at  Thu Apr  7 15:37:03 2016
From: donald at (Donald Stufft)
Date: Thu, 7 Apr 2016 15:37:03 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

> On Apr 7, 2016, at 3:17 PM, Brett Cannon <brett at> wrote:
> The ternary operator one-liner is backwards-compatible while the ABC is only forward-compatible.

I like the idea of doing both. Make a __fspath__ method and pathlib.fspath that uses it, and then have
an ABC that checks for the existence of __fspath__.

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From ethan at  Thu Apr  7 15:43:27 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 12:43:27 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 04/07/2016 12:17 PM, Brett Cannon wrote:

> How do you make Python 3.3 code work with this when the ABC will simply
> not be available to you unless you're running Python 3.4.3, 3.5.2, or
> 3.6.0 (under the assumption that the ABC is put in pathlib and
> backported thanks to its provisional status)? The ternary operator
> one-liner is backwards-compatible while the ABC is only forward-compatible.

__os_path__ (or whatever it's called) also won't be available on those 
earlier versions -- so I'm not seeing that the ABC route as any worse.

Am I missing something?


From njs at  Thu Apr  7 15:47:14 2016
From: njs at (Nathaniel Smith)
Date: Thu, 7 Apr 2016 12:47:14 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 7, 2016 8:57 AM, "Paul Moore" <p.f.moore at> wrote:
> But the proposal for paths is to have a *specific* method that says
> "give me a string representing a filesystem path from this object". An
> "interpret this object as a string" wouldn't be appropriate for the
> cases where I'd want to do "give me a string representing a filesystem
> path". And that's where I get stuck, as I can't think of an example
> where I *would* want the more general option. For a number of reasons:
> 1. I can't think of a real-world example of when I'd *use* such a facility
> 2. I can't think of a real-world example of a type that might
> *provide* such a facility
> 3. I can't see how something so general would be of benefit.

Numpy and friends would implement this if it existed, for things like
converting numpy strings to python strings.

But I can't think of the cases where this would be useful either.

The reason that it's useful to have __index__, and that it would be useful
to have __fspath__, is that in both cases there are a bunch of interfaces
defined as part of the core language that need to implement the consumer
side of the protocol, so a core language protocol is the only real way to
go. I'm not thinking of a lot of analogous APIs in core python take
specifically-strings? Most of the important ones are str methods, so
regular self-dispatch already handles it. Or I guess str.__add__(someobj),
but that's already handled by binop dispatch.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Thu Apr  7 16:04:50 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 16:04:50 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016, at 15:47, Nathaniel Smith wrote:
> The reason that it's useful to have __index__, and that it would be
> useful
> to have __fspath__, is that in both cases there are a bunch of interfaces
> defined as part of the core language that need to implement the consumer
> side of the protocol, so a core language protocol is the only real way to
> go. I'm not thinking of a lot of analogous APIs in core python take
> specifically-strings?

Well, there's getattr and the like. I'm not sure why you'd want to pass
a not-really-a-string (though, an ASCII bytes is the one thing I _can_
think of*, and didn't the inability to pass unicode strings to a bunch
of APIs cause no end of trouble for Python 2 users?), but then I'm not
sure why you'd want to pass anything but an int (or a python 2 int/long)
to list indexing.

*Maybe a RUE-string that subclasses bytes and uses UTF-8.

From tjreedy at  Thu Apr  7 16:05:19 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 7 Apr 2016 16:05:19 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <ne6ei7$cbg$>

On 4/7/2016 2:08 PM, Guido van Rossum wrote:
> On Thu, Apr 7, 2016 at 10:39 AM, Michael Selik <mike at> wrote:

>> To clarify, the proposal is: ``~True == False`` but every other operation on ``True`` remains the same, including ``True * 42 == 42``. Correct?
> Yes. To be more precise, there are some "arithmetic" operations (+, -,
> *, /, **) and they all treat bools as ints and always return ints;
> there are also some "bitwise" operations (&, |, ^, ~) and they should
> all treat bools as bools and return a bool. Currently the only
> exception to this idea is that ~ returns an int, so the proposal is to
> fix that. (There are also some "boolean" operations (and, or, not) and
> they are also unchanged.)

When the proposal is expressed as "Make bools consistently follow this 
simple rule -- Logical and 'bitwise' operations on bools return the 
expected bool, while arithmetic operations treat bools as 0 or 1 and 
return ints.", it makes sense to me.

Given that ~bool hardly make any sense currently, I would not expect it 
to be in much use now.  Hence not much to break.

Terry Jan Reedy

From random832 at  Thu Apr  7 16:08:37 2016
From: random832 at (Random832)
Date: Thu, 07 Apr 2016 16:08:37 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <ne6ei7$cbg$>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote:
> Given that ~bool hardly make any sense currently, I would not expect it 
> to be in much use now.  Hence not much to break.

I suspect the fear is of one being passed into a place that expects an
int, and staying alive as a bool (i.e. not being converted to an int by
an arithmetic operation) long enough to confuse code that is trying to
do ~int.

From p.f.moore at  Thu Apr  7 16:21:28 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 21:21:28 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 20:43, Ethan Furman <ethan at> wrote:
> On 04/07/2016 12:17 PM, Brett Cannon wrote:
>> How do you make Python 3.3 code work with this when the ABC will simply
>> not be available to you unless you're running Python 3.4.3, 3.5.2, or
>> 3.6.0 (under the assumption that the ABC is put in pathlib and
>> backported thanks to its provisional status)? The ternary operator
>> one-liner is backwards-compatible while the ABC is only
>> forward-compatible.
> __os_path__ (or whatever it's called) also won't be available on those
> earlier versions -- so I'm not seeing that the ABC route as any worse.
> Am I missing something?

You can check for __os_path__ even if it doesn't exist (hasattr takes
the name as a string), but you can't check for the ABC if it doesn't
exist (isinstance takes the actual ABC as the argument).


From p.f.moore at  Thu Apr  7 16:26:38 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 21:26:38 +0100
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On 7 April 2016 at 21:08, Random832 <random832 at> wrote:
> On Thu, Apr 7, 2016, at 16:05, Terry Reedy wrote:
>> Given that ~bool hardly make any sense currently, I would not expect it
>> to be in much use now.  Hence not much to break.
> I suspect the fear is of one being passed into a place that expects an
> int, and staying alive as a bool (i.e. not being converted to an int by
> an arithmetic operation) long enough to confuse code that is trying to
> do ~int.

That is indeed the only place likely to hit problems. But I'd be
surprised if it was sufficiently common to be a major problem. I don't
think the backward compatibility constraints on a minor release would
preclude a change like this.

Personally, I'm +0 on the proposal. It seems like a more useful
behaviour, but it's one I'm never likely to need personally.


From ethan at  Thu Apr  7 16:28:57 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 13:28:57 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>	<>
Message-ID: <>

On 04/07/2016 01:21 PM, Paul Moore wrote:
> On 7 April 2016 at 20:43, Ethan Furman wrote:
>> On 04/07/2016 12:17 PM, Brett Cannon wrote:
>>> How do you make Python 3.3 code work with this when the ABC will simply
>>> not be available to you unless you're running Python 3.4.3, 3.5.2, or
>>> 3.6.0 (under the assumption that the ABC is put in pathlib and
>>> backported thanks to its provisional status)? The ternary operator
>>> one-liner is backwards-compatible while the ABC is only
>>> forward-compatible.
>> __os_path__ (or whatever it's called) also won't be available on those
>> earlier versions -- so I'm not seeing that the ABC route as any worse.
>> Am I missing something?
> You can check for __os_path__ even if it doesn't exist (hasattr takes
> the name as a string), but you can't check for the ABC if it doesn't
> exist (isinstance takes the actual ABC as the argument).

True, but:

- Python 3.3 isn't going to check for __os_path__

- Python 3.3 isn't going to check for an ABC

In other words, Python 3.3* simply isn't going to work with pathlib, so 
I continue to be confused with why Brett brought it up.


* In case there's any confusion: by "not work" I mean the stdlib is not 
going to correctly interpret a pathlib.Path in 3.3 and earlier.

From p.f.moore at  Thu Apr  7 16:46:11 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Apr 2016 21:46:11 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 21:28, Ethan Furman <ethan at> wrote:
> In case there's any confusion: by "not work" I mean the stdlib is not going
> to correctly interpret a pathlib.Path in 3.3 and earlier.

I think the confusion is over *who* will be checking the protocol or
the ABC. You're correct that the stdlib will not do so in earlier
versions. I've been assuming that is obvious (and I suspect Brett has
too). Talk about using the check for older versions is basically
around the possibility that 3rd party libraries might do so. I think
it's unlikely that they'll bother (but I'm commenting on the
suggestion because I'd like it to be easy to do if anyone does want to
bother, against my expectations). I have the feeling Brett might think
that it's somewhat more likely.

If all we're thinking about is a way for the stdlib to work with
strings and pathlib objects seamlessly, while also allowing 3rd party
path classes to register to be treated the same way, then yes, there's
no real difference between an ABC and a protocol (well, to register
with the ABC, 3rd party code would need to conditionally import the
ABC and make sure not to fail if the ABC can't be found, but that's
just some boilerplate).

Frankly, if it wasn't for the fact that you have stated that you'll
add support for the protocol to your path library, I'd be surprised if
*any* 3rd party code changed as a result of this discussion. There's
been no comment from the authors of or pylib (the only other 2
path objects I know of). And the only comments I've heard from authors
of libraries that consume paths is "I don't see any reason why I'd
bother". So as long as you're happy with the final form of the
proposal, I see little reason to worry about how other 3rd party code
might use it.


From brett at  Thu Apr  7 17:26:52 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 21:26:52 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 13:46 Paul Moore <p.f.moore at> wrote:

> On 7 April 2016 at 21:28, Ethan Furman <ethan at> wrote:
> > In case there's any confusion: by "not work" I mean the stdlib is not
> going
> > to correctly interpret a pathlib.Path in 3.3 and earlier.
> I think the confusion is over *who* will be checking the protocol or
> the ABC. You're correct that the stdlib will not do so in earlier
> versions. I've been assuming that is obvious (and I suspect Brett has
> too).

Yep, you're right. I was never worried about making the stdlib work since
that's a fully controlled environment that we can update at once. What I'm
worried about is any third-party library that has an API that takes a path
as an argument that may need to be updated to support pathlib -- or any
other path library -- and doesn't want to directly rely on Python 3.6 for

> Talk about using the check for older versions is basically
> around the possibility that 3rd party libraries might do so. I think
> it's unlikely that they'll bother (but I'm commenting on the
> suggestion because I'd like it to be easy to do if anyone does want to
> bother, against my expectations). I have the feeling Brett might think
> that it's somewhat more likely.

Yes, that's my hope as third-party libraries are the ones that have older
Python version compatibility to care about (the stdlib is obviously always

> If all we're thinking about is a way for the stdlib to work with
> strings and pathlib objects seamlessly, while also allowing 3rd party
> path classes to register to be treated the same way, then yes, there's
> no real difference between an ABC and a protocol (well, to register
> with the ABC, 3rd party code would need to conditionally import the
> ABC and make sure not to fail if the ABC can't be found, but that's
> just some boilerplate).

My point is the boilerplate is minimized for third-party libraries in the
instance of the magic method vs the ABC, but otherwise they accomplish the
same thing.

> Frankly, if it wasn't for the fact that you have stated that you'll
> add support for the protocol to your path library, I'd be surprised if
> *any* 3rd party code changed as a result of this discussion. There's
> been no comment from the authors of or pylib (the only other 2
> path objects I know of). And the only comments I've heard from authors
> of libraries that consume paths is "I don't see any reason why I'd
> bother". So as long as you're happy with the final form of the
> proposal, I see little reason to worry about how other 3rd party code
> might use it.

I'm trying to be a bit more optimistic on the uptake. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From njs at  Thu Apr  7 17:50:30 2016
From: njs at (Nathaniel Smith)
Date: Thu, 7 Apr 2016 14:50:30 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 7, 2016 2:32 PM, "Brett Cannon" <brett at> wrote:
>> Frankly, if it wasn't for the fact that you have stated that you'll
>> add support for the protocol to your path library, I'd be surprised if
>> *any* 3rd party code changed as a result of this discussion. There's
>> been no comment from the authors of or pylib (the only other 2
>> path objects I know of). And the only comments I've heard from authors
>> of libraries that consume paths is "I don't see any reason why I'd
>> bother". So as long as you're happy with the final form of the
>> proposal, I see little reason to worry about how other 3rd party code
>> might use it.
> I'm trying to be a bit more optimistic on the uptake. :)

If data points are useful, numpy just merged a PR for supporting pathlib

It's a bit awkward given the current api, but someone did care enough to
take the trouble.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Thu Apr  7 17:50:53 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Apr 2016 21:50:53 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, 7 Apr 2016 at 12:36 Random832 <random832 at> wrote:

> On Thu, Apr 7, 2016, at 15:17, Brett Cannon wrote:
> > How do you make Python 3.3 code work with this when the ABC will simply
> > not
> > be available to you unless you're running Python 3.4.3, 3.5.2, or 3.6.0
> > (under the assumption that the ABC is put in pathlib and backported
> > thanks
> > to its provisional status)? The ternary operator one-liner is
> > backwards-compatible while the ABC is only forward-compatible.
> If it's not available to you, then it's not available to anyone to
> register either, so you obviously act as if no objects are StringLike if
> you get an ImportError when trying to use it. Isn't this just the
> standard dance for using *any* function that's new to a new version of
> Python?

Yes, but the lack of a magic method is not as severe as a lack of an ABC
you will be using in an isinstance() check.

> try:
>     from pathlib import StringLike # if it's in pathlib why is it called
>     StringLike?

I called it StringLike because I was replying to Chris' proposal of a
generic string-like protocol (which wouldn't live in pathlib).

>     def is_StringLike(x): return isinstance(x, StringLike)
> except ImportError:
>     def is_StringLike(x): return False

That would cut out all third-party libraries no matter what their Python
version support was.

My point is that if I wanted this to work in Python 3.3.x, Python 3.4.2, or
Python 3.5.1 then the ABC solution is out as the ABC won't exist. The magic
method, though, would still work with the one-liner all the way back to
Python 2.5 when conditional expressions were added to the language.

For instance, let's say I'm the author of a library that uses file paths
that wants to support Python 3.3 and newer. How do I add support for using
pathlib.Path and Ethan's path library? With the magic method solution I can

  def ospath(path):
      return path.__ospath__() if hasattr(path, '__ospath__') else path

If I really wanted to I could just embed that wherever I want to work with

Now how about the ABC?

  # In pathlib.
  class StringPath(abc.ABC):
      def __str__(self): ...

  StringPath.register(pathlib.PurePath, str)  # Maybe not cover str?

  # In my library trying to support Python 3.3 and newer, pathlib and
Ethan's path library.
      from importlib import StringPath
  except ImportError:
      StringPath = None

  def ospath(path):
      if StringPath is None:
          if isinstance(path, StringPath):
              return str(path)
      # What am I supposed to do here?

Now you could set `StringPath = object`, but that starts to negate the
point of not subclassing strings as you're now accepting anything that
defines __str__() in that case unless you're on a version of Python "new"
enough to have pathlib w/ the ABC defined. And if you go some other route,
what would you want to do if StringPath wasn't available?

So the ABC vs magic method discussion comes down to whether we think
third-party libraries will use whatever approach is decided upon and
whether they care about how good the support is for Python 3.3.x, 3.4.2,
and 3.5.1.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr  7 18:09:01 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 15:09:01 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 04/07/2016 02:26 PM, Brett Cannon wrote:

> Yep, you're right. I was never worried about making the stdlib work
> since that's a fully controlled environment that we can update at once.
> What I'm worried about is any third-party library that has an API that
> takes a path as an argument that may need to be updated to support
> pathlib -- or any other path library -- and doesn't want to directly
> rely on Python 3.6 for support.

Ah, I understand (finally!).

Okay, let's go with the protocol then.  If it makes sense to also have 
an ABC I'm fine with that.


From eric at  Thu Apr  7 20:09:12 2016
From: eric at (Eric V. Smith)
Date: Thu, 7 Apr 2016 20:09:12 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 4/7/2016 3:00 PM, Chris Barker wrote:
> On Thu, Apr 7, 2016 at 11:59 AM, Ethan Furman <ethan at
> <mailto:ethan at>> wrote:
>     __index__ was originally created to support indexing, but has
>     morphed over time to mean "something that can be losslessly
>     converted to an integer".
> Ahh! very good precedent, then -- where else is this used?

It's used by hex, oct, and bin, at least:

>>> class Foo:
...   def __index__(self): return 42
>>> hex(Foo())
>>> oct(Foo())
>>> bin(Foo())

From k7hoven at  Thu Apr  7 20:27:42 2016
From: k7hoven at (Koos Zevenhoven)
Date: Fri, 8 Apr 2016 03:27:42 +0300
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

I like this double-underscore path attribute better than my suggestion
almost two weeks ago in the other thread, but which did not get any

On Sun, Mar 27, 2016 at 5:40 PM, Koos Zevenhoven <k7hoven at> wrote:
> On Sun, Mar 27, 2016 at 1:23 AM, Koos Zevenhoven <k7hoven at> wrote:
>> I assume you meant to type pathlib.Path.path, so that Path("...").path ==
>> str(Path("...")). That's a good start, and I'm looking forward to Serhiy's
>> patch for making the stdlib accept Paths. But if Path will not subclass str,
>> we also need new stdlib functions that *return* Paths.
> Actually, now that .path is not out yet, would it make sense to call it
> Path.str or Path.strpath instead, and introduce the same thing on DirEntry
> and guarantee a str (instead of str or bytes as DirEntry.path now does)?
> Maybe that would lead to fewer broken implementations in third-party
> libraries too?

But maybe it could be called `__pathname__`, since 'names' are
commonly thought of as strings. This, implemented across the stdlib,
would obviously be better than the status quo.

Anyway, you are now discussing that remind me a lot of my thoughts
last week. I hope everyone realizes that, while this would still
require effort from the maintainers of every library that deals with
paths, this would not allow libraries to start returning pathlib
objects from functions without either breaking backwards compatibility
of their APIs or adding duplicate functions. We will end up with a
mess of some libraries accepting path objects and some not, and some
that screw up their DirEntry compatibility regarding bytestring paths
or that break their existing pure bytestring compatibility that they
didn't know of. All this could take a while and give people an
impression of inconsistency in Python.


From brenbarn at  Thu Apr  7 21:32:58 2016
From: brenbarn at (Brendan Barnwell)
Date: Thu, 07 Apr 2016 18:32:58 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On 2016-04-07 08:15, Ethan Furman wrote:
> On 04/07/2016 12:46 AM, Antoine Pitrou wrote:
>> How about changing the behaviour of bool.__invert__ to make it in line
>> with the Numpy boolean?
>> (i.e. bool.__invert__ == operator.not_)
> No.  bool is a subclass of int, and changing that now would be a serious
> breach of backward-compatibility, not to mention breaking existing code
> for no good reason.

	Let's not forget that subclasses don't have to exactly duplicate all 
the behavior of their superclasses.  That's why there's such a thing as 
overriding.  Bool could remain a subclass of int, and still change its 
__invert__ behavior by overriding __invert__.

	It's true that this would be a backwards incompatible change, but 
behavior like ~True==-2 doesn't seem like something a lot of people are 
relying on.  It would be worth looking into how much code actually does 
rely on it.

Brendan Barnwell
"Do not follow where the path may lead.  Go, instead, where there is no 
path, and leave a trail."
    --author unknown

From steve at  Thu Apr  7 21:40:30 2016
From: steve at (Steven D'Aprano)
Date: Fri, 8 Apr 2016 11:40:30 +1000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote:

> To be more precise, there are some "arithmetic" operations (+, -,
> *, /, **) and they all treat bools as ints and always return ints;
> there are also some "bitwise" operations (&, |, ^, ~) and they should
> all treat bools as bools and return a bool.

You missed two: >> and <<. What are we to do with (True << 1)?

Honestly, I cannot even imagine what it means to say "shift a truth 
value N bits". I think the idea that bitwise operations on bools are 
actually boolean operations in disguise is not a well-formed idea. 
Sometimes it happens to work out (& | ^), and sometimes it doesn't (<< 
and ~).

And I'm not sure what to make of >> as an operation on bools. It doesn't 
*mean* anything, you can't shift a truth value, the very concept is 
meaningless, but if it did mean something it would surely return False. 
So >> could go into either category.

But ultimately, ~ has meant bitwise-not for 25 years, and it's never 
caused a problem before, not back in the days when people used to write 

TRUE, FALSE = 1, 0

and not now. If you want to perform a boolean "not" on a truth 
value, you use `not`.

Nobody cared enough to "fix" this (if it is a problem that needs fixing, 
which I doubt) when bools were first introduced, and nobody cared when 
Python 3 came out. So why are we talking about rushing a backwards-
incompatible semantic change into a point release?

Even if we "fix" this, surely we should go through the usual deprecation 
process? This isn't a critical security bug that needs fixing, it's a 
semantic change to something that has worked this way for 25 years, and 
its going to break something somewhere. There are just far too many 
people that expect that bools are ints. After all, not withstanding 
their fancy string representation, they behave like ints and actually 
are ints.


From steve at  Thu Apr  7 21:53:44 2016
From: steve at (Steven D'Aprano)
Date: Fri, 8 Apr 2016 11:53:44 +1000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol>
Message-ID: <>

On Thu, Apr 07, 2016 at 01:17:57PM +0000, Antoine Pitrou wrote:
> Steven D'Aprano <steve at ...> writes:
> > 
> > > Numpy's boolean type does the more useful (and more expected) thing:
> > > 
> > > >>> ~np.bool_(True)
> > > False
> > 
> > Expected by whom?
> By anyone who takes booleans at face value (that is, takes booleans as
> representing a truth value and expects operations on booleans to reflect
> the semantics of useful operations on truth values, not some arbitrary
> side-effect of the internal representation of a boolean...).

Bools in Python have *always* been integers, so who are these people 
taking booleans at face value? Beginners? If so, say so. That's a motive 
I can understand.

But I think that people who expect bools to be real truth values, like 
in Pascal, probably won't expect BITWISE operations to operate on them 
at all and will use the BOOLEAN operators and, or, not. Bitwise 
operators operate on a sequence of bits, not a single truth value. What 
do these naive "bools are truth values" people think:

True << 3 

should return?

I don't think we need a second way to spell "not bool". Bools have 
always been ints in Python, and apart from their fancy string 
representation they behave like ints. I don't think it helps to make ~ a 
special case where they don't.

> But I'm not surprised by such armchair commenting and pointless controversy
> on python-ideas, since that's what the list is for....

Thanks for your feedback, I'll give it the due consideration it 


From ethan at  Thu Apr  7 21:56:23 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 18:56:23 -0700
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
Message-ID: <>

In the recent thread about changing the meaning of __invert__ on bools I 
was accused of an attempted hijack.

Can anybody please enlighten me as to what, exactly, I did wrong?

Original post follows:

 > I think the str() of a value, while possibly being the most
 > interesting piece of information (IntEnum, anyone?), is hardly the
 > most intrinsic.
 > If we do make this change, besides needing a couple major versions to
 > make it happen, will anything else be different?
 > - no longer subclass int?
 > - add an "unknown" value?
 >    - how will indexing work?
 >    - or any of the other operations?
 > - don't bother with any of the other mathematical operations?
 >    - counting True's is not the same as adding True's
 > I'm not firmly opposed, I just don't see a major issue here -- I've
 > needed an Unknown value for more often that I've needed ~True to be
 > False.


From guido at  Thu Apr  7 22:13:29 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Apr 2016 19:13:29 -0700
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <>
Message-ID: <>

You know full well that none of the things you brought up are up for discussion.

Honestly I don't care any more and I am going to mute this thread and
any others in the same vein.

On Thu, Apr 7, 2016 at 6:56 PM, Ethan Furman <ethan at> wrote:
> In the recent thread about changing the meaning of __invert__ on bools I was
> accused of an attempted hijack.
> Can anybody please enlighten me as to what, exactly, I did wrong?
> Original post follows:
>> I think the str() of a value, while possibly being the most
>> interesting piece of information (IntEnum, anyone?), is hardly the
>> most intrinsic.
>> If we do make this change, besides needing a couple major versions to
>> make it happen, will anything else be different?
>> - no longer subclass int?
>> - add an "unknown" value?
>>    - how will indexing work?
>>    - or any of the other operations?
>> - don't bother with any of the other mathematical operations?
>>    - counting True's is not the same as adding True's
>> I'm not firmly opposed, I just don't see a major issue here -- I've
>> needed an Unknown value for more often that I've needed ~True to be
>> False.
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From alexander.belopolsky at  Thu Apr  7 22:17:16 2016
From: alexander.belopolsky at (Alexander Belopolsky)
Date: Thu, 7 Apr 2016 22:17:16 -0400
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 9:56 PM, Ethan Furman <ethan at> wrote:

> In the recent thread about changing the meaning of __invert__ on bools I
> was accused of an attempted hijack.
> Can anybody please enlighten me as to what, exactly, I did wrong?

You greatly expanded the scope of the discussion.  The original thread was
focused on a single feature, and a not so widely used one.  You asked 6-7
(rhetorical?) questions below that have nothing to do with the original

> Original post follows:
> > I think the str() of a value, while possibly being the most
> > interesting piece of information (IntEnum, anyone?), is hardly the
> > most intrinsic.
> >
> > If we do make this change, besides needing a couple major versions to
> > make it happen, will anything else be different?
> >
> > - no longer subclass int?
> > - add an "unknown" value?
> >    - how will indexing work?
> >    - or any of the other operations?
> > - don't bother with any of the other mathematical operations?
> >    - counting True's is not the same as adding True's
> >
> > I'm not firmly opposed, I just don't see a major issue here -- I've
> > needed an Unknown value for more often that I've needed ~True to be
> > False.
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rian at  Thu Apr  7 22:17:33 2016
From: rian at (Rian Hunter)
Date: Thu, 7 Apr 2016 19:17:33 -0700 (PDT)
Subject: [Python-ideas] Consistent programming error handling idiom
Message-ID: <alpine.OSX.2.20.1604071911020.16133@ioko>

I like that in Python all errors are exceptions. This enables elegant
code and provides a default error idiom for all Python code.

An important distinction between exceptions arises when handling an
exception in a top-level exception handler that doesn't have the
context to properly handle it. In these situations some exceptions can
be ignored (and hopefully logged) and some exceptions should the
terminate/reset the program (not necessarily literally).

I'd call exceptions that should terminate the program "programming
errors? or ?bugs.? It's practically impossible to gracefully recover
from a programming error when encountered (you can potentially hot
reload a bug fix but I digress). Examples of current exceptions that
definitely represent programming errors include AssertionError,
NameError and SyntaxError. In the case of AssertionError, it's very
important the program terminates or the state is reset lest you
run the risk of corrupting persistent data.

I'd argue that other exceptions like AttributeError, ValueError, and
TypeError can also represent programming errors depending on where
they are caught. If one of these exceptions is caught near the top of
the call stack where nothing useful can be done it's very likely a
programming error. If it's caught locally where it can be predictably
handled, no hard reset is necessary, e.g.:

         foo =
     except AttributeError:
         foo = 0

Contrast with the following code that never makes sense (and is why I
said that NameError definitely signifies a programming error):

         foo = bar
     except NameError:
         foo = 0

Toy examples aside, this problem arises in real programs, like an
extensible HTTP server:

         response = client_request_handler(global_state, connection_state, request)
     except Exception:
         response = create_500_response()
         # TODO: should the server be reset?
         #       is global_state invalid? is connection_state invalid?

In this case I think it?s polite to always send a ?500? error (the
HTTP status code for ?internal server error?) but the question remains
as to whether or not the server should reset its global state or close
the connection.

Something smells here and I don?t think Python currently has a
solution to this problem. I don?t think it should be ambiguous by the
time an exception is caught whether or not the program has a bug in it
or whether it has simply run into an external error.

I don?t know what the solution to this should be. I?ve seen proposals
like a new exception hierarchy for ?bad code? but I don?t think a good
solution necessarily requires changes to the language.

I think the solution could be as simple as a universally accepted PEP
(like PEP 8) that discusses some of these issues, and related ones,
and presents correct / pythonic ways of dealing with them. Maybe the
answer is recommending that top-level exception handlers should only
be used with extreme care and, unless you know what you?re doing, it?s
best to let your program die (or affected state reset) and bias
towards more fine-grained exception handling.

Thoughts? Thanks for reading.


From rosuav at  Thu Apr  7 23:28:20 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 13:28:20 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <alpine.OSX.2.20.1604071911020.16133@ioko>
References: <alpine.OSX.2.20.1604071911020.16133@ioko>
Message-ID: <>

On Fri, Apr 8, 2016 at 12:17 PM, Rian Hunter <rian at> wrote:
> Contrast with the following code that never makes sense (and is why I
> said that NameError definitely signifies a programming error):
>     try:
>         foo = bar
>     except NameError:
>         foo = 0

This is exactly the idiom used to cope with builtins that may or may
not exist. If you want to support Python 2 as well as 3, you might use
something like this:

    input = raw_input
except NameError:
    raw_input = input

After this, you're guaranteed that both names exist and refer to the
non-evaluating input function. So it doesn't *definitely* signify a
bug; like every other exception, you can declare that it's a known and
expected situation by try/excepting. An *uncaught* NameError is
legitimately a bug - but then, so would most uncaught exceptions.
StopIteration gets raised and caught all the time when you iterate
over things, but if one leaks out, it's a bug somewhere.

> Toy examples aside, this problem arises in real programs, like an
> extensible HTTP server:
>     try:
>         response = client_request_handler(global_state, connection_state,
> request)
>     except Exception:
>         response = create_500_response()
>         # TODO: should the server be reset?
>         #       is global_state invalid? is connection_state invalid?

This is what I'd call a boundary location. You have "outer" code and
"inner" code. Any uncaught exception in the inner code should get
logged rather than aborting the outer code. I'd spell it like this:

    response = ...
except BaseException as e:
    response = create_500_response()

Notably, the exception should be *logged* in some way that the author
of the inner code can find it. (The outer and inner code needn't be
the same 'thing', and needn't have the same author, although they

At this kind of boundary, you basically catch-and-log *all*
exceptions, handling them the same way. Doesn't matter whether it's
ValueError, SyntaxError, NameError, RuntimeError, GeneratorStop, or
SystemExit - they all get logged, the client gets a 500, and you go
back and look for another request.

As to resetting stuff: I wouldn't bother; your functions should
already not mess with global state. The only potential mess you should
consider dealing with is a database rollback; and actually, my
personal recommendation is to do that with a context manager inside
the inner code, rather than a reset in the exception handler in the
outer code.

I don't think there's anything to be codified here; all you have is
the basic principle that uncaught exceptions are bugs, modified by
boundary locations.

> Maybe the
> answer is recommending that top-level exception handlers should only
> be used with extreme care and, unless you know what you?re doing, it?s
> best to let your program die (or affected state reset) and bias
> towards more fine-grained exception handling.

This should already be the recommendation. The only time you should
ever catch an exception is when you can actually do something useful
with it; at a boundary location, you log all exceptions and return to
some sort of main loop, and everywhere else, you catch stuff because
you can usefully cope with it. This is exactly how structured
exception handling should normally be used; most programs have no
boundaries in them, so you simply catch what you can handle and let
the rest hit the console.


From stephen at  Fri Apr  8 01:51:08 2016
From: stephen at (Stephen J. Turnbull)
Date: Fri, 8 Apr 2016 14:51:08 +0900
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

Random832 writes:

 > How about ASCII bytes strings [as a use case]? ;)

Very good point, and not at all funny.

In fact, this is pretty well useless except to try to propagate the
Python 2 model of str as maybe-encoded-maybe-not into Python 3 "when
it's safe".  But that's way less than "99%+" of use cases.  Anything
where you combine it with bytes has to result in bytes, or prepare for
the combining operation to raise if the bytes contain any non-ASCII.
Ditto str.  Either way, you've not really gained much safety for the
programmer, although maybe the user would prefer a crashed program to
corrupt data or mojibake on screen.  But they'd rather the program
worked as intended, so the programmer has to do pretty much the same
amount of thinking and coding around the semantic differences between
Python 2 and Python 3 AFAICS.

The only place I can see this StringableASCIIBytes *frequently* being
really "safe" is scripting where you pass a string literal
representing a path to an os function, or print a string literal.  But
guess what, that already works if you just use a str literal!

I'm also not really sure that, given the flexibility and ubiquity of
string representations, that there really are a lot of use cases whose
use of __lossless_i_promise_str__ are mutually compatible.  Eg, if we
use this for pathlib.Path and for urllib_tng.URL, are those really
compatible in the sense that we're happy handing an urllib_tng.URL to
open() and have it try to operate on that?  If not, we need separate
dunders for this purpose.  And ditto for all those hypothetical future
uses (which are unlikely to be as compatible as filesystem paths and
URIs, or URI paths).

From stephen at  Fri Apr  8 01:52:45 2016
From: stephen at (Stephen J. Turnbull)
Date: Fri, 8 Apr 2016 14:52:45 +0900
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

Steven D'Aprano writes:
 > On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote:
 > > To be more precise, there are some "arithmetic" operations (+, -,
 > > *, /, **) and they all treat bools as ints and always return ints;
 > > there are also some "bitwise" operations (&, |, ^, ~) and they should
 > > all treat bools as bools and return a bool.

I'm with Steven on this.  Knuth would call these operations
"seminumerical".  I would put the emphasis on "numerical", *expecting*
True and False to be one-bit representations of the (mathematical)
integers 1 and 0.  If numerical operations widen bools to int and then
operate, I would *expect* seminumerical operations to do so as well.

In fact, I was startled by Antoine's post.  I even have a couple of
lines of code using "^" as a *logical* operator on known bools,
carefully labeled "# Hack! works only on true bools."

That said, I'm not Dutch, and if treating bool as "not actually int"
here is the right thing to do, then I would think the easiest thing to
do would be to interpret the bitwise operations as performed on
(mythical) C "itty-bitty ints".[1]  Then ~ does the right thing and

True << 1 == True >> 1 == False << 1 == False >> 1 == 0

giving us four new ways to spell 0 as a bonus!

 > After all, not withstanding their fancy string representation,

I guess "fancy string representation" was the original motivation for
the overrides.  If the intent was really to make operator versions of
logical operators (but only for true bools!), they would have fixed ~

 > they behave like ints and actually are ints.

I can't fellow-travel all the way to "actually are", though.  bools
are what we decide to make them.  I just don't see why the current
behaviors of &|^ are particularly useful, since you'll have to guard
all bitwise expressions against non-bool truthies and falsies.

[1]  "itty-bitty" almost reads "1 bit" in Japanese!

From rosuav at  Fri Apr  8 02:03:42 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Apr 2016 16:03:42 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 3:51 PM, Stephen J. Turnbull <stephen at> wrote:
> Anything
> where you combine it with bytes has to result in bytes, or prepare for
> the combining operation to raise if the bytes contain any non-ASCII.

More helpfully, the Ascii object would raise *before* that - it would
raise on construction if it had any non-ASCII in it. Combining an
Ascii with a bytes results in a perfectly well-formed bytes; combining
an Ascii with a unicode results in a perfectly well-formed unicode.

But I'm not sure how (a) useful and (b) feasible this is. The only
real use-case I know of is Path objects, and there may be similar
tasks around numpy or pandas, but I don't know for sure.


From ethan at  Fri Apr  8 02:08:46 2016
From: ethan at (Ethan Furman)
Date: Thu, 07 Apr 2016 23:08:46 -0700
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/07/2016 07:17 PM, Alexander Belopolsky wrote:

> You greatly expanded the scope of the discussion.  The original thread
> was focused on a single feature, and a not so widely used one.  You
> asked 6-7 (rhetorical?) questions below that have nothing to do with the
> original topic.

Thank you, Alexander.  I appreciate the feedback.

It's entirely possible I'm talking to myself, but in case anyone is 

Guido, it felt to me that you were being a jerk.  I was definitely an 
ass in response.  For that I apologize.

As for broadening the scope of the discussion I will say this:

So far I have authored or helped with three successful PEPs;  The first 
was to amend an incomplete feature (raise from None), and the third was 
to add back a feature that had been ripped out (%-interpolation with bytes).

Why were those two even necessary?  I suspect because the folks involved 
were only thinking about their own needs and/or didn't have the relevant 
experience as to why those features were useful.

Perhaps I am only flattering myself, but I think I try hard to see all 
sides of every issue, and the only way I can do that is by asking 
questions of those with more or different experience than I have.

I think the current pathlib discussions are a fair indicator:  I don't 
particularly care for it, and might never use it -- but I hate to see it 
tossed and Antoine's work and effort lost; so I'm working to find a 
reasonable way to keep it, and not just in asking questions and offering 
ideas -- I volunteered my time to write the code.

I'm starting to ramble so let me close with this:  I'm not sorry for 
asking questions and trying to look at the broader issues, and I'm not 
going to stop doing that -- but I will stop pursuing any particular 
issue when asked to do so... but please don't be insulting about it.


From greg.ewing at  Fri Apr  8 02:26:34 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 08 Apr 2016 18:26:34 +1200
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

Michael Selik wrote:
> To clarify, the proposal is: ``~True == False`` but every other operation on
> ``True`` remains the same, including ``True * 42 == 42``. Correct?

Seems to me things are fine as they are.

The justification for & and | on bools returning bools
is that the result remains within the domain of bools,
even when they are interpreted as int operations.

But ~ on a bool-interpreted-as-an-int doesn't have
that property, so ~True is more in the realm of
True * 42 in that regard.


From greg.ewing at  Fri Apr  8 02:54:33 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 08 Apr 2016 18:54:33 +1200
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <ne6ei7$cbg$>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

Terry Reedy wrote:
> Given that ~bool hardly make any sense currently, I would not expect it 
> to be in much use now.  Hence not much to break.

But conversely, any code that *is* using ~bool instead of
"not bool" is probably doing it precisely because it
*does* want the integer interpretation.

Why break that code, when "not bool" is available as the
obvious way of getting a logically negated bool?


From greg.ewing at  Fri Apr  8 03:06:33 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 08 Apr 2016 19:06:33 +1200
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman wrote:
> Can anybody please enlighten me as to what, exactly, I did wrong?

I think Guido was objecting to talk about things such as
making bool no longer subclass int, which is not only
going a long way beyond the original proposal, but is
almost certainly never going to happen, so any discussion
of it could be seen as wasting people's time.


From njs at  Fri Apr  8 03:34:06 2016
From: njs at (Nathaniel Smith)
Date: Fri, 8 Apr 2016 00:34:06 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <ne6ei7$cbg$> <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:54 PM, Greg Ewing <greg.ewing at> wrote:
> Terry Reedy wrote:
>> Given that ~bool hardly make any sense currently, I would not expect it to
>> be in much use now.  Hence not much to break.
> But conversely, any code that *is* using ~bool instead of
> "not bool" is probably doing it precisely because it
> *does* want the integer interpretation.

It would be interesting actually if anyone has any idea of how to get
some empirical data on this -- it isn't actually clear to me whether
this is true or not.

The reason I'm uncertain is that in numpy code, using operations like
~ on booleans is *very* common, because the whole idea of numpy is
that it gives you a way to write code that works the same on either a
single value or on an array of values, and when you're working with
booleans then this means you have to use '~': '~' works on arrays and
'not' doesn't.

And, for numpy bools or arrays of bools, ~ does logical negation:

In [1]: ~np.bool_(True)
Out[1]: False

So you can write code like:

# Contrived function that doubles 'value' if do_double is true
# and otherwise halves it
def double_or_halve(value, do_double):
    value = np.asarray(value)
    value[do_double] *= 2
    value[~do_double] *= 0.5
    return value

and then this works correctly if 'do_double' is a numpy bool or array of bools:

In [16]: double_or_halve(np.arange(3, dtype=float), np.array([True,
False, True]))
Out[16]: array([ 0. ,  0.5,  4. ])

In [21]: double_or_halve(5.0, np.bool_(False))
Out[21]: array(2.5)

But if you pass in a regular Python bool then the attempt to index by
~do_double turns into negative integer indexing and blows up:

In [23]: double_or_halve(5.0, False)
IndexError: too many indices for array

Of course this is a totally contrived function, and anyway it has a
bug -- the user should have said 'do_double = np.asarray(do_double)'
at the top of the function, and that would fix the problem. This is
definitely not some massive problem afflicting numerical users, and I
don't have any strong opinion on Antoine's proposal.

But, it is the only case where I can imagine someone intentionally
writing ~bool, so it actually strikes me as plausible that the
majority of existing code that writes ~bool is like this: doing it by
mistake and expecting it to be the same as 'not'.



Nathaniel J. Smith --

From niki.spahiev at  Fri Apr  8 03:46:36 2016
From: niki.spahiev at (Niki Spahiev)
Date: Fri, 8 Apr 2016 10:46:36 +0300
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <ne7njm$5t6$>

On  8.04.2016 09:26, Greg Ewing wrote:
> Michael Selik wrote:
>> To clarify, the proposal is: ``~True == False`` but every other
>> operation on
>> ``True`` remains the same, including ``True * 42 == 42``. Correct?
> Seems to me things are fine as they are.
> The justification for & and | on bools returning bools
> is that the result remains within the domain of bools,
> even when they are interpreted as int operations.
> But ~ on a bool-interpreted-as-an-int doesn't have
> that property, so ~True is more in the realm of
> True * 42 in that regard.

What should be the result of +True? 1 or True?


From k7hoven at  Fri Apr  8 08:28:08 2016
From: k7hoven at (Koos Zevenhoven)
Date: Fri, 8 Apr 2016 15:28:08 +0300
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On Fri, Apr 8, 2016 at 4:40 AM, Steven D'Aprano <steve at> wrote:
> On Thu, Apr 07, 2016 at 11:08:38AM -0700, Guido van Rossum wrote:
>> To be more precise, there are some "arithmetic" operations (+, -,
>> *, /, **) and they all treat bools as ints and always return ints;
>> there are also some "bitwise" operations (&, |, ^, ~) and they should
>> all treat bools as bools and return a bool.
> You missed two: >> and <<. What are we to do with (True << 1)?

Not everyone considers bit shifts 'bitwise', as they don't act at the
level of individual bit positions:

> Honestly, I cannot even imagine what it means to say "shift a truth
> value N bits". I think the idea that bitwise operations on bools are
> actually boolean operations in disguise is not a well-formed idea.
> Sometimes it happens to work out (& | ^), and sometimes it doesn't (<<
> and ~).

One point of view is that bitwise operations should stay within bool,
while shifts return ints, the left-shift operations actually being
much more useful than "left-pad" ;). The main point of >> can be seen
as consistency, although perhaps useless.

That said, I don't really have an opinion on the OP's suggestion.


From songofacandy at  Fri Apr  8 09:25:50 2016
From: songofacandy at (INADA Naoki)
Date: Fri, 8 Apr 2016 22:25:50 +0900
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

> We have abstract base classes for such tests, but there's nothing
> which would define "string-like" as ABC. Before trying to define
> a test via a special method, I think it's better to define what
> exactly you mean by "string-like".
> Something along the lines of numbers.Number, but for strings.
> To make an object string-like, you then test for the ABC and
> then call .__str__() to get the string representation as string.
Does ABC is easy to use from C?
There should be easy way to (1) define "string-like" class in C and (2) use
"string-like" object from C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From desmoulinmichel at  Fri Apr  8 10:15:10 2016
From: desmoulinmichel at (Michel Desmoulin)
Date: Fri, 8 Apr 2016 16:15:10 +0200
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <ne6brb$ubp$>
References: <> <ne6brb$ubp$>
Message-ID: <>

Le 07/04/2016 21:18, Terry Reedy a ?crit :
> On 4/7/2016 1:03 PM, Michel Desmoulin wrote:
>> Python is a lot about iteration, and I often have to get values from an
>> iterable. For this, unpacking is fantastic:
>> a, b, c = iterable
>> One that problem arises is that you don't know when iterable will
>> contain 3 items.
> You need to augment if needed to get at least 3 items.
> You need to chop if needed to get at most 3 items.  The following does
> exactly this.
> import itertools as it
> a, b, c = it.islice(it.chain(iterable, it.repeat(default, 3), 3)
> You can wrap this if you want with def exactly(iterable, n, default).
> This might already be a recipe in the itertools doc recipe section.

Yes and you can also do that for regular slicing on list. But you don't,
because you have regular slicing, which is cleaner, and easier to read
and remember.

This is not only hard to remember, but needs a comment. And an import.

From p.f.moore at  Fri Apr  8 10:33:16 2016
From: p.f.moore at (Paul Moore)
Date: Fri, 8 Apr 2016 15:33:16 +0100
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 April 2016 at 16:26, M.-A. Lemburg <mal at> wrote:
> We have abstract base classes for such tests, but there's nothing
> which would define "string-like" as ABC. Before trying to define
> a test via a special method, I think it's better to define what
> exactly you mean by "string-like".

As a more general point here, if the point is simply to group a set of
types together for use by "user" code (whether application code or 3rd
party libraries) then that's precisely what the ABC machinery is for.

It doesn't require a core change or a PEP or anything other than
agreement between the users of the facility to define an ABC for
*anything*, even something as vague as "string-like" - if all parties
concerned agree that something is "string-like" when it is an instance
of the StringLike ABC, then that's a done deal. You can add a
"to_string()" method to the ABC, which can be as simple as

def to_string(obj):
    return str(obj)

and let types for which that's not appropriate override it.

The only real reason for a new "protocol" (a dunder method) is if the
core interpreter or low-level parts of the stdlib need to participate.
In that case, the ABC mechanisms may not be available/appropriate or
may introduce unacceptable overhead (e.g., in something like dict

But in general, we have a mechanism for things like this, why are
people inventing home-grown solutions rather than using them? (In the
case of pathlib and __fspath__, it's because of the low-level
implications of adding support to open() and places like importlib -
but that's *also* why the solution should remain focused on the
specific problem, and not become over-generalised).


From guido at  Fri Apr  8 10:54:16 2016
From: guido at (Guido van Rossum)
Date: Fri, 8 Apr 2016 07:54:16 -0700
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:08 PM, Ethan Furman <ethan at> wrote:
> On 04/07/2016 07:17 PM, Alexander Belopolsky wrote:
>> You greatly expanded the scope of the discussion.  The original thread
>> was focused on a single feature, and a not so widely used one.  You
>> asked 6-7 (rhetorical?) questions below that have nothing to do with the
>> original topic.
> Thank you, Alexander.  I appreciate the feedback.
> It's entirely possible I'm talking to myself, but in case anyone is
> listening:
> Guido, it felt to me that you were being a jerk.  I was definitely an ass in
> response.  For that I apologize.

Thanks for that. I didn't mean to be a jerk, just to prevent the
thread from derailing way out of scope. And I was on my cell phone,
where I can't be as eloquent as when I have a real keyboard. So I
apologize for sounding like a jerk.

> As for broadening the scope of the discussion I will say this:
> So far I have authored or helped with three successful PEPs;  The first was
> to amend an incomplete feature (raise from None), and the third was to add
> back a feature that had been ripped out (%-interpolation with bytes).
> Why were those two even necessary?  I suspect because the folks involved
> were only thinking about their own needs and/or didn't have the relevant
> experience as to why those features were useful.
> Perhaps I am only flattering myself, but I think I try hard to see all sides
> of every issue, and the only way I can do that is by asking questions of
> those with more or different experience than I have.
> I think the current pathlib discussions are a fair indicator:  I don't
> particularly care for it, and might never use it -- but I hate to see it
> tossed and Antoine's work and effort lost; so I'm working to find a
> reasonable way to keep it, and not just in asking questions and offering
> ideas -- I volunteered my time to write the code.
> I'm starting to ramble so let me close with this:  I'm not sorry for asking
> questions and trying to look at the broader issues, and I'm not going to
> stop doing that -- but I will stop pursuing any particular issue when asked
> to do so... but please don't be insulting about it.

Starting a new thread with a broader (or different) scope is always
totally fine. Bringing up a whole bunch of contentious things that
distract from a relatively simple issue is not.Thanks for asking for

--Guido van Rossum (

From guido at  Fri Apr  8 11:00:27 2016
From: guido at (Guido van Rossum)
Date: Fri, 8 Apr 2016 08:00:27 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano <steve at> wrote:

> You missed two: >> and <<. What are we to do with (True << 1)?

Those are indeed ambiguous -- they are defined as multiplication or
floor division with a power of two, e.g. x<<n is x*2**n and x>>n is
x//2**n (for integral x and nonnegative n). The point of this thread
seems to be to see whether some operations can be made more useful by
staying in the bool domain -- I don't think making both of these
return 0 if n != 0, so let's keep them unchanged.

> Even if we "fix" this, surely we should go through the usual deprecation
> process? This isn't a critical security bug that needs fixing, it's a
> semantic change to something that has worked this way for 25 years, and
> its going to break something somewhere. There are just far too many
> people that expect that bools are ints. After all, not withstanding
> their fancy string representation, they behave like ints and actually
> are ints.

The thing here is, this change is too small to warrant a __future__
import. So we're either going to introduce it in 3.6 and tell people
about it in case their code might break, or we're never going to do
it.  I'm honestly on the fence, but I feel this is a rarely used
operator so changing its meaning is not likely to break a lot of code.

--Guido van Rossum (

From random832 at  Fri Apr  8 11:27:13 2016
From: random832 at (Random832)
Date: Fri, 08 Apr 2016 11:27:13 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote:
> The thing here is, this change is too small to warrant a __future__
> import. So we're either going to introduce it in 3.6 and tell people
> about it in case their code might break, or we're never going to do
> it.  I'm honestly on the fence, but I feel this is a rarely used
> operator so changing its meaning is not likely to break a lot of code.

What about just having a DeprecationWarning, but no __future__ import?

From guido at  Fri Apr  8 11:42:52 2016
From: guido at (Guido van Rossum)
Date: Fri, 8 Apr 2016 08:42:52 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

DeprecationWarning every time you use ~ on a bool? That would still be
too big a burden on using it the new way.

On Fri, Apr 8, 2016 at 8:27 AM, Random832 <random832 at> wrote:
> On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote:
>> The thing here is, this change is too small to warrant a __future__
>> import. So we're either going to introduce it in 3.6 and tell people
>> about it in case their code might break, or we're never going to do
>> it.  I'm honestly on the fence, but I feel this is a rarely used
>> operator so changing its meaning is not likely to break a lot of code.
> What about just having a DeprecationWarning, but no __future__ import?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From rian at  Fri Apr  8 12:40:15 2016
From: rian at (Rian Hunter)
Date: Fri, 8 Apr 2016 09:40:15 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

> From: Chris Angelico <rosuav at>
>> On Fri, Apr 8, 2016 at 12:17 PM, Rian Hunter <rian at> wrote:
>> Contrast with the following code that never makes sense (and is why I
>> said that NameError definitely signifies a programming error):
>>   try:
>>       foo = bar
>>   except NameError:
>>       foo = 0
> This is exactly the idiom used to cope with builtins that may or may
> not exist. If you want to support Python 2 as well as 3, you might use
> something like this:
> try:
>   input = raw_input
> except NameError:
>   raw_input = input

Oops, good call. That was a bad example. Maybe this one is a bit better:

       assert no_bugs()
   except AssertionError:
       # bugs are okay

> This is what I'd call a boundary location. You have "outer" code and
> "inner" code. Any uncaught exception in the inner code should get
> logged rather than aborting the outer code. I'd spell it like this:
> try:
>   response = ...
> except BaseException as e:
>   logging.exception(...)
>   response = create_500_response()
> [snip]
> At this kind of boundary, you basically catch-and-log *all*
> exceptions, handling them the same way. Doesn't matter whether it's
> ValueError, SyntaxError, NameError, RuntimeError, GeneratorStop, or
> SystemExit - they all get logged, the client gets a 500, and you go
> back and look for another request.

I'm very familiar with this pattern in Python and I've used it myself countless times. Unfortunately I've seen instances where it can lead to disastrous behavior, I think we all have. It seems to have limited usefulness in production and more usefulness in development. The point of my original email was precisely to put the legitimacy and usefulness of code like that into question.

> As to resetting stuff: I wouldn't bother; your functions should
> already not mess with global state. The only potential mess you should
> consider dealing with is a database rollback; and actually, my
> personal recommendation is to do that with a context manager inside
> the inner code, rather than a reset in the exception handler in the
> outer code.

So I agree this pattern works if you assume all code is exception-safe (context managers will clean up intermediate state) and there are no programming errors. There are lots of things that good code should do but as we all well know good code doesn't always do those things. My point is that except-log loops are dangerous and careless in the face of programming errors.

Programming errors are unavoidable. In a large system they are a daily fact of life. When there is a bug in production it's very dangerous to re-enter a code block that has demonstrated itself to be buggy, else you risk corrupting data. For example:

   def random_code(state):
       assert is_valid(

   def user_code_a(state): = "bad data"
       # the following throws

   def user_code_b(state):

   def main_loop():
       state = State()
       loop = [user_code_a,
       for fn in loop:
           except Exception:

This code allows user_code_b() to execute and corrupt data even though random_code() was lucky enough to be called and detect bad state early on.

You may say the fix is to assert correct data before writing to the database and, yes, that would fix the problem for future executions in this instance. That's not the point, the point is that incorrect buggy code is running in production today and it's imperative to have multiple safeguards to limit its damage. For example, you wouldn't have an except-log loop in an airplane control system.

Sometimes an error is just an error but sometimes an error signifies the running system itself is in a bad state. It would be nice to distinguish between the two in a consistent way across all Python event loops. Halting on any escaped exception is inconvenient, but continuing after any escape exception is dangerous.


From rosuav at  Fri Apr  8 12:53:21 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 02:53:21 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 2:40 AM, Rian Hunter <rian at> wrote:
>> From: Chris Angelico <rosuav at>
>> This is exactly the idiom used to cope with builtins that may or may
>> not exist. If you want to support Python 2 as well as 3, you might use
>> something like this:
>> try:
>>   input = raw_input
>> except NameError:
>>   raw_input = input
> Oops, good call. That was a bad example. Maybe this one is a bit better:
>    try:
>        assert no_bugs()
>    except AssertionError:
>        # bugs are okay
>        pass

Not sure I understand this example. It's obviously a toy, because
you'd never put an 'assert' just to catch and discard the exception -
the net result is that you call no_bugs() if you're in debug mode and
don't if you're not, which is more cleanly spelled "if __debug__:

>> This is what I'd call a boundary location. You have "outer" code and
>> "inner" code. Any uncaught exception in the inner code should get
>> logged rather than aborting the outer code.
> I'm very familiar with this pattern in Python and I've used it myself countless times. Unfortunately I've seen instances where it can lead to disastrous behavior, I think we all have. It seems to have limited usefulness in production and more usefulness in development. The point of my original email was precisely to put the legitimacy and usefulness of code like that into question.

When does this lead to disaster? Is it because someone creates a
boundary that shouldn't exist, or because the code maintains global
state in bad ways? The latter is a problem even without exceptions;
imagine a programming error that doesn't cause an exception, but just
omits some crucial "un-modify global state" call. You have no way of
detecting that in the outer code, and your state is messed up.

>> As to resetting stuff: I wouldn't bother; your functions should
>> already not mess with global state. The only potential mess you should
>> consider dealing with is a database rollback; and actually, my
>> personal recommendation is to do that with a context manager inside
>> the inner code, rather than a reset in the exception handler in the
>> outer code.
> So I agree this pattern works if you assume all code is exception-safe (context managers will clean up intermediate state) and there are no programming errors. There are lots of things that good code should do but as we all well know good code doesn't always do those things. My point is that except-log loops are dangerous and careless in the face of programming errors.

I don't think the except-log loop is the problem here. The problem is
the code that can go in part way, come out again, and leave itself in
a mess.

> Programming errors are unavoidable. In a large system they are a daily fact of life. When there is a bug in production it's very dangerous to re-enter a code block that has demonstrated itself to be buggy, else you risk corrupting data. For example:
>    def random_code(state):
>        assert is_valid(
>    def user_code_a(state):
> = "bad data"
>        # the following throws
>        random_code(state)
>    def user_code_b(state):
>        state.db.write(
>    def main_loop():
>        state = State()
>        loop = [user_code_a,
>                user_code_b]
>        for fn in loop:
>            try:
>                fn()
>            except Exception:
>                log_exception()
> This code allows user_code_b() to execute and corrupt data even though random_code() was lucky enough to be called and detect bad state early on.

Can you give a non-toy example that has this kind of mutable state at
top level? I suspect it's bad design. If it's truly necessary, use a
context manager to guarantee the reset:

def user_code_a(state):
    with state.set_data("bad data"):

> You may say the fix is to assert correct data before writing to the database and, yes, that would fix the problem for future executions in this instance. That's not the point, the point is that incorrect buggy code is running in production today and it's imperative to have multiple safeguards to limit its damage. For example, you wouldn't have an except-log loop in an airplane control system.

Actually, yes I would. The alternative that you're suggesting is to
have any error immediately shut down the whole system. Is that really
better? To have the entire control system disabled?

> Sometimes an error is just an error but sometimes an error signifies the running system itself is in a bad state. It would be nice to distinguish between the two in a consistent way across all Python event loops. Halting on any escaped exception is inconvenient, but continuing after any escape exception is dangerous.

There's no way for Python to be able to fix this for you. The tools
exist - most notably context managers - so the solution is to use

I don't think we're in python-ideas territory here. What I see here is
a perfect subject for a blog post or other scholarly article on
"Python exception handling best practice", plus possibly an internal
style guide for your company/organization. The blog post I would
definitely read with interest; the style guide ought to be stating the
obvious (as most style guides should).


From brett at  Fri Apr  8 13:39:21 2016
From: brett at (Brett Cannon)
Date: Fri, 08 Apr 2016 17:39:21 +0000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Fri, 8 Apr 2016 at 08:44 Guido van Rossum <guido at> wrote:

> DeprecationWarning every time you use ~ on a bool? That would still be
> too big a burden on using it the new way.

I think proposal would be a DeprecationWarning to flush out/remove all
current uses of ~bool with Python 3.6, and then in Python 3.7 introduce the
new semantics.


> On Fri, Apr 8, 2016 at 8:27 AM, Random832 <random832 at> wrote:
> >
> >
> > On Fri, Apr 8, 2016, at 11:00, Guido van Rossum wrote:
> >> The thing here is, this change is too small to warrant a __future__
> >> import. So we're either going to introduce it in 3.6 and tell people
> >> about it in case their code might break, or we're never going to do
> >> it.  I'm honestly on the fence, but I feel this is a rarely used
> >> operator so changing its meaning is not likely to break a lot of code.
> >
> > What about just having a DeprecationWarning, but no __future__ import?
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Fri Apr  8 13:42:24 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 03:42:24 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 7, 2016 at 11:04 PM, Chris Angelico <rosuav at> wrote:
> This is a spin-off from the __fspath__ discussion on python-dev, in
> which a few people said that a more general approach would be better.
> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.
> This could supersede the __fspath__ "give me the string for this path"
> protocol, or could stand in parallel with it.

In the light of all the arguments put forward in this thread and
elsewhere, I'm withdrawing this proposal in favour of __fspath__. If
additional use-cases are found for a "string-like object", this can be
revived, but otherwise, it's too general and insufficiently useful.


From rian at  Fri Apr  8 14:24:15 2016
From: rian at (rian at
Date: Fri, 08 Apr 2016 11:24:15 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

>> Oops, good call. That was a bad example. Maybe this one is a bit 
>> better:
>>    try:
>>        assert no_bugs()
>>    except AssertionError:
>>        # bugs are okay
>>        pass
> Not sure I understand this example. It's obviously a toy, because
> you'd never put an 'assert' just to catch and discard the exception -
> the net result is that you call no_bugs() if you're in debug mode and
> don't if you're not, which is more cleanly spelled "if __debug__:
> no_bugs()".

The example is removed from its original context. The point was to show 
that there is no reason to locally catch AssertionError (contrast with 
AttributeError, ValueError, TypeError, etc.). AssertionError is *always* 
indicative of a bug (unlike AttributeError which only may be indicative 
of a bug).

>>> This is what I'd call a boundary location. You have "outer" code and
>>> "inner" code. Any uncaught exception in the inner code should get
>>> logged rather than aborting the outer code.
>> I'm very familiar with this pattern in Python and I've used it myself 
>> countless times. Unfortunately I've seen instances where it can lead 
>> to disastrous behavior, I think we all have. It seems to have limited 
>> usefulness in production and more usefulness in development. The point 
>> of my original email was precisely to put the legitimacy and 
>> usefulness of code like that into question.
> When does this lead to disaster? Is it because someone creates a
> boundary that shouldn't exist, or because the code maintains global
> state in bad ways? The latter is a problem even without exceptions;
> imagine a programming error that doesn't cause an exception, but just
> omits some crucial "un-modify global state" call. You have no way of
> detecting that in the outer code, and your state is messed up.

This leads to disaster in the case of buggy code. This leads to a 
greater disaster when the boundary allows the code to continue running. 
It's not a black and white thing, there is no 100% way to detect for 
buggy code but my argument is that you shouldn't ignore an assert when 
you're lucky enough to get one.

>>> As to resetting stuff: I wouldn't bother; your functions should
>>> already not mess with global state. The only potential mess you 
>>> should
>>> consider dealing with is a database rollback; and actually, my
>>> personal recommendation is to do that with a context manager inside
>>> the inner code, rather than a reset in the exception handler in the
>>> outer code.
>> So I agree this pattern works if you assume all code is exception-safe 
>> (context managers will clean up intermediate state) and there are no 
>> programming errors. There are lots of things that good code should do 
>> but as we all well know good code doesn't always do those things. My 
>> point is that except-log loops are dangerous and careless in the face 
>> of programming errors.
> I don't think the except-log loop is the problem here. The problem is
> the code that can go in part way, come out again, and leave itself in
> a mess.

I'm not saying except-log is to blame for buggy code. I'm saying that 
when an exception occurs, there's no way to tell whether it was caused 
by an internal bug or an external error. Except-log should normally be 
limited to exceptions caused by external errors in production code.

>> Programming errors are unavoidable. In a large system they are a daily 
>> fact of life. When there is a bug in production it's very dangerous to 
>> re-enter a code block that has demonstrated itself to be buggy, else 
>> you risk corrupting data. For example:
>>    def random_code(state):
>>        assert is_valid(
>>    def user_code_a(state):
>> = "bad data"
>>        # the following throws
>>        random_code(state)
>>    def user_code_b(state):
>>        state.db.write(
>>    def main_loop():
>>        state = State()
>>        loop = [user_code_a,
>>                user_code_b]
>>        for fn in loop:
>>            try:
>>                fn()
>>            except Exception:
>>                log_exception()
>> This code allows user_code_b() to execute and corrupt data even though 
>> random_code() was lucky enough to be called and detect bad state early 
>> on.
> Can you give a non-toy example that has this kind of mutable state at
> top level? I suspect it's bad design. If it's truly necessary, use a
> context manager to guarantee the reset:
> def user_code_a(state):
>     with state.set_data("bad data"):
>         random_code(state)

Yes it may be bad design or the code may have bugs but that's precisely 
the point. By the time the exception hits your catch-all, there's no 
universal way of determining whether or not the exception was due to an 
internal bug (or bad design or whatever) or an external error.

>> You may say the fix is to assert correct data before writing to the 
>> database and, yes, that would fix the problem for future executions in 
>> this instance. That's not the point, the point is that incorrect buggy 
>> code is running in production today and it's imperative to have 
>> multiple safeguards to limit its damage. For example, you wouldn't 
>> have an except-log loop in an airplane control system.
> Actually, yes I would. The alternative that you're suggesting is to
> have any error immediately shut down the whole system. Is that really
> better? To have the entire control system disabled?

The alternative I'm suggesting is to reset that flight computer and 
switch to the backup one (potentially written by a different team).

>> Sometimes an error is just an error but sometimes an error signifies 
>> the running system itself is in a bad state. It would be nice to 
>> distinguish between the two in a consistent way across all Python 
>> event loops. Halting on any escaped exception is inconvenient, but 
>> continuing after any escape exception is dangerous.
> There's no way for Python to be able to fix this for you. The tools
> exist - most notably context managers - so the solution is to use
> them.

Context managers don't allow me to determine whether an exception is 
caused by an internal bug or external error, so it's not a solution to 
this problem. I want to live in a world where I can do this:

     while cbs:
         cb = cbs.pop()
         except Exception as e:
             logging.exception("In main loop")
             if is_a_bug(e):
                 raise SystemExit() from e

Python may be able to do something here. One possible thing is a new 
exception hierarchy, there may be other solutions. This may be 
sufficient but I doubt it:

     def is_a_bug(e):
         return isinstance(e, AssertionError)

The reason I am polling python-ideas is that I can't be the only one who 
has ever encountered this deficiency in Python.


From rosuav at  Fri Apr  8 14:28:47 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 04:28:47 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 4:24 AM,  <rian at> wrote:
> Python may be able to do something here. One possible thing is a new
> exception hierarchy, there may be other solutions. This may be sufficient
> but I doubt it:
>     def is_a_bug(e):
>         return isinstance(e, AssertionError)
> The reason I am polling python-ideas is that I can't be the only one who has
> ever encountered this deficiency in Python.

How about this:

def is_a_bug(e):
    return True

ANY uncaught exception is a bug. Any caught exception is not a bug.


From steve at  Fri Apr  8 15:01:19 2016
From: steve at (Steven D'Aprano)
Date: Sat, 9 Apr 2016 05:01:19 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 09, 2016 at 04:28:47AM +1000, Chris Angelico wrote:

> ANY uncaught exception is a bug.

What, even SystemExit and KeyboardInterrupt?

> Any caught exception is not a bug.


    # Look, bug free programming!

I am sure that any attempt to make universal rules about what is or 
isn't a bug will be doomed to failure. Remember that people can write 
code like this:

# Writing a BASIC interpreter in Python.
if not isinstance(command[0], int):
    raise SyntaxError('expected line number')

So, no, SyntaxError does not necessarily mean a bug in your code. *ALL* 
exceptions have to be understood in context, you can't just make a 
sweeping generalisation that exception A is always a bug, exception B is 
always recoverable. It depends on the context.


From chris.barker at  Fri Apr  8 15:02:16 2016
From: chris.barker at (Chris Barker)
Date: Fri, 8 Apr 2016 12:02:16 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 11:24 AM, <rian at> wrote:

> AssertionError is *always* indicative of a bug
I don't think there is any way to know that -- it all depends on how it's
used. For my part, Assertions are only for testing -- and are actually
turned off in debug mode.

And I think we can say the same thing about ALL Exceptions --even a syntax
error may not be a "stop everything" error in a system that runs
user-written scripts.

I agree with Chris A's point:

Any unhandled Exception is a bug. Simple as that.

Any other interpretation would be a style issue, decided for your

And if you are going to go there, I would do:

if not_a_bug() instead.

Consistent with the principle of good Python code -- only handle the
Exceptions you know to handle.



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Fri Apr  8 15:08:08 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 05:08:08 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 5:01 AM, Steven D'Aprano <steve at> wrote:
> On Sat, Apr 09, 2016 at 04:28:47AM +1000, Chris Angelico wrote:
>> ANY uncaught exception is a bug.
> What, even SystemExit and KeyboardInterrupt?

In terms of a boundary location? Yes, probably. If you're writing a
web server that allows web applications to be plugged into it, you
probably want to let the console halt processing of one request, then
go back and handle another. Although it is contextual; you might
choose the other way, in which case it's "any uncaught subclass of
Exception is a bug", or "anything other than KeyboardInterrupt is a
bug", or some other definition. I'm on the fence about SystemExit; in
any context where there's this kind of boundary, sys.exit() simply
wouldn't be used. So it could viably be called a bug, or it could
validly be used to terminate the entire server.

>> Any caught exception is not a bug.
> Even:
> try:
>     ...
> except:
>     # Look, bug free programming!
>     pass

As far as the boundary location's concerned, yes. The inner code can
be as ridiculous as it likes, but the boundary will never do the "log
and resume" if all exceptions are caught. (It won't even get an
opportunity to.)

> I am sure that any attempt to make universal rules about what is or
> isn't a bug will be doomed to failure. Remember that people can write
> code like this:
> # Writing a BASIC interpreter in Python.
> if not isinstance(command[0], int):
>     raise SyntaxError('expected line number')
> So, no, SyntaxError does not necessarily mean a bug in your code. *ALL*
> exceptions have to be understood in context, you can't just make a
> sweeping generalisation that exception A is always a bug, exception B is
> always recoverable. It depends on the context.

If one escapes to a boundary, then yes it does. That's the context
we're talking about here.


From steve at  Fri Apr  8 15:30:13 2016
From: steve at (Steven D'Aprano)
Date: Sat, 9 Apr 2016 05:30:13 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 08, 2016 at 11:24:15AM -0700, rian at wrote:

> I want to live in a world where I can do this:
>     while cbs:
>         cb = cbs.pop()
>         try:
>             cb()
>         except Exception as e:
>             logging.exception("In main loop")
>             if is_a_bug(e):
>                 raise SystemExit() from e

And I want a pony :-)

So long as Python allows people to write code like this:

# A deliberately silly example
if len(seq) == 0:
    raise ImportError("assertion fails")

you cannot expect to automatically know what an exception means without 
any context of where it came from and why it happened. The above example 
is silly, but it doesn't take much effort to come up with more serious 

- an interpreter written in Python may raise SyntaxError, which 
  is not a bug in the interpreter;

- a test framework may raise AssertionError for a failed test,
  which is not a bug in the framework;

- a function may raise MemoryError if the call *would* run out of
  memory, but without actually running out of memory; consequently
  it is not a fatal error, while another function may raise the 
  same MemoryError because it actually did fatally run out of memory.

Effectively, you want the compiler to Do What I Mean when it comes to 
exceptions. DWIM may, occasionally, be a good idea in applications, but 
I maintain it is never a good idea in a programming language.

I'm afraid that there's no hope for it: you're going to have to actually 
understand where an exception came from, and why it happened, before 
deciding whether or not it can be recovered from. The interpreter can't 
do that for you.


From rian at  Fri Apr  8 15:47:13 2016
From: rian at (Rian Hunter)
Date: Fri, 8 Apr 2016 12:47:13 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

> On Apr 8, 2016, at 12:02 PM, Chris Barker <chris.barker at> wrote:
> I agree with Chris A's point:
> Any unhandled Exception is a bug. Simple as that.

I'm happy with that interpretation. If that was codified in a style document accessible to newbies I think that would help achieve a more consistent approach to exceptions.

Yet something dark and hideous inside me tells me except-log loops will continue to be pervasive and bugs will continue to be ignored in a large number of Python programs.

From ethan at  Fri Apr  8 16:06:24 2016
From: ethan at (Ethan Furman)
Date: Fri, 08 Apr 2016 13:06:24 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/08/2016 12:47 PM, Rian Hunter wrote:
>> On Apr 8, 2016, at 12:02 PM, Chris Barker wrote:

>> I agree with Chris A's point:
>> Any unhandled Exception is a bug. Simple as that.
> I'm happy with that interpretation. If that was codified in a style
 > document accessible to newbies I think that would help achieve a
 > more consistent approach to exceptions.
> Yet something dark and hideous inside me tells me except-log loops
 > will continue to be pervasive and bugs will continue to be ignored
 > in a large number of Python programs.

Sadly, the language cannot force someone to investigate problems instead 
of ignoring them.  :(


From chris.barker at  Fri Apr  8 16:47:34 2016
From: chris.barker at (Chris Barker)
Date: Fri, 8 Apr 2016 13:47:34 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 8, 2016 at 12:47 PM, Rian Hunter <rian at> wrote:

> > On Apr 8, 2016, at 12:02 PM, Chris Barker <chris.barker at> wrote:
> > I agree with Chris A's point:
> >
> > Any unhandled Exception is a bug. Simple as that.
> I'm happy with that interpretation. If that was codified in a style
> document accessible to newbies I think that would help achieve a more
> consistent approach to exceptions.

it already is in advice all over the place: "don't use bare except"

Doesn't mean folks don't do it anyway....



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rian at  Fri Apr  8 16:50:14 2016
From: rian at (Rian Hunter)
Date: Fri, 8 Apr 2016 13:50:14 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

> On Apr 8, 2016, at 1:47 PM, Chris Barker <chris.barker at> wrote:
>> On Fri, Apr 8, 2016 at 12:47 PM, Rian Hunter <rian at> wrote:
>> > On Apr 8, 2016, at 12:02 PM, Chris Barker <chris.barker at> wrote:
>> > I agree with Chris A's point:
>> >
>> > Any unhandled Exception is a bug. Simple as that.
>> I'm happy with that interpretation. If that was codified in a style document accessible to newbies I think that would help achieve a more consistent approach to exceptions.
> it already is in advice all over the place: "don't use bare except"
> Doesn't mean folks don't do it anyway....

I think bare except is different from "except Exception" which is common and not discouraged. "except Exception" still masks programming errors.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Fri Apr  8 17:00:13 2016
From: ethan at (Ethan Furman)
Date: Fri, 08 Apr 2016 14:00:13 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/08/2016 01:50 PM, Rian Hunter wrote:

> I think bare except is different from "except Exception" which is common
> and not discouraged. "except Exception" still masks programming errors.

And comes with the advice to "log it, research it".


From joejev at  Fri Apr  8 17:03:05 2016
From: joejev at (Joseph Jevnik)
Date: Fri, 8 Apr 2016 17:03:05 -0400
Subject: [Python-ideas] New scope for exception handlers
Message-ID: <>

I would like to propose a change to exception handlers to make it harder to
accidently leak names defined only in the exception handler blocks. This
change follows from the decision to delete the name of an exception at the
end of a handler. The goal of this change is to prevent people from relying
on names that are defined only in a handler.

As an example, let's looks at a function with a try except:

def f():
        a = 1
    return a

This function will only work if the body raises some exception, otherwise
we will get an UnBoundLocalError. I propose that we should make `a` fall
out of scope when we leave the except handler regardless to prevent people
from depending on this behavior. This will make it easier to catch bugs
like this in testing. There is one case where I think the name should not
fall out of scope, and that is when the name is already defined outside of
the handler. For example:

def g():
    a = 1
        a = 2
    return a

I think this code is well behaved and should continue to work as it already
does. There are a couple of ways to implment this new behavior but I think
the simplest way to do this would be to treat the handler as a closure
where all the free variables defined as nonlocal.

This would need to be a small change to the compiler but may have some
performance implications for code that does not hit the except handler. If
the handler is longer than the bytecode needed to create the inner closure
then it may be faster to run the function when the except handler is not

This changes our definition of f from:

  2           0 SETUP_EXCEPT             8 (to 11)

  3           3 LOAD_CONST               1 (Ellipsis)
              6 POP_TOP
              7 POP_BLOCK
              8 JUMP_FORWARD            14 (to 25)

  4     >>   11 POP_TOP
             12 POP_TOP
             13 POP_TOP

  5          14 LOAD_CONST               2 (1)
             17 STORE_FAST               0 (a)
             20 POP_EXCEPT
             21 JUMP_FORWARD             1 (to 25)
             24 END_FINALLY

  6     >>   25 LOAD_FAST                0 (a)
             28 RETURN_VALUE

to something more like:

  3           0 SETUP_EXCEPT             8 (to 11)

  4           3 LOAD_CONST               0 (Ellipsis)
              6 POP_TOP
              7 POP_BLOCK
              8 JUMP_FORWARD            20 (to 31)

  5     >>   11 POP_TOP
             12 POP_TOP
             13 POP_TOP
             14 LOAD_CONST               1 (<code object <excepthandler> at
0x7febcd6e2300, file "<code>", line 1>)
             17 LOAD_CONST               2 ('<excepthandler>')
             20 MAKE_FUNCTION            0
             23 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
             26 POP_EXCEPT
             27 JUMP_FORWARD             1 (to 31)
             30 END_FINALLY

  7     >>   31 LOAD_FAST                0 (a)
             34 RETURN_VALUE

  1           0 LOAD_CONST               0 (1)
              3 STORE_FAST               0 (a)
              6 LOAD_CONST               1 (None)
              9 RETURN_VALUE

This new code properly raises the unbound locals exception when executed.
For g we could use a MAKE_CLOSURE instead of MAKE_FUNCTION.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From keithcu at  Fri Apr  8 17:15:52 2016
From: keithcu at (Keith Curtis)
Date: Fri, 8 Apr 2016 17:15:52 -0400
Subject: [Python-ideas] A tuple of various Python suggestions
Message-ID: <>

Hi all,

I just discovered this alias and I thought I'd post a few ideas. I
wouldn't call myself a Python master yet, but it's an amazing language
and my biggest wish is that it was more widely used in the industry.

Here are a few suggestions:

1. Decrease the bug count. I recently noticed that there are about
5,400 active bugs in That surprised me
because I almost never see anyone complain about bugs in Python
(compared to the number who complain about bugs in LibreOffice,
graphics drivers, Gnome / KDE, etc.)

There are a lot of people on this list, and if the brainpower can be
applied to practical, known, existing problems, it is a great way to
improve Python while also considering more exotic ideas. I can also
suggest making a pretty graph of the bug count and putting it on the
front page of for greater visibility.

2. Python is somewhat popular on servers, and there is a lot of
potential for more. WordPress is easy to use and powerful, but lots of
people don't want to program in PHP. Or Javascript, Java, Ruby, etc.

Codebases like Whoosh full-text search
( are important, but
have minimal dev resources as most people are using Lucene /
ElasticSearch. The common choice is between 1.3M lines of Java:, containing 1100 todos and
1000 references to "deprecated", or 41K lines of Python written mostly
by one person.

Hadoop is another big Java project (1.9M lines), and there is even an
ecosystem around it. Python interoperates with Hadoop, but it should
be possible to build a radically simpler framework that provides the
same functionality using Python-native functionality and without all
the baggage. Hadoop has several interesting sister projects: a
distributed database, scalable machine learning, a high-level data
flow language, a coordination service, etc. I'm sure you'd build
something smaller, cleaner, faster in many cases, more reliable, etc.

3. It was a sad mistake that Google picked Java over Python for
Android. However, there is now a great program called Kivy which
allows people to write apps for IOS or Android with one codebase, but
it could also use more resources, as for example it doesn't fully
support Python 3.x yet.

There are 10s of thousands of bugs in the popular Python libraries and
I would fix those before proposing more language changes.

4. I enjoy reading about the Python performance improvements, but it
is mostly a perception problem with all the existing workarounds.
Gnome wrote version 3 of their shell in Javascript because they didn't
think Python would be fast enough. Lots of people write Node because
it's compiled and "fast". I suggest taking some of the effort working
on performance, and spend it on evangelizing to other programmers that
Python / Cython / PyPy, etc. are already good enough! There are a lot
of programmers out there who would be happier if they could work in

5. It would be great to get Python in the web browsers as an
alternative to Javascript. There are a number of projects which
convert Python to Javascript, but this would be more direct.
LibreOffice ships with a Python interpreter, why can't Firefox and
Webkit? ;-) Obviously there are interoperability issues, but it would
be great to just side-step all the complexity of Javascript (Here is a
server-oriented article, but it gives a flavor: This might sound like a crazy
idea, but the engineering problems aren't that hard.

6. In a few cases, there are two many codebases providing the same
functionality, and none of them are really doing the job. For example,
the de-facto MySQL Python interop library
( only supports Python 2.x
and appears to be abandoned. There are several other libraries out
there with different features, performance, compatibility, etc. and
it's kind of a minefield for what should be a basic scenario. It takes
leadership to jump in and figure out how to fix it and make one
codebase good enough that the others can switch over.

7. Focus more on evolving the libraries rather than the language. I've
recently discovered Toolz, which has a more complete set of functional
language methods. I think some of them should be included in the
official versions. A lot of people don't think Python is good enough
for functional programming and this would help. These new routines add
complexity, but a newbie doesn't need to write in a functional way, so
it obeys the "only pay for what you use" rule.

There are a number of under-staffed libraries and frameworks. I see
people complain about the YAML parsing library being unmaintained, the
default HTTP functionality being difficult and limiting, poor SOAP
support, etc. There are a million ways to improve the Python ecosystem
without making any language changes. You don't have a big rich company
who can pay for thousands of full-time developers working on
libraries, but the bug reports are a great way to prioritize.

8. I've yet to find a nice simple free IDE with debugging for Python.
I use Atom, but it has primitive debugging. I tried PyCharm but it's
very complicated (and not free, and Java). I use Jupyter sometimes
also but I'd prefer a rich client app with watch windows, etc.

9. It would be interesting to re-imagine the spreadsheet with a more
native Python interface. Pandas and matplotlib are great, but it would
be cool to have it in LibreOffice Calc that supports drag and drop,
copy and paste, can read and write ODS, etc. (Also, LibreOffice Base
is basically unmaintained. I think if 10 Python programmers passionate
about databases and GUIs showed up, it could re-invigorate this dead

10. Being simple to learn and powerful is very hard. Fortunately, you
can break compatibility every 10 years. My only suggestion is to get
rid of the __self__ somehow ;-)



From breamoreboy at  Fri Apr  8 17:18:41 2016
From: breamoreboy at (Mark Lawrence)
Date: Fri, 8 Apr 2016 22:18:41 +0100
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <ne977p$a7h$>

On 08/04/2016 21:50, Rian Hunter wrote:
> I think bare except is different from "except Exception" which is common
> and not discouraged. "except Exception" still masks programming errors.

"except Exception" is a programming error as far as I'm concerned, but 
I'm not expecting my own code to keep running 24/7/365.  Horses for courses.

My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

From rian at  Fri Apr  8 17:25:47 2016
From: rian at (Rian Hunter)
Date: Fri, 8 Apr 2016 14:25:47 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

>> On Fri, Apr 08, 2016 at 11:24:15AM -0700, rian at wrote:
>> I want to live in a world where I can do this:
>>   while cbs:
>>       cb = cbs.pop()
>>       try:
>>           cb()
>>       except Exception as e:
>>           logging.exception("In main loop")
>>           if is_a_bug(e):
>>               raise SystemExit() from e
> And I want a pony :-)
> So long as Python allows people to write code like this:
> # A deliberately silly example
> if len(seq) == 0:
>   raise ImportError("assertion fails")
> you cannot expect to automatically know what an exception means without 
> any context of where it came from and why it happened. The above example 
> is silly, but it doesn't take much effort to come up with more serious 
> ones:
> - an interpreter written in Python may raise SyntaxError, which 
> is not a bug in the interpreter;
> - a test framework may raise AssertionError for a failed test,
> which is not a bug in the framework;
> - a function may raise MemoryError if the call *would* run out of
> memory, but without actually running out of memory; consequently
> it is not a fatal error, while another function may raise the 
> same MemoryError because it actually did fatally run out of memory.
> Effectively, you want the compiler to Do What I Mean when it comes to 
> exceptions. DWIM may, occasionally, be a good idea in applications, but 
> I maintain it is never a good idea in a programming language. 

I think you're misinterpreting me. I don't want a pony and I don't want a sufficiently smart compiler.

I want a consistent opt-in idiom with community consensus. I want a clear way to express that an exception is an error and not an internal bug. It doesn't have to catch 100% of cases, the idiom just needs to approach consistency across all Python libraries that I may import.

If the programmer doesn't pay attention to the idiom, then is_a_bug() will never return true (or not True if it's is_not_a_bug()). AssertionError is already unambiguous, I'm sure there are other candidates as well.

I'm not the first or only one to want something like this

But I can see the pile on has begun and my point is lost. Maybe this will get added in ten or twenty years.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Fri Apr  8 17:33:18 2016
From: ethan at (Ethan Furman)
Date: Fri, 08 Apr 2016 14:33:18 -0700
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/08/2016 02:25 PM, Rian Hunter wrote:

> I want a consistent opt-in idiom with community consensus. I want a
> clear way to express that an exception is an error and not an internal
> bug. It doesn't have to catch 100% of cases, the idiom just needs to
> approach consistency across all Python libraries that I may import.

> But I can see the pile on has begun and my point is lost. Maybe this
> will get added in ten or twenty years.

Then come up with one, test it out, and come back with "Here's a cool 
new idiom, and this is why it's better" post.


From pavol.lisy at  Fri Apr  8 17:55:34 2016
From: pavol.lisy at (Pavol Lisy)
Date: Fri, 8 Apr 2016 23:55:34 +0200
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

2016-04-08 17:00 GMT+02:00, Guido van Rossum <guido at>:
> On Thu, Apr 7, 2016 at 6:40 PM, Steven D'Aprano <steve at>
> wrote:
>> You missed two: >> and <<. What are we to do with (True << 1)?
> [...]
> Those are indeed ambiguous -- they are defined as multiplication or
> floor division with a power of two, e.g. x<<n is x*2**n and x>>n is
> x//2**n (for integral x and nonnegative n). The point of this thread
> seems to be to see whether some operations can be made more useful by
> staying in the bool domain -- I don't think making both of these
> return 0 if n != 0, so let's keep them unchanged.

It is also compatible with numpy:

  np.bool_(True) <<2

but there is also minus operator:



From brett at  Fri Apr  8 18:14:24 2016
From: brett at (Brett Cannon)
Date: Fri, 08 Apr 2016 22:14:24 +0000
Subject: [Python-ideas] New scope for exception handlers
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 8 Apr 2016 at 14:03 Joseph Jevnik <joejev at> wrote:

> I would like to propose a change to exception handlers to make it harder
> to accidentally leak names defined only in the exception handler blocks.
> This change follows from the decision to delete the name of an exception at
> the end of a handler.

So that change was to prevent memory leaks for the caught exception due to
tracebacks being attached to exceptions, not with anything to do with

> The goal of this change is to prevent people from relying on names that
> are defined only in a handler.

So that will break code and would make an `except` clause even more special
in terms of how it works compared to other blocks. Right now the Python
scoping rules are pretty straight-forward (local, non-local/free, global,
built-in). Adding this would shove in a "unless you're in an `except`
clause" rule that makes things more complicated for little benefit in the
face of backwards-compatibility.


> As an example, let's looks at a function with a try except:
> def f():
>     try:
>         ...
>     except:
>         a = 1
>     return a
> This function will only work if the body raises some exception, otherwise
> we will get an UnBoundLocalError. I propose that we should make `a` fall
> out of scope when we leave the except handler regardless to prevent people
> from depending on this behavior. This will make it easier to catch bugs
> like this in testing. There is one case where I think the name should not
> fall out of scope, and that is when the name is already defined outside of
> the handler. For example:
> def g():
>     a = 1
>     try:
>         ...
>     except:
>         a = 2
>     return a
> I think this code is well behaved and should continue to work as it
> already does. There are a couple of ways to implment this new behavior but
> I think the simplest way to do this would be to treat the handler as a
> closure where all the free variables defined as nonlocal.
> This would need to be a small change to the compiler but may have some
> performance implications for code that does not hit the except handler. If
> the handler is longer than the bytecode needed to create the inner closure
> then it may be faster to run the function when the except handler is not
> hit.
> This changes our definition of f from:
>   2           0 SETUP_EXCEPT             8 (to 11)
>   3           3 LOAD_CONST               1 (Ellipsis)
>               6 POP_TOP
>               7 POP_BLOCK
>               8 JUMP_FORWARD            14 (to 25)
>   4     >>   11 POP_TOP
>              12 POP_TOP
>              13 POP_TOP
>   5          14 LOAD_CONST               2 (1)
>              17 STORE_FAST               0 (a)
>              20 POP_EXCEPT
>              21 JUMP_FORWARD             1 (to 25)
>              24 END_FINALLY
>   6     >>   25 LOAD_FAST                0 (a)
>              28 RETURN_VALUE
> to something more like:
> f
> -
>   3           0 SETUP_EXCEPT             8 (to 11)
>   4           3 LOAD_CONST               0 (Ellipsis)
>               6 POP_TOP
>               7 POP_BLOCK
>               8 JUMP_FORWARD            20 (to 31)
>   5     >>   11 POP_TOP
>              12 POP_TOP
>              13 POP_TOP
>              14 LOAD_CONST               1 (<code object <excepthandler>
> at 0x7febcd6e2300, file "<code>", line 1>)
>              17 LOAD_CONST               2 ('<excepthandler>')
>              20 MAKE_FUNCTION            0
>              23 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
>              26 POP_EXCEPT
>              27 JUMP_FORWARD             1 (to 31)
>              30 END_FINALLY
>   7     >>   31 LOAD_FAST                0 (a)
>              34 RETURN_VALUE
> f.<excepthandler>
> -----------------
>   1           0 LOAD_CONST               0 (1)
>               3 STORE_FAST               0 (a)
>               6 LOAD_CONST               1 (None)
>               9 RETURN_VALUE
> This new code properly raises the unbound locals exception when executed.
> For g we could use a MAKE_CLOSURE instead of MAKE_FUNCTION.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Fri Apr  8 18:20:17 2016
From: brett at (Brett Cannon)
Date: Fri, 08 Apr 2016 22:20:17 +0000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 8 Apr 2016 at 14:16 Keith Curtis <keithcu at> wrote:

> Hi all,
> I just discovered this alias and I thought I'd post a few ideas. I
> wouldn't call myself a Python master yet, but it's an amazing language
> and my biggest wish is that it was more widely used in the industry.
> Here are a few suggestions:
> 1. Decrease the bug count. I recently noticed that there are about
> 5,400 active bugs in That surprised me
> because I almost never see anyone complain about bugs in Python
> (compared to the number who complain about bugs in LibreOffice,
> graphics drivers, Gnome / KDE, etc.)
> There are a lot of people on this list, and if the brainpower can be
> applied to practical, known, existing problems, it is a great way to
> improve Python while also considering more exotic ideas. I can also
> suggest making a pretty graph of the bug count and putting it on the
> front page of for greater visibility.

That bug count is misleading because it includes  not only legitimate bugs
but also enhancement proposals.

> 2. Python is somewhat popular on servers, and there is a lot of
> potential for more. WordPress is easy to use and powerful, but lots of
> people don't want to program in PHP. Or Javascript, Java, Ruby, etc.
> Codebases like Whoosh full-text search
> ( are important, but
> have minimal dev resources as most people are using Lucene /
> ElasticSearch. The common choice is between 1.3M lines of Java:
>, containing 1100 todos and
> 1000 references to "deprecated", or 41K lines of Python written mostly
> by one person.
> Hadoop is another big Java project (1.9M lines), and there is even an
> ecosystem around it. Python interoperates with Hadoop, but it should
> be possible to build a radically simpler framework that provides the
> same functionality using Python-native functionality and without all
> the baggage. Hadoop has several interesting sister projects: a
> distributed database, scalable machine learning, a high-level data
> flow language, a coordination service, etc. I'm sure you'd build
> something smaller, cleaner, faster in many cases, more reliable, etc.
> 3. It was a sad mistake that Google picked Java over Python for
> Android. However, there is now a great program called Kivy which
> allows people to write apps for IOS or Android with one codebase, but
> it could also use more resources, as for example it doesn't fully
> support Python 3.x yet.
> There are 10s of thousands of bugs in the popular Python libraries and
> I would fix those before proposing more language changes.
> 4. I enjoy reading about the Python performance improvements, but it
> is mostly a perception problem with all the existing workarounds.
> Gnome wrote version 3 of their shell in Javascript because they didn't
> think Python would be fast enough. Lots of people write Node because
> it's compiled and "fast". I suggest taking some of the effort working
> on performance, and spend it on evangelizing to other programmers that
> Python / Cython / PyPy, etc. are already good enough! There are a lot
> of programmers out there who would be happier if they could work in
> Python.

Two things. One, there is actually a good amount of performance work
potentially landing in Python 3.6. Two, we have been saying Python is fast
enough for most things for decades at this point, so this is an old issue
that's never going away.

> 5. It would be great to get Python in the web browsers as an
> alternative to Javascript. There are a number of projects which
> convert Python to Javascript, but this would be more direct.
> LibreOffice ships with a Python interpreter, why can't Firefox and
> Webkit? ;-) Obviously there are interoperability issues, but it would
> be great to just side-step all the complexity of Javascript (Here is a
> server-oriented article, but it gives a flavor:
> This might sound like a crazy
> idea, but the engineering problems aren't that hard.

Once lands in the browser then it won't be
as much of an issue. There are also various transpilers of Python to
JavaScript already. I also tried back in 2006 but Mozilla rejected the idea
so this has been tried before. :)


> 6. In a few cases, there are two many codebases providing the same
> functionality, and none of them are really doing the job. For example,
> the de-facto MySQL Python interop library
> ( only supports Python 2.x
> and appears to be abandoned. There are several other libraries out
> there with different features, performance, compatibility, etc. and
> it's kind of a minefield for what should be a basic scenario. It takes
> leadership to jump in and figure out how to fix it and make one
> codebase good enough that the others can switch over.
> 7. Focus more on evolving the libraries rather than the language. I've
> recently discovered Toolz, which has a more complete set of functional
> language methods. I think some of them should be included in the
> official versions. A lot of people don't think Python is good enough
> for functional programming and this would help. These new routines add
> complexity, but a newbie doesn't need to write in a functional way, so
> it obeys the "only pay for what you use" rule.
> There are a number of under-staffed libraries and frameworks. I see
> people complain about the YAML parsing library being unmaintained, the
> default HTTP functionality being difficult and limiting, poor SOAP
> support, etc. There are a million ways to improve the Python ecosystem
> without making any language changes. You don't have a big rich company
> who can pay for thousands of full-time developers working on
> libraries, but the bug reports are a great way to prioritize.
> 8. I've yet to find a nice simple free IDE with debugging for Python.
> I use Atom, but it has primitive debugging. I tried PyCharm but it's
> very complicated (and not free, and Java). I use Jupyter sometimes
> also but I'd prefer a rich client app with watch windows, etc.
> 9. It would be interesting to re-imagine the spreadsheet with a more
> native Python interface. Pandas and matplotlib are great, but it would
> be cool to have it in LibreOffice Calc that supports drag and drop,
> copy and paste, can read and write ODS, etc. (Also, LibreOffice Base
> is basically unmaintained. I think if 10 Python programmers passionate
> about databases and GUIs showed up, it could re-invigorate this dead
> codebase.)
> 10. Being simple to learn and powerful is very hard. Fortunately, you
> can break compatibility every 10 years. My only suggestion is to get
> rid of the __self__ somehow ;-)
> Regards,
> -Keith
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From leewangzhong+python at  Fri Apr  8 18:50:17 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Fri, 8 Apr 2016 18:50:17 -0400
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Apr 8, 2016 3:09 AM, "Greg Ewing" <greg.ewing at> wrote:
> Ethan Furman wrote:
>> Can anybody please enlighten me as to what, exactly, I did wrong?
> I think Guido was objecting to talk about things such as
> making bool no longer subclass int, which is not only
> going a long way beyond the original proposal, but is
> almost certainly never going to happen, so any discussion
> of it could be seen as wasting people's time.

While reading the thread, I was honestly wondering about whether it was
Pythonic to have bool be a subclass of int. (It's definitely convenient.)
So that, at least, might be a justified "Should we also...".
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ericsnowcurrently at  Fri Apr  8 19:08:04 2016
From: ericsnowcurrently at (Eric Snow)
Date: Fri, 8 Apr 2016 17:08:04 -0600
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <ne6ei7$cbg$> <>
Message-ID: <>

On Fri, Apr 8, 2016 at 1:34 AM, Nathaniel Smith <njs at> wrote:
> The reason I'm uncertain is that in numpy code, using operations like
> ~ on booleans is *very* common, because the whole idea of numpy is
> that it gives you a way to write code that works the same on either a
> single value or on an array of values, and when you're working with
> booleans then this means you have to use '~': '~' works on arrays and
> 'not' doesn't.

Would it be a different story if the logical operators (and, or, not)
had a protocol, e.g. __not__?  My guess is that ~ was made to work in
the absence of __not__.


From njs at  Fri Apr  8 21:25:41 2016
From: njs at (Nathaniel Smith)
Date: Fri, 8 Apr 2016 18:25:41 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <ne6ei7$cbg$> <>
Message-ID: <>

On Fri, Apr 8, 2016 at 4:08 PM, Eric Snow <ericsnowcurrently at> wrote:
> On Fri, Apr 8, 2016 at 1:34 AM, Nathaniel Smith <njs at> wrote:
>> The reason I'm uncertain is that in numpy code, using operations like
>> ~ on booleans is *very* common, because the whole idea of numpy is
>> that it gives you a way to write code that works the same on either a
>> single value or on an array of values, and when you're working with
>> booleans then this means you have to use '~': '~' works on arrays and
>> 'not' doesn't.
> Would it be a different story if the logical operators (and, or, not)
> had a protocol, e.g. __not__?  My guess is that ~ was made to work in
> the absence of __not__.

It would, but you can't have a regular protocol for and/or because
they're actually not operators, they're control-flow syntax:

In [1]: a = True

In [2]: a or b
Out[2]: True

In [3]: a and b
NameError: name 'b' is not defined

You could define a protocol for __or__/__ror__/__and__/__rand__
despite this, but it would have weird issues, like 'True and
array([True, False])' would call ndarray.__rand__ and return
array([True, False]), but 'True or array([True, False])' would return
True (because it short-circuits and returns before it can even check
for the presence of ndarray.__ror__). Given this, it's not clear
whether it even makes sense to try.

There was a discussion about this on python-ideas a few months ago,
and Guido asked whether it would still be useful despite these weird
issues, but I dropped the ball and my email to numpy-discussion
soliciting feedback on that is still sitting in my drafts folder...

And I guess you could have a protocol just for 'not', but there might
be some performance concerns (e.g. right now the peephole optimizer
actually knows how to optimize 'if not' into a single opcode), and
overriding 'not' without overriding 'and' + 'or' is probably more
confusing than useful.


Nathaniel J. Smith --

From ncoghlan at  Sat Apr  9 01:00:07 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Apr 2016 15:00:07 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <ne6a2v$fg$>
References: <>
Message-ID: <>

On 8 April 2016 at 04:48, Terry Reedy <tjreedy at> wrote:
> To me, the default proposal to expand the domain of open and other path
> functions is to call str on the path arg, either always or as needed. We
> should then ask "why isn't str() good enough"?  Most bad args for open will
> immediately result in a file-not-found exception.

Not when you're *creating* files and directories.

Using "str(path)" as the protocol means these all become valid operations:

  open(1.0, "w")
  open(object, "w")
  open(object(), "w")
  open(str, "w")
  open(input, "w")

Everything implements __str__ or __repr__, so *everything* becomes
acceptable as an argument to filesystem mutating operations, instead
of those operations bailing out immediately complaining they've been
asked to do something that doesn't make any sense.

I strongly encourage folks interested in the fspath protocol design
debate to read the __index__ PEP:

Start from the title: "Allowing Any Object to be Used for Slicing"

The protocol wasn't designed in the abstract: it had a concrete goal
of allowing objects other than builtins to be usable in the "x:y:z"
slicing syntax.

Those objects weren't hypothetical either: the rationale spells out

"In NumPy, for example, there are 8 different integer scalars
corresponding to unsigned and signed integers of 8, 16, 32, and 64
bits.  These type-objects could reasonably be used as integers in many
places where Python expects true integers but cannot inherit from the
Python integer type because of incompatible memory layouts.  There
should be some way to be able to tell Python that an object can behave
like an integer."

The PEP also spells out what's wrong with the "just use int(obj)" alternative:

"It is not possible to use the nb_int (and __int__ special method) for
this purpose because that method is used to *coerce* objects to
integers.  It would be inappropriate to allow every object that can be
coerced to an integer to be used as an integer everywhere Python
expects a true integer.  For example, if __int__ were used to convert
an object to an integer in slicing, then float objects would be
allowed in slicing and x[3.2:5.8] would not raise an error as it

Extending the use of the protocol to other contexts (such as sequence
repetition and optimised lookups on range objects) was then taken up
on a case by case basis, but the protocol semantics themselves were
defined by that original use case of "allow NumPy integers to be used
when slicing sequences".

The equivalent motivating use case here is "allow pathlib objects to
be used with the open() builtin, os module functions, and os.path
module functions".

The open() builtin handles str paths, integers (file descriptors), and
bytes-like objects (pre-encoded paths)
The os and os.path functions handle some combination of those 3
depending on the specific function

Working directly with file descriptors is relatively rare, so we can
leave that as a special case.
Similarly, working directly with bytes-like objects introduces
cross-platform portability problems, and also changes the output type
of many operations, so we'll keep that as a special case, too.

That leaves the text representation, and the question of defining
equivalents to "operator.index" and its underlying __index__ protocol.

My suggestion of os.fspath as the conversion function is based on:

- "path" being too generic (we have sys.path, os.path, and the PATH
envvar as potential sources of confusion)
- "fspath" being similar to "os.fsencode" and "os.fsdecode", which are
the operations for converting a filesystem path in text form to and
from its bytes-like object form
- os and os.path being two of the main consumers of the proposed protocol
- os being a builtin module that underpins most filesystem operations
anyway, so folks shouldn't be averse to importing it in code that
wants to consume the new protocol


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Sat Apr  9 01:26:26 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 15:26:26 +1000
Subject: [Python-ideas] New scope for exception handlers
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 7:03 AM, Joseph Jevnik <joejev at> wrote:
> def g():
>     a = 1
>     try:
>         ...
>     except:
>         a = 2
>     return a
> I think this code is well behaved and should continue to work as it already
> does. There are a couple of ways to implment this new behavior but I think
> the simplest way to do this would be to treat the handler as a closure where
> all the free variables defined as nonlocal.

I'm not sure what the point there is; when do you need this kind of
thing, and why only in the 'except' clause?

Also, what if the name is assigned to in the 'try' and the 'except'
but nowhere else?

If you're curious, I actually put together a "just for the fun of it"
patch that adds a form of sub-function scoping to Python. It'd be a
great way to figure out just how far this flies in the face of
Python's design.


From rosuav at  Sat Apr  9 01:32:12 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 15:32:12 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 7:25 AM, Rian Hunter <rian at> wrote:
> I want a consistent opt-in idiom with community consensus. I want a clear
> way to express that an exception is an error and not an internal bug. It
> doesn't have to catch 100% of cases, the idiom just needs to approach
> consistency across all Python libraries that I may import.
> If the programmer doesn't pay attention to the idiom, then is_a_bug() will
> never return true (or not True if it's is_not_a_bug()). AssertionError is
> already unambiguous, I'm sure there are other candidates as well.
> I'm not the first or only one to want something like this

The problem with posts like that is that it assumes there's some kind
of defined category of "stuff you always want to catch" vs "stuff you
never want to catch". This simply isn't the case. You ONLY catch the
exceptions you truly understand. Everything else, you leave. There's
seldom a need to catch a broad-but-specific subset of exceptions, and
those needs aren't entirely consistent, so they fundamentally cannot
be codified into the hierarchy.


From vito.detullio at  Sat Apr  9 01:58:07 2016
From: vito.detullio at (Vito De Tullio)
Date: Sat, 09 Apr 2016 07:58:07 +0200
Subject: [Python-ideas] Dunder method to make object str-like
References: <>
Message-ID: <nea5li$3v2$>

Chris Angelico wrote:

> Proposal: Objects should be allowed to declare that they are
> "string-like" by creating a dunder method (analogously to __index__
> for integers) which implies a loss-less conversion to str.

> Obviously str will have this dunder method, returning self. Most other
> core types (notably 'object') will not define it. Absence of this
> method implies that the object cannot be treated as a string.

do you expect int to define it?

By ZeD

From rosuav at  Sat Apr  9 02:00:14 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Apr 2016 16:00:14 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <nea5li$3v2$>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 3:58 PM, Vito De Tullio <vito.detullio at> wrote:
> Chris Angelico wrote:
>> Proposal: Objects should be allowed to declare that they are
>> "string-like" by creating a dunder method (analogously to __index__
>> for integers) which implies a loss-less conversion to str.
>> Obviously str will have this dunder method, returning self. Most other
>> core types (notably 'object') will not define it. Absence of this
>> method implies that the object cannot be treated as a string.
> do you expect int to define it?

Nope! An integer cannot be treated as a string.


From ncoghlan at  Sat Apr  9 02:03:18 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Apr 2016 16:03:18 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 April 2016 at 07:15, Keith Curtis <keithcu at> wrote:
> Hi all,
> I just discovered this alias and I thought I'd post a few ideas. I
> wouldn't call myself a Python master yet, but it's an amazing language
> and my biggest wish is that it was more widely used in the industry.
> Here are a few suggestions:
> 1. Decrease the bug count. I recently noticed that there are about
> 5,400 active bugs in That surprised me
> because I almost never see anyone complain about bugs in Python
> (compared to the number who complain about bugs in LibreOffice,
> graphics drivers, Gnome / KDE, etc.)
> There are a lot of people on this list, and if the brainpower can be
> applied to practical, known, existing problems, it is a great way to
> improve Python while also considering more exotic ideas. I can also
> suggest making a pretty graph of the bug count and putting it on the
> front page of for greater visibility.

Folks in institutional environments can most readily contribute to
this by obtaining Python from commercial redistributors (rather than
using the free community versions and then expecting free ongoing
support from volunteers), and then funneling bug reports through their
vendor rather than filing them directly with upstream (since donating
bug fixes to commercial organisations for free isn't something most
people consider a fun hobby).

> 2. Python is somewhat popular on servers, and there is a lot of
> potential for more. WordPress is easy to use and powerful, but lots of
> people don't want to program in PHP. Or Javascript, Java, Ruby, etc.
> Codebases like Whoosh full-text search
> ( are important, but
> have minimal dev resources as most people are using Lucene /
> ElasticSearch. The common choice is between 1.3M lines of Java:
>, containing 1100 todos and
> 1000 references to "deprecated", or 41K lines of Python written mostly
> by one person.
> Hadoop is another big Java project (1.9M lines), and there is even an
> ecosystem around it. Python interoperates with Hadoop, but it should
> be possible to build a radically simpler framework that provides the
> same functionality using Python-native functionality and without all
> the baggage. Hadoop has several interesting sister projects: a
> distributed database, scalable machine learning, a high-level data
> flow language, a coordination service, etc. I'm sure you'd build
> something smaller, cleaner, faster in many cases, more reliable, etc.

This is about business dynamics and institutional supply chain
management, not software, so wishing won't make it so. However, given
the business challenges facing the vendors behind the Java projects
you mention here, Python-based alternatives are also going to be a
tough sell to potential investors.

(That said, you may also be interested in the Apache Spark project,
where Python, Java, Scala and R are the top tier analytics development

> 3. It was a sad mistake that Google picked Java over Python for
> Android. However, there is now a great program called Kivy which
> allows people to write apps for IOS or Android with one codebase, but
> it could also use more resources, as for example it doesn't fully
> support Python 3.x yet.
> There are 10s of thousands of bugs in the popular Python libraries and
> I would fix those before proposing more language changes.

People work on things in their own time because they find them
enjoyable (or otherwise inherently rewarding), not because they're
interested in facilitating increased corporate adoption.

That said, yes, while it's high profile, contributing to CPython is
one of the *least* effective ways of helping to improve the overall
Python ecosystem in the near term, as it can literally take years to
roll out major changes (as Python 2.6 is still the de facto baseline
version, and there are many situations where folks are still using
Python 2.4 and earlier). The best cases are those where we can define
new APIs, protocols and idioms in ways that can also be adopted in
earlier versions of the language by way of third party libraries and
standard library backports.

> 4. I enjoy reading about the Python performance improvements, but it
> is mostly a perception problem with all the existing workarounds.
> Gnome wrote version 3 of their shell in Javascript because they didn't
> think Python would be fast enough. Lots of people write Node because
> it's compiled and "fast". I suggest taking some of the effort working
> on performance, and spend it on evangelizing to other programmers that
> Python / Cython / PyPy, etc. are already good enough! There are a lot
> of programmers out there who would be happier if they could work in
> Python.

There's nothing stopping anyone interested in this area from working
on any kind of evangelisation they want to. However, what harm does it
cause us personally if people decide to use other programming
languages? Python tautologically fits the brains of Pythonistas, but
that's nowhere near being the same thing as it being the right
language for everyone for every purpose.

It's also the case that any developer with only one language currently
in their toolbox (even when that language is Python) is a developer
with lots of learning opportunities ahead of them:

> 5. It would be great to get Python in the web browsers as an
> alternative to Javascript. There are a number of projects which
> convert Python to Javascript, but this would be more direct.
> LibreOffice ships with a Python interpreter, why can't Firefox and
> Webkit? ;-) Obviously there are interoperability issues, but it would
> be great to just side-step all the complexity of Javascript (Here is a
> server-oriented article, but it gives a flavor:
> This might sound like a crazy
> idea, but the engineering problems aren't that hard.

This misunderstands the nature of the relationship between JavaScript,
CSS and the HTML Domain Object Model: these are technologies that have
co-evolved for describing and dynamically updating user interfaces
that interact with remote services over a network, and they're
*really* good at it. Replicating that ecosystem in other program
languages would technically be possible, but there's little incentive
to do so given the work on WebAssembly and ongoing improvements in

> 6. In a few cases, there are two many codebases providing the same
> functionality, and none of them are really doing the job. For example,
> the de-facto MySQL Python interop library
> ( only supports Python 2.x
> and appears to be abandoned. There are several other libraries out
> there with different features, performance, compatibility, etc. and
> it's kind of a minefield for what should be a basic scenario. It takes
> leadership to jump in and figure out how to fix it and make one
> codebase good enough that the others can switch over.

This is why people pay open source redistributors to ensure they have
access to commercially supported components, rather than trusting that
whatever they happened to find on the internet will continue to be
maintained by anonymous benefactors.

> 7. Focus more on evolving the libraries rather than the language. I've
> recently discovered Toolz, which has a more complete set of functional
> language methods. I think some of them should be included in the
> official versions. A lot of people don't think Python is good enough
> for functional programming and this would help. These new routines add
> complexity, but a newbie doesn't need to write in a functional way, so
> it obeys the "only pay for what you use" rule.

"Some of them should be included" is not an actionable proposal.
Anyone is free to propose specific additions to functools, and make
the case for why those particular ones should be included in the
standard library rather than continuing to be accessed via version
independent 3rd party libraries like Toolz. (Advance warning: "I want
a purely functional solution to a problem that can already be readily
solved with procedural code" generally isn't accepted as a compelling

A simpler possibility might be to review the Functional Programming
HOWTO at and consider
ways that that might be updated to reference third part libraries, as
well as potentially made more discoverable via the functools and
itertools reference documentation.

> There are a number of under-staffed libraries and frameworks. I see
> people complain about the YAML parsing library being unmaintained, the
> default HTTP functionality being difficult and limiting, poor SOAP
> support, etc. There are a million ways to improve the Python ecosystem
> without making any language changes. You don't have a big rich company
> who can pay for thousands of full-time developers working on
> libraries, but the bug reports are a great way to prioritize.

What makes you think anyone here has the authority to tell anyone else
how to spend their time?

Folks work on community open source projects based on their personal
interests and their commercial interests. While you're right that lots
of things could stand to be improved, the best people to complain to
about underinvestment (or misdirected investment) are commercial
redistributors selling supported versions of CPython and other
community projects in the Python ecosystem, rather than the already
overcommitted volunteers contributing their own time for their own

> 8. I've yet to find a nice simple free IDE with debugging for Python.
> I use Atom, but it has primitive debugging. I tried PyCharm but it's
> very complicated (and not free, and Java). I use Jupyter sometimes
> also but I'd prefer a rich client app with watch windows, etc.

Not the right list for that question.

> 9. It would be interesting to re-imagine the spreadsheet with a more
> native Python interface. Pandas and matplotlib are great, but it would
> be cool to have it in LibreOffice Calc that supports drag and drop,
> copy and paste, can read and write ODS, etc. (Also, LibreOffice Base
> is basically unmaintained. I think if 10 Python programmers passionate
> about databases and GUIs showed up, it could re-invigorate this dead
> codebase.)

Also not the right list.

> 10. Being simple to learn and powerful is very hard. Fortunately, you
> can break compatibility every 10 years. My only suggestion is to get
> rid of the __self__ somehow ;-)

Methods are just functions, so this is never going to happen. (When
folks understand why, they've generally made a decent step forward in
appreciating the differences between a procedural-first usage model
for a language, vs an objects-first one)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From steve at  Sat Apr  9 03:44:44 2016
From: steve at (Steven D'Aprano)
Date: Sat, 9 Apr 2016 17:44:44 +1000
Subject: [Python-ideas] New scope for exception handlers
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote:

> I would like to propose a change to exception handlers to make it harder to
> accidently leak names defined only in the exception handler blocks. This
> change follows from the decision to delete the name of an exception at the
> end of a handler. The goal of this change is to prevent people from relying
> on names that are defined only in a handler.

An interesting proposal, but you're missing one critical point: why is 
it harmful to create names inside an except block?

There is a concrete reason why Python 3, and not Python 2, deletes the 
"except Exception as err" name when the except block leaves: because 
exceptions now hold on to a lot more call info, which can prevent 
objects from being garbage-collected. But the same doesn't apply to 
arbitrary names.

At the moment, only a few block statements create a new scope: def and 
class mostly. In particular, no flow control statement does: if, elif, 
else, for, while, try, except all use the existing scope. This is a nice 
clean design, and in my opinion must better than the rule that any 
indented block is a new scope. I would certainly object to making 
"except" the only exception (pun intended) and I would object even more 
to making *all* the block statements create a new scope.

Here is an example of how your proposal would bite people. Nearly all by 
code is hybrid 2+3 code, so I often have a construct like this at the 
start of modules:

    import builtins  # Python 3.x
except ImportError:
    # Python 2.x
    import __builtin__ as builtins

Nice and clean. But what if try and except introduced a new scope? I 
would have to write:

builtins = None
    global builtins
    import builtins
except ImportError:
    global builtins
    import __builtin__ as builtins
assert builtins is not None

Since try and except are different scopes, I need a separate global 
declaration in each. If you think this second version is an improvement 
over the first, then our ideas of what makes good looking code are so 
far apart that I don't think its worth discussing this further :-)

If only except is a different scope, then I have this shorter version:

try:  # global scope
    import builtins
except ImportError:  # local scope
    global builtins
    import __builtin__ as builtins

> As an example, let's looks at a function with a try except:
> def f():
>     try:
>         ...
>     except:
>         a = 1
>     return a
> This function will only work if the body raises some exception, otherwise
> we will get an UnBoundLocalError.

Not necessary. It depends on what is hidden by the ... dots. For 

def f():
        a = sequence.pop()
    except AttributeError:
        a = -1
    return a

It might not be the most Pythonic code around, but it works, and your 
proposal will break it.

Bottom line is, there's nothing fundamentally wrong with except blocks 
*not* starting a new scope. I'm not sure if there's any real benefit to 
the proposal, but even if there is, I doubt it's worth the cost of 
breaking existing working code.

So if you still want to champion your proposal, it's not enough to 
demonstrate that it could be done. You're going to have to demonstrate 
not only a benefit from the change, but that the benefit is worth 
breaking other people's code.


From tjreedy at  Sat Apr  9 04:07:04 2016
From: tjreedy at (Terry Reedy)
Date: Sat, 9 Apr 2016 04:07:04 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <nead7h$rcc$>

On 4/8/2016 11:42 AM, Guido van Rossum wrote:
> DeprecationWarning every time you use ~ on a bool?

A DeprecationWarning should only be in the initial version of 
bool.__invert__, which initially would return int.__invert__ after 
issuing the warning that we plan to change the meaning.

Terry Jan Reedy

From joejev at  Sat Apr  9 04:09:50 2016
From: joejev at (Joseph Jevnik)
Date: Sat, 9 Apr 2016 04:09:50 -0400
Subject: [Python-ideas] New scope for exception handlers
In-Reply-To: <>
References: <>
Message-ID: <>

Thank you for the responses. I did not realize that the delete fast was
added because the traceback is on the exception, that makes a lot of sense.
Regarding the case of:

    import a
except ImportError:
    import b as a

my proposal would still have this work as intended because the first import
would appear as an assignment to the name `a` outside the scope of the
handler which would case the inner scope to emit a store deref after the
import instead of a store fast. The only thing this change would block is
introducing a new variable which is only defined inside the except handler.

> So if you still want to champion your proposal, it's not enough to
> demonstrate that it could be done. You're going to have to demonstrate
> not only a benefit from the change, but that the benefit is worth
> breaking other people's code.

To be honest, this was mainly proposed because I thought about _how_ to
implement it and less about _should_ we implement this. I implemented a
small version of this to generate the dis outputs that I put in the first
email. After reading the responses I agree that this should probably not be
added; I do not think that we need to discuss this further unless someone
else has strong feelings about this.

On Sat, Apr 9, 2016 at 3:44 AM, Steven D'Aprano <steve at> wrote:

> On Fri, Apr 08, 2016 at 05:03:05PM -0400, Joseph Jevnik wrote:
> > I would like to propose a change to exception handlers to make it harder
> to
> > accidently leak names defined only in the exception handler blocks. This
> > change follows from the decision to delete the name of an exception at
> the
> > end of a handler. The goal of this change is to prevent people from
> relying
> > on names that are defined only in a handler.
> An interesting proposal, but you're missing one critical point: why is
> it harmful to create names inside an except block?
> There is a concrete reason why Python 3, and not Python 2, deletes the
> "except Exception as err" name when the except block leaves: because
> exceptions now hold on to a lot more call info, which can prevent
> objects from being garbage-collected. But the same doesn't apply to
> arbitrary names.
> At the moment, only a few block statements create a new scope: def and
> class mostly. In particular, no flow control statement does: if, elif,
> else, for, while, try, except all use the existing scope. This is a nice
> clean design, and in my opinion must better than the rule that any
> indented block is a new scope. I would certainly object to making
> "except" the only exception (pun intended) and I would object even more
> to making *all* the block statements create a new scope.
> Here is an example of how your proposal would bite people. Nearly all by
> code is hybrid 2+3 code, so I often have a construct like this at the
> start of modules:
> try:
>     import builtins  # Python 3.x
> except ImportError:
>     # Python 2.x
>     import __builtin__ as builtins
> Nice and clean. But what if try and except introduced a new scope? I
> would have to write:
> builtins = None
> try:
>     global builtins
>     import builtins
> except ImportError:
>     global builtins
>     import __builtin__ as builtins
> assert builtins is not None
> Since try and except are different scopes, I need a separate global
> declaration in each. If you think this second version is an improvement
> over the first, then our ideas of what makes good looking code are so
> far apart that I don't think its worth discussing this further :-)
> If only except is a different scope, then I have this shorter version:
> try:  # global scope
>     import builtins
> except ImportError:  # local scope
>     global builtins
>     import __builtin__ as builtins
> > As an example, let's looks at a function with a try except:
> >
> >
> > def f():
> >     try:
> >         ...
> >     except:
> >         a = 1
> >     return a
> >
> >
> > This function will only work if the body raises some exception, otherwise
> > we will get an UnBoundLocalError.
> Not necessary. It depends on what is hidden by the ... dots. For
> example:
> def f():
>     try:
>         a = sequence.pop()
>     except AttributeError:
>         a = -1
>     return a
> It might not be the most Pythonic code around, but it works, and your
> proposal will break it.
> Bottom line is, there's nothing fundamentally wrong with except blocks
> *not* starting a new scope. I'm not sure if there's any real benefit to
> the proposal, but even if there is, I doubt it's worth the cost of
> breaking existing working code.
> So if you still want to champion your proposal, it's not enough to
> demonstrate that it could be done. You're going to have to demonstrate
> not only a benefit from the change, but that the benefit is worth
> breaking other people's code.
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Apr  9 04:14:50 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Apr 2016 18:14:50 +1000
Subject: [Python-ideas] New scope for exception handlers
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 April 2016 at 17:44, Steven D'Aprano <steve at> wrote:
> So if you still want to champion your proposal, it's not enough to
> demonstrate that it could be done. You're going to have to demonstrate
> not only a benefit from the change, but that the benefit is worth
> breaking other people's code.

Not just any code, but "try it and see if it works" name binding
idioms recommended in the reference documentation:

It's also worth noting that when it comes to detecting this kind of
structural error, tools like pylint already do a good job of tracing
possible control flow problems:

$ cat >
    a = 1

$ pylint -E --enable=invalid-name
No config file found, using default configuration
************* Module conditional_name_binding
C:  4, 4: Invalid constant name "a" (invalid-name)

More easily finding this kind of problem is one of the major
advantages of using static analysis tools in addition to dynamic


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From k7hoven at  Sat Apr  9 07:16:47 2016
From: k7hoven at (Koos Zevenhoven)
Date: Sat, 9 Apr 2016 14:16:47 +0300
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <nead7h$rcc$>
References: <20160407094618.5c01847d@fsol> <>
Message-ID: <>

On Sat, Apr 9, 2016 at 11:07 AM, Terry Reedy <tjreedy at> wrote:
> On 4/8/2016 11:42 AM, Guido van Rossum wrote:
>> DeprecationWarning every time you use ~ on a bool?
> A DeprecationWarning should only be in the initial version of
> bool.__invert__, which initially would return int.__invert__ after issuing
> the warning that we plan to change the meaning.

Maybe the right warning type would be FutureWarning.


From ian.g.kelly at  Sat Apr  9 10:28:10 2016
From: ian.g.kelly at (Ian Kelly)
Date: Sat, 9 Apr 2016 08:28:10 -0600
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <nead7h$rcc$>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <>

On Sat, Apr 9, 2016 at 2:07 AM, Terry Reedy <tjreedy at> wrote:
> On 4/8/2016 11:42 AM, Guido van Rossum wrote:
>> DeprecationWarning every time you use ~ on a bool?
> A DeprecationWarning should only be in the initial version of
> bool.__invert__, which initially would return int.__invert__ after issuing
> the warning that we plan to change the meaning.

It seems unusual to deprecate something without also providing a means
of using the new thing in the same release. "Don't use this feature
because we're going to change what it does in the future. Oh, you want
to use the new version? Psych! We haven't actually done anything yet.
Use not instead." It creates a weird void in Python 3.6 where the
operator still exists but absolutely nobody has a legitimate reason to
be using it.

What happens if somebody is using ~ for its current semantics, skips
the 3.6 release in their upgrade path, and doesn't read the release
notes carefully enough? They'll never see the warning and will just
experience a silent and difficult-to-diagnose breakage.

From steve at  Sat Apr  9 11:25:14 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Apr 2016 01:25:14 +1000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote:

> It seems unusual to deprecate something without also providing a means
> of using the new thing in the same release. "Don't use this feature
> because we're going to change what it does in the future. Oh, you want
> to use the new version? Psych! We haven't actually done anything yet.
> Use not instead." It creates a weird void in Python 3.6 where the
> operator still exists but absolutely nobody has a legitimate reason to
> be using it.

Not really. This is quite similar to what happened in Python 2.3 during 
int/long unification. The behaviour of certain integer operations 
changed, including the meaning of some literals, and warnings were 

I don't have 2.3 available to demonstrate but I can show you the change 
in behaviour:

[steve at ando ~]$ python1.5 -c "print 0xffffffff"
[steve at ando ~]$ python2.4 -c "print 0xffffffff"

By memory, 0xffffffff in python2.3 would print a warning that the result 
will change in the next release, and return -1.


> What happens if somebody is using ~ for its current semantics, skips
> the 3.6 release in their upgrade path, and doesn't read the release
> notes carefully enough? They'll never see the warning and will just
> experience a silent and difficult-to-diagnose breakage.

Then they'll be in the same position as everybody if there's no 
depreciation at all.


From steve at  Sat Apr  9 11:43:05 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Apr 2016 01:43:05 +1000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote:

>  > After all, not withstanding their fancy string representation,
> I guess "fancy string representation" was the original motivation for
> the overrides.  If the intent was really to make operator versions of
> logical operators (but only for true bools!), they would have fixed ~
> too.

No need to guess. There's a PEP:

>  > they behave like ints and actually are ints.
> I can't fellow-travel all the way to "actually are", though.  bools
> are what we decide to make them. 

I'm not talking about bools in other languages, or bools in Python in 
some alternate universe. But in the Python we have right now, bools 
*are* ints, no ifs, buts or maybes:

py> isinstance(True, int)

This isn't an accident of the implementation, it was an explicit 
BDFL pronouncement in PEP 285:

    6) Should bool inherit from int?

    => Yes.

Now I'll certainly admit that bools-are-ints is an accident of history. 
Had Guido been more influenced by Pascal, say, and less by C, he might 
have choosen to include a dedicated Boolean type right from the 
beginning. But he wasn't, and so he didn't, and consequently bools are 
now ints.

> I just don't see why the current
> behaviors of &|^ are particularly useful, since you'll have to guard
> all bitwise expressions against non-bool truthies and falsies.

flag ^ flag is useful since we don't have a boolean-xor operator and 
bitwise-xor does the right thing for bools. And I suppose some people 
might prefer & and | over boolean-and and boolean-or because they're 
shorter and require less typing. I don't think that's a particularly 
good reason for using them, and as you say, you do have to guard 
against non-bools slipping, but Consenting Adults applies.


From steve at  Sat Apr  9 11:47:42 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Apr 2016 01:47:42 +1000
Subject: [Python-ideas] Hijacking threads [was: Changing the meaning of
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Fri, Apr 08, 2016 at 06:50:17PM -0400, Franklin? Lee wrote:

> While reading the thread, I was honestly wondering about whether it was
> Pythonic to have bool be a subclass of int. (It's definitely convenient.)
> So that, at least, might be a justified "Should we also...".

and see my comments to Stephen Turnbull send a few minutes ago.


From steve at  Sat Apr  9 11:49:12 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Apr 2016 01:49:12 +1000
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 09, 2016 at 02:16:47PM +0300, Koos Zevenhoven wrote:

> Maybe the right warning type would be FutureWarning.

If we accept this proposal -- and I hope we don't -- I think that 
FutureWarning is the right one to use. It is what was used in 2.3 when 
the behaviour of ints changed as part of int/long unification.


From ian.g.kelly at  Sat Apr  9 12:07:16 2016
From: ian.g.kelly at (Ian Kelly)
Date: Sat, 9 Apr 2016 10:07:16 -0600
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano <steve at> wrote:
> On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote:
>> It seems unusual to deprecate something without also providing a means
>> of using the new thing in the same release. "Don't use this feature
>> because we're going to change what it does in the future. Oh, you want
>> to use the new version? Psych! We haven't actually done anything yet.
>> Use not instead." It creates a weird void in Python 3.6 where the
>> operator still exists but absolutely nobody has a legitimate reason to
>> be using it.
> Not really. This is quite similar to what happened in Python 2.3 during
> int/long unification. The behaviour of certain integer operations
> changed, including the meaning of some literals, and warnings were
> displayed.

Pointing out that this has been done once before, 11 minor releases
prior, does not dissuade me from continuing to characterize it as
"unusual". The int/long unification was also a much more visible
change overall.

>> What happens if somebody is using ~ for its current semantics, skips
>> the 3.6 release in their upgrade path, and doesn't read the release
>> notes carefully enough? They'll never see the warning and will just
>> experience a silent and difficult-to-diagnose breakage.
> Then they'll be in the same position as everybody if there's no
> depreciation at all.

I'm not suggesting there should be no deprecation. I'm just
questioning whether the proposed deprecation is sufficient.

From random832 at  Sat Apr  9 12:23:55 2016
From: random832 at (Random832)
Date: Sat, 09 Apr 2016 12:23:55 -0400
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016, at 11:43, Steven D'Aprano wrote:
> flag ^ flag is useful since we don't have a boolean-xor operator and 
> bitwise-xor does the right thing for bools.

If you have bools, so does !=. If you don't, ^ is no better.

From guido at  Sat Apr  9 12:24:54 2016
From: guido at (Guido van Rossum)
Date: Sat, 9 Apr 2016 09:24:54 -0700
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

Let me pronounce something here. This change is not worth the amount
of effort and pain a deprecation would cause everyone. Either we
change this quietly in 3.6 (adding it to What's New etc. of course) or
we don't do it at all.

--Guido van Rossum (

From ian.g.kelly at  Sat Apr  9 12:25:50 2016
From: ian.g.kelly at (Ian Kelly)
Date: Sat, 9 Apr 2016 10:25:50 -0600
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 9, 2016 at 10:07 AM, Ian Kelly <ian.g.kelly at> wrote:
> On Sat, Apr 9, 2016 at 9:25 AM, Steven D'Aprano <steve at> wrote:
>> On Sat, Apr 09, 2016 at 08:28:10AM -0600, Ian Kelly wrote:
>>> It seems unusual to deprecate something without also providing a means
>>> of using the new thing in the same release. "Don't use this feature
>>> because we're going to change what it does in the future. Oh, you want
>>> to use the new version? Psych! We haven't actually done anything yet.
>>> Use not instead." It creates a weird void in Python 3.6 where the
>>> operator still exists but absolutely nobody has a legitimate reason to
>>> be using it.
>> Not really. This is quite similar to what happened in Python 2.3 during
>> int/long unification. The behaviour of certain integer operations
>> changed, including the meaning of some literals, and warnings were
>> displayed.
> Pointing out that this has been done once before, 11 minor releases
> prior, does not dissuade me from continuing to characterize it as
> "unusual". The int/long unification was also a much more visible
> change overall.

Also, in that case there was a way to start using long literals
immediately: 0xffffffffL

From storchaka at  Sat Apr  9 14:27:57 2016
From: storchaka at (Serhiy Storchaka)
Date: Sat, 9 Apr 2016 21:27:57 +0300
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <nead7h$rcc$>
References: <20160407094618.5c01847d@fsol> <>
 <> <>
Message-ID: <nebhjd$jpd$>

On 09.04.16 11:07, Terry Reedy wrote:
> On 4/8/2016 11:42 AM, Guido van Rossum wrote:
>> DeprecationWarning every time you use ~ on a bool?
> A DeprecationWarning should only be in the initial version of
> bool.__invert__, which initially would return int.__invert__ after
> issuing the warning that we plan to change the meaning.

For such cases there is FutureWarning.

From keithcu at  Sat Apr  9 20:28:52 2016
From: keithcu at (Keith Curtis)
Date: Sat, 9 Apr 2016 20:28:52 -0400
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

Hi again all,

Thanks for the replies.

I don't have a studied opinion of what new methods should be added to
functools, I mostly brought it up because it won't create an
incompatible change and things like that should be much easier to do.
Functools is one of the few remaining places in the core runtime where
it seems what is provided is meant to be a sample or flavor, and not
used on a daily basis without group-by, merge-sort, frequencies, pipe,
take / drop, etc.

I agree that more companies paying for Python support would be a great
thing, but most often companies buy support at a different level of
granularity: for example, RHEL where they get it for an entire OS, or
hiring Django people to build and maintain a website. I wouldn't count
on commercial redistributors of Python as ever being a big source of
resources to fix the thousands of random bugs. The vast majority of
people use the standard runtime.

I realize that volunteers work on their own time and choice of tasks,
but you've got quite a large community, and by periodically reporting
on the bug count and having a goal to resolve old ones eventually, you
will even find problem areas no one is talking about. Sometimes the
issue isn't resources, but grit, focus, pride, etc.

If you never weighed yourself, at looked at yourself in the mirror,
you'd get heavier than you realized. You are mostly volunteers, but
you can try to produce software as good as what paid professionals can

Perhaps the bug list should be broken up into two categories: future
optional or incompatible features, and code that people can easily
agree are worth doing now. That way important holes don't get ignored
for years. People should be able to trust the standard runtime.

It didn't seem like WebAssembly would enable access to Numpy. It seems
doubtful it will be much adopted if it can't access the rich Python
runtime that might be installed.

Python is much cleaner than some of the alternatives, so you should
consider adoption as a humanitarian effort!



From steve at  Sat Apr  9 21:14:41 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Apr 2016 11:14:41 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Keith,

On Sat, Apr 09, 2016 at 08:28:52PM -0400, Keith Curtis wrote:

> Python is much cleaner than some of the alternatives, so you should
> consider adoption as a humanitarian effort!

I keep seeing you telling us that "you should do this, you should do 
that, you you you..." but I don't see *you* volunteering to work on any 
of the bugs.

Or do you see yourself more in a supervisory role?


From ncoghlan at  Sun Apr 10 00:32:22 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 10 Apr 2016 14:32:22 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 April 2016 at 10:28, Keith Curtis <keithcu at> wrote:
> I agree that more companies paying for Python support would be a great
> thing, but most often companies buy support at a different level of
> granularity: for example, RHEL where they get it for an entire OS, or
> hiring Django people to build and maintain a website.

For context, I work for Red Hat's developer experience team and
occasionally put the graph at in front of our Python
maintenance team and ask "This is a trend that worries me, what are we
currently doing about it?".

I'd encourage anyone using Python in a commercial or large
organisational context that are getting their Python runtimes via a
commercially supported Linux distro, Platform-as-a-Service provider,
or one of the cross-platform CPython redistributors to take that
graph, show it to their vendor and say: "These upstream CPython
metrics worry us, what are you doing about them on our behalf?". (If
you don't know how to ask your vendor that kind of question since
another team handles the supplier relationship, then find that team
and ask *them* what they're doing to address your concerns)

For folks in those kinds of contexts running directly off the binaries
or source code published by the Python Software Foundation, then my
recommendation is a bit different:

* option 1 is to advocate for switching to a commercial redistributor,
and asking prospective vendors how they're supporting ongoing CPython
maintenance as part of the vendor evaluation
* option 2 is to advocate for becoming a PSF Sponsor Member and then
advocate for a technical fellowship program aimed at general
maintenance and project health monitoring for CPython (keeping in mind
that "How do we manage it fairly and sustainably?" would actually be a
bigger challenge for such a fellowship than funding it)

Whether advocating for option 1 or option 2 makes more sense will vary
by organisation, as it depends greatly on whether or not the
organisation would prefer to work with a traditional commercial
supplier, or engage directly with a charitable foundation that
operates in the wider public interest.

There's also option 3, which is to hire existing core developers and
give them time to work on general upstream CPython maintenance, but
there are actually relatively few organisations for which that
strategy makes commercial sense (and they're largely going to be
operating system companies, public cloud infrastructure companies, or
commercial Python redistributors).

> I wouldn't count
> on commercial redistributors of Python as ever being a big source of
> resources to fix the thousands of random bugs. The vast majority of
> people use the standard runtime.

If that were an accurate assessment, I wouldn't need to write articles
and the manylinux cross-distro Linux ABI wouldn't be based on the 9
year old CentOS 5 userspace. (We also wouldn't be needing to pursue
redistributor-friendly proposals like PEP 493 to help propagate the
network security enhancements in the 2.7.x series)

Direct downloads from are certainly a major factor on
Windows (especially in educational contexts), but even there, many of
the largest commercial adopters are using ActiveState's distribution
(see ) or one of the analytical
distributions from Enthought or Continuum Analytics.

> I realize that volunteers work on their own time and choice of tasks,
> but you've got quite a large community, and by periodically reporting
> on the bug count and having a goal to resolve old ones eventually, you
> will even find problem areas no one is talking about. Sometimes the
> issue isn't resources, but grit, focus, pride, etc.

So, you're suggesting we deliberately set out to make our volunteers
feel bad in an effort to guilt them into donating more free work to
corporations and other large organisations? When I phrase it in that
deliberately negative way, I hope it becomes obvious why we *don't* do

We do send out the weekly tracker metrics emails, and offers the
statistics graphs on the tracker itself if people want to use them as
parts of their own business cases for increased investment (whether
direct or indirect) in upstream sustaining engineering for CPython.

> If you never weighed yourself, at looked at yourself in the mirror,
> you'd get heavier than you realized. You are mostly volunteers, but
> you can try to produce software as good as what paid professionals can
> do.

You seem to be operating under some significant misapprehensions here.
As far as I am aware, most of the core developers *are* paid
professionals (even the folks that didn't start out that way) - we're
just typically not paid specifically to work on CPython. Instead, we
work on CPython to make sure it's fit for *our* purposes, or because
we find it to be an enjoyable way to spend our free time.

Engaging effectively with that environment means pursuing a
"co-contributor" experience - we each make CPython better for our own
purposes, and by doing so, end up making it better for everyone.

The experience you seem to be seeking in this thread is that of a pure
software consumer, and that's the role downstream redistributors
fulfil in an open source ecosystem: they provide (usually for a fee) a
traditional customer experience, with the redistributor taking care of
the community engagement side of things.

> Perhaps the bug list should be broken up into two categories: future
> optional or incompatible features, and code that people can easily
> agree are worth doing now. That way important holes don't get ignored
> for years. People should be able to trust the standard runtime.

Important holes don't get ignored for years - they get addressed,
since people are motivated to invest the time to ensure they're fixed.
That said, we do experience the traditional open source "community
gap" where problems that are only important to folks that aren't yet
personally adept at effective community engagement can languish
indefinitely (hence my emphasis above on organisations considering the
idea of outsourcing their community engagement efforts, rather than
assuming that it's easy, or that someone else will take care of it for
them without any particular personal motivation).

> It didn't seem like WebAssembly would enable access to Numpy. It seems
> doubtful it will be much adopted if it can't access the rich Python
> runtime that might be installed.

Project Jupyter already addressed that aspect by separating out the
platform independent UI service for code entry from the language
kernel backend for code evaluation. There's certainly a lot that could
be done in terms of making Jupyter language kernels installable as web
browser add-ons running native code, but that would be a question for
the Project Jupyter folks, rather than here.

> Python is much cleaner than some of the alternatives, so you should
> consider adoption as a humanitarian effort!

As tempting as it is to believe that "things that matter to software
developers" dictate where the world moves, it's important to keep in
mind that it's estimated we make up barely 0.25% of the world's
population [1] - this means that for everyone that would consider
themselves a "software developer", there's around 399 people who

Accordingly, if your motivation is "How do we help empower people to
control the technology in their lives, rather than having it control
them?" (which is a genuinely humanitarian effort), then I'd recommend
getting involved with education-focused organisations like the
Raspberry Pi Foundation, Software Carpentry and Django Girls, or
otherwise helping to build bridges between the software development
community and professional educators (see [2] for an example of the

When it comes to technology choices *within* that 0.25%, it's worth
remembering that one of the key design principles of Python is
*trusting developers to make their own decisions* (hence things like
allowing monkeypatching as being incredibly useful in cases like
testing, even while actively advising against it as a regular
programming practice). That trust extends to trusting them to choose
languages that are appropriate for them and their use cases. If that
leads to them choosing Python, cool. If it leads to them choosing
something else, that's cool, too - there are so many decent open
source programming languages and runtimes out there these days that
software developers are truly spoiled for choice, so providing a solid
foundation for exploring those alternatives is at least as important a
goal as reducing the incentives for people that already favour Python
to need to explore them.



Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From greg.ewing at  Sun Apr 10 01:38:41 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 10 Apr 2016 17:38:41 +1200
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

Rian Hunter wrote:
> I want a consistent opt-in idiom with community consensus. I want a 
> clear way to express that an exception is an error and not an internal 
> bug.

There's already a convention for this. Exceptions derived
from EnvironmentError (OSError in python 3) usually
result from something the user did wrong. Anything else that
escapes lower-level exception handling is probably a bug.

So my top-level exception handlers usually look like this:

    except EnvironmentError as e:
       # Display an appropriate error message to the user
       # If it's something interactive, go back for another
       # request, otherwise exit

Anything else is left to propagate and generate a traceback.

It's not perfect, but it works well enough most of the time.
If I find a case where some non-bug exception gets raised that
doesn't derive from EnvironmentError, I fix things so that
it gets caught somewhere lower down and re-raised as an

So the rule you seem to be after is probably "If it's likely to
be a user error, derive it from EnvironmentError, otherwise don't."

I think that's about the best that can be done, considering
that library code can't always know whether a given exception
is a user error or a bug, because it depends on what the
calling code is trying to accomplish.

For example,


will probably produce an OSError (file not found) even though
it's the programmer's fault for misspelling "config".


From greg.ewing at  Sun Apr 10 02:46:17 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 10 Apr 2016 18:46:17 +1200
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> Let me pronounce something here. This change is not worth the amount
> of effort and pain a deprecation would cause everyone. Either we
> change this quietly in 3.6 (adding it to What's New etc. of course) or
> we don't do it at all.

I'm having trouble seeing why it should be done at all.
What actual problem would it be solving? Does anyone
desperately want to be able to spell boolean negation
as ~b instead of not b?


From p.f.moore at  Sun Apr 10 06:15:28 2016
From: p.f.moore at (Paul Moore)
Date: Sun, 10 Apr 2016 11:15:28 +0100
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 April 2016 at 07:46, Greg Ewing <greg.ewing at> wrote:
> Guido van Rossum wrote:
>> Let me pronounce something here. This change is not worth the amount
>> of effort and pain a deprecation would cause everyone. Either we
>> change this quietly in 3.6 (adding it to What's New etc. of course) or
>> we don't do it at all.
> I'm having trouble seeing why it should be done at all.
> What actual problem would it be solving? Does anyone
> desperately want to be able to spell boolean negation
> as ~b instead of not b?

I have no axe to grind either way, but my impression from this thread
is that some people would prefer bool to be consistent with
user-defined types (such as numpy's) in this regard - specifically
because user-defined types *have* to use ~ as the negation operator
because "not" is not overridable in they way they require.


From stephen at  Sun Apr 10 12:33:53 2016
From: stephen at (Stephen J. Turnbull)
Date: Mon, 11 Apr 2016 01:33:53 +0900
Subject: [Python-ideas] Boundaries for unpacking
In-Reply-To: <>
References: <> <ne6brb$ubp$>
Message-ID: <>

Michel Desmoulin writes:

 > Yes and you can also do that for regular slicing on list. But you don't,
 > because you have regular slicing, which is cleaner, and easier to read
 > and remember.

It's clean because it's well-defined.  Slices on general iterables
don't have an OWTDI.  For example, "a = somelist[:]" is an idiom for
copying somelist to a so that destructive manipulations of a don't
change the original.  Should "a = someiterable[:]" reproduce those
semantics?  After "head, tail = someiterable" should tail contain a
list or someiterable itself or something else?

"WIBNI iterable[] worked" has already been posted to this thread about
5 times, and nobody disagrees that IWBN.  But slicing and unpacking of
iterables are fraught with such issues.  It's time the wishful
thinkers got down to edge cases and wrote a PEP.

From steve at  Sun Apr 10 12:39:24 2016
From: steve at (Steven D'Aprano)
Date: Mon, 11 Apr 2016 02:39:24 +1000
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 10, 2016 at 05:38:41PM +1200, Greg Ewing wrote:
> Rian Hunter wrote:
> >I want a consistent opt-in idiom with community consensus. I want a 
> >clear way to express that an exception is an error and not an internal 
> >bug.
> There's already a convention for this. Exceptions derived
> from EnvironmentError (OSError in python 3) usually
> result from something the user did wrong. Anything else that
> escapes lower-level exception handling is probably a bug.

I don't think this is a valid description of EnvironmentError. I don't 
think you can legitimately say it "usually" comes from user error.

Let's say I have an application which, on startup, looks for config 
files in various places. If they exist and are readable, the application 
uses them. If they don't exist or aren't readable, an EnvironmentError 
will be generated (say, IOError). This isn't a user error, and my app 
can and should just ignore such missing config files.

Then the application goes to open a user-specified data file. If the 
file doesn't exist, or can't be read, that will generate an 
EnvironmentError, but it isn't one that should be logged. (Who wants to 
log every time the user mispells a file name, or tries to open a file 
they don't have permission for?)

In an interactive application, the app should display an error message 
and then wait for more commands. In a batch or command-line app, the 
application should exit. So treatment of the error depends on *what sort 
of application* you have, not just the error itself.

Then the application phones home, looking for updates to download, 
uploading the popularity statistics of the most commonly 
used commands, bug reports, etc. What if it can't contact the home 
server? That's most likely an EnvironmentError too, but it's not a 

Oops, I spelled the URL of my server "" instead of 
.au. So that specific EnvironmentError is a programming bug. (No wonder 
I haven't had any popularity stats uploaded...)

So EnvironmentError can represent any of:

- situation normal, not actually an error at all;
- a non-fatal user-error;
- a fatal user-error;
- transient network errors that will go away on their own;
- programming bugs.

> It's not perfect, but it works well enough most of the time.
> If I find a case where some non-bug exception gets raised that
> doesn't derive from EnvironmentError, I fix things so that
> it gets caught somewhere lower down and re-raised as an
> EnvironmentError.

That's ... rather strange. As in:

EnvironmentError("substring not found")

for an unsuccessful search?


From cory at  Sun Apr 10 14:41:44 2016
From: cory at (Cory Benfield)
Date: Sun, 10 Apr 2016 19:41:44 +0100
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

> On 10 Apr 2016, at 01:28, Keith Curtis <keithcu at> wrote:
> It didn't seem like WebAssembly would enable access to Numpy. It seems
> doubtful it will be much adopted if it can't access the rich Python
> runtime that might be installed.

I feel like this demonstrates a confusion about how the web ecosystem works.

Code that executes inside a web browser?s Javascript runtime does not have arbitrary access to the system. That?s not an arbitrary limitation, it?s vital, because code shipped over the web is inherently untrusted: it can be intercepted, manipulated, edited, and attacked. Each time the web application developers of the world want a new interface to system hardware they have to specify up a whole set of APIs for the browser Javascript to obtain that access in a way that is controlled and enabled by the user.

Browser developers will allow code executing in the browser to access binaries on the host system *over their dead bodies*. Allowing that represents a terrifying security vulnerability. At no point will any sane browser developer allow that.

As it turns out, of course, WebAssembly *would* enable access to NumPy because NumPy could simply be compiled to WebAssembly as well, and distributed along with the Python interpreter and all the other code you?d have to ship.

The TL;DR here is: no web browser will ever allow access to a Python runtime (or any other runtime) that is installed. However, WebAssembly does not have the limitation you?ve suggested.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From Nikolaus at  Sun Apr 10 17:24:00 2016
From: Nikolaus at (Nikolaus Rath)
Date: Sun, 10 Apr 2016 14:24:00 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
 (Nick Coghlan's message of "Sat, 9 Apr 2016 15:00:07 +1000")
References: <>
Message-ID: <>

On Apr 09 2016, Nick Coghlan <ncoghlan-Re5JQEeQqe8AvxtiuMwx3w at> wrote:
> The equivalent motivating use case here is "allow pathlib objects to
> be used with the open() builtin, os module functions, and os.path
> module functions".

Why isn't this use case perfectly solved by:

def open(obj, *a):
   # If pathlib isn't imported, we can't possibly receive
   # a Path object.
   pathlib = sys.modules.get('pathlib', None):
   if pathlib is not None and isinstance(obj, pathlib.Path):
       obj = str(obj)
   return real_open(obj, *a)

> That leaves the text representation, and the question of defining
> equivalents to "operator.index" and its underlying __index__ protocol.

well, or the above (as far as I can see) :-).


GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From keithcu at  Sun Apr 10 17:24:59 2016
From: keithcu at (Keith Curtis)
Date: Sun, 10 Apr 2016 17:24:59 -0400
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

Hi again,

I mean this merely as food for thought. Thank you for reading.

I personally find Python very stable, but the bug counts have been
heading in the wrong direction:

I recently discovered those charts, and they have valuable data that
can be turned into action.

My distributor of Python is Arch. I presume it is very close to what
you guys release. That's just my example, but from what I've seen, few
pay for a Python runtime. Given your license, I wouldn't expect a lot
to. If someone waits for the wrong train, they won't get there.

I don't suggest guilt-tripping volunteers into fixing bugs. If you've
got to dig a ditch, you can try to enjoy the sun. Getting the bug
count under control is an admirable goal. The list is an opportunity
to focus on the known problems real people care about most, and to try
to deal with old issues before taking on new ones.

I don't recommend people fix bugs to help "corporations and large
organizations". You won't be very motivated by that anti-capitalist
mindset. People should fix bugs because it helps real Python users,
and gets rid of barriers. If people steadily remove roadblocks, things
will flourish.

I was teasing about you guys not being paid professionals, however, it
appears that very few of you have the goal of getting the official bug
count down. That is a big difference between amateurs and
professionals. If few people feel ownership of the official Python, it
can be bad. Maybe the PSF could hire more with that mindset. I don't
know the solutions, but I am grateful to be able to send you a few

As for WebAssembly, I don't know if Numpy can "simply" be re-compiled
for it. It seems like the sort of workitem that could take 2-3 years,
and never be actually fully compatible or as fast.

Python can run in sandboxes as well: People who care about
code being intercepted and manipulated should use SSL or sign it.
People who write their own code to run on their own machine would
likely prefer to be able to just directly reference the Python runtime
they've already setup.

I wonder whether the sandbox can be used outside of Web Assembly so
that code distribution and security are not so intermingled. I did see
someone write that WebAssembly is the "dawn of a new era", but I
wonder whether it is mostly a bunch of Javascript people trying to
solve their own problems rather than those who care about making
Python work well on it.



From ethan at  Sun Apr 10 17:31:15 2016
From: ethan at (Ethan Furman)
Date: Sun, 10 Apr 2016 14:31:15 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/10/2016 02:24 PM, Nikolaus Rath wrote:
> On Apr 09 2016, Nick Coghlan wrote:

>> The equivalent motivating use case here is "allow pathlib objects to
>> be used with the open() builtin, os module functions, and os.path
>> module functions".
> Why isn't this use case perfectly solved by:
> def open(obj, *a):
>     # If pathlib isn't imported, we can't possibly receive
>     # a Path object.
>     pathlib = sys.modules.get('pathlib', None):
>     if pathlib is not None and isinstance(obj, pathlib.Path):
>         obj = str(obj)
>     return real_open(obj, *a)

pathlib is the primary motivator, but there's no reason to shut everyone 
else out.

By having a well-defined protocol and helper function we not only make 
our own lives easier but we also make the lives of third-party libraries 
and experimenters easier.


From greg.ewing at  Sun Apr 10 20:36:27 2016
From: greg.ewing at (Greg Ewing)
Date: Mon, 11 Apr 2016 12:36:27 +1200
Subject: [Python-ideas] Consistent programming error handling idiom
In-Reply-To: <>
References: <>
Message-ID: <>

Steven D'Aprano wrote:
> Let's say I have an application which, on startup, looks for config 
> files in various places. If they exist and are readable, the application 
> uses them. If they don't exist or aren't readable, an EnvironmentError 
> will be generated (say, IOError). This isn't a user error, and my app 
> can and should just ignore such missing config files.

Of course you can't assume that an EnvironmentError
*anywhere* in the program represents a user error.
I was just suggesting a heuristic for the *top level*
exception hander to use.

Making that heuristic work well requires cooperation
from the rest of the code. In this case, it means
wrapping the code that reads config files to catch
file-not-found errors. You're going to have to do
that anyway if you want to carry on with the rest of
the processing.

> That's ... rather strange. As in:
> EnvironmentError("substring not found")
> for an unsuccessful search?

I might put a bit more information in there to help
the user.

The idea is that whatever I put in the EnvironmentError
will end up getting displayed to the user as an error
message by my top-level exception handler.

I might also use a subclass of EnvironmentError if I
want to be able to catch it specifically, but that's
not strictly necessary. I don't show the user the
exception class name, only its argument.


From cory at  Mon Apr 11 04:17:31 2016
From: cory at (Cory Benfield)
Date: Mon, 11 Apr 2016 09:17:31 +0100
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

> On 10 Apr 2016, at 22:24, Keith Curtis <keithcu at> wrote:
> Python can run in sandboxes as well:
> People who care about
> code being intercepted and manipulated should use SSL or sign it.

I don?t think you?ve understood the problem here: you seem to be saying that we can solve the lack of trust issue by ?rubbing some crypto on it?. But that doesn?t solve the problem at all.

Let?s take this apart.

?People who care about code being intercepted and manipulated?: if that code runs directly on the user?s machine then *everyone* should care. Put another way: it doesn?t matter what the *author* of the code cares about, it matters what the user cares about, and all users care about executing safe code! Otherwise, I can insert whatever Python code I like and run arbitrary code with the permissions of the browser. This opens the entire machine up to attack: an attacker can consume their CPU resources, transition into FS access, and basically do all kinds of wacky things.

But wrapping this in SSL or signing the code doesn?t solve the problem at all. I opened a new browser window, turned my ad blocker off, and then loaded The following domains loaded and executed Javascript code (this isn?t a complete list either, I gave up and got bored):


The point is that I, the user, did not consent to *any* of those. This is the way the web platform works: users don?t get asked who gets to execute code on their machine. Most users would not download a random binary to their machine from one of those domains and execute it. Wrapping that code in SSL or signing it prevents man-in-the-middle attacks on the code, but doesn?t in any sense prevent those actors listed there from doing terrible stuff!

It is well known that Sony wrote an actual rootkit that they used as DRM and distributed it via CDs. If you believe that one of the above sites wouldn?t do something equally malicious with full access to my machine, you?re living in a fantasy world. This means that any code that those domains are allowed to execute needs to run in an absurdly restricted context: because neither I nor any other user trusts arbitrary domains to run arbitrary code!

A Python sandbox that allows access to any code not distributed via the web browser is not restrictive enough. That Python tells you too much about the machine on which it is running. This is doubly bad if that Python is capable of calling into native code extensions distributed outside the browser (such as NumPy), because sandboxing code like that requires running a complete virtual machine. Either distributing that code would be a nightmare, or you?d be forcing users to run a complete x86 virtual machine in order to keep them safe from the arbitrary code that these actors are delivering.

> People who write their own code to run on their own machine would
> likely prefer to be able to just directly reference the Python runtime
> they've already setup.

People who write their own code to run on their own machine can write Python directly. They don?t need a web browser. Hell, they can bundle Flask and provide a localhost website that runs Python code. Those users are currently served just fine.

> I wonder whether the sandbox can be used outside of Web Assembly so
> that code distribution and security are not so intermingled. I did see
> someone write that WebAssembly is the "dawn of a new era", but I
> wonder whether it is mostly a bunch of Javascript people trying to
> solve their own problems rather than those who care about making
> Python work well on it.

Python will work just fine on it if you don?t add the bizarre requirement to be able to access the user?s machine from inside the browser sandbox.

Python on the web is a laudable goal, and should be pursued, but the idea of a Python-on-the-web as powerful as Python-on-the-machine is not. No-one wants their website Javascript to be able to call into a Node.js installed on the user?s machine, because they know that?s absurd. We shouldn?t want Python to be able to do it either.

Anyway, at this point I think we?re about as off-topic as we can get for this list, so I?m stepping back out of this conversation now. Feel free to follow-up off-list.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From ncoghlan at  Mon Apr 11 04:51:05 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Apr 2016 18:51:05 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On 11 April 2016 at 07:24, Keith Curtis <keithcu at> wrote:
> I personally find Python very stable, but the bug counts have been
> heading in the wrong direction:
> I recently discovered those charts, and they have valuable data that
> can be turned into action.

How? Are you offer to pay someone to bring them down? Are you offering
to work on it yourself? Are you offering to help with the current
workflow improvement efforts that are designed to make more efficient
use of the limited pool of available contributor time?

It's very easy to say "Hey, these metrics are going in the wrong
direction". It's very hard to do something productive about it, rather
than just complaining on the internet.

> My distributor of Python is Arch. I presume it is very close to what
> you guys release. That's just my example, but from what I've seen, few
> pay for a Python runtime.

Then you'd be incorrect (they may pay for it as part of something
else, like a Red Hat subscription, or an
Azure/AWS/GCE/Heroku/OpenShift account, but they're paying for it).

It's true that *open source and free software communities* are very
bad at ensuring the development processes for the software we use are
sustainable, but that seems to be because we're bad at supply chain
management in general and even more confused than most by the
difference between "duplicating an already existing piece of software
can readily be made zero cost" and "ensuring a piece of software
remains useful as the world around it changes requires ongoing
investments of time and money".

> I don't suggest guilt-tripping volunteers into fixing bugs. If you've
> got to dig a ditch, you can try to enjoy the sun. Getting the bug
> count under control is an admirable goal. The list is an opportunity
> to focus on the known problems real people care about most, and to try
> to deal with old issues before taking on new ones.

But is that a fun way for volunteers to prioritise their time, as
compared to working on things that interest them personally, or that
they otherwise find inherently rewarding?

Many of the oldest issues remain open because they're rare, easily
worked around, hard to fix, only arguably a bug, or some combination
of the above. Even reviewing them to see if they're still valid can be
time consuming.

A lot of projects deal with that problem by automatically closing old
issues that haven't been touched in a while, which seems to be a text
book case of gaming the system - once you treat "number of open
issues" as an important metric, you've created an incentive to
"default to closing", even if the underlying problem hasn't been

> I don't recommend people fix bugs to help "corporations and large
> organizations". You won't be very motivated by that anti-capitalist
> mindset.

What anti-capitalist mindset? "If someone wants me to fix bugs for
their reasons rather than mine, they can pay me" is a
quintessentiallity capitalist attitude. I only add the "corporations
and large organisations" qualifier because they're more likely to be
able to afford to pay someone, and less likely to have volunteers
willing to work on their problems for altruistic reasons.

> People should fix bugs because it helps real Python users,
> and gets rid of barriers.

This is still guilt tripping volunteers - neither "you're not
volunteering enough" nor "you're volunteering for the wrong things"
are acceptable messages to send to folks that are already contributing
(and folks that send that message regardless of its inappropriateness
are a major factor in open source contributors burning out and
quitting community contributions entirely, so it has the exact
opposite effect of the intended one).

By contrast, it's perfectly reasonable to let folks that have *spare*
time they're willing to give to the community know that there are
plenty of opportunities to contribute and provide info regarding some
of the many areas where assistance would be helpful, as long as it's
accompanied by the reminder that a sensible and sustainable personal
priority order is "health, relationships, paid work, volunteer work".

Even putting that question of healthy priorities aside, it's also the
case that the vast majority of Pythonistas *aren't* affected by the
edge cases that aren't handled correctly, or they're using third party
libraries that already work around those issues. The latter is
especially true for folks working across multiple Python versions.

> I was teasing about you guys not being paid professionals, however, it
> appears that very few of you have the goal of getting the official bug
> count down. That is a big difference between amateurs and
> professionals.

As noted above, commercial projects tend to be far more ruthless about
culling old "We're not going to invest resources in addressing this"
issues. We prefer to leave them open in case they catch someone's
interest (and because politely explaining to someone why you closed
their issue report can itself be quite time consuming).

However, I'll also reiterate my original point: if you want a customer
experience, pay someone.

> If few people feel ownership of the official Python, it
> can be bad. Maybe the PSF could hire more with that mindset.

There's a big difference between "not offering a customer experience
for free" and not feeling ownership. Now, you may *want* a customer
experience for free, but that doesn't make it a reasonable

> I don't
> know the solutions, but I am grateful to be able to send you a few
> words.

This isn't a case where articulating the problem is at all helpful -
we're well aware of the problem already (hence the metrics).

The PSF is also aware of the concern, but it's a longer term challenge
compared to other areas (such as the Python Package Index), so it's
not something we're interested in driving from the Board level at this
point in time.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Mon Apr 11 04:56:01 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Apr 2016 18:56:01 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan <ncoghlan at> wrote:
> Many of the oldest issues remain open because they're rare, easily
> worked around, hard to fix, only arguably a bug, or some combination
> of the above. Even reviewing them to see if they're still valid can be
> time consuming.

Maybe this is where someone like Keith can contribute? Go through a
lot of old issues and inevitably there'll be some that you can
reproduce with the version of Python that was current then, but can't
repro with today's Python, and they can be closed as fixed. Doesn't
take any knowledge of C, and maybe not even of Python (if there's a
good enough test case there).


From ncoghlan at  Mon Apr 11 05:59:54 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Apr 2016 19:59:54 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On 11 April 2016 at 18:56, Chris Angelico <rosuav at> wrote:
> On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan <ncoghlan at> wrote:
>> Many of the oldest issues remain open because they're rare, easily
>> worked around, hard to fix, only arguably a bug, or some combination
>> of the above. Even reviewing them to see if they're still valid can be
>> time consuming.
> Maybe this is where someone like Keith can contribute? Go through a
> lot of old issues and inevitably there'll be some that you can
> reproduce with the version of Python that was current then, but can't
> repro with today's Python, and they can be closed as fixed. Doesn't
> take any knowledge of C, and maybe not even of Python (if there's a
> good enough test case there).


As David Murray has pointed out on occasion, browsing through the
tracker "oldest first" can also be interesting in terms of reading the
discussions and seeing what leads to issues remaining open for a long

Separating out "Open Enhancements" as a subcategory of "Open Issues"
is also a longstanding "nice-to-have" for the metrics collection - the
script for the weekly data collection is at
, while
pulls the time series created by those notifications and turns it into
a chart.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From berker.peksag at  Mon Apr 11 06:38:02 2016
From: berker.peksag at (=?UTF-8?Q?Berker_Peksa=C4=9F?=)
Date: Mon, 11 Apr 2016 13:38:02 +0300
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 11, 2016 at 12:24 AM, Keith Curtis <keithcu at> wrote:

> I was teasing about you guys not being paid professionals, however, it
> appears that very few of you have the goal of getting the official bug
> count down.

While I understand your point, it's more complicated than that. I
spent some of my weekends to close/commit old issues on and unfortunately I can say that issue triaging is one
of the most unrewarding (you will probably get a "oh so it took three
years to commit a two lines patch?" or "wow it took only two years to
notice my problem" response) and time consuming (you will spend at
least 30 minutes to understand what the issue is about) tasks in open
source development.


From stephen at  Mon Apr 11 08:01:28 2016
From: stephen at (Stephen J. Turnbull)
Date: Mon, 11 Apr 2016 21:01:28 +0900
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

Executive summary: Keep your shirts on, Python-Dev is scaling.

Nick Coghlan writes:
 > On 11 April 2016 at 07:24, Keith Curtis <keithcu at> wrote:
 > > I personally find Python very stable, but the bug counts have been
 > > heading in the wrong direction:
 > >
 > >
 > > I recently discovered those charts, and they have valuable data that
 > > can be turned into action.
 > How?

Wrong question, in my opinion.  "Why do you think so?" is what I'd
ask.  To me, only the last graph, which shows the closed and total
growing at the same rate (more or less), and therefore suggests that
Python development is on a stable path vis-a-vis bugginess, is
particularly interesting.

Growth in number of open issues in that situation is arguably a *good*
thing.  Why might the open issues be growing, even though the fraction
of open issues (= 1 - the fraction of closed issues) is constant?

(1) Users are reporting lots of issues.  How would that happen?
    Python is getting lots of users, and they aren't going away
    because Python is "too buggy".  Hard to see that as a bad thing.

(2) Python has a growing amount of code to be buggy, and users are
    exercising the new code and reporting issues they encounter, and
    aren't going away because it's too buggy.  Hard to see that as a
    bad thing.

(3) The reported issues are duplicates reported because nobody's
    fixing an important subset.  That's bad, but is it real?  Well,
    can't disprove that just looking at the numbers, but (a) we'd
    notice the dupes (people are looking at the issue tracker, because
    issues are getting closed) and (b) the users would go away, but
    (1) and (2) say they aren't.

The same "back of the envelope" analysis applies to the "issues with
patches," I think.

Not to be Pollyanna about it; there are problems with Python's issue
management (and Nick knows them better than most).  And perhaps there
is an opportunity to leverage that stability, and improve the open to
total ratio while maintaining user and feature growth.  But AFAICS,
those graphs don't really tell us anything we can act on (except to
reassure us that the rumors that Python is about to be consumed by
termites are unfounded).

From ncoghlan at  Mon Apr 11 10:06:04 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Apr 2016 00:06:04 +1000
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <>

On 11 April 2016 at 22:01, Stephen J. Turnbull <stephen at> wrote:
> Executive summary: Keep your shirts on, Python-Dev is scaling.
> Nick Coghlan writes:
>  > On 11 April 2016 at 07:24, Keith Curtis <keithcu at> wrote:
>  > > I personally find Python very stable, but the bug counts have been
>  > > heading in the wrong direction:
>  > >
>  > >
>  > > I recently discovered those charts, and they have valuable data that
>  > > can be turned into action.
>  >
>  > How?
> Not to be Pollyanna about it; there are problems with Python's issue
> management (and Nick knows them better than most).

Yeah, I have a biased perspective here as I have a vested interest in
folks taking greater advantage of any support contracts they already
have (since Red Hat's subscriptions are what ultimately pay my
salary), and my pre-Red-Hat employment was with a large scale system
integrator, so I'm now inclined to look for supply chain based
solutions to capacity problems whenever there are credible commercial
incentives to be found.

Since keeping high touch customers out of community channels also
helps reduce the overall support burden for volunteers, I'll generally
advocate for that approach whenever the questions of increased
development capacity or changes in focus come up :)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From Nikolaus at  Mon Apr 11 10:43:49 2016
From: Nikolaus at (Nikolaus Rath)
Date: Mon, 11 Apr 2016 16:43:49 +0200
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <> (Ethan Furman's message of "Sun, 
 10 Apr 2016 14:31:15 -0700")
References: <>
 <> <>
Message-ID: <>

On Apr 10 2016, Ethan Furman <ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
> On 04/10/2016 02:24 PM, Nikolaus Rath wrote:
>> On Apr 09 2016, Nick Coghlan wrote:
>>> The equivalent motivating use case here is "allow pathlib objects to
>>> be used with the open() builtin, os module functions, and os.path
>>> module functions".
>> Why isn't this use case perfectly solved by:
>> def open(obj, *a):
>>     # If pathlib isn't imported, we can't possibly receive
>>     # a Path object.
>>     pathlib = sys.modules.get('pathlib', None):
>>     if pathlib is not None and isinstance(obj, pathlib.Path):
>>         obj = str(obj)
>>     return real_open(obj, *a)
> pathlib is the primary motivator, but there's no reason to shut
> everyone else out.
> By having a well-defined protocol and helper function we not only make
> our own lives easier but we also make the lives of third-party
> libraries and experimenters easier.

To me this sounds like catering to a hypothetical audience that may want
to do hypothetical things. If you start with the above, and people
complain that their favorite non-pathlib path library is not supported
by the stdlib, you can still add a protocol. But if you add a protocol
right away, you're stuck with the complexity for a very long time even
if almost no one actually uses it.


GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From random832 at  Mon Apr 11 10:57:47 2016
From: random832 at (Random832)
Date: Mon, 11 Apr 2016 10:57:47 -0400
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Mon, Apr 11, 2016, at 10:43, Nikolaus Rath wrote:
> To me this sounds like catering to a hypothetical audience that may want
> to do hypothetical things. If you start with the above, and people
> complain that their favorite non-pathlib path library is not supported
> by the stdlib, you can still add a protocol. But if you add a protocol
> right away, you're stuck with the complexity for a very long time even
> if almost no one actually uses it.

And there's nothing stopping people from subclassing from Path (should
it be PurePath?), or monkey-patching pathlib. In this case,
"isinstance(foo, Path) returns true" is the protocol.

Also, where is the .path attribute? It's documented

>>> pathlib.Path(".").path
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'PosixPath' object has no attribute 'path'

From ethan at  Mon Apr 11 11:10:24 2016
From: ethan at (Ethan Furman)
Date: Mon, 11 Apr 2016 08:10:24 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 04/11/2016 07:57 AM, Random832 wrote:

> Also, where is the .path attribute? It's documented
> <>,
> but...
>>>> pathlib.Path(".").path
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> AttributeError: 'PosixPath' object has no attribute 'path'

pathlib is provisional -- `.path` has been committed and will be 
available with the next releases (assuming we don't change it out for 
the protocol version).


From ethan at  Mon Apr 11 11:13:32 2016
From: ethan at (Ethan Furman)
Date: Mon, 11 Apr 2016 08:13:32 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 04/11/2016 07:43 AM, Nikolaus Rath wrote:
> On Apr 10 2016, Ethan Furman wrote:

>> By having a well-defined protocol and helper function we not only make
>> our own lives easier but we also make the lives of third-party
>> libraries and experimenters easier.
> To me this sounds like catering to a hypothetical audience that may want
> to do hypothetical things. If you start with the above, and people
> complain that their favorite non-pathlib path library is not supported
> by the stdlib, you can still add a protocol. But if you add a protocol
> right away, you're stuck with the complexity for a very long time even
> if almost no one actually uses it.

The part of "make our own lives easier" is not a hypothetical audience.

Making my own library (antipathy) work seamlessly with pathlib and 
DirEntry is not hypothetical.


From Nikolaus at  Mon Apr 11 17:09:01 2016
From: Nikolaus at (Nikolaus Rath)
Date: Mon, 11 Apr 2016 14:09:01 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <> (Ethan Furman's message of "Mon, 
 11 Apr 2016 08:13:32 -0700")
References: <>
 <> <>
 <> <>
Message-ID: <>

On Apr 11 2016, Ethan Furman <ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
> On 04/11/2016 07:43 AM, Nikolaus Rath wrote:
>> On Apr 10 2016, Ethan Furman wrote:
>>> By having a well-defined protocol and helper function we not only make
>>> our own lives easier but we also make the lives of third-party
>>> libraries and experimenters easier.
>> To me this sounds like catering to a hypothetical audience that may want
>> to do hypothetical things. If you start with the above, and people
>> complain that their favorite non-pathlib path library is not supported
>> by the stdlib, you can still add a protocol. But if you add a protocol
>> right away, you're stuck with the complexity for a very long time even
>> if almost no one actually uses it.
> The part of "make our own lives easier" is not a hypothetical
> audience.

My assumption was that "own" refers to core developers here, while..

> Making my own library (antipathy) work seamlessly with pathlib and
> DirEntry is not hypothetical. you seem to be wearing your third-party maintainer hat :-).

As far as I can see, implementing a protocol instead of adding a few
isinstance checks is more likely to make the life of a CPython developer
harder than easier.


GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From ethan at  Mon Apr 11 17:19:39 2016
From: ethan at (Ethan Furman)
Date: Mon, 11 Apr 2016 14:19:39 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On 04/11/2016 02:09 PM, Nikolaus Rath wrote:
> On Apr 11 2016, Ethan Furman <ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
>> On 04/11/2016 07:43 AM, Nikolaus Rath wrote:
>>> On Apr 10 2016, Ethan Furman wrote:
>>>> By having a well-defined protocol and helper function we not only make
>>>> our own lives easier but we also make the lives of third-party
>>>> libraries and experimenters easier.
>>> To me this sounds like catering to a hypothetical audience that may want
>>> to do hypothetical things. If you start with the above, and people
>>> complain that their favorite non-pathlib path library is not supported
>>> by the stdlib, you can still add a protocol. But if you add a protocol
>>> right away, you're stuck with the complexity for a very long time even
>>> if almost no one actually uses it.
>> The part of "make our own lives easier" is not a hypothetical
>> audience.
> My assumption was that "own" refers to core developers here, while..

It does.

>> Making my own library (antipathy) work seamlessly with pathlib and
>> DirEntry is not hypothetical.
> you seem to be wearing your third-party maintainer hat :-).

I was.  :)

> As far as I can see, implementing a protocol instead of adding a few
> isinstance checks is more likely to make the life of a CPython developer
> harder than easier.

I disagree.  And the protocol idea was not mine, so apparently other 
core-devs also disagree (or think it's worth it, regardless).  Even 
without the protocol I would think we'd still make a separate function 
to check the input.


From d3matt at  Mon Apr 11 17:22:25 2016
From: d3matt at (Matthew Stoltenberg)
Date: Mon, 11 Apr 2016 15:22:25 -0600
Subject: [Python-ideas] simpler weakref.finalize
Message-ID: <>

Currently, if I want to have weakref.finalize call an object's cleanup method,
I have to do something like:

class foo:

    _finalizer = None

    def __init__(self):
        self._finalizer = weakref.finalize(self,


    def __del__(self):

    def _cleanup(cls, func):

    def cleanup(self):
        if self._finalizer is not None:
        print('cleaning up')

It wouldn't be difficult to have weakref.finalize automatically handle the
conversion to WeakMethod and automatically attempt to dereference then call
the function passed in.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From keithcu at  Mon Apr 11 17:36:27 2016
From: keithcu at (Keith Curtis)
Date: Mon, 11 Apr 2016 17:36:27 -0400
Subject: [Python-ideas] (no subject)
Message-ID: <>

Hello again,

Many FOSS communities struggle with sustainability, but Python is very
rich so you've clearly figured it out.

It is easy to say that bug metrics are going in the wrong direction.
It is also easy to do nothing, or to criticize those who are concerned
about a systemic issue. Data can be turned into action by leadership
and plans. It can be helpful to have people who think 5,000 is a big
number. (Automatically closing old issues doesn't take grit and isn't
a valid solution.) It seems you could use at least 10 more dedicated
devs who focus on the official CPython bug count.

As for WebAssembly, if their security is done mostly via their virtual
machine, then they won't be able to separate it. However, if Python in
the browser can't enable optional access Numpy, etc., then it will be
missing the best reasons to use it. Sometimes security can destroy
utility. Just because there are some untrustworthy websites doesn't
mean all must be.

Warm regards,


From ethan at  Mon Apr 11 17:46:23 2016
From: ethan at (Ethan Furman)
Date: Mon, 11 Apr 2016 14:46:23 -0700
Subject: [Python-ideas] (no subject)
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/11/2016 02:36 PM, Keith Curtis wrote:

> It seems you could use at least 10 more dedicated
> devs who focus on the official CPython bug count.

At least.  So create a Python Fund and let us know where to apply.  :)


From ian.g.kelly at  Mon Apr 11 18:07:42 2016
From: ian.g.kelly at (Ian Kelly)
Date: Mon, 11 Apr 2016 16:07:42 -0600
Subject: [Python-ideas] (no subject)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 11, 2016 at 3:36 PM, Keith Curtis <keithcu at> wrote:
> As for WebAssembly, if their security is done mostly via their virtual
> machine, then they won't be able to separate it. However, if Python in
> the browser can't enable optional access Numpy, etc., then it will be
> missing the best reasons to use it. Sometimes security can destroy
> utility. Just because there are some untrustworthy websites doesn't
> mean all must be.

How do you propose to ascertain whether the website the user is
visiting is trustworthy? Ask the user whether they trust the author?
This sounds like it has great potential to be the security disaster of
signed Java applets all over again.

From tjreedy at  Mon Apr 11 18:45:38 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Apr 2016 18:45:38 -0400
Subject: [Python-ideas] A tuple of various Python suggestions
In-Reply-To: <>
References: <>
Message-ID: <neh9fb$ktr$>

On 4/11/2016 4:56 AM, Chris Angelico wrote:
> On Mon, Apr 11, 2016 at 6:51 PM, Nick Coghlan <ncoghlan at> wrote:
>> Many of the oldest issues remain open because they're rare, easily
>> worked around, hard to fix, only arguably a bug, or some combination
>> of the above. Even reviewing them to see if they're still valid can be
>> time consuming.
> Maybe this is where someone like Keith can contribute? Go through a
> lot of old issues and inevitably there'll be some that you can
> reproduce with the version of Python that was current then, but can't
> repro with today's Python, and they can be closed as fixed. Doesn't
> take any knowledge of C, and maybe not even of Python (if there's a
> good enough test case there).

Or: look through old enhancement requests.  There are many that were 
posted before python-ideas existed.  Today, they would likely be posted 
here first, and if no support (as happens more often than not), never 
posted.  Or is posted first to bugs, told to go to python-ideas for 
discussion.  Anyone could repost old ideas to python-list now and post 
the result of discussion back to the tracker.  If the discussion 
suggests rejection, then a coredev can close the issue.  (I certainly 
would.)  Closure is not erasure, so the idea is there permanently anyway.

Terry Jan Reedy

From wes.turner at  Mon Apr 11 18:47:07 2016
From: wes.turner at (Wes Turner)
Date: Mon, 11 Apr 2016 17:47:07 -0500
Subject: [Python-ideas] (no subject)
In-Reply-To: <>
References: <>
Message-ID: <>

Python Software Foundation (PSF)

| Web:
| Twitter:

PSF accepts donations, yeah.

* *

Other ways to fund (additions to, fixes for, idle talk about) open source

* Crowdfunding campaign (specific)
* Bounties (specific / open)
* "Hire a developer"
On Apr 11, 2016 4:48 PM, "Ethan Furman" <ethan at> wrote:

> On 04/11/2016 02:36 PM, Keith Curtis wrote:
> It seems you could use at least 10 more dedicated
>> devs who focus on the official CPython bug count.
> At least.  So create a Python Fund and let us know where to apply.  :)
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Apr 11 22:58:05 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Apr 2016 12:58:05 +1000
Subject: [Python-ideas] (no subject)
In-Reply-To: <>
References: <>
Message-ID: <>

On 12 April 2016 at 07:36, Keith Curtis <keithcu at> wrote:
> Hello again,
> Many FOSS communities struggle with sustainability, but Python is very
> rich so you've clearly figured it out.
> It is easy to say that bug metrics are going in the wrong direction.
> It is also easy to do nothing, or to criticize those who are concerned
> about a systemic issue. Data can be turned into action by leadership
> and plans. It can be helpful to have people who think 5,000 is a big
> number. (Automatically closing old issues doesn't take grit and isn't
> a valid solution.) It seems you could use at least 10 more dedicated
> devs who focus on the official CPython bug count.

Keith, I get it. You're worried about the issue tracker stats, and
apparently believe if you just yell long enough and hard enough here
we'll suddenly go "You know, you're right, we never thought of that,
and we should drop everything else immediately in favour of seeking
funding for full-time core development work".

However, CPython core development is only *one* of the activities the
PSF helps to support (see [1] for a partial list of others), and it's
one where commercial entities can most readily contribute people's
time and energy directly rather than indirectly through the Python
Software Foundation.

As core developers, we're individually free to add our details to the
Motivations & Affiliations page at [2] and negotiate with our current
and future employers for dedicated time to devote to general CPython
maintenance, rather than focusing solely on specific items relevant to
our work.

Folks that aren't core developers yet, but are fortunate enough to
work for organisations with a good career planning process and a
vested interest in Python's continued success are free to negotiate
with their managers to add "become a CPython core developer and spend
some of my working hours on general CPython maintenance" to their
individual career goals.

Any core developer that chooses to do so is also already free to
submit a development grant proposal to the PSF to dedicate some of
their time to issue tracker grooming, and it's a fair bet (although
not a guarantee) that any such grant proposal would be approved as
long as the hourly rate and total amount requested were reasonable,
and the activities to be pursued and the desired outcome were defined

However, whether or not anyone chooses to do any of those things is a
decision that takes place in the context of that "health,
relationships, paid work, volunteer work" priority order I mentioned
earlier. Not everyone is going to want to turn a volunteer activity
into a paid one, and not everyone is going to want to prioritise
CPython core development over their other activities. Telling people
"your priorities should be different because I say they should be
different" is an approach that has never worked in volunteer
management, and never *will* work in volunteer management, as
overcoming those differences in intrinsic motivation is the key
rationale for paid employment.



Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ben+python at  Mon Apr 11 23:08:45 2016
From: ben+python at (Ben Finney)
Date: Tue, 12 Apr 2016 13:08:45 +1000
Subject: [Python-ideas] (no subject)
References: <>
Message-ID: <>

Keith Curtis <keithcu at> writes:

> It is easy to say that bug metrics are going in the wrong direction.
> It is also easy to do nothing, or to criticize those who are concerned
> about a systemic issue.

This is astounding hubris. You started several threads unprompted and
made many posts this week, which can all be fairly characterised as
criticism without actionable solutions.

How is that usefully distinct from ?do nothing?, except for occupying
time in apparently fruitless nagging?

 \       ?Corporation, n. An ingenious device for obtaining individual |
  `\       profit without individual responsibility.? ?Ambrose Bierce, |
_o__)                                   _The Devil's Dictionary_, 1906 |
Ben Finney

From nicholas.chammas at  Tue Apr 12 10:39:28 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Tue, 12 Apr 2016 14:39:28 +0000
Subject: [Python-ideas] (no subject)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 11, 2016 at 10:58 PM Nick Coghlan <ncoghlan at> wrote:

> However, whether or not anyone chooses to do any of those things is a
> decision that takes place in the context of that "health,
> relationships, paid work, volunteer work" priority order I mentioned
> earlier. Not everyone is going to want to turn a volunteer activity
> into a paid one, and not everyone is going to want to prioritise
> CPython core development over their other activities. Telling people
> "your priorities should be different because I say they should be
> different" is an approach that has never worked in volunteer
> management, and never *will* work in volunteer management, as
> overcoming those differences in intrinsic motivation is the key
> rationale for paid employment.

I think this gets to the core of how work gets done in a volunteer
organization, and addresses a common misunderstanding people have.

Does anyone know of a blog post that expands on this point? I think it
would be a good resource to point people to in situations like this.

It's easy to think that Python is somehow developed by an organization that
decides top-down what's going to get done and who's going to do what,
allocating resources as needed. After all, that's how a typical company

In the Python community (and perhaps in most open-source communities not
dominated by a backing company) I gather the rules are very different.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stefan at  Tue Apr 12 13:07:02 2016
From: stefan at (Stefan Krah)
Date: Tue, 12 Apr 2016 17:07:02 +0000 (UTC)
Subject: [Python-ideas] (no subject)
References: <>
Message-ID: <>

Nicholas Chammas <nicholas.chammas at ...> writes:
> It's easy to think that Python is somehow developed by an organization
that decides top-down what's going to get done and who's going to do what,
allocating resources as needed. After all, that's how a typical company works.
> In the Python community (and perhaps in most open-source communities not
dominated by a backing company) I gather the rules are very different.

This is an excellent point.  I think that one of the problems is
that the Python website is entirely dominated by the PSF.

Perhaps it is time to put a little more emphasis on development

Stefan Krah

From Nikolaus at  Tue Apr 12 13:17:52 2016
From: Nikolaus at (Nikolaus Rath)
Date: Tue, 12 Apr 2016 10:17:52 -0700
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <> (Ethan Furman's message of "Mon, 
 11 Apr 2016 14:19:39 -0700")
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On Apr 11 2016, Ethan Furman <ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
>> As far as I can see, implementing a protocol instead of adding a few
>> isinstance checks is more likely to make the life of a CPython developer
>> harder than easier.
> I disagree.  And the protocol idea was not mine, so apparently other
> core-devs also disagree (or think it's worth it, regardless).

I haven't found any email explaining why a protocol would make things
easier than the isinstance() approach (and I read most of the threads
both here and on -dev), so I was assuming that the core-devs in question
don't disagree but haven't considered the second approach.

But this will be my last mail advocating it. If there's still no
interest, then there probably is a reason for it :-).


GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             ?Time flies like an arrow, fruit flies like a Banana.?

From brett at  Tue Apr 12 13:33:38 2016
From: brett at (Brett Cannon)
Date: Tue, 12 Apr 2016 17:33:38 +0000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On Tue, 12 Apr 2016 at 10:18 Nikolaus Rath <Nikolaus at> wrote:

> On Apr 11 2016, Ethan Furman <
> ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
> >> As far as I can see, implementing a protocol instead of adding a few
> >> isinstance checks is more likely to make the life of a CPython developer
> >> harder than easier.
> >
> > I disagree.  And the protocol idea was not mine, so apparently other
> > core-devs also disagree (or think it's worth it, regardless).
> I haven't found any email explaining why a protocol would make things
> easier than the isinstance() approach (and I read most of the threads
> both here and on -dev), so I was assuming that the core-devs in question
> don't disagree but haven't considered the second approach.

I disagree with the idea. :) Type checking tends to be too strict as it
prevents duck typing. And locking ourselves to only what's in the stdlib is
way too restrictive when there are alternative path libraries out there
already. Plus it's short-sighted to assume no one will ever come up with a
better path library, and so tying us down to only pathlib.PurePath
explicitly would be a mistake long-term when specifying a single method
keeps things flexible (this is the same reason __index__() exists and we
simply don't say indexing has to be with a subclass of int or something).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bzvi7919 at  Tue Apr 12 17:40:33 2016
From: bzvi7919 at (Bar Harel)
Date: Tue, 12 Apr 2016 21:40:33 +0000
Subject: [Python-ideas] Modifying yield from's return value
Message-ID: <>

I asked a question in stackoverflow
<> regarding a way of modifying
yield from's return value.
There doesn't seem to be any way of modifying the data yielded by the yield
from expression without breaking the yield from "pipe". By "pipe" I mean
the fact that .send() and .throw() pass to the inner generator. Useful
cases are parsers as I've demonstrated in the question, transcoders,
encoding and decoding the yielded values without interrupting .send() or
.throw() and generally coroutines.

I believe it would be beneficial to many aspects and libraries in Python,
most notably asyncio.

I couldn't think of a good syntax though, other than creating a wrapping
function, lets say in itertools, that creates a class overriding send,
throw and __next__ and receives the generator and associated modification
functions (like suggested in one of the answers).

What do you think? Is it useful? Any suggested syntax?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rob.cliffe at  Tue Apr 12 17:40:32 2016
From: rob.cliffe at (Rob Cliffe)
Date: Tue, 12 Apr 2016 22:40:32 +0100
Subject: [Python-ideas] random.choice on non-sequence
Message-ID: <>

It surprised me a bit the first time I realised that random.choice did 
not work on a set.  (One expects "everything" to "just work" in Python! 
:-) )
There is a simple workaround (convert the set to a tuple or list before 
passing it) but ...

On 2011-06-23, Sven Marnach posted 'A few suggestions for the random 
module' on Python-Ideas, one of which was to allow random.choice to work 
on an arbitrary iterable.
Raymond Hettinger commented
     "It could be generalized to arbitrary iterables (Bentley provides 
an example of how to do this) but it is fragile (i.e. falls apart badly 
with weak random number generators) and doesn't correspond well with 
real use cases."
Sven replied
     "Again, there definitely are real world use cases ..."

I would like to revive this suggestion.  I don't understand Raymond's 
comment.  It seems to me that a reasonable implementation is better and 
more friendly than none, particularly for newbies.
This is the source code for random.choice in Python 3.2 (as far as I 
know it hasn't changed since):

     def choice(self, seq):
         """Choose a random element from a non-empty sequence."""
             i = self._randbelow(len(seq))
         except ValueError:
             raise IndexError('Cannot choose from an empty sequence')
         return seq[i]

And here is my proposed (tested) version:

     def choice(self, it):
         """Choose a random element from a non-empty iterable."""
         except IndexError:
             raise IndexError('Cannot choose from an empty sequence')
         except TypeError:
             it = tuple(it)
             i = self._randbelow(len(it))
         except ValueError:
             raise IndexError('Cannot choose from an empty iterable')
         return it[i]

This works on (e.g.) a list/tuple/string, a set, a dictionary view or a 
Obviously the generator has to be 'fresh' (i.e. any previously consumed 
values will be excluded from the choice) and 'throw-away' (random.choice 
will exhaust it).
But it means you can write code like
     random.choice(x for x in xrange(10) if x%3)    # this feels quite 
"Pythonic" to me!

(I experimented with versions that, when passed an object that supported 
len() but not indexing, viz. a set, iterated through the object as far 
as necessary instead of converting it to a list.  But I found that this 
was slower in practice as well as more complicated.  There might be a 
memory issue with huge iterables, but we are no worse off using existing 
workarounds for those.)

Downsides of proposed version:
     (1) Backward compatibility.  Could mask an error if a "wrong" 
object, but one that happens to be an iterable, is passed to random.choice.
           There is also slightly different behaviour if a dictionary 
(D) is passed (an unusual thing to do):
                 The current version will get a random integer in the 
range [0, len(D)), try to use that as a key of D,
                    and raise KeyError if that fails.
                 The proposed version behaves similarly except that it 
will always raise "KeyError: 0" if 0 is not a key of D.
                 One could argue that this is an improvement, if it 
matters at all (error reproducibility).
                 And code that deliberately exploits the existing 
behaviour, provided that it uses a dictionary whose keys are consecutive 
integers from 0 upwards, e.g.
                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 
'three', 4 : 'four' }
                     d = random.choice(D)    # equivalent, in this 
example, to "d = random.choice(list(D.values()))"
                     if d == 'zero':
                 will continue to work (although one might argue that it 
deserves not to).
     (2) Speed - the proposed version is almost certainly slightly 
slower when called with a sequence.
          For what it's worth, I tried to measure the difference on my 
several-years-old Windows PC, but is was hard to measure accurately, 
even over millions of iterations.
          All I can say is that it appeared to be a small fraction of a 
millisecond, possibly in the region of 50 nanoseconds, per call.

Best wishes
Rob Cliffe

From guido at  Tue Apr 12 17:53:17 2016
From: guido at (Guido van Rossum)
Date: Tue, 12 Apr 2016 14:53:17 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

The problem with your version is that copying the input is slow if it is large.

Raymond was referencing the top answer here:
It's also slow though (draws N random values if the input is of length

On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at> wrote:
> It surprised me a bit the first time I realised that random.choice did not
> work on a set.  (One expects "everything" to "just work" in Python! :-) )
> There is a simple workaround (convert the set to a tuple or list before
> passing it) but ...
> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module'
> on Python-Ideas, one of which was to allow random.choice to work on an
> arbitrary iterable.
> Raymond Hettinger commented
>     "It could be generalized to arbitrary iterables (Bentley provides an
> example of how to do this) but it is fragile (i.e. falls apart badly with
> weak random number generators) and doesn't correspond well with real use
> cases."
> Sven replied
>     "Again, there definitely are real world use cases ..."
> I would like to revive this suggestion.  I don't understand Raymond's
> comment.  It seems to me that a reasonable implementation is better and more
> friendly than none, particularly for newbies.
> This is the source code for random.choice in Python 3.2 (as far as I know it
> hasn't changed since):
>     def choice(self, seq):
>         """Choose a random element from a non-empty sequence."""
>         try:
>             i = self._randbelow(len(seq))
>         except ValueError:
>             raise IndexError('Cannot choose from an empty sequence')
>         return seq[i]
> And here is my proposed (tested) version:
>     def choice(self, it):
>         """Choose a random element from a non-empty iterable."""
>         try:
>             it[0]
>         except IndexError:
>             raise IndexError('Cannot choose from an empty sequence')
>         except TypeError:
>             it = tuple(it)
>         try:
>             i = self._randbelow(len(it))
>         except ValueError:
>             raise IndexError('Cannot choose from an empty iterable')
>         return it[i]
> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
> generator.
> Obviously the generator has to be 'fresh' (i.e. any previously consumed
> values will be excluded from the choice) and 'throw-away' (random.choice
> will exhaust it).
> But it means you can write code like
>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
> "Pythonic" to me!
> (I experimented with versions that, when passed an object that supported
> len() but not indexing, viz. a set, iterated through the object as far as
> necessary instead of converting it to a list.  But I found that this was
> slower in practice as well as more complicated.  There might be a memory
> issue with huge iterables, but we are no worse off using existing
> workarounds for those.)
> Downsides of proposed version:
>     (1) Backward compatibility.  Could mask an error if a "wrong" object,
> but one that happens to be an iterable, is passed to random.choice.
>           There is also slightly different behaviour if a dictionary (D) is
> passed (an unusual thing to do):
>                 The current version will get a random integer in the range
> [0, len(D)), try to use that as a key of D,
>                    and raise KeyError if that fails.
>                 The proposed version behaves similarly except that it will
> always raise "KeyError: 0" if 0 is not a key of D.
>                 One could argue that this is an improvement, if it matters
> at all (error reproducibility).
>                 And code that deliberately exploits the existing behaviour,
> provided that it uses a dictionary whose keys are consecutive integers from
> 0 upwards, e.g.
>                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 :
> 'four' }
>                     d = random.choice(D)    # equivalent, in this example,
> to "d = random.choice(list(D.values()))"
>                     if d == 'zero':
>                         ...
>                 will continue to work (although one might argue that it
> deserves not to).
>     (2) Speed - the proposed version is almost certainly slightly slower
> when called with a sequence.
>          For what it's worth, I tried to measure the difference on my
> several-years-old Windows PC, but is was hard to measure accurately, even
> over millions of iterations.
>          All I can say is that it appeared to be a small fraction of a
> millisecond, possibly in the region of 50 nanoseconds, per call.
> Best wishes
> Rob Cliffe
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From joejev at  Tue Apr 12 17:55:17 2016
From: joejev at (Joseph Jevnik)
Date: Tue, 12 Apr 2016 17:55:17 -0400
Subject: [Python-ideas] Modifying yield from's return value
In-Reply-To: <>
References: <>
Message-ID: <>

I have written a library `cotoolz` that provides the primitive pieces
needed for this. It implements a `comap` type which is like `map` but
properly forwards `send`, `throw`, and `close` to the underlying coroutine.
There is no special syntax needed, just write:

yield from comap(f, inner_coroutine())

This also supports `cozip` which lets you zip together multiple coroutines,
for example:

yield from cozip(inner_coroutine_a(), inner_corouting_b(),
inner_coroutine_c(), ...)

This will fan out the sends to all of the coroutines and collect the
results into a single tuple to yield. Just like `zip`, this will be
exchausted when the first coroutine is exhausted.

The library is available as free software on pypi or on here:

On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel <bzvi7919 at> wrote:

> I asked a question in stackoverflow
> <> regarding a way of
> modifying yield from's return value.
> There doesn't seem to be any way of modifying the data yielded by the
> yield from expression without breaking the yield from "pipe". By "pipe" I
> mean the fact that .send() and .throw() pass to the inner generator. Useful
> cases are parsers as I've demonstrated in the question, transcoders,
> encoding and decoding the yielded values without interrupting .send() or
> .throw() and generally coroutines.
> I believe it would be beneficial to many aspects and libraries in Python,
> most notably asyncio.
> I couldn't think of a good syntax though, other than creating a wrapping
> function, lets say in itertools, that creates a class overriding send,
> throw and __next__ and receives the generator and associated modification
> functions (like suggested in one of the answers).
> What do you think? Is it useful? Any suggested syntax?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bzvi7919 at  Tue Apr 12 18:02:40 2016
From: bzvi7919 at (Bar Harel)
Date: Tue, 12 Apr 2016 22:02:40 +0000
Subject: [Python-ideas] Modifying yield from's return value
In-Reply-To: <>
References: <>
Message-ID: <>

Perhaps something like this should go in the standard library? Or change
the builtin map to forward everything? Changing the builtin map in this
case will be backwards compatible and will overall be beneficial. If
someone made a workaround, the workaround would still work, but now the
builtins will do it per se.

On Wed, Apr 13, 2016 at 12:55 AM Joseph Jevnik <joejev at> wrote:

> I have written a library `cotoolz` that provides the primitive pieces
> needed for this. It implements a `comap` type which is like `map` but
> properly forwards `send`, `throw`, and `close` to the underlying coroutine.
> There is no special syntax needed, just write:
> ```
> yield from comap(f, inner_coroutine())
> ```
> This also supports `cozip` which lets you zip together multiple
> coroutines, for example:
> ```
> yield from cozip(inner_coroutine_a(), inner_corouting_b(),
> inner_coroutine_c(), ...)
> ```
> This will fan out the sends to all of the coroutines and collect the
> results into a single tuple to yield. Just like `zip`, this will be
> exchausted when the first coroutine is exhausted.
> The library is available as free software on pypi or on here:
> On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel <bzvi7919 at> wrote:
>> I asked a question in stackoverflow
>> <> regarding a way of
>> modifying yield from's return value.
>> There doesn't seem to be any way of modifying the data yielded by the
>> yield from expression without breaking the yield from "pipe". By "pipe" I
>> mean the fact that .send() and .throw() pass to the inner generator. Useful
>> cases are parsers as I've demonstrated in the question, transcoders,
>> encoding and decoding the yielded values without interrupting .send() or
>> .throw() and generally coroutines.
>> I believe it would be beneficial to many aspects and libraries in Python,
>> most notably asyncio.
>> I couldn't think of a good syntax though, other than creating a wrapping
>> function, lets say in itertools, that creates a class overriding send,
>> throw and __next__ and receives the generator and associated modification
>> functions (like suggested in one of the answers).
>> What do you think? Is it useful? Any suggested syntax?
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From cognetta.marco at  Tue Apr 12 18:09:40 2016
From: cognetta.marco at (Marco Cognetta)
Date: Tue, 12 Apr 2016 18:09:40 -0400
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

Maybe I am misunderstanding but for a generator, couldn't you avoid
storing it in memory and just using the following online algorithm to
save space?

(I hope this message goes through correctly, first time participating
in a discussion on this mailing list).


On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum <guido at> wrote:
> The problem with your version is that copying the input is slow if it is large.
> Raymond was referencing the top answer here:
> It's also slow though (draws N random values if the input is of length
> N).
> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at> wrote:
>> It surprised me a bit the first time I realised that random.choice did not
>> work on a set.  (One expects "everything" to "just work" in Python! :-) )
>> There is a simple workaround (convert the set to a tuple or list before
>> passing it) but ...
>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module'
>> on Python-Ideas, one of which was to allow random.choice to work on an
>> arbitrary iterable.
>> Raymond Hettinger commented
>>     "It could be generalized to arbitrary iterables (Bentley provides an
>> example of how to do this) but it is fragile (i.e. falls apart badly with
>> weak random number generators) and doesn't correspond well with real use
>> cases."
>> Sven replied
>>     "Again, there definitely are real world use cases ..."
>> I would like to revive this suggestion.  I don't understand Raymond's
>> comment.  It seems to me that a reasonable implementation is better and more
>> friendly than none, particularly for newbies.
>> This is the source code for random.choice in Python 3.2 (as far as I know it
>> hasn't changed since):
>>     def choice(self, seq):
>>         """Choose a random element from a non-empty sequence."""
>>         try:
>>             i = self._randbelow(len(seq))
>>         except ValueError:
>>             raise IndexError('Cannot choose from an empty sequence')
>>         return seq[i]
>> And here is my proposed (tested) version:
>>     def choice(self, it):
>>         """Choose a random element from a non-empty iterable."""
>>         try:
>>             it[0]
>>         except IndexError:
>>             raise IndexError('Cannot choose from an empty sequence')
>>         except TypeError:
>>             it = tuple(it)
>>         try:
>>             i = self._randbelow(len(it))
>>         except ValueError:
>>             raise IndexError('Cannot choose from an empty iterable')
>>         return it[i]
>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
>> generator.
>> Obviously the generator has to be 'fresh' (i.e. any previously consumed
>> values will be excluded from the choice) and 'throw-away' (random.choice
>> will exhaust it).
>> But it means you can write code like
>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
>> "Pythonic" to me!
>> (I experimented with versions that, when passed an object that supported
>> len() but not indexing, viz. a set, iterated through the object as far as
>> necessary instead of converting it to a list.  But I found that this was
>> slower in practice as well as more complicated.  There might be a memory
>> issue with huge iterables, but we are no worse off using existing
>> workarounds for those.)
>> Downsides of proposed version:
>>     (1) Backward compatibility.  Could mask an error if a "wrong" object,
>> but one that happens to be an iterable, is passed to random.choice.
>>           There is also slightly different behaviour if a dictionary (D) is
>> passed (an unusual thing to do):
>>                 The current version will get a random integer in the range
>> [0, len(D)), try to use that as a key of D,
>>                    and raise KeyError if that fails.
>>                 The proposed version behaves similarly except that it will
>> always raise "KeyError: 0" if 0 is not a key of D.
>>                 One could argue that this is an improvement, if it matters
>> at all (error reproducibility).
>>                 And code that deliberately exploits the existing behaviour,
>> provided that it uses a dictionary whose keys are consecutive integers from
>> 0 upwards, e.g.
>>                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 :
>> 'four' }
>>                     d = random.choice(D)    # equivalent, in this example,
>> to "d = random.choice(list(D.values()))"
>>                     if d == 'zero':
>>                         ...
>>                 will continue to work (although one might argue that it
>> deserves not to).
>>     (2) Speed - the proposed version is almost certainly slightly slower
>> when called with a sequence.
>>          For what it's worth, I tried to measure the difference on my
>> several-years-old Windows PC, but is was hard to measure accurately, even
>> over millions of iterations.
>>          All I can say is that it appeared to be a small fraction of a
>> millisecond, possibly in the region of 50 nanoseconds, per call.
>> Best wishes
>> Rob Cliffe
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From ethan at  Tue Apr 12 18:21:11 2016
From: ethan at (Ethan Furman)
Date: Tue, 12 Apr 2016 15:21:11 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/12/2016 02:53 PM, Guido van Rossum wrote:

> The problem with your version is that copying the input is slow if it is large.
> Raymond was referencing the top answer here:
> It's also slow though (draws N random values if the input is of length
> N).

So the objection is that because performance can vary widely it is 
better for users to select their own algorithm rather than rely on a 
one-size-fits-all stdlib solution?


From guido at  Tue Apr 12 18:25:05 2016
From: guido at (Guido van Rossum)
Date: Tue, 12 Apr 2016 15:25:05 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

Ow, sorry! That's the algorithm I meant to reference (the link I
actually gave is for a different situation). See also

It does indeed avoid storing a copy of the sequence -- at the cost of
N calls to random(). Try timing it -- I wouldn't be surprised if
copying a set of N elements into a tuple is a lot faster than N
random() calls.

So we're stuck with two alternatives, neither of which is always the
best: (1) just copy the sequence (like Rob's proposal) -- this loses
big time if the sequence is large and not already held in memory; (2)
use reservoir sampling, at the cost of many more random() calls.

Since Rob didn't link to the quoted conversation I don't have it handy
to guess what Raymond meant with "Bentley" either, but I'm guessing
Jon Bentley wrote about reservoir sampling (before it was named that).
I recall hearing about the algorithm for the first time in the late
'80s from a coworker with Bell Labs ties.

(Later, Ethan wrote)
> So the objection is that because performance can vary widely it is better for users to select their own algorithm rather than rely on a one-size-fits-all stdlib solution?

That's pretty much it. Were this 1995, I probably would have put some
toy in the stdlib. But these days we should be more careful than that.


On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta
<cognetta.marco at> wrote:
> Maybe I am misunderstanding but for a generator, couldn't you avoid
> storing it in memory and just using the following online algorithm to
> save space?
> (I hope this message goes through correctly, first time participating
> in a discussion on this mailing list).
> -Marco
> On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum <guido at> wrote:
>> The problem with your version is that copying the input is slow if it is large.
>> Raymond was referencing the top answer here:
>> It's also slow though (draws N random values if the input is of length
>> N).
>> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at> wrote:
>>> It surprised me a bit the first time I realised that random.choice did not
>>> work on a set.  (One expects "everything" to "just work" in Python! :-) )
>>> There is a simple workaround (convert the set to a tuple or list before
>>> passing it) but ...
>>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random module'
>>> on Python-Ideas, one of which was to allow random.choice to work on an
>>> arbitrary iterable.
>>> Raymond Hettinger commented
>>>     "It could be generalized to arbitrary iterables (Bentley provides an
>>> example of how to do this) but it is fragile (i.e. falls apart badly with
>>> weak random number generators) and doesn't correspond well with real use
>>> cases."
>>> Sven replied
>>>     "Again, there definitely are real world use cases ..."
>>> I would like to revive this suggestion.  I don't understand Raymond's
>>> comment.  It seems to me that a reasonable implementation is better and more
>>> friendly than none, particularly for newbies.
>>> This is the source code for random.choice in Python 3.2 (as far as I know it
>>> hasn't changed since):
>>>     def choice(self, seq):
>>>         """Choose a random element from a non-empty sequence."""
>>>         try:
>>>             i = self._randbelow(len(seq))
>>>         except ValueError:
>>>             raise IndexError('Cannot choose from an empty sequence')
>>>         return seq[i]
>>> And here is my proposed (tested) version:
>>>     def choice(self, it):
>>>         """Choose a random element from a non-empty iterable."""
>>>         try:
>>>             it[0]
>>>         except IndexError:
>>>             raise IndexError('Cannot choose from an empty sequence')
>>>         except TypeError:
>>>             it = tuple(it)
>>>         try:
>>>             i = self._randbelow(len(it))
>>>         except ValueError:
>>>             raise IndexError('Cannot choose from an empty iterable')
>>>         return it[i]
>>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
>>> generator.
>>> Obviously the generator has to be 'fresh' (i.e. any previously consumed
>>> values will be excluded from the choice) and 'throw-away' (random.choice
>>> will exhaust it).
>>> But it means you can write code like
>>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
>>> "Pythonic" to me!
>>> (I experimented with versions that, when passed an object that supported
>>> len() but not indexing, viz. a set, iterated through the object as far as
>>> necessary instead of converting it to a list.  But I found that this was
>>> slower in practice as well as more complicated.  There might be a memory
>>> issue with huge iterables, but we are no worse off using existing
>>> workarounds for those.)
>>> Downsides of proposed version:
>>>     (1) Backward compatibility.  Could mask an error if a "wrong" object,
>>> but one that happens to be an iterable, is passed to random.choice.
>>>           There is also slightly different behaviour if a dictionary (D) is
>>> passed (an unusual thing to do):
>>>                 The current version will get a random integer in the range
>>> [0, len(D)), try to use that as a key of D,
>>>                    and raise KeyError if that fails.
>>>                 The proposed version behaves similarly except that it will
>>> always raise "KeyError: 0" if 0 is not a key of D.
>>>                 One could argue that this is an improvement, if it matters
>>> at all (error reproducibility).
>>>                 And code that deliberately exploits the existing behaviour,
>>> provided that it uses a dictionary whose keys are consecutive integers from
>>> 0 upwards, e.g.
>>>                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 : 'three', 4 :
>>> 'four' }
>>>                     d = random.choice(D)    # equivalent, in this example,
>>> to "d = random.choice(list(D.values()))"
>>>                     if d == 'zero':
>>>                         ...
>>>                 will continue to work (although one might argue that it
>>> deserves not to).
>>>     (2) Speed - the proposed version is almost certainly slightly slower
>>> when called with a sequence.
>>>          For what it's worth, I tried to measure the difference on my
>>> several-years-old Windows PC, but is was hard to measure accurately, even
>>> over millions of iterations.
>>>          All I can say is that it appeared to be a small fraction of a
>>> millisecond, possibly in the region of 50 nanoseconds, per call.
>>> Best wishes
>>> Rob Cliffe
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> --
>> --Guido van Rossum (
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From ben+python at  Tue Apr 12 18:57:47 2016
From: ben+python at (Ben Finney)
Date: Wed, 13 Apr 2016 08:57:47 +1000
Subject: [Python-ideas] random.choice on non-sequence
References: <>
Message-ID: <>

Rob Cliffe <rob.cliffe at> writes:

> It surprised me a bit the first time I realised that random.choice did
> not work on a set. (One expects "everything" to "just work" in Python!
> :-) )

I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and
even ?dict?, should each be accepted as input for ?random.choice?.

> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random
> module' on Python-Ideas, one of which was to allow random.choice to
> work on an arbitrary iterable.

I am ?1 on the notion of an arbitrary iterable as ?random.choice? input.

Arbitrary iterables may be consumed merely by being iterated. This will
force the caller to make a copy, which may as well be a normal sequence
type, defeating the purpose of accepting that ?arbitrary iterable? type.

Arbitrary iterables may never finish iterating. This means the call
to ?random.choice? would sometimes never return.

The ?random.choice? function should IMO not be responsible for dealing
with those cases.

Instead, a more moderate proposal would be to have ?random.choice?
accept an arbitrary container.

If the object implements the container protocol (by which I think I mean
that it conforms to ?, it is safe to iterate
and to treat its items as a collection from which to choose a random

Rob, does that proposal satisfy the requirements that motivated this?

 \        ?Some people, when confronted with a problem, think ?I know, |
  `\       I'll use regular expressions?. Now they have two problems.? |
_o__)                           ?Jamie Zawinski, in alt.religion.emacs |
Ben Finney

From guido at  Tue Apr 12 19:04:11 2016
From: guido at (Guido van Rossum)
Date: Tue, 12 Apr 2016 16:04:11 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <> <>
Message-ID: <>

This still seems wrong *unless* we add a protocol to select a random
item in O(1) time. There currently isn't one for sets and mappings --
only for sequences. It would be pretty terrible if someone wrote code
to get N items from a set of size N and their code ended up being

On Tue, Apr 12, 2016 at 3:57 PM, Ben Finney <ben+python at> wrote:
> Rob Cliffe <rob.cliffe at> writes:
>> It surprised me a bit the first time I realised that random.choice did
>> not work on a set. (One expects "everything" to "just work" in Python!
>> :-) )
> I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and
> even ?dict?, should each be accepted as input for ?random.choice?.
>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random
>> module' on Python-Ideas, one of which was to allow random.choice to
>> work on an arbitrary iterable.
> I am ?1 on the notion of an arbitrary iterable as ?random.choice? input.
> Arbitrary iterables may be consumed merely by being iterated. This will
> force the caller to make a copy, which may as well be a normal sequence
> type, defeating the purpose of accepting that ?arbitrary iterable? type.
> Arbitrary iterables may never finish iterating. This means the call
> to ?random.choice? would sometimes never return.
> The ?random.choice? function should IMO not be responsible for dealing
> with those cases.
> Instead, a more moderate proposal would be to have ?random.choice?
> accept an arbitrary container.
> If the object implements the container protocol (by which I think I mean
> that it conforms to ?, it is safe to iterate
> and to treat its items as a collection from which to choose a random
> item.
> Rob, does that proposal satisfy the requirements that motivated this?
> --
>  \        ?Some people, when confronted with a problem, think ?I know, |
>   `\       I'll use regular expressions?. Now they have two problems.? |
> _o__)                           ?Jamie Zawinski, in alt.religion.emacs |
> Ben Finney
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From mahmoud at  Tue Apr 12 19:05:27 2016
From: mahmoud at (Mahmoud Hashemi)
Date: Tue, 12 Apr 2016 16:05:27 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

Reservoir sampling is super handy. For a Pythonic introduction, along with
links to implementation, see here:

Even though I love the elegance enough to write and illustrate about it,
I'd be very surprised to have found it built into Python at this low of a


On Tue, Apr 12, 2016 at 3:25 PM, Guido van Rossum <guido at> wrote:

> Ow, sorry! That's the algorithm I meant to reference (the link I
> actually gave is for a different situation). See also
> It does indeed avoid storing a copy of the sequence -- at the cost of
> N calls to random(). Try timing it -- I wouldn't be surprised if
> copying a set of N elements into a tuple is a lot faster than N
> random() calls.
> So we're stuck with two alternatives, neither of which is always the
> best: (1) just copy the sequence (like Rob's proposal) -- this loses
> big time if the sequence is large and not already held in memory; (2)
> use reservoir sampling, at the cost of many more random() calls.
> Since Rob didn't link to the quoted conversation I don't have it handy
> to guess what Raymond meant with "Bentley" either, but I'm guessing
> Jon Bentley wrote about reservoir sampling (before it was named that).
> I recall hearing about the algorithm for the first time in the late
> '80s from a coworker with Bell Labs ties.
> (Later, Ethan wrote)
> > So the objection is that because performance can vary widely it is
> better for users to select their own algorithm rather than rely on a
> one-size-fits-all stdlib solution?
> That's pretty much it. Were this 1995, I probably would have put some
> toy in the stdlib. But these days we should be more careful than that.
> --Guido
> On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta
> <cognetta.marco at> wrote:
> > Maybe I am misunderstanding but for a generator, couldn't you avoid
> > storing it in memory and just using the following online algorithm to
> > save space?
> >
> > (I hope this message goes through correctly, first time participating
> > in a discussion on this mailing list).
> >
> > -Marco
> >
> > On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum <guido at>
> wrote:
> >> The problem with your version is that copying the input is slow if it
> is large.
> >>
> >> Raymond was referencing the top answer here:
> >>
> .
> >> It's also slow though (draws N random values if the input is of length
> >> N).
> >>
> >> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at>
> wrote:
> >>> It surprised me a bit the first time I realised that random.choice did
> not
> >>> work on a set.  (One expects "everything" to "just work" in Python!
> :-) )
> >>> There is a simple workaround (convert the set to a tuple or list before
> >>> passing it) but ...
> >>>
> >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random
> module'
> >>> on Python-Ideas, one of which was to allow random.choice to work on an
> >>> arbitrary iterable.
> >>> Raymond Hettinger commented
> >>>     "It could be generalized to arbitrary iterables (Bentley provides
> an
> >>> example of how to do this) but it is fragile (i.e. falls apart badly
> with
> >>> weak random number generators) and doesn't correspond well with real
> use
> >>> cases."
> >>> Sven replied
> >>>     "Again, there definitely are real world use cases ..."
> >>>
> >>> I would like to revive this suggestion.  I don't understand Raymond's
> >>> comment.  It seems to me that a reasonable implementation is better
> and more
> >>> friendly than none, particularly for newbies.
> >>> This is the source code for random.choice in Python 3.2 (as far as I
> know it
> >>> hasn't changed since):
> >>>
> >>>
> >>>     def choice(self, seq):
> >>>         """Choose a random element from a non-empty sequence."""
> >>>         try:
> >>>             i = self._randbelow(len(seq))
> >>>         except ValueError:
> >>>             raise IndexError('Cannot choose from an empty sequence')
> >>>         return seq[i]
> >>>
> >>>
> >>> And here is my proposed (tested) version:
> >>>
> >>>
> >>>     def choice(self, it):
> >>>         """Choose a random element from a non-empty iterable."""
> >>>         try:
> >>>             it[0]
> >>>         except IndexError:
> >>>             raise IndexError('Cannot choose from an empty sequence')
> >>>         except TypeError:
> >>>             it = tuple(it)
> >>>         try:
> >>>             i = self._randbelow(len(it))
> >>>         except ValueError:
> >>>             raise IndexError('Cannot choose from an empty iterable')
> >>>         return it[i]
> >>>
> >>>
> >>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
> >>> generator.
> >>> Obviously the generator has to be 'fresh' (i.e. any previously consumed
> >>> values will be excluded from the choice) and 'throw-away'
> (random.choice
> >>> will exhaust it).
> >>> But it means you can write code like
> >>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
> >>> "Pythonic" to me!
> >>>
> >>> (I experimented with versions that, when passed an object that
> supported
> >>> len() but not indexing, viz. a set, iterated through the object as far
> as
> >>> necessary instead of converting it to a list.  But I found that this
> was
> >>> slower in practice as well as more complicated.  There might be a
> memory
> >>> issue with huge iterables, but we are no worse off using existing
> >>> workarounds for those.)
> >>>
> >>> Downsides of proposed version:
> >>>     (1) Backward compatibility.  Could mask an error if a "wrong"
> object,
> >>> but one that happens to be an iterable, is passed to random.choice.
> >>>           There is also slightly different behaviour if a dictionary
> (D) is
> >>> passed (an unusual thing to do):
> >>>                 The current version will get a random integer in the
> range
> >>> [0, len(D)), try to use that as a key of D,
> >>>                    and raise KeyError if that fails.
> >>>                 The proposed version behaves similarly except that it
> will
> >>> always raise "KeyError: 0" if 0 is not a key of D.
> >>>                 One could argue that this is an improvement, if it
> matters
> >>> at all (error reproducibility).
> >>>                 And code that deliberately exploits the existing
> behaviour,
> >>> provided that it uses a dictionary whose keys are consecutive integers
> from
> >>> 0 upwards, e.g.
> >>>                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 :
> 'three', 4 :
> >>> 'four' }
> >>>                     d = random.choice(D)    # equivalent, in this
> example,
> >>> to "d = random.choice(list(D.values()))"
> >>>                     if d == 'zero':
> >>>                         ...
> >>>                 will continue to work (although one might argue that it
> >>> deserves not to).
> >>>     (2) Speed - the proposed version is almost certainly slightly
> slower
> >>> when called with a sequence.
> >>>          For what it's worth, I tried to measure the difference on my
> >>> several-years-old Windows PC, but is was hard to measure accurately,
> even
> >>> over millions of iterations.
> >>>          All I can say is that it appeared to be a small fraction of a
> >>> millisecond, possibly in the region of 50 nanoseconds, per call.
> >>>
> >>> Best wishes
> >>> Rob Cliffe
> >>> _______________________________________________
> >>> Python-ideas mailing list
> >>> Python-ideas at
> >>>
> >>> Code of Conduct:
> >>
> >>
> >>
> >> --
> >> --Guido van Rossum (
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at
> >>
> >> Code of Conduct:
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From joejev at  Tue Apr 12 19:20:12 2016
From: joejev at (Joseph Jevnik)
Date: Tue, 12 Apr 2016 19:20:12 -0400
Subject: [Python-ideas] Modifying yield from's return value
In-Reply-To: <>
References: <>
Message-ID: <>

Changing the builtin map to do this would be questionable because there is
some performance overhead to dispatching to the inner coroutine's `send`
methods. `__next__` is implemented as a tp slot; however, coroutine `send`
is just a normal method that needs to be looked up every time we call next.
While my implementation is not as optimized as it could be, it is about
twice as slow as builtin map (and 4x faster is . I think it would be better
to keep the library separate until there is a lot of demand for this

On Tue, Apr 12, 2016 at 6:02 PM, Bar Harel <bzvi7919 at> wrote:

> Perhaps something like this should go in the standard library? Or change
> the builtin map to forward everything? Changing the builtin map in this
> case will be backwards compatible and will overall be beneficial. If
> someone made a workaround, the workaround would still work, but now the
> builtins will do it per se.
> On Wed, Apr 13, 2016 at 12:55 AM Joseph Jevnik <joejev at> wrote:
>> I have written a library `cotoolz` that provides the primitive pieces
>> needed for this. It implements a `comap` type which is like `map` but
>> properly forwards `send`, `throw`, and `close` to the underlying coroutine.
>> There is no special syntax needed, just write:
>> ```
>> yield from comap(f, inner_coroutine())
>> ```
>> This also supports `cozip` which lets you zip together multiple
>> coroutines, for example:
>> ```
>> yield from cozip(inner_coroutine_a(), inner_corouting_b(),
>> inner_coroutine_c(), ...)
>> ```
>> This will fan out the sends to all of the coroutines and collect the
>> results into a single tuple to yield. Just like `zip`, this will be
>> exchausted when the first coroutine is exhausted.
>> The library is available as free software on pypi or on here:
>> On Tue, Apr 12, 2016 at 5:40 PM, Bar Harel <bzvi7919 at> wrote:
>>> I asked a question in stackoverflow
>>> <> regarding a way of
>>> modifying yield from's return value.
>>> There doesn't seem to be any way of modifying the data yielded by the
>>> yield from expression without breaking the yield from "pipe". By "pipe" I
>>> mean the fact that .send() and .throw() pass to the inner generator. Useful
>>> cases are parsers as I've demonstrated in the question, transcoders,
>>> encoding and decoding the yielded values without interrupting .send() or
>>> .throw() and generally coroutines.
>>> I believe it would be beneficial to many aspects and libraries in
>>> Python, most notably asyncio.
>>> I couldn't think of a good syntax though, other than creating a wrapping
>>> function, lets say in itertools, that creates a class overriding send,
>>> throw and __next__ and receives the generator and associated modification
>>> functions (like suggested in one of the answers).
>>> What do you think? Is it useful? Any suggested syntax?
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From cognetta.marco at  Tue Apr 12 20:07:15 2016
From: cognetta.marco at (Marco Cognetta)
Date: Tue, 12 Apr 2016 20:07:15 -0400
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

At the risk of making this not very easily accessible, couldn't we use
both approaches (reservoir sampling and just loading the whole thing
into memory) and apply the ski-rental problem? We could come up with
some cost for calling random and use that to determine when we should
just give up and load it into memory if we do not know the length of
the sequence produced by the generator. If the user knows something
about the generator beforehand they could specify in some optional
parameter whether or not they want to use reservoir sampling or just
load it into memory at the start.

The theory side of me likes this but I admit its a little ugly to be
included in Python's stdlib.


On Tue, Apr 12, 2016 at 7:05 PM, Mahmoud Hashemi <mahmoud at> wrote:
> Reservoir sampling is super handy. For a Pythonic introduction, along with
> links to implementation, see here:
> Even though I love the elegance enough to write and illustrate about it, I'd
> be very surprised to have found it built into Python at this low of a level.
> Mahmoud
> On Tue, Apr 12, 2016 at 3:25 PM, Guido van Rossum <guido at> wrote:
>> Ow, sorry! That's the algorithm I meant to reference (the link I
>> actually gave is for a different situation). See also
>> It does indeed avoid storing a copy of the sequence -- at the cost of
>> N calls to random(). Try timing it -- I wouldn't be surprised if
>> copying a set of N elements into a tuple is a lot faster than N
>> random() calls.
>> So we're stuck with two alternatives, neither of which is always the
>> best: (1) just copy the sequence (like Rob's proposal) -- this loses
>> big time if the sequence is large and not already held in memory; (2)
>> use reservoir sampling, at the cost of many more random() calls.
>> Since Rob didn't link to the quoted conversation I don't have it handy
>> to guess what Raymond meant with "Bentley" either, but I'm guessing
>> Jon Bentley wrote about reservoir sampling (before it was named that).
>> I recall hearing about the algorithm for the first time in the late
>> '80s from a coworker with Bell Labs ties.
>> (Later, Ethan wrote)
>> > So the objection is that because performance can vary widely it is
>> > better for users to select their own algorithm rather than rely on a
>> > one-size-fits-all stdlib solution?
>> That's pretty much it. Were this 1995, I probably would have put some
>> toy in the stdlib. But these days we should be more careful than that.
>> --Guido
>> On Tue, Apr 12, 2016 at 3:09 PM, Marco Cognetta
>> <cognetta.marco at> wrote:
>> > Maybe I am misunderstanding but for a generator, couldn't you avoid
>> > storing it in memory and just using the following online algorithm to
>> > save space?
>> >
>> > (I hope this message goes through correctly, first time participating
>> > in a discussion on this mailing list).
>> >
>> > -Marco
>> >
>> > On Tue, Apr 12, 2016 at 5:53 PM, Guido van Rossum <guido at>
>> > wrote:
>> >> The problem with your version is that copying the input is slow if it
>> >> is large.
>> >>
>> >> Raymond was referencing the top answer here:
>> >>
>> >>
>> >> It's also slow though (draws N random values if the input is of length
>> >> N).
>> >>
>> >> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at>
>> >> wrote:
>> >>> It surprised me a bit the first time I realised that random.choice did
>> >>> not
>> >>> work on a set.  (One expects "everything" to "just work" in Python!
>> >>> :-) )
>> >>> There is a simple workaround (convert the set to a tuple or list
>> >>> before
>> >>> passing it) but ...
>> >>>
>> >>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random
>> >>> module'
>> >>> on Python-Ideas, one of which was to allow random.choice to work on an
>> >>> arbitrary iterable.
>> >>> Raymond Hettinger commented
>> >>>     "It could be generalized to arbitrary iterables (Bentley provides
>> >>> an
>> >>> example of how to do this) but it is fragile (i.e. falls apart badly
>> >>> with
>> >>> weak random number generators) and doesn't correspond well with real
>> >>> use
>> >>> cases."
>> >>> Sven replied
>> >>>     "Again, there definitely are real world use cases ..."
>> >>>
>> >>> I would like to revive this suggestion.  I don't understand Raymond's
>> >>> comment.  It seems to me that a reasonable implementation is better
>> >>> and more
>> >>> friendly than none, particularly for newbies.
>> >>> This is the source code for random.choice in Python 3.2 (as far as I
>> >>> know it
>> >>> hasn't changed since):
>> >>>
>> >>>
>> >>>     def choice(self, seq):
>> >>>         """Choose a random element from a non-empty sequence."""
>> >>>         try:
>> >>>             i = self._randbelow(len(seq))
>> >>>         except ValueError:
>> >>>             raise IndexError('Cannot choose from an empty sequence')
>> >>>         return seq[i]
>> >>>
>> >>>
>> >>> And here is my proposed (tested) version:
>> >>>
>> >>>
>> >>>     def choice(self, it):
>> >>>         """Choose a random element from a non-empty iterable."""
>> >>>         try:
>> >>>             it[0]
>> >>>         except IndexError:
>> >>>             raise IndexError('Cannot choose from an empty sequence')
>> >>>         except TypeError:
>> >>>             it = tuple(it)
>> >>>         try:
>> >>>             i = self._randbelow(len(it))
>> >>>         except ValueError:
>> >>>             raise IndexError('Cannot choose from an empty iterable')
>> >>>         return it[i]
>> >>>
>> >>>
>> >>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or
>> >>> a
>> >>> generator.
>> >>> Obviously the generator has to be 'fresh' (i.e. any previously
>> >>> consumed
>> >>> values will be excluded from the choice) and 'throw-away'
>> >>> (random.choice
>> >>> will exhaust it).
>> >>> But it means you can write code like
>> >>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
>> >>> "Pythonic" to me!
>> >>>
>> >>> (I experimented with versions that, when passed an object that
>> >>> supported
>> >>> len() but not indexing, viz. a set, iterated through the object as far
>> >>> as
>> >>> necessary instead of converting it to a list.  But I found that this
>> >>> was
>> >>> slower in practice as well as more complicated.  There might be a
>> >>> memory
>> >>> issue with huge iterables, but we are no worse off using existing
>> >>> workarounds for those.)
>> >>>
>> >>> Downsides of proposed version:
>> >>>     (1) Backward compatibility.  Could mask an error if a "wrong"
>> >>> object,
>> >>> but one that happens to be an iterable, is passed to random.choice.
>> >>>           There is also slightly different behaviour if a dictionary
>> >>> (D) is
>> >>> passed (an unusual thing to do):
>> >>>                 The current version will get a random integer in the
>> >>> range
>> >>> [0, len(D)), try to use that as a key of D,
>> >>>                    and raise KeyError if that fails.
>> >>>                 The proposed version behaves similarly except that it
>> >>> will
>> >>> always raise "KeyError: 0" if 0 is not a key of D.
>> >>>                 One could argue that this is an improvement, if it
>> >>> matters
>> >>> at all (error reproducibility).
>> >>>                 And code that deliberately exploits the existing
>> >>> behaviour,
>> >>> provided that it uses a dictionary whose keys are consecutive integers
>> >>> from
>> >>> 0 upwards, e.g.
>> >>>                     D = { 0 : 'zero', 1 : 'one', 2 : 'two', 3 :
>> >>> 'three', 4 :
>> >>> 'four' }
>> >>>                     d = random.choice(D)    # equivalent, in this
>> >>> example,
>> >>> to "d = random.choice(list(D.values()))"
>> >>>                     if d == 'zero':
>> >>>                         ...
>> >>>                 will continue to work (although one might argue that
>> >>> it
>> >>> deserves not to).
>> >>>     (2) Speed - the proposed version is almost certainly slightly
>> >>> slower
>> >>> when called with a sequence.
>> >>>          For what it's worth, I tried to measure the difference on my
>> >>> several-years-old Windows PC, but is was hard to measure accurately,
>> >>> even
>> >>> over millions of iterations.
>> >>>          All I can say is that it appeared to be a small fraction of a
>> >>> millisecond, possibly in the region of 50 nanoseconds, per call.
>> >>>
>> >>> Best wishes
>> >>> Rob Cliffe
>> >>> _______________________________________________
>> >>> Python-ideas mailing list
>> >>> Python-ideas at
>> >>>
>> >>> Code of Conduct:
>> >>
>> >>
>> >>
>> >> --
>> >> --Guido van Rossum (
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at
>> >>
>> >> Code of Conduct:
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas at
>> >
>> > Code of Conduct:
>> --
>> --Guido van Rossum (
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:

From greg.ewing at  Tue Apr 12 21:42:41 2016
From: greg.ewing at (Greg Ewing)
Date: Wed, 13 Apr 2016 13:42:41 +1200
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> The problem with your version is that copying the input is slow if it is large.

Maybe sets should have a method that returns an indexable
view? The order could be defined as equivalent to iteration
order, and it would allow things like random.choice to
work efficiently on sets.


From guido at  Tue Apr 12 21:59:10 2016
From: guido at (Guido van Rossum)
Date: Tue, 12 Apr 2016 18:59:10 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 12, 2016 at 6:42 PM, Greg Ewing <greg.ewing at> wrote:
> Maybe sets should have a method that returns an indexable
> view? The order could be defined as equivalent to iteration
> order, and it would allow things like random.choice to
> work efficiently on sets.

How would you implement it though? There are gaps in the hash table.

--Guido van Rossum (

From rosuav at  Tue Apr 12 22:02:43 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Apr 2016 12:02:43 +1000
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 13, 2016 at 7:40 AM, Rob Cliffe <rob.cliffe at> wrote:
> And here is my proposed (tested) version:
>     def choice(self, it):
>         """Choose a random element from a non-empty iterable."""
>         try:
>             it[0]
>         except IndexError:
>             raise IndexError('Cannot choose from an empty sequence')
>         except TypeError:
>             it = tuple(it)
>         try:
>             i = self._randbelow(len(it))
>         except ValueError:
>             raise IndexError('Cannot choose from an empty iterable')
>         return it[i]
> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
> generator.
> Obviously the generator has to be 'fresh' (i.e. any previously consumed
> values will be excluded from the choice) and 'throw-away' (random.choice
> will exhaust it).
> But it means you can write code like
>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
> "Pythonic" to me!

Small point of order: Pretty much everything discussed here on
python-ideas is about Python 3. It's best to make sure your code works
with the latest CPython (currently 3.5), as a change like this would
be landing in 3.6 at the earliest. So what I'd be looking at is this:

random.choice(x for x in range(10) if x%3)

AFAIK this doesn't change your point at all, but it is worth being
careful of; Python 3's range object isn't quite the same as Python 2's
xrange, and it's possible something might "just work". (For the
inverse case, "if x%3 == 0", you can simply use
random.choice(random(0, 10, 3)) to do what you want.)

I don't like the proposed acceptance of arbitrary iterables. In the
extremely rare case where you actually do want that, you can always
manually wrap it in list() or tuple(). But your original use-case does
have merit:

> It surprised me a bit the first time I realised that random.choice did not work on a set.

A set has a length, but it can't be indexed. It should be perfectly
reasonable to ask for a random element out of a set! So here's my
counter-proposal: Make random.choice accept any iterable with a

    def choice(self, coll):
        """Choose a random element from a non-empty collection."""
            i = self._randbelow(len(coll))
        except ValueError:
            raise IndexError('Cannot choose from an empty collection')
            return coll[i]
        except TypeError:
            for _, value in zip(range(i+1), coll):
            return value

Feel free to bikeshed the method of iterating part way into a
collection (enumerate() and a comparison? call iter, then call next
that many times, then return next(it)?), but the basic concept is that
you have to have a length and then you iterate into it that far.

It's still not arbitrary iterables, but it'll handle sets and dict
views. Handling dicts directly may be a little tricky; since they
support subscripting, they'll either raise KeyError or silently return
a value, where the iteration-based return value would be a key.
Breaking this out into its own function would be reliable there (take
out the try/except and just go straight into iteration).


From tim.peters at  Tue Apr 12 22:07:43 2016
From: tim.peters at (Tim Peters)
Date: Tue, 12 Apr 2016 21:07:43 -0500
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

[Greg Ewing]
> Maybe sets should have a method that returns an indexable
> view? The order could be defined as equivalent to iteration
> order, and it would allow things like random.choice to
> work efficiently on sets.

Well, sets and dicts are implemented very similarly in CPython.  Note
that a dict view (whether of keys, values or items) doesn't support
indexing!  It's unclear how that could be added efficiently, and the
same applies to sets.  Iteration works pretty efficiently, because a
hidden "search finger" is maintained internally, skipping over the
gaps in the hash table as needed (so iteration can, internally, make a
number of probes equal to the number of hash slots, and no more than
that) - but that's no help for indexing by a random integer.

I'll just add that I've never had a real need for a random selection
from a set.  For all the many set algorithms I've coded, an
"arbitrary" element worked just as well as a "random" element, and
set.pop() works fine for that.

From guido at  Tue Apr 12 22:11:38 2016
From: guido at (Guido van Rossum)
Date: Tue, 12 Apr 2016 19:11:38 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

This is no good. Who wants a choice() that is O(1) on sequences but
degrades to O(N) if the argument is a set?

I see the current behavior as a feature: it works when the argument is
indexable, and when it isn't, it fails cleanly, prompting the user to
use their brain and decide what's right. Maybe you're programming a
toy example. Then you can just call random.choice(list(x)) and move
on. Or maybe you're trying to do something serious -- then it behooves
you to copy the set into a list variable once and then repeatedly
choose from that list, or maybe you should have put the data in a list
in the first place.

But putting the toy solution in the stdlib is a bad idea, and so is
putting a bad (O(N)) algorithm there.

So the status quo wins for a reason!


On Tue, Apr 12, 2016 at 7:02 PM, Chris Angelico <rosuav at> wrote:
> On Wed, Apr 13, 2016 at 7:40 AM, Rob Cliffe <rob.cliffe at> wrote:
>> And here is my proposed (tested) version:
>>     def choice(self, it):
>>         """Choose a random element from a non-empty iterable."""
>>         try:
>>             it[0]
>>         except IndexError:
>>             raise IndexError('Cannot choose from an empty sequence')
>>         except TypeError:
>>             it = tuple(it)
>>         try:
>>             i = self._randbelow(len(it))
>>         except ValueError:
>>             raise IndexError('Cannot choose from an empty iterable')
>>         return it[i]
>> This works on (e.g.) a list/tuple/string, a set, a dictionary view or a
>> generator.
>> Obviously the generator has to be 'fresh' (i.e. any previously consumed
>> values will be excluded from the choice) and 'throw-away' (random.choice
>> will exhaust it).
>> But it means you can write code like
>>     random.choice(x for x in xrange(10) if x%3)    # this feels quite
>> "Pythonic" to me!
> Small point of order: Pretty much everything discussed here on
> python-ideas is about Python 3. It's best to make sure your code works
> with the latest CPython (currently 3.5), as a change like this would
> be landing in 3.6 at the earliest. So what I'd be looking at is this:
> random.choice(x for x in range(10) if x%3)
> AFAIK this doesn't change your point at all, but it is worth being
> careful of; Python 3's range object isn't quite the same as Python 2's
> xrange, and it's possible something might "just work". (For the
> inverse case, "if x%3 == 0", you can simply use
> random.choice(random(0, 10, 3)) to do what you want.)
> I don't like the proposed acceptance of arbitrary iterables. In the
> extremely rare case where you actually do want that, you can always
> manually wrap it in list() or tuple(). But your original use-case does
> have merit:
>> It surprised me a bit the first time I realised that random.choice did not work on a set.
> A set has a length, but it can't be indexed. It should be perfectly
> reasonable to ask for a random element out of a set! So here's my
> counter-proposal: Make random.choice accept any iterable with a
> __len__.
>     def choice(self, coll):
>         """Choose a random element from a non-empty collection."""
>         try:
>             i = self._randbelow(len(coll))
>         except ValueError:
>             raise IndexError('Cannot choose from an empty collection')
>         try:
>             return coll[i]
>         except TypeError:
>             for _, value in zip(range(i+1), coll):
>                 pass
>             return value
> Feel free to bikeshed the method of iterating part way into a
> collection (enumerate() and a comparison? call iter, then call next
> that many times, then return next(it)?), but the basic concept is that
> you have to have a length and then you iterate into it that far.
> It's still not arbitrary iterables, but it'll handle sets and dict
> views. Handling dicts directly may be a little tricky; since they
> support subscripting, they'll either raise KeyError or silently return
> a value, where the iteration-based return value would be a key.
> Breaking this out into its own function would be reliable there (take
> out the try/except and just go straight into iteration).
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From rosuav at  Tue Apr 12 22:15:28 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Apr 2016 12:15:28 +1000
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 13, 2016 at 12:11 PM, Guido van Rossum <guido at> wrote:
> This is no good. Who wants a choice() that is O(1) on sequences but
> degrades to O(N) if the argument is a set?

Fair point. Suggestion withdrawn.

If you want something that does the iteration-till-it-finds-something,
that should be a separate function. (And then it'd work automatically
on dicts, too.)


From chris.barker at  Wed Apr 13 00:52:31 2016
From: chris.barker at (Chris Barker - NOAA Federal)
Date: Tue, 12 Apr 2016 21:52:31 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <-176815014396412028@unknownmsgid>

I also haven't had a use case for random items  from a set, but have
had a use case for randomly ( not arbitrarily ) choosing from a dict
(see trigrams:

As dicts and sets share implementation, why not both?

Anyway, the hash table has gaps, but wouldn't it be sufficiently
random to pick a random index, and if it's a gap, pick another one? I
suppose in theory, this could be in infinite process, but in practice,
it would be O(1) with a constant of two or three... Better than
iterating through on average half of the keys.

( how many gaps are there? I have no idea )

I know nothing about the implementation, and have not thought
carefully about the statistics, but it seems do-able. I'll just shut
up if I'm way off base.

BTW, isn't it impossible to randomly select from an infinite iterable anyway?


> On Apr 12, 2016, at 6:43 PM, Greg Ewing <greg.ewing at> wrote:
> Guido van Rossum wrote:
>> The problem with your version is that copying the input is slow if it is large.
> Maybe sets should have a method that returns an indexable
> view? The order could be defined as equivalent to iteration
> order, and it would allow things like random.choice to
> work efficiently on sets.
> --
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From rob.cliffe at  Wed Apr 13 00:09:30 2016
From: rob.cliffe at (Rob Cliffe)
Date: Wed, 13 Apr 2016 05:09:30 +0100
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <> <>
Message-ID: <>

Thanks to everyone for all the feedback so far.

Interesting.  I had not heard of reservoir sampling, and it had not 
occurred to me that one could select a uniformly random element from a 
sequence of unknown length without copying it.

I am with Ben in that I consider the most important type for 
random.choice to accept, that it doesn't already, is "set".  If just 
that could be achieved, yes Ben, I think that would be a plus.
(The fact that I could write a version that handled generators 
(efficiently or not) just seemed to be a bonus.)
ISTM however that there is a problem accepting dict, because of the 
ambiguity - does it mean a random key, a random value, or a random 
(key,value) item?
(I would pick a random key (consistent with "for k in MyDict"), but ISTM 
"explicit is better than implict".)

There are problems with arbitrary iterables as Ben points out.  But I 
don't see these as an objection to making random.choice accept them, because
       (1) It is easy to produce examples of stdlib functions that can 
be crashed with infinite iterables, e.g.
                 >>> def forever():
                 ...        while 1: yield 1
                 >>> itertools.product(forever(), forever())
                 Traceback (most recent call last):
                   File "<stdin>", line 1, in <module>
         (2) As I indicated in my original post, it is understood that 
generators are "throw-away", so may need to be copied.
'Consenting adults', no?

One small point: Whenever I've tested it (quite a few times, for 
different projects), I've always found that it was faster to copy to a 
tuple than to a list.

I did a little experimenting (admittedly non-rigorous, on one platform 
only, and using Python 2.7.10, not Python 3, and using code which could 
very possibly be improved) on selecting a random element from a 
generator, and found that
     for small or moderate generators, reservoir sampling was almost 
always slower than generating a tuple
     as the generator length increased up to roughly 10,000, the ratio
             (time taken by reservoir)  / (time taken by tuple)
         increased, reaching a maximum of over 4
     as the generator length increased further, the ratio started to 
decrease, although for a length of 80 million (about as large as I could 
test) it was still over 3.
(This would change if the reservoir approach could be recoded in a way 
that was amazingly faster than the way I did it, but, arrogance aside, I 
doubt that.  I will supply details of my code on request.)
I think this is a tribute to the efficiency of tuple building.
It's also enough to convince me that reservoir sampling, or Marco 
Cognetta's compromise approach, are non-starters.
More rigorous testing would be necessary to convince the rest of the world.

I am also pretty well convinced from actual tests (not from theoretical 
reasoning) that:
     the "convert it to a tuple" recipe is not far off O(N), both for 
sets and generators (it gets worse for generators that produce objects 
with large memory footprints), and is certainly fast enough to be useful 
(the point at which it gets unacceptably slow is generally not far from 
the point where you run out of memory).
I did try an approach similar to Chris Angelico's of iterating over sets 
up to the selection point (my bikeshed actually used enumerate rather 
than zip), but it was much slower than the "tuple-ise" algorithm.

On 13/04/2016 00:04, Guido van Rossum wrote:
> This still seems wrong *unless* we add a protocol to select a random
> item in O(1) time. There currently isn't one for sets and mappings --
> only for sequences.
Hm, would this need a change to the structure of dicts and sets?
Is there by any chance a quick way of getting the Nth item added in an 
>   It would be pretty terrible if someone wrote code
> to get N items from a set of size N and their code ended up being
> O(N**2).
Er, I'm not quite sure what you're saying here.
random.sample allows you to select k items from a set of size N (it 
converts the set to a tuple).
If k=N, i.e. you're trying to select the whole set, that's a silly thing 
to do if you know you're doing it.
But even in this case, random.sample is approximately O(N); I tested it.
If someone writes their own inefficient version of random.sample, that's 
their problem.
Have I misunderstood?  Do you mean selecting N items with replacement?

Hm, it occurs to me that random.sample itself could be extended to 
choose k unique random elements from a general iterable.
I have seen a reservoir sampling algorithm that claims to be O(N):


  On Tue, Apr 12, 2016 at 3:57 PM, Ben Finney 
<ben+python at> wrote:
>> Rob Cliffe <rob.cliffe at> writes:
>>> It surprised me a bit the first time I realised that random.choice did
>>> not work on a set. (One expects "everything" to "just work" in Python!
>>> :-) )
>> I am +1 on the notion that an instance of ?tuple?, ?list?, ?set?, and
>> even ?dict?, should each be accepted as input for ?random.choice?.
>>> On 2011-06-23, Sven Marnach posted 'A few suggestions for the random
>>> module' on Python-Ideas, one of which was to allow random.choice to
>>> work on an arbitrary iterable.
>> I am ?1 on the notion of an arbitrary iterable as ?random.choice? input.
>> Arbitrary iterables may be consumed merely by being iterated. This will
>> force the caller to make a copy, which may as well be a normal sequence
>> type, defeating the purpose of accepting that ?arbitrary iterable? type.
>> Arbitrary iterables may never finish iterating. This means the call
>> to ?random.choice? would sometimes never return.
>> The ?random.choice? function should IMO not be responsible for dealing
>> with those cases.
>> Instead, a more moderate proposal would be to have ?random.choice?
>> accept an arbitrary container.
>> If the object implements the container protocol (by which I think I mean
>> that it conforms to ?, it is safe to iterate
>> and to treat its items as a collection from which to choose a random
>> item.
>> Rob, does that proposal satisfy the requirements that motivated this?
>> --
>>   \        ?Some people, when confronted with a problem, think ?I know, |
>>    `\       I'll use regular expressions?. Now they have two problems.? |
>> _o__)                           ?Jamie Zawinski, in alt.religion.emacs |
>> Ben Finney
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:

From tim.peters at  Wed Apr 13 01:08:47 2016
From: tim.peters at (Tim Peters)
Date: Wed, 13 Apr 2016 00:08:47 -0500
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <-176815014396412028@unknownmsgid>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

[Chris Barker - NOAA Federal <chris.barker at>]
> ...
> Anyway, the hash table has gaps, but wouldn't it be sufficiently
> random to pick a random index, and if it's a gap, pick another one?

That would be a correct implementation, but ...

> I suppose in theory, this could be in infinite process, but in practice,
> it would be O(1) with a constant of two or three... Better than
> iterating through on average half of the keys.

There's an upper limit on how dense a CPython dict or set can become
(the load factor doesn't exceed 2/3), but no lower limit.  For
example, it's easy to end up with a dict holding a single entry hiding
among millions of empty slots (dicts are never resized on key
deletion, only on key insertion).

> ...
> BTW, isn't it impossible to randomly select from an infinite iterable anyway?

Of course, but it is possible to do uniform random selection, in one
pass using constant space and in linear time, from an iterable whose
length isn't known in advance (simple case of "reservoir sampling").

From grant.jenks at  Wed Apr 13 01:20:07 2016
From: grant.jenks at (Grant Jenks)
Date: Tue, 12 Apr 2016 22:20:07 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at> wrote:
> It surprised me a bit the first time I realised that random.choice did not
> work on a set.  (One expects "everything" to "just work" in Python! :-) )

As you point out in your proposed solution, the problem is `random.choice`
expects an index-able value. You can easily create an index-able set using
the sortedcontainers.SortedSet interface even if if the elements aren't
comparable. Simply use:

from sortedcontainers import SortedSet

def zero(value):
    return 0

class IndexableSet(SortedSet):
    def __init__(self, *args, **kwargs):
        super(IndexableSet, self).__init__(*args, key=zero, **kwargs)

values = IndexableSet(range(100))

import random


`__contains__`, `__iter__`, `len`, `add`, `__getitem__` and `__delitem__` will
all be fast. But `discard` will take potentially linear time. So:

index = random.randrange(len(values))
value = values[index]
del values[index]

will be much faster than:

value = random.choice(values)

Otherwise `IndexableSet` will work as a drop-in replacement for
your `set` needs. Read more about the SortedContainers project at


From jsbueno at  Wed Apr 13 01:29:28 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Wed, 13 Apr 2016 02:29:28 -0300
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

Maybe, instead of all of this, since as put by others, hardly there would be a
"one size fits all" - the nice thing to have would be a "choice" or
"randompop"  method for sets.

Sets are the one natural kind of object from which it would seen
normal to be able
to pick a random element (or pop one) - since they have no order to
start with, and
the one from which it is frustrating when one realizes "random.choice"
fails with,

So either have a "random.set_choice"  and "random.set_pop" functions
in random -
or "choice" and "pop_random" on the "set"  interface itself could be
more practical
- in terms of real world usage - than speculate a way for choice to
work in iterables
of unknown size, among other objects.

On 13 April 2016 at 02:20, Grant Jenks <grant.jenks at> wrote:
> On Tue, Apr 12, 2016 at 2:40 PM, Rob Cliffe <rob.cliffe at> wrote:
>> It surprised me a bit the first time I realised that random.choice did not
>> work on a set.  (One expects "everything" to "just work" in Python! :-) )
> As you point out in your proposed solution, the problem is `random.choice`
> expects an index-able value. You can easily create an index-able set using
> the sortedcontainers.SortedSet interface even if if the elements aren't
> comparable. Simply use:
> ```python
> from sortedcontainers import SortedSet
> def zero(value):
>     return 0
> class IndexableSet(SortedSet):
>     def __init__(self, *args, **kwargs):
>         super(IndexableSet, self).__init__(*args, key=zero, **kwargs)
> values = IndexableSet(range(100))
> import random
> random.choice(values)
> ```
> `__contains__`, `__iter__`, `len`, `add`, `__getitem__` and `__delitem__` will
> all be fast. But `discard` will take potentially linear time. So:
> ```
> index = random.randrange(len(values))
> value = values[index]
> del values[index]
> ```
> will be much faster than:
> ```
> value = random.choice(values)
> values.discard(value)
> ```
> Otherwise `IndexableSet` will work as a drop-in replacement for
> your `set` needs. Read more about the SortedContainers project at
> Grant
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From p.f.moore at  Wed Apr 13 04:32:47 2016
From: p.f.moore at (Paul Moore)
Date: Wed, 13 Apr 2016 09:32:47 +0100
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On 13 April 2016 at 06:29, Joao S. O. Bueno <jsbueno at> wrote:
> Maybe, instead of all of this, since as put by others, hardly there would be a
> "one size fits all" - the nice thing to have would be a "choice" or
> "randompop"  method for sets.

Given that your choice method for sets would be little more than

    value = random.choice(tuple(s))

it seems like it's probably overkill to build it into Python -
particularly as doing so restricts the implementation to Python 3.6+
whereas writing it out by hand works for any version. It also makes
the complexity of the operation explicit, which is a good thing.


From rob.cliffe at  Wed Apr 13 05:47:53 2016
From: rob.cliffe at (Rob Cliffe)
Date: Wed, 13 Apr 2016 10:47:53 +0100
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On 13/04/2016 09:32, Paul Moore wrote:
> On 13 April 2016 at 06:29, Joao S. O. Bueno<jsbueno at>  wrote:
>> Maybe, instead of all of this, since as put by others, hardly there would be a
>> "one size fits all" - the nice thing to have would be a "choice" or
>> "randompop"  method for sets.
> Given that your choice method for sets would be little more than
>      value = random.choice(tuple(s))
> it seems like it's probably overkill to build it into Python -
> particularly as doing so restricts the implementation to Python 3.6+
> whereas writing it out by hand works for any version. It also makes
> the complexity of the operation explicit, which is a good thing.
> Paul
> _______________________________________________
Isn't there an inconsistency that random.sample caters to a set by 
converting it to a tuple, but random.choice doesn't?
Surely random.sample(someSet, k) will always be slower than a 
"tuple-ising" random.choice(someSet) implementation?
And it wouldn't prevent you from converting your set (or other iterable) 
to a sequence explicitly.

From tjreedy at  Wed Apr 13 06:36:04 2016
From: tjreedy at (Terry Reedy)
Date: Wed, 13 Apr 2016 06:36:04 -0400
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <-176815014396412028@unknownmsgid>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <nel7fd$7fq$>

On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:

> BTW, isn't it impossible to randomly select from an infinite iterable anyway?

With equal probability, yes, impossible.
With skewed probabilities, no, possible.

Terry Jan Reedy

From rosuav at  Wed Apr 13 06:46:08 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Apr 2016 20:46:08 +1000
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <nel7fd$7fq$>
References: <>
 <-176815014396412028@unknownmsgid> <nel7fd$7fq$>
Message-ID: <>

On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy <tjreedy at> wrote:
> On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:
>> BTW, isn't it impossible to randomly select from an infinite iterable
>> anyway?
> With equal probability, yes, impossible.
> With skewed probabilities, no, possible.

What, you mean like this?

def choice(it):
    it = iter(it)
    value = next(it)
        while random.randrange(2):
            value = next(it)
    except StopIteration: pass
    return value

I'm not sure how useful it is, but it does accept potentially infinite
iterables, and does return values selected at random...


From ncoghlan at  Wed Apr 13 09:31:54 2016
From: ncoghlan at (Nick Coghlan)
Date: Wed, 13 Apr 2016 23:31:54 +1000
Subject: [Python-ideas] Dunder method to make object str-like
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On 13 April 2016 at 03:33, Brett Cannon <brett at> wrote:
> On Tue, 12 Apr 2016 at 10:18 Nikolaus Rath <Nikolaus at> wrote:
>> On Apr 11 2016, Ethan Furman
>> <ethan-gcWI5d7PMXnvaiG9KC9N7Q at> wrote:
>> >> As far as I can see, implementing a protocol instead of adding a few
>> >> isinstance checks is more likely to make the life of a CPython
>> >> developer
>> >> harder than easier.
>> >
>> > I disagree.  And the protocol idea was not mine, so apparently other
>> > core-devs also disagree (or think it's worth it, regardless).
>> I haven't found any email explaining why a protocol would make things
>> easier than the isinstance() approach (and I read most of the threads
>> both here and on -dev), so I was assuming that the core-devs in question
>> don't disagree but haven't considered the second approach.
> I disagree with the idea. :) Type checking tends to be too strict as it
> prevents duck typing. And locking ourselves to only what's in the stdlib is
> way too restrictive when there are alternative path libraries out there
> already. Plus it's short-sighted to assume no one will ever come up with a
> better path library, and so tying us down to only pathlib.PurePath
> explicitly would be a mistake long-term when specifying a single method
> keeps things flexible (this is the same reason __index__() exists and we
> simply don't say indexing has to be with a subclass of int or something).

In addition to these general API design points, there's also a key
pragmatic point, which is that we probably want
importlib._bootstrap_external to be able to use the new API, which
means it needs to work before any non-builtin and non-frozen modules
are available. That's easy with a protocol, hard with a concrete

Most folks don't need to worry about "How we do make this code work
when the interpeter isn't fully configured yet?", but we don't always
have that luxury :)

As for why it wasn't specifically discussed before now, "protocols are
preferable to dependencies on concrete classes" is such an entrenched
design pattern in Python by this point that we tend to take it for
granted, so unless someone specifically asks for an explanation, we're
unlikely to spell it out again. (Thanks for doing that, by the way - I
suspect you're far from the only one that was puzzled by the apparent


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From chris.barker at  Wed Apr 13 11:31:15 2016
From: chris.barker at (Chris Barker - NOAA Federal)
Date: Wed, 13 Apr 2016 08:31:15 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <5974248239762161974@unknownmsgid>

> There's an upper limit on how dense a CPython dict or set can become
> (the load factor doesn't exceed 2/3), but no lower limit.  For
> example, it's easy to end up with a dict holding a single entry hiding
> among millions of empty slots (dicts are never resized on key
> deletion, only on key insertion).

Easy, yes. Common? I wonder. If it were common then wouldn't there be
good reason to resize the hash table when that occurred? Aside from
being able to select random items, of course...


>> ...
>> BTW, isn't it impossible to randomly select from an infinite iterable anyway?
> Of course, but it is possible to do uniform random selection, in one
> pass using constant space and in linear time, from an iterable whose
> length isn't known in advance (simple case of "reservoir sampling").

From guido at  Wed Apr 13 12:16:55 2016
From: guido at (Guido van Rossum)
Date: Wed, 13 Apr 2016 09:16:55 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <5974248239762161974@unknownmsgid>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

On Wed, Apr 13, 2016 at 8:31 AM, Chris Barker - NOAA Federal
<chris.barker at> wrote:
>> There's an upper limit on how dense a CPython dict or set can become
>> (the load factor doesn't exceed 2/3), but no lower limit.  For
>> example, it's easy to end up with a dict holding a single entry hiding
>> among millions of empty slots (dicts are never resized on key
>> deletion, only on key insertion).
> Easy, yes. Common? I wonder. If it were common then wouldn't there be
> good reason to resize the hash table when that occurred? Aside from
> being able to select random items, of course...

So it's becoming clear that if we wanted to do this right for sets
(and dicts) the choice() implementation should be part of the set/dict
implementation. It can then take care of compacting the set if it's
density is too low.

But honestly I don't see the use case come up enough to warrant all that effort.

--Guido van Rossum (

From mal at  Wed Apr 13 12:33:12 2016
From: mal at (M.-A. Lemburg)
Date: Wed, 13 Apr 2016 18:33:12 +0200
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

On 13.04.2016 07:08, Tim Peters wrote:
> [Chris Barker - NOAA Federal <chris.barker at>]
>> ...
>> Anyway, the hash table has gaps, but wouldn't it be sufficiently
>> random to pick a random index, and if it's a gap, pick another one?
> That would be a correct implementation, but ...
>> I suppose in theory, this could be in infinite process, but in practice,
>> it would be O(1) with a constant of two or three... Better than
>> iterating through on average half of the keys.
> There's an upper limit on how dense a CPython dict or set can become
> (the load factor doesn't exceed 2/3), but no lower limit.  For
> example, it's easy to end up with a dict holding a single entry hiding
> among millions of empty slots (dicts are never resized on key
> deletion, only on key insertion).

Converting a set to a list is O(N) (with N being the number
of slots allocated for the set, with N > n, the number of used keys),
so any gap skipping logic won't be better in performance than doing:

import random

my_set = {1, 2, 3, 4}
l = list(my_set)
selection = [random.choice(l) for x in l]

print (selection)

You'd only have a memory advantage, AFAICT.

>> ...
>> BTW, isn't it impossible to randomly select from an infinite iterable anyway?
> Of course, but it is possible to do uniform random selection, in one
> pass using constant space and in linear time, from an iterable whose
> length isn't known in advance (simple case of "reservoir sampling").

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Apr 13 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From guido at  Wed Apr 13 12:31:05 2016
From: guido at (Guido van Rossum)
Date: Wed, 13 Apr 2016 09:31:05 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 13, 2016 at 2:47 AM, Rob Cliffe <rob.cliffe at> wrote:
> Isn't there an inconsistency that random.sample caters to a set by
> converting it to a tuple, but random.choice doesn't?

Perhaps because the use cases are different? Over the years I've
learned that inconsistencies aren't always signs of sloppy thinking --
they may actually point to deep issues that aren't apparently on the

I imagine the typical use case for sample() to be something that
samples the population once and then does something to the sample; the
next time sample() is called the population is probably different
(e.g. the next lottery has a different set of players).

But I imagine a fairly common use case for choice() to be choosing
from the same population over and over, and that's exactly the case
where the copying implementation you're proposing would be a small

--Guido van Rossum (

From tim.peters at  Wed Apr 13 14:04:46 2016
From: tim.peters at (Tim Peters)
Date: Wed, 13 Apr 2016 13:04:46 -0500
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <5974248239762161974@unknownmsgid>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

>> There's an upper limit on how dense a CPython dict or set can become
>> (the load factor doesn't exceed 2/3), but no lower limit.  For
>> example, it's easy to end up with a dict holding a single entry hiding
>> among millions of empty slots (dicts are never resized on key
>> deletion, only on key insertion).

[Chris Barker - NOAA Federal <chris.barker at>]
> Easy, yes. Common? I wonder.

Depends on the app.  I doubt it's common across all apps.

> If it were common then wouldn't there be good reason to resize the
> hash table when that occurred? Aside from being able to select random
> items, of course...

Shrinking the table would have no primary effect on speed of
subsequent access or deletion - it's O(1) expected regardless.
Shrinking the table is expensive when it's done (the entire object has
to be traversed, and the items reinserted one by one, into a smaller

Regardless, I'd be loathe to add a conditional branch on every
deletion to check.  Nothing is free, and the check would be a waste of
cycles every time for apps that don't care about O(1) uniform random
dict/set selection.  Note that I don't care about the "wasted" memory,
though.  But then I don't care about O(1) uniform random dict/set
selection either ;-)

From greg.ewing at  Wed Apr 13 17:47:37 2016
From: greg.ewing at (Greg Ewing)
Date: Thu, 14 Apr 2016 09:47:37 +1200
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

Chris Angelico wrote:
> On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy <tjreedy at> wrote:
>>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:
>>>BTW, isn't it impossible to randomly select from an infinite iterable
>>With equal probability, yes, impossible.
> def choice(it):
>     it = iter(it)
>     value = next(it)
>     try:
>         while random.randrange(2):
>             value = next(it)
>     except StopIteration: pass
>     return value

I think Terry meant that you can't pick just one item that's
equally likely to be any of the infinitely many items returned
by the iterator.

You can prove that by considering that the probability of
a given item being returned would have to be 1/infinity,
which is zero -- so you can't return anything!


From ethan at  Wed Apr 13 18:55:04 2016
From: ethan at (Ethan Furman)
Date: Wed, 13 Apr 2016 15:55:04 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 04/12/2016 09:09 PM, Rob Cliffe wrote:

> I did a little experimenting (admittedly non-rigorous, on one platform
> only, and using Python 2.7.10, not Python 3, and using code which could
> very possibly be improved) on selecting a random element from a
> generator, and found that
>      for small or moderate generators, reservoir sampling was almost
> always slower than generating a tuple
>      as the generator length increased up to roughly 10,000, the ratio
>              (time taken by reservoir)  / (time taken by tuple)
>          increased, reaching a maximum of over 4
>      as the generator length increased further, the ratio started to
> decrease, although for a length of 80 million (about as large as I could
> test) it was still over 3.

I suspect this proves the point -- reservoir sampling is good not 
because it is fast, but because it won't drain your memory keeping items 
you will not return.


From tjreedy at  Wed Apr 13 19:29:53 2016
From: tjreedy at (Terry Reedy)
Date: Wed, 13 Apr 2016 19:29:53 -0400
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <nemkqb$rl$>

On 4/13/2016 6:46 AM, Chris Angelico wrote:
> On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy <tjreedy at> wrote:
>> On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:
>>> BTW, isn't it impossible to randomly select from an infinite iterable
>>> anyway?
>> With equal probability, yes, impossible.

I have seen too many mathematical or statistical writers who ought to 
know better write "Let N be a random integer..." with no indication of a 
finite bound or other than a uniform distribution.  No wonder students 
and readers sometimes get confused.

>> With skewed probabilities, no, possible.
> What, you mean like this?
> def choice(it):
>      it = iter(it)
>      value = next(it)
>      try:
>          while random.randrange(2):
>              value = next(it)
>      except StopIteration: pass
>      return value

With a perfect source of random numbers, this will halt with probability 
one.  With current pseudorandom generators, this will definitely halt. 
And yes, both theoretically and practically, this is an example of 
skewed probabilities  -- a waiting time distribution.

> I'm not sure how useful it is, but it does accept potentially infinite
> iterables, and does return values selected at random...

People often want variates selected from a non-uniform distribution. 
Often, a uniform variate can be transformed.  Sometimes multiple uniform 
variates are needed.

If 'it' above is itertools.count, the above models the waiting time to 
get 'heads' (or 'tails') with a fair coin.  Waiting times might be 
obtained more efficiently, perhaps, with, say, randrange(2**16), with 
another draw for as long as the value is 0 plus some arithmethic that 
uses int.bit_length.  (Details to be verified.)

Terry Jan Reedy

From steve at  Wed Apr 13 21:30:38 2016
From: steve at (Steven D'Aprano)
Date: Thu, 14 Apr 2016 11:30:38 +1000
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
Message-ID: <>

On Thu, Apr 14, 2016 at 09:47:37AM +1200, Greg Ewing wrote:
> Chris Angelico wrote:
> >On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy <tjreedy at> wrote:
> >
> >>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:
> >>
> >>>BTW, isn't it impossible to randomly select from an infinite iterable
> >>>anyway?
> >>
> >>With equal probability, yes, impossible.

> I think Terry meant that you can't pick just one item that's
> equally likely to be any of the infinitely many items returned
> by the iterator.

Correct. That's equivalent to chosing a positive integer with uniform 
probability distribution and no upper bound.

> You can prove that by considering that the probability of
> a given item being returned would have to be 1/infinity,
> which is zero -- so you can't return anything!

That's not how probability works :-)

Consider a dart which is thrown at a dartboard. The probability of it 
landing on any specific point is zero, since the area of a single point 
is zero. Nevertheless, the dart does hit somewhere!

A formal and precise treatment would have to involve calculus and limits 
as the probability approaches zero, rather than a flat out "the 
probability is zero, therefore it's impossible".

Slightly less formally, we can say (only horrifying mathematicians a 
little bit) that the probability of any specific number is an 
infinitesimal number.

While it is *mathematically* meaningful to talk about selecting a random 
positive integer uniformly, its hard to do much more than that. The mean 
(average) is undefined[1]. A typical value chosen would have a vast 
number of digits, far larger than anything that could be stored in 
computer memory. Indeed Almost All[2] of the values we generate would be 
so large that we have no notation for writing it down (and not enough 
space in the universe to write it even if we did). So it is impossible 
in practice to select a random integer with uniform distribution and no 
upper bound.

Non-uniform distributions, though, are easy :-)

[1] While weird, this is not all that weird. For example, selecting 
numbers from a Cauchy distribution also has an undefined mean. What this 
means in practice is that the *sample mean* will not converge as you 
take more and more samples: the more samples you take, the more wildly 
the average will jump all over the place.



From greg.ewing at  Thu Apr 14 02:03:08 2016
From: greg.ewing at (Greg Ewing)
Date: Thu, 14 Apr 2016 18:03:08 +1200
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
 <> <>
Message-ID: <>

Steven D'Aprano wrote:
> A formal and precise treatment would have to involve calculus and limits 
> as the probability approaches zero, rather than a flat out "the 
> probability is zero, therefore it's impossible".

The limit of p = 1/n as n goes to infinity is zero.
Events with zero probability can't happen. I don't
know how it can be made more rigorous than that.

I think what this means is that if you somehow wrote
a program to draw a number from an infinite uniform
distribution, it would never terminate.

< A typical value chosen would have a vast
> number of digits, far larger than anything that could be stored in 
> computer memory.

I'm not sure it even makes sense to talk about a
typical number, because for any number you pick,
there are infinitely many numbers with more digits,
but only finitely many with the same or fewer digits.
So the probability of getting only that many digits
is zero, too!

Infinity: Just say no.


From joshua.morton13 at  Thu Apr 14 02:59:44 2016
From: joshua.morton13 at (Joshua Morton)
Date: Thu, 14 Apr 2016 06:59:44 +0000
Subject: [Python-ideas] Dictionary views are not entirely 'set like'
In-Reply-To: <>
References: <>
Message-ID: <>

These were exactly my thoughts. I wanted to bump this, since it was drowned
out by more important things like Paths and invertible booleans and almost
no discussion was had on the main issue, that of dict views not acting like
sets. Since things have died down, is that behavior that should be
remedied, and if so should it be backwards compatible (make set more
permissive), treat it as a bugfix (make the views raise errors)? And should
it include union/other methods to keep performance on the usecase that now
throws a bug?


On Thu, Apr 7, 2016 at 1:53 PM Michael Selik <mike at> wrote:

> > On Apr 6, 2016, at 10:38 AM, Joshua Morton <joshua.morton13 at>
> wrote:
> >     set() | []  # 2
> >     {}.keys() | []  # 3
> Looks like this should be standardized. Either both raise TypeError, or
> both return a set. My preference would be TypeError, but that might be
> worse for backwards-compatibility.
> >     {}.keys().union(set())  # 6
> Seems to me that the pipe operator is staying on MappingView, so it's
> reasonable to add a corresponding ``.union`` to mimic sets. And
> intersection, etc.
> >     {}.values() == {}.values()  # 9
> >     d = {}; d.values() == d.values()  # 10
> It's weird, but float('nan') != float('nan'). I'm not particularly
> bothered by this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ian.g.kelly at  Thu Apr 14 04:03:00 2016
From: ian.g.kelly at (Ian Kelly)
Date: Thu, 14 Apr 2016 02:03:00 -0600
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <> <-176815014396412028@unknownmsgid>
 <> <>
Message-ID: <>

On Thu, Apr 14, 2016 at 12:03 AM, Greg Ewing
<greg.ewing at> wrote:
> Steven D'Aprano wrote:
>> A formal and precise treatment would have to involve calculus and limits
>> as the probability approaches zero, rather than a flat out "the probability
>> is zero, therefore it's impossible".
> The limit of p = 1/n as n goes to infinity is zero.
> Events with zero probability can't happen. I don't
> know how it can be made more rigorous than that.

This only holds true if the sample space is finite. The mathematical
term for an event with zero probability that nonetheless can happen is
"almost never".


As usual, infinity is weird.

From stephen at  Thu Apr 14 04:16:06 2016
From: stephen at (Stephen J. Turnbull)
Date: Thu, 14 Apr 2016 17:16:06 +0900
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <-176815014396412028@unknownmsgid> <nel7fd$7fq$>
Message-ID: <>

Greg Ewing writes:

 > The limit of p = 1/n as n goes to infinity is zero.
 > Events with zero probability can't happen. I don't
 > know how it can be made more rigorous than that.

Sorry, as Steven pointed out, events *that are modeled as* having zero
probability happen all the time.  What you should not do with such
events is give them positive (ie, atomic) weight in statistical
calculations (= theory), and therefore you should not *predict* their
occurance as individuals (unless you have no fear of becoming a
laughingstock = practice).

 > Infinity: Just say no.

I thought your name was Ewing.  Is it really Brouwer?<wink/>

(Don't bother, I know.  The reference to Brouwer is close enough for


N.B. "Straight man" is not an accurate translation.

From nikita at  Thu Apr 14 04:23:05 2016
From: nikita at (Nikita Nemkin)
Date: Thu, 14 Apr 2016 13:23:05 +0500
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP 3121/PEP
Message-ID: <>

Reading PEP 3121/PEP 489 I can't stop wondering, why do extension
modules require such specialized lifecycle APIs? Why not just let
them subclass ModuleType? (Or any type, really, but ModuleType might
be a good place to define some standard behavior.)

Module instance naturally encapsulates C level module state.
GC/finalization happens just like for any other object. PEP 3121
becomes redundant.

Two-step initialization (PEP 489) can be achieved by defining
a new kind of PyInit_XXX entry point, returning a module *type*,
instead of a module *instance*. No extra API needed beyond that!

Now, importer can simply instantiate this module type, passing
__name__, __file__ and the rest. ModuleType.tp_new will perform
attribute init, sys.modules registration etc.
the importer can manually pull tp_new/tp_init/attribute setup, supplanting
type_call. (This is closer to the current way of doing things.)

Actual module initialization ("executing the module body")
happens in tp_init. reload() is equivalent to calling tp_init again.

Subinterpreter interaction becomes transparent: every interpreter
instantiates its own module copy. "Singleton" modules with
external global state should fail second instantiation
(maybe by deriving from a special SingletonModuleType subclass
that will handle it for them).

Additionally, custom module type allows fine grained attribute
access control (aka metamodules), useful to many complex modules.
C synchronized module "variables" become super-easy to define
(tp_members). For lazy loading and importing there's
tp_getattro, tp_getset, etc.

One problem not solved by this approach (nor the current approach)
is module state access from methods of extension types.
At least two solutions are possible:
1. Look up the module by name (sys.modules) or type (new per-interpreter
2. Define a new METH_XXX calling convention (or flag) and pass both
   PyCFunctionObject.m_self and PyCFunctionObject.m_module
   to the C level method implementations.
Both can be implemented, #1 being simple and #2 being proper.

What do you think?

From encukou at  Thu Apr 14 05:57:19 2016
From: encukou at (Petr Viktorin)
Date: Thu, 14 Apr 2016 11:57:19 +0200
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/14/2016 10:23 AM, Nikita Nemkin wrote:
> Reading PEP 3121/PEP 489 I can't stop wondering, why do extension
> modules require such specialized lifecycle APIs? Why not just let
> them subclass ModuleType? (Or any type, really, but ModuleType might
> be a good place to define some standard behavior.)

Good question.
I'll list some assumptions; if you don't share them we can talk more
about these:
- somethings are easy in Python, but not pleasant to do correctly* in C:
  - subclassing
  - creating simple objects
  - setting attributes
- most modules don't need special functionality

* ("Correctly" includes things like error checking, which many
third-party modules skimp on.)

Most of the API is in the style of "leave this NULL unless you need
something special", i.e. simple things are easy, but the complex cases
are possible. Creating custom ModuleType subclasses is possible, but I
definitely wouldn't want to require every module author to do that.

A lot of the API is convenience features, which are important: they do
the right thing (for example, w.r.t. error checking), and they're easier
to use than alternatives. This makes the API grows into several layers;
for example:
- m_methods in the PyModuleDef
- PyModule_AddFunctions
- PyCFunction_NewEx & PyObject_SetAttrString
Usually you just set the first, but if you need more control (e.g.
you're creating modules dynamically), you can use the lower-level tools.

Your suggestion won't really help with this kind of complexity.

Oh, and there is a technical reason against subclassing ModuleType
unless necessary: Custom ModuleType subclasses cannot be made to work
with runpy (i.e. python -m). For ModuleType (the ones without a custom
create_module in the current API), this doesn't *currently* work, but
the PEPs were written so that it's "just" a question of spending some
development effort on runpy.

> Module instance naturally encapsulates C level module state.
> GC/finalization happens just like for any other object. PEP 3121
> becomes redundant.

It doesn't become fully redundant: m_methods is still useful.

The rest of PyModuleDef makes the more common complex cases simpler than
a full-blown ModuleType subclass.

> Two-step initialization (PEP 489) can be achieved by defining
> a new kind of PyInit_XXX entry point, returning a module *type*,
> instead of a module *instance*. No extra API needed beyond that!

With the current API, you don't return a module *instance*, but a module
*description* (PyModuleDef). This is a lot easier than creating a
subclass in C.
With your suggestion, I fear that someone would quickly come up with a
macro to automate creating simple ModuleType instances, and at that
point the API would be as complex as it is now, but every module
instance would now also have an extra ModuleType subclass ? and I don't
think that's either simpler or more effective.

> Now, importer can simply instantiate this module type, passing
> __name__, __file__ and the rest. ModuleType.tp_new will perform
> attribute init, sys.modules registration etc.
> OR
> the importer can manually pull tp_new/tp_init/attribute setup, supplanting
> type_call. (This is closer to the current way of doing things.)
> Actual module initialization ("executing the module body")
> happens in tp_init. reload() is equivalent to calling tp_init again.
> Subinterpreter interaction becomes transparent: every interpreter
> instantiates its own module copy. "Singleton" modules with
> external global state should fail second instantiation
> (maybe by deriving from a special SingletonModuleType subclass
> that will handle it for them).

Current status: every interpreter instantiates its own module instance.
"Singleton" modules with external global state are marked as such, and
should be written so that they fail second instantiation. (Maybe the
failing can be automated by the import machinery, but that part is not
yet implemented).
It seems to me that adding a custom ModuleType subclass to the mix
wouldn't change much.

> Additionally, custom module type allows fine grained attribute
> access control (aka metamodules), useful to many complex modules.
> C synchronized module "variables" become super-easy to define
> (tp_members). For lazy loading and importing there's
> tp_getattro, tp_getset, etc.

Right. This is the kind of thing you *do* need a ModuleType subclass
for, and the current API makes it possible to do it.

> One problem not solved by this approach (nor the current approach)
> is module state access from methods of extension types.
> At least two solutions are possible:
> 1. Look up the module by name (sys.modules) or type (new per-interpreter
>    cache).
> 2. Define a new METH_XXX calling convention (or flag) and pass both
>    PyCFunctionObject.m_self and PyCFunctionObject.m_module
>    to the C level method implementations.
> Both can be implemented, #1 being simple and #2 being proper.

*This* is the real problem now. I think #2 is viable, and I'm slowly
(too slowly perhaps) working on it

> What do you think?

I personally think that your suggestion wouldn't make the API
substantially simpler, assuming you would keep it as robust and easy to
use as the current solution. And if you would want to maintain backwards
compatibility (even with only the pre-PEP 489 state), it would be even

Many people thought about the current APIs, and (almost) all of the
unpleasant decisions we had to make do have their reasons. (And if you
ask about a more specific decision, I can give you the specific reasons.)

From nikita at  Thu Apr 14 09:51:15 2016
From: nikita at (Nikita Nemkin)
Date: Thu, 14 Apr 2016 18:51:15 +0500
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
Message-ID: <>

On Thu, Apr 14, 2016 at 2:57 PM, Petr Viktorin <encukou at> wrote:
> On 04/14/2016 10:23 AM, Nikita Nemkin wrote:
>> Reading PEP 3121/PEP 489 I can't stop wondering, why do extension
>> modules require such specialized lifecycle APIs? Why not just let
>> them subclass ModuleType? (Or any type, really, but ModuleType might
>> be a good place to define some standard behavior.)
> Good question.
> I'll list some assumptions; if you don't share them we can talk more
> about these:
> - somethings are easy in Python, but not pleasant to do correctly* in C:
>   - subclassing
>   - creating simple objects
>   - setting attributes
> - most modules don't need special functionality
> * ("Correctly" includes things like error checking, which many
> third-party modules skimp on.)
> Most of the API is in the style of "leave this NULL unless you need
> something special", i.e. simple things are easy, but the complex cases
> are possible. Creating custom ModuleType subclasses is possible, but I
> definitely wouldn't want to require every module author to do that.
> A lot of the API is convenience features, which are important: they do
> the right thing (for example, w.r.t. error checking), and they're easier
> to use than alternatives. This makes the API grows into several layers;
> for example:
> - m_methods in the PyModuleDef
> - PyModule_AddFunctions
> - PyCFunction_NewEx & PyObject_SetAttrString
> Usually you just set the first, but if you need more control (e.g.
> you're creating modules dynamically), you can use the lower-level tools.
> Your suggestion won't really help with this kind of complexity.

I agree that C API is quite difficult and dangerous to use directly
(that's why I always use Cython), but I don't agree that separate module
init API makes things simpler.

PyModuleDef appears to be PyTypeObject surrogate with similar
(or identical?) semantics. That's an extra concept to learn about,
an extra bit of documentation to consult when writing new code.
PyTypeObject, on the other hand, is fundamental and unavoidable,
regardless of module init system.

Practically speaking,
PyModuleDef_Init(&spam_def) and PyType_Ready(&spam_type)
differ only in the number of zeroes in their respective static structs.
There's also PyType_FromSpec (stable ABI), which might appeal to
some people more than PyType_Ready.

> Oh, and there is a technical reason against subclassing ModuleType
> unless necessary: Custom ModuleType subclasses cannot be made to work
> with runpy (i.e. python -m). For ModuleType (the ones without a custom
> create_module in the current API), this doesn't *currently* work, but
> the PEPs were written so that it's "just" a question of spending some
> development effort on runpy.

I didn't even know that python -m supported extension modules.

>> Module instance naturally encapsulates C level module state.
>> GC/finalization happens just like for any other object. PEP 3121
>> becomes redundant.
> It doesn't become fully redundant: m_methods is still useful.

ModuleType subclasses have tp_methods. A little check in PyType_Ready
can make it behave like PyModuleDef.m_methods. Modules don't need
normal methods anyway.

> With your suggestion, I fear that someone would quickly come up with a
> macro to automate creating simple ModuleType instances, and at that
> point the API would be as complex as it is now, but every module
> instance would now also have an extra ModuleType subclass ? and I don't
> think that's either simpler or more effective.

One or two standard macros that help with ModuleType subsclassing
are in order. Something like PyModule_HEAD_INIT, whose usage is
already obvious from the name.

> I personally think that your suggestion wouldn't make the API
> substantially simpler, assuming you would keep it as robust and easy to
> use as the current solution. And if you would want to maintain backwards
> compatibility (even with only the pre-PEP 489 state), it would be even
> harder.
> Many people thought about the current APIs, and (almost) all of the
> unpleasant decisions we had to make do have their reasons. (And if you
> ask about a more specific decision, I can give you the specific reasons.)

Regarding backward compatibility, PEP 3121 and most of PEP 489
will work just fine as a facade to ModuleType subclass.

I understand that existing PEPs are well researched and there's no
real incentive to change. My proposal is more of a hypothetical kind.

From stefan at  Thu Apr 14 11:07:28 2016
From: stefan at (Stefan Krah)
Date: Thu, 14 Apr 2016 15:07:28 +0000 (UTC)
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
References: <>
Message-ID: <>

Nikita Nemkin <nikita at ...> writes:
> I understand that existing PEPs are well researched and there's no
> real incentive to change. My proposal is more of a hypothetical kind.

Petr obviously has researched all this carefully, but there is
an incentive: _decimal for example takes a speed hit of 20% with
PEP-3121, so it's not implemented.  I suspect that the later PEP
also slows down modules (which does not matter most of the time).

Any new proposal should absolutely include the performance issue.

Stefan Krah

From walker_s at  Thu Apr 14 12:25:56 2016
From: walker_s at (SW)
Date: Thu, 14 Apr 2016 17:25:56 +0100
Subject: [Python-ideas] PEP8 operator must come before line break
Message-ID: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>

PEP8 says that: "The preferred place to break around a binary operator
is after the operator, not before it."

This is ignored in the multiline if examples, and seems to generally be
a bad idea as it negatively impacts clarity.

For example, the following seems much clearer as the entire line does
not need to be scanned to see the intent- only the start of the line is
needed to see how the different properties are used for filtering:
mylist = [
    item for item in mylist
    if item['property'] == 2
    and item['otherproperty'] == 'test'

The alternative seems less clear:
mylist = [
    item for item in mylist
    if item['property'] == 2 and
    item['otherproperty'] == 'test'

If this recommendation remains in force, it would be good to:
1. Follow it in the style guide.
2. Provide a rationale for it, as currently it seems arbitrary and

Just raising this as it has bitten me a couple of times with style
checkers recently.


From guido at  Thu Apr 14 12:35:43 2016
From: guido at (Guido van Rossum)
Date: Thu, 14 Apr 2016 09:35:43 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

Where in PEP 8 does it violate its own advice? That's a bug. (The PEP
has many authors by now.)

As the PEP acknowledges, style is hard to agree over. It's even harder
to change an agreement that has been documented (if not always
followed consistently) for over 20 years.

My rationale for this rule is that ending a line in a binary operator
is a clear hint to the reader that the line isn't finished. (If you
think about it, a comma is a kind of binary operator, and you wouldn't
move the comma to the start of the continuation line, would you? :-)

On Thu, Apr 14, 2016 at 9:25 AM, SW <walker_s at> wrote:
> Hi,
> PEP8 says that: "The preferred place to break around a binary operator
> is after the operator, not before it."
> This is ignored in the multiline if examples, and seems to generally be
> a bad idea as it negatively impacts clarity.
> For example, the following seems much clearer as the entire line does
> not need to be scanned to see the intent- only the start of the line is
> needed to see how the different properties are used for filtering:
> mylist = [
>     item for item in mylist
>     if item['property'] == 2
>     and item['otherproperty'] == 'test'
> ]
> The alternative seems less clear:
> mylist = [
>     item for item in mylist
>     if item['property'] == 2 and
>     item['otherproperty'] == 'test'
> ]
> If this recommendation remains in force, it would be good to:
> 1. Follow it in the style guide.
> 2. Provide a rationale for it, as currently it seems arbitrary and
> unhelpful.
> Just raising this as it has bitten me a couple of times with style
> checkers recently.
> Thanks,
> S
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From random832 at  Thu Apr 14 12:58:08 2016
From: random832 at (Random832)
Date: Thu, 14 Apr 2016 12:58:08 -0400
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On Thu, Apr 14, 2016, at 12:35, Guido van Rossum wrote:
> Where in PEP 8 does it violate its own advice? That's a bug. (The PEP
> has many authors by now.)
> As the PEP acknowledges, style is hard to agree over. It's even harder
> to change an agreement that has been documented (if not always
> followed consistently) for over 20 years.
> My rationale for this rule is that ending a line in a binary operator
> is a clear hint to the reader that the line isn't finished. (If you
> think about it, a comma is a kind of binary operator, and you wouldn't
> move the comma to the start of the continuation line, would you? :-)

Well, yes, but "and" and "or" are English conjunctions in addition to
being binary operators and, a line break is a natural break in reading
and, something flows more naturally if you pause before conjunctions
rather than after them or, am I completely wrong about that?

What I'm saying is, maybe there should be a separate rule for "and" and
"or" in certain contexts, rather than a universal rule for all binary

From boekewurm at  Thu Apr 14 13:02:08 2016
From: boekewurm at (Matthias welp)
Date: Thu, 14 Apr 2016 19:02:08 +0200
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

> Where in PEP 8 does it violate its own advice

As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008)

section indentation, just after 'Acceptable options in this situation
include, but are not limited to: '

# Add some extra indentation on the conditional continuation line.
if (this_is_one_thing
        and that_is_another_thing):

That is the only place I could find just now.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Thu Apr 14 13:02:02 2016
From: guido at (Guido van Rossum)
Date: Thu, 14 Apr 2016 10:02:02 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On Thu, Apr 14, 2016 at 9:58 AM, Random832 <random832 at> wrote:
> On Thu, Apr 14, 2016, at 12:35, Guido van Rossum wrote:
>> Where in PEP 8 does it violate its own advice? That's a bug. (The PEP
>> has many authors by now.)
>> As the PEP acknowledges, style is hard to agree over. It's even harder
>> to change an agreement that has been documented (if not always
>> followed consistently) for over 20 years.
>> My rationale for this rule is that ending a line in a binary operator
>> is a clear hint to the reader that the line isn't finished. (If you
>> think about it, a comma is a kind of binary operator, and you wouldn't
>> move the comma to the start of the continuation line, would you? :-)
> Well, yes, but "and" and "or" are English conjunctions in addition to
> being binary operators and, a line break is a natural break in reading
> and, something flows more naturally if you pause before conjunctions
> rather than after them or, am I completely wrong about that?
> What I'm saying is, maybe there should be a separate rule for "and" and
> "or" in certain contexts, rather than a universal rule for all binary
> operators.

OK, do you want to moderate the discussion about that particular
issue? If you can gather support from some of the usual suspects I'm
not against being persuaded.

--Guido van Rossum (

From guido at  Thu Apr 14 13:23:59 2016
From: guido at (Guido van Rossum)
Date: Thu, 14 Apr 2016 10:23:59 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

Thanks, that was obviously an oversight. I've fixed the PEP.

If the discussion ends up with rough consensus on changing this I will
happily change it back (and change all other occurrences to match the
new rule).

Note that my request for "rough consensus" does *not* imply a vote. +1
and -1 votes (nor fractions in between) should not be posted --
however cogent arguments for/against the status quo (or for
relinquishing the rule altogether) are welcome.

On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp <boekewurm at> wrote:
>> Where in PEP 8 does it violate its own advice
> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008)
> section indentation, just after 'Acceptable options in this situation
> include, but are not limited to: '
> # Add some extra indentation on the conditional continuation line.
> if (this_is_one_thing
>         and that_is_another_thing):
>     do_something()
> That is the only place I could find just now.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From walker_s at  Thu Apr 14 14:48:23 2016
From: walker_s at (SW)
Date: Thu, 14 Apr 2016 19:48:23 +0100
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <BLU436-SMTP5263DB3C44C4326F7FC3E0B8970@phx.gbl>

That'll teach me for stepping away from the computer...

As for changing an established rule, I agree that can be difficult. The
reason this one became an irritation for me is that it was only in the
last few months that I saw flake8 (my style complainer of choice) start
complaining about this, so it's not quite so entrenched as other
elements of style.

I agree that placing the binary operator at the end shows the line
should continue, and thus could be valid, but I also think that placing
it at the start of the next line shows the logic flow for each part of
the expression more clearly- as shown in the examples I originally gave.


On 14/04/16 18:23, Guido van Rossum wrote:
> Thanks, that was obviously an oversight. I've fixed the PEP.
> If the discussion ends up with rough consensus on changing this I will
> happily change it back (and change all other occurrences to match the
> new rule).
> Note that my request for "rough consensus" does *not* imply a vote. +1
> and -1 votes (nor fractions in between) should not be posted --
> however cogent arguments for/against the status quo (or for
> relinquishing the rule altogether) are welcome.
> On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp <boekewurm at> wrote:
>>> Where in PEP 8 does it violate its own advice
>> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008)
>> section indentation, just after 'Acceptable options in this situation
>> include, but are not limited to: '
>> # Add some extra indentation on the conditional continuation line.
>> if (this_is_one_thing
>>         and that_is_another_thing):
>>     do_something()
>> That is the only place I could find just now.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:

From bruce at  Thu Apr 14 15:10:08 2016
From: bruce at (Bruce Leban)
Date: Thu, 14 Apr 2016 12:10:08 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <BLU436-SMTP5263DB3C44C4326F7FC3E0B8970@phx.gbl>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

I find the operator at the beginning of the line much more clear in code
like this:

innerWidth = (outerWidth
              - 2 * border_width
              - left_margin
              - right_margin)

outerHeight  = (innerHeight
                + (title_height if have_title else 0)
                + (subtitle_height if have_subtitle else 0)
                - (1 if have_title and have_subtitle else 0))

outerHeight  = (innerHeight
                + (title_height
                   if have_title
                   else 0)
                + (subtitle_height
                   if have_subtitle
                   else 0)
                - (1
                   if have_title and have_subtitle
                   else 0))

area = ((multiline_calculation_of_height)
        * (multiline_calculation_of_width))

The first two are taken and sanitized from real code.

--- Bruce
Check out my puzzle book and get it free here: (available on iOS)

On Thu, Apr 14, 2016 at 11:48 AM, SW <walker_s at> wrote:

> That'll teach me for stepping away from the computer...
> As for changing an established rule, I agree that can be difficult. The
> reason this one became an irritation for me is that it was only in the
> last few months that I saw flake8 (my style complainer of choice) start
> complaining about this, so it's not quite so entrenched as other
> elements of style.
> I agree that placing the binary operator at the end shows the line
> should continue, and thus could be valid, but I also think that placing
> it at the start of the next line shows the logic flow for each part of
> the expression more clearly- as shown in the examples I originally gave.
> Thanks,
> S
> On 14/04/16 18:23, Guido van Rossum wrote:
> > Thanks, that was obviously an oversight. I've fixed the PEP.
> >
> > If the discussion ends up with rough consensus on changing this I will
> > happily change it back (and change all other occurrences to match the
> > new rule).
> >
> > Note that my request for "rough consensus" does *not* imply a vote. +1
> > and -1 votes (nor fractions in between) should not be posted --
> > however cogent arguments for/against the status quo (or for
> > relinquishing the rule altogether) are welcome.
> >
> > On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp <boekewurm at>
> wrote:
> >>> Where in PEP 8 does it violate its own advice
> >> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008)
> >>
> >> section indentation, just after 'Acceptable options in this situation
> >> include, but are not limited to: '
> >>
> >> # Add some extra indentation on the conditional continuation line.
> >> if (this_is_one_thing
> >>         and that_is_another_thing):
> >>     do_something()
> >>
> >> That is the only place I could find just now.
> >>
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at
> >>
> >> Code of Conduct:
> >
> >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Thu Apr 14 15:22:03 2016
From: storchaka at (Serhiy Storchaka)
Date: Thu, 14 Apr 2016 22:22:03 +0300
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <neoqkr$jvb$>

On 14.04.16 20:23, Guido van Rossum wrote:
> Thanks, that was obviously an oversight. I've fixed the PEP.
> If the discussion ends up with rough consensus on changing this I will
> happily change it back (and change all other occurrences to match the
> new rule).
> Note that my request for "rough consensus" does *not* imply a vote. +1
> and -1 votes (nor fractions in between) should not be posted --
> however cogent arguments for/against the status quo (or for
> relinquishing the rule altogether) are welcome.

An argument for operators "+" and "-":

     result = expr1 +

is syntax error, while

     result = expr1
     + expr2

is silent bug.

From ianlee1521 at  Thu Apr 14 15:25:20 2016
From: ianlee1521 at (Ian Lee)
Date: Thu, 14 Apr 2016 12:25:20 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <BLU436-SMTP5263DB3C44C4326F7FC3E0B8970@phx.gbl>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

My preference has been towards having the binary operator at the front rather than end since listening to Brandon Rhodes 2012 PyCon Canada talk (specifically starting around [1]) . Specifically he is arguing that following Knuth rather than PEP 8 [2] might be a better way to go.

Maybe my preference there is due to the Django work I was doing where I would get long query lines and didn?t always want to split them into multiple lines [3]:

query = (Person.objects

[1] <>
[2] <>
[3] <>

~ Ian Lee | IanLee1521 at <mailto:IanLee1521 at>
> On Apr 14, 2016, at 11:48, SW <walker_s at> wrote:
> That'll teach me for stepping away from the computer...
> As for changing an established rule, I agree that can be difficult. The
> reason this one became an irritation for me is that it was only in the
> last few months that I saw flake8 (my style complainer of choice) start
> complaining about this, so it's not quite so entrenched as other
> elements of style.
> I agree that placing the binary operator at the end shows the line
> should continue, and thus could be valid, but I also think that placing
> it at the start of the next line shows the logic flow for each part of
> the expression more clearly- as shown in the examples I originally gave.
> Thanks,
> S
> On 14/04/16 18:23, Guido van Rossum wrote:
>> Thanks, that was obviously an oversight. I've fixed the PEP.
>> If the discussion ends up with rough consensus on changing this I will
>> happily change it back (and change all other occurrences to match the
>> new rule).
>> Note that my request for "rough consensus" does *not* imply a vote. +1
>> and -1 votes (nor fractions in between) should not be posted --
>> however cogent arguments for/against the status quo (or for
>> relinquishing the rule altogether) are welcome.
>> On Thu, Apr 14, 2016 at 10:02 AM, Matthias welp <boekewurm at> wrote:
>>>> Where in PEP 8 does it violate its own advice
>>> As the OP did not reply this fast, from the webpage (/dev/peps/pep-0008)
>>> section indentation, just after 'Acceptable options in this situation
>>> include, but are not limited to: '
>>> # Add some extra indentation on the conditional continuation line.
>>> if (this_is_one_thing
>>>        and that_is_another_thing):
>>>    do_something()
>>> That is the only place I could find just now.
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From contrebasse at  Thu Apr 14 15:28:39 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Thu, 14 Apr 2016 19:28:39 +0000 (UTC)
Subject: [Python-ideas] PEP8 operator must come before line break
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

Guido van Rossum <guido at ...> writes:

> My rationale for this rule is that ending a line in a binary operator
> is a clear hint to the reader that the line isn't finished. (If you
> think about it, a comma is a kind of binary operator, and you wouldn't
> move the comma to the start of the continuation line, would you? 

I personally tend to look more at the start of the lines because that's
where the blocks are defined (by indentation). Also the end of the lines are
usually not aligned which makes binary operators harder to see.
Because of these two reasons I always put binary operator at the start of
new lines, because that's where I have the most chance to see them, and I'm
in favor of changing this in PEP8.

From python at  Thu Apr 14 15:27:29 2016
From: python at (Erik)
Date: Thu, 14 Apr 2016 20:27:29 +0100
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On 14/04/16 18:23, Guido van Rossum wrote:
> If the discussion ends up with rough consensus on changing this I will
> happily change it back (and change all other occurrences to match the
> new rule).

Interestingly, the sentence above specifically binds the 'and' with what 
follows, via parentheses. It is not part of the preceding text to 
indicate something follows ...

 > however cogent arguments for/against the status quo (or for
 > relinquishing the rule altogether) are welcome.

What I said above pretty much sums up my argument for this (it came up 
in a different context a month or so ago).

I think that is also Matthias's argument (though I won't speak for him).

English speakers will not usually accentuate the 'and's and 'or's in a 
sentence before pausing. To me, I read:

   if foo == bar and \
      baz == spam:

as: "if foo equals bar AND, baz equals spam".

(emphasis on the "and", a pause before 'baz').

Where I will read:

   if  foo == bar \
   and baz == spam:

as: "If foo equals bar, and baz equals spam"

(pause after 'bar').

Just my personal opinion. The problem with this sort of style issue is 
that it's a pattern-based thing. If one is used to reading code written 
with a particular pattern then it's hard to adjust to another. So I 
don't think you'll get a complete consensus one way or the other.


From python at  Thu Apr 14 15:41:58 2016
From: python at (Erik)
Date: Thu, 14 Apr 2016 20:41:58 +0100
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come before
 line break
In-Reply-To: <neoqkr$jvb$>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On 14/04/16 20:22, Serhiy Storchaka wrote:
> An argument for operators "+" and "-":
>      result = expr1 +
>      expr2
> is syntax error, while
>      result = expr1
>      + expr2
> is silent bug.

Interesting. But that bug potentially exists regardless of what PEP8 says.

This seems to me to be something that should be tackled separately.

I imagine that there are versions of 'expr2' that makes this a valid and 
useful construct (which is why it remains) - if I've missed some 
historical discussion on this then please refer me to it, I'd like to 
understand what I'm missing on first glance.


From rosuav at  Thu Apr 14 16:04:12 2016
From: rosuav at (Chris Angelico)
Date: Fri, 15 Apr 2016 06:04:12 +1000
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

On Fri, Apr 15, 2016 at 5:41 AM, Erik <python at> wrote:
> On 14/04/16 20:22, Serhiy Storchaka wrote:
>> An argument for operators "+" and "-":
>>      result = expr1 +
>>      expr2
>> is syntax error, while
>>      result = expr1
>>      + expr2
>> is silent bug.
> Interesting. But that bug potentially exists regardless of what PEP8 says.

Yes, but if you encourage people to spell it the first way, you can't
accidentally leave part of your expression out of your result.

However, I wouldn't write it like *either* of those. To me, the options are:

result = expr1 +


result = expr1
    + expr 2

And either of those would give an immediate IndentationError without
the parentheses. So indenting the continuation is even better than
choosing where to place the binary operator.


From python at  Thu Apr 14 16:30:56 2016
From: python at (Erik)
Date: Thu, 14 Apr 2016 21:30:56 +0100
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

On 14/04/16 21:04, Chris Angelico wrote:
> However, I wouldn't write it like *either* of those. To me, the options are:
> result = expr1 +
>      expr2
> and
> result = expr1
>      + expr 2

Agreed, and the second one is how I prefer to format my C code too.

> So indenting the continuation is even better than
> choosing where to place the binary operator.

The two are not mutually exclusive. Indenting the continuation tackles 
the "silent bug" issue, while choosing where to place the operator is 
purely a readability issue - hence I prefer the second of your examples 
... it addresses both problems from my POV.

However, that doesn't answer my question of when a line consisting of 
just "+ expr" is a useful thing.


From barry at  Thu Apr 14 16:34:36 2016
From: barry at (Barry Warsaw)
Date: Thu, 14 Apr 2016 16:34:36 -0400
Subject: [Python-ideas] PEP8 operator must come before line break
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On Apr 14, 2016, at 05:25 PM, SW wrote:

>PEP8 says that: "The preferred place to break around a binary operator
>is after the operator, not before it."

Personally, my own preferred style prefers keyword binary operators,
e.g. 'and' and 'or', after the line break but operator symbols (e.g. '*' or
'+') before the line break.  The way my editor syntax highlights the keywords
but not the operators makes this style the most readable to me.

However, I think the pep8 tool is too strict here.  PEP 8 the document doesn't
say the line break around binary operators is *required* just that it's a
preferred style.  Forcing the line break here seems like the tool is
overstepping and I would favor a relaxation of the tool.  Where I've adopted
flake8 and such, I've grumbled when it forces me to make this change.  (But I
guess not enough to file a bug. ;)

If the pep8 developers insist on reading PEP 8's "preferred" as a strict
requirement, then I would favor clarification in PEP 8 that such line breaks
are not required and that either choice of line break positioning around
binary operators is acceptable.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From zachary.ware+pyideas at  Thu Apr 14 16:41:36 2016
From: zachary.ware+pyideas at (Zachary Ware)
Date: Thu, 14 Apr 2016 15:41:36 -0500
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

On Thu, Apr 14, 2016 at 3:30 PM, Erik <python at> wrote:
> However, that doesn't answer my question of when a line consisting of just
> "+ expr" is a useful thing.

It's probably not, but that's for a linter to point out.  For a
stupid, contrived, untested example, though:

class Foo:
    def __init__(self, value):
        self.value = value

    def __pos__(self):
        self.value += 1

    def __neg__(self):
        self.value -= 2

f = Foo(3)
assert f.value == 4
assert f.value == 2


From tjreedy at  Thu Apr 14 16:55:25 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 14 Apr 2016 16:55:25 -0400
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <nep04s$djn$>

On 4/14/2016 1:23 PM, Guido van Rossum wrote:

> however cogent arguments for/against the status quo (or for
> relinquishing the rule altogether) are welcome.

Outside of Python, binary operators are (in the examples I can think of) 
more often more strongly associated with the second argument than the 
first, even to the point of switching the order of 2nd arg and operator.

English: Start with A, add B, subtract C, and assign the result to D.

   load A
   add B
   sub C
   stor D

Calculator tape (with literals, not symbols)
   B +
   C -
   D =

I suggest relinquishing the rule, except maybe to suggest consistency 
within an expression, if not the whole file.

Terry Jan Reedy

From python at  Thu Apr 14 17:22:47 2016
From: python at (Erik)
Date: Thu, 14 Apr 2016 22:22:47 +0100
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

Hi Zach,

On 14/04/16 21:41, Zachary Ware wrote:
> It's probably not, but that's for a linter to point out.

I was waiting for the "linter" response (in a good way - I suspected 
that would eventually be the answer, but I was hoping to be surprised :)).

> stupid, contrived, untested example, though:

Yeah, OK. In nearly 20 years, I've never once needed to implement those 
dunder methods, but that's explanation enough for me - thanks.


From random832 at  Thu Apr 14 18:36:55 2016
From: random832 at (Random832)
Date: Thu, 14 Apr 2016 18:36:55 -0400
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

On Thu, Apr 14, 2016, at 16:30, Erik wrote:
> However, that doesn't answer my question of when a line consisting of 
> just "+ expr" is a useful thing.

Well, it's a legitimate unary operator. I've never heard a good
explanation of what the point of it is, but the same can't be said
obviously for -expr.

As for allowing an expression on its own (sure, any operator _can_ have
side effects, but calling most operators for their side effects is a
huge stylistic problem with the code regardless)... the same can be said
for most operators, not just the unary + and -. It'd probably be
reasonable to make a static checking tool that gives you warnings if
almost any non-function-call expression is used as a statement. (along
with other things like most constructors, pure functions, etc). But I
think the idea of making it a syntax error has actually been brought up
recently and rejected.

From njs at  Thu Apr 14 20:45:10 2016
From: njs at (Nathaniel Smith)
Date: Thu, 14 Apr 2016 17:45:10 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On Thu, Apr 14, 2016 at 12:28 PM, Joseph Martinot-Lagarde <
contrebasse at> wrote:
> Guido van Rossum <guido at ...> writes:
>> My rationale for this rule is that ending a line in a binary operator
>> is a clear hint to the reader that the line isn't finished. (If you
>> think about it, a comma is a kind of binary operator, and you wouldn't
>> move the comma to the start of the continuation line, would you?
> I personally tend to look more at the start of the lines because that's
> where the blocks are defined (by indentation). Also the end of the lines
> usually not aligned which makes binary operators harder to see.
> Because of these two reasons I always put binary operator at the start of
> new lines, because that's where I have the most chance to see them, and
> in favor of changing this in PEP8.

This is the case that jumped to mind for me as well...

If I saw code like this in a code review I'd force the author to change it
because the style is outright misleading:

return (something1() +
        a * b +
        some_other_thing() ** 2 -
        f -

We're summing a list of items with some things being negated, but that
structure is impossible to see, and worst of all, the association between
the operator and the thing being operated on is totally lost. OTOH if we
write like this:

return (something1()
        + a * b
        + some_other_thing() ** 2
        - f
        - normalizer)

then the list structure is immediately obvious, and it's immediately
obvious which terms are being added and which are being subtracted.

Similarly, who can even tell if this code is correct:

return (something1() +
        a * b +
        some_other_thing() ** 2 -
        f /

but if the same code is formatted this way then it's immediately obvious
that the code is buggy:

return (something1()
        + a * b
        + some_other_thing() ** 2
        - f
        / normalizer)

It should be corrected to:

return ((something1()
         + a * b
         + some_other_thing() ** 2
         - f)
        / normalizer)

(I'm calling the final item "normalizer" because this pattern actually
comes up fairly often in bayesian computations -- if you're working in log
space then at the end of some computation you subtract off a magic
normalizing factor, and if you're working in linear space then at the end
you divide off a magic normalizing factor. You can also get a similar
pattern for chains of * and /, though it's less common.)

In all of these cases, the hanging indent makes the fact that we have a
continuation line obvious from across the room -- I don't need help knowing
that there's a continuation, I need help figuring out what the the
computation actually does :-).

I actually find a similar effect for and/or chains, where the
beginning-of-line format makes things lined up and easier to scan -- e.g.
comparing these two options:

return (something1() and
        f() and
        some_expression > 1 and
        (some_other_thing() == whatever or
         blahblah or

return (something1()
        and f()
        and some_expression > 1
        and (some_other_thing() == whatever
             or blahblah
             or asdf))

then I find the second option dramatically more readable. But maybe that's
just me -- for and/or the argument is a bit less compelling, because you
don't regularly have chains of different operators with the same
precedence, like you do with +/-.

I guess this might have to do with a more underlying stylistic difference:
as soon as I have an expression that stretches across multiple lines, I try
to break it into pieces where each line is a meaningful unit, even if this
doesn't produce the minimum number of lines. For example I generally avoid
writing stuff like:

return (something1() and f() some_expression > 1 and
        (some_other_thing() == whatever or blahblah or asdf))

return (something1() + a * b + some_other_thing() ** 2 -
        f - normalizer)


BIKESHEDDING REDUCTION ACT NOTICE: I hereby swear that I have said
everything useful I have to say about this topic and that this will be my
only contribution to this thread unless someone addresses me directly.

Nathaniel J. Smith --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Apr 14 21:07:48 2016
From: ethan at (Ethan Furman)
Date: Thu, 14 Apr 2016 18:07:48 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

On 04/14/2016 05:45 PM, Nathaniel Smith wrote:
> On Thu, Apr 14, 2016 at 12:28 PM, Joseph Martinot-Lagarde wrote:

> I actually find a similar effect for and/or chains, where the
> beginning-of-line format makes things lined up and easier to scan --
> e.g. comparing these two options:
> return (something1() and
>          f() and
>          some_expression > 1 and
>          (some_other_thing() == whatever or
>           blahblah or
>           asdf))
> return (something1()
>          and f()
>          and some_expression > 1
>          and (some_other_thing() == whatever
>               or blahblah
>               or asdf))
> then I find the second option dramatically more readable. But maybe
> that's just me -- for and/or the argument is a bit less compelling,
> because you don't regularly have chains of different operators with the
> same precedence, like you do with +/-.

I agree.  ;)

Most of my continuation lines are with `and` and `or`, and I try to keep 
the logical pieces on one line with the join `and` or `or` beginning the 
next line.

Much easier for me to read.


From guido at  Thu Apr 14 22:28:20 2016
From: guido at (Guido van Rossum)
Date: Thu, 14 Apr 2016 19:28:20 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

OK, I get it. Brandon's slides with the Knuth references were
especially useful. So let's change the PEP!

Who wants to draft a diff?

--Guido van Rossum (

From steve at  Fri Apr 15 00:27:39 2016
From: steve at (Steven D'Aprano)
Date: Fri, 15 Apr 2016 14:27:39 +1000
Subject: [Python-ideas] Silent bugs - was Re: PEP8 operator must come
 before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
 <neoqkr$jvb$> <>
Message-ID: <>

On Thu, Apr 14, 2016 at 09:30:56PM +0100, Erik wrote:

> However, that doesn't answer my question of when a line consisting of 
> just "+ expr" is a useful thing.

Almost nowhere.

But, if you have a DSL that operates in a imperative fashion (giving 
commands, rather than returning results in a functional fashion) you 
might do something like:


which is modelled after the doctest directives:

# doctest: +SKIP

etc. I don't think this imperative style is a really good fit for 
Python's syntax, but it could work for some people.


From ncoghlan at  Fri Apr 15 02:43:25 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 15 Apr 2016 16:43:25 +1000
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

On 14 April 2016 at 23:51, Nikita Nemkin <nikita at> wrote:
> PyModuleDef appears to be PyTypeObject surrogate with similar
> (or identical?) semantics. That's an extra concept to learn about,
> an extra bit of documentation to consult when writing new code.
> PyTypeObject, on the other hand, is fundamental and unavoidable,
> regardless of module init system.

The internal details of PyTypeObject are eminently avoidable, either
by not defining your own custom types in C code at all (you can get a
long way with C level acceleration just by defining functions and
using instances of existing types), or else by only defining them
dynamically as heap types (the kind created by class statements) via

To answer your original question, though, PEP 489 needs to read in the
context of PEP 451, which was the one that switched the overall import
system over to the multi-phase import model:

One of the main goals of importlib in general is to let the
interpreter do more of the heavy lifting for things that absolutely
have to be done correctly if you want your import hook or module to
behave "normally". With Python level modules, the interpreter has
always taken care of creating the module for "normal" imports, with
the module author only having to care about populating that namespace
with content (by running Python code).

PEP 451 extended that same convenience to authors of module loaders,
as they could now just define a custom exec_module, and use the
default module creation code rather than having to write their own as
part of a load_module implementation.

PEP 489 then brought that capability to extension modules: extension
module authors can now decide not to worry about module creation at
all, and instead just use the Exec hook to populate the standard
module object that CPython provides by default.

That means caring about the module creation step in an extension
module is now primarily a matter of performance optimisation for
access to module global state - as Stefan notes, the indirection
mechanism in PEP 3121 can be significantly slower than using C level
static variables, and indirection through a Python level namespace is
likely to be even slower.

However, even in those cases, the PEP 489 mechanism gives the
extension module reliable access to information it didn't previously
have access to, since the create method receives a fully populated
module spec.

Once the question is narrowed down to "How can an extension module
fully support subinterpreters and multiple Py_Initialize/Finalize
cycles without incurring PEP 3121's performance overhead?" then the
short answer becomes "We don't know, but ideas for that are certainly
welcome, either here or over on import-sig".

Returning custom module subclasses from the Create hook is certainly
one mechanism for that (it's why supporting such subclasses was a
design goal for PEP 489), but like other current solutions, they run
afoul of the problem that methods defined in C extension modules
currently don't receive a reference to the defining module, only to
the class instance (which is a general problem with the way methods
are defined in C rather than a problem with the import system


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ianlee1521 at  Fri Apr 15 03:07:33 2016
From: ianlee1521 at (Ian Lee)
Date: Fri, 15 Apr 2016 00:07:33 -0700
Subject: [Python-ideas] PEP8 operator must come before line break
In-Reply-To: <>
References: <BLU436-SMTP729AF492CC1AE1A82CC892B8970@phx.gbl>
Message-ID: <>

> On Apr 14, 2016, at 19:28, Guido van Rossum <guido at> wrote:
> OK, I get it. Brandon's slides with the Knuth references were
> especially useful. So let's change the PEP!
> Who wants to draft a diff? <>

~ Ian Lee | IanLee1521 at <mailto:IanLee1521 at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From nikita at  Fri Apr 15 04:59:35 2016
From: nikita at (Nikita Nemkin)
Date: Fri, 15 Apr 2016 13:59:35 +0500
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks for your input. I now see how things evolved to the present state.

in the context of PEP 451, my proposal would have been to move
all default module creation tasks to ModuleType.tp_new (taking
an optional spec parameter), making separate create and exec
unnecessary. Too late, I guess.

> Once the question is narrowed down to "How can an extension module
> fully support subinterpreters and multiple Py_Initialize/Finalize
> cycles without incurring PEP 3121's performance overhead?" then the
> short answer becomes "We don't know, but ideas for that are certainly
> welcome, either here or over on import-sig".

I mentioned the way to avoid state access overhead in my first post.
It's independent of module loading mechanism:

1) define a new "calling convention" flag like METH_GLOBALS.
2) store module ref in PyCFunctionObject.m_module
    (currently it stores only the module name)
3) pass module ref as an extra arg to methods with METH_GLOBALS flag.
4) PyModule_State, reimplemented as a macro, would amount to one
    indirection from the passed parameter.

I suspect that most C ABIs allow to pass the extra arg unconditionally,
(this is certainly the case for x86 and x64 on Windows and Linux).
Meaning that METH_GLOBALS won't increase the actual number
of possible dispatch targets in PyCFunction_Call and won't impact
Python-to-C call performance at all.

From stefan at  Fri Apr 15 05:22:34 2016
From: stefan at (Stefan Krah)
Date: Fri, 15 Apr 2016 09:22:34 +0000 (UTC)
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
References: <>
Message-ID: <>

Nikita Nemkin <nikita at ...> writes:
> I mentioned the way to avoid state access overhead in my first post.
> It's independent of module loading mechanism:

It's great to see people discussing this. I must clarify the 20%
slowdown figure that I posted earlier:

The slowdown was due to changing the static variables to module state
*and* the static types to heap types.  It was recommended at the time
to do both or nothing.

I haven't measured the module state impact in isolation.

Stefan Krah

From encukou at  Fri Apr 15 05:24:58 2016
From: encukou at (Petr Viktorin)
Date: Fri, 15 Apr 2016 11:24:58 +0200
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/15/2016 10:59 AM, Nikita Nemkin wrote:
> Thanks for your input. I now see how things evolved to the present state.
> in the context of PEP 451, my proposal would have been to move
> all default module creation tasks to ModuleType.tp_new (taking
> an optional spec parameter), making separate create and exec
> unnecessary. Too late, I guess.
>> Once the question is narrowed down to "How can an extension module
>> fully support subinterpreters and multiple Py_Initialize/Finalize
>> cycles without incurring PEP 3121's performance overhead?" then the
>> short answer becomes "We don't know, but ideas for that are certainly
>> welcome, either here or over on import-sig".
> I mentioned the way to avoid state access overhead in my first post.
> It's independent of module loading mechanism:
> 1) define a new "calling convention" flag like METH_GLOBALS.
> 2) store module ref in PyCFunctionObject.m_module
>     (currently it stores only the module name)

Wouldn't that break backwards compatibility, though?

> 3) pass module ref as an extra arg to methods with METH_GLOBALS flag.
> 4) PyModule_State, reimplemented as a macro, would amount to one
>     indirection from the passed parameter.
> I suspect that most C ABIs allow to pass the extra arg unconditionally,
> (this is certainly the case for x86 and x64 on Windows and Linux).
> Meaning that METH_GLOBALS won't increase the actual number
> of possible dispatch targets in PyCFunction_Call and won't impact
> Python-to-C call performance at all.

My planned approach is a bit more flexible:
- Add a reference to the module (ht_module) to heap types
- Create a calling convention METH_METHOD, where methods are passed the
class that defines the method (which might PyTYPE(self) or a superclass
of it)

This way methods can get both module state and the class they are
defined on, and the replacement for PyModule_State is two indirections.

Still, both approaches won't work with slot methods (e.g. nb_add), where
there's no space in the API to add an extra argument. Nick proposed a
solution in import-sig [0], which is workable but not elegant. But, I think:
- METH_METHOD would be useful even if it doesn't solve the problem with
slot methods.
- A good solution to the slot methods problem is unlikely to render
METH_METHOD obsolete.
so perhaps solving the 90% case first would be OK.


From nikita at  Fri Apr 15 06:58:53 2016
From: nikita at (Nikita Nemkin)
Date: Fri, 15 Apr 2016 15:58:53 +0500
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 15, 2016 at 2:24 PM, Petr Viktorin <encukou at> wrote:
>> 1) define a new "calling convention" flag like METH_GLOBALS.
>> 2) store module ref in PyCFunctionObject.m_module
>>     (currently it stores only the module name)
> Wouldn't that break backwards compatibility, though?

It will, and I consider this level of breakage acceptable. Alternatively,
another field can be added to this struct.

> My planned approach is a bit more flexible:
> - Add a reference to the module (ht_module) to heap types
> - Create a calling convention METH_METHOD, where methods are passed the
> class that defines the method (which might PyTYPE(self) or a superclass
> of it)
> This way methods can get both module state and the class they are
> defined on, and the replacement for PyModule_State is two indirections.

I've read the linked import-sig thread and realized the depth of issues
Those poor heap type methods don't even have access their own type
pointer! In the light of that, your variant is more useful than mine.

Still, without a good slot support option, new METH_X conventions
don't look attractive at all. Such fundamental change, but only solves
half of the problem.

Also, MRO walking is actually not as slow as it seems.
Checking Py_TYPE(self) and Py_TYPE(self)->tp_base *inline*
will minimize performance overhead in the (very) common case.

If non-slot methods had a suitable static anchor (the equivalent
of slot function address for slots), they could use MRO walking too.

From ncoghlan at  Fri Apr 15 07:01:49 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 15 Apr 2016 21:01:49 +1000
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

On 15 April 2016 at 18:59, Nikita Nemkin <nikita at> wrote:
> Thanks for your input. I now see how things evolved to the present state.
> in the context of PEP 451, my proposal would have been to move
> all default module creation tasks to ModuleType.tp_new (taking
> an optional spec parameter), making separate create and exec
> unnecessary. Too late, I guess.

That doesn't work either, as not only aren't modules in general
actually required to be instances of ModuleType (see [1]), we also
need to be able to create modules to hold __main__, os, sys and
_frozen_importlib before we have an import system to manipulate.

That's a large part of the reason we hived off import-sig from
python-ideas a while back - the import system involves a whole lot of
intertwined arcana stemming from accidents-of-implementation early in
Python's history, as well as the flexible import hook system that was
defined in PEP 302, so a separate list has proven useful for thrashing
out technical details, while we tend to use python-dev and
python-ideas more to check the end result is still comprehensible to
folks that aren't familiar with all those internals :)



Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From encukou at  Fri Apr 15 09:15:07 2016
From: encukou at (Petr Viktorin)
Date: Fri, 15 Apr 2016 15:15:07 +0200
Subject: [Python-ideas] Module lifecycle: simple alternative to PEP
 3121/PEP 489
In-Reply-To: <>
References: <>
Message-ID: <>

Let's move the discussion to import-sig, as Nick explained in the other
subthread. Please drop python-ideas from CC when you reply.

On 04/15/2016 12:58 PM, Nikita Nemkin wrote:
> On Fri, Apr 15, 2016 at 2:24 PM, Petr Viktorin <encukou at> wrote:
>>> 1) define a new "calling convention" flag like METH_GLOBALS.
>>> 2) store module ref in PyCFunctionObject.m_module
>>>     (currently it stores only the module name)
>> Wouldn't that break backwards compatibility, though?
> It will, and I consider this level of breakage acceptable. Alternatively,
> another field can be added to this struct.
>> My planned approach is a bit more flexible:
>> - Add a reference to the module (ht_module) to heap types
>> - Create a calling convention METH_METHOD, where methods are passed the
>> class that defines the method (which might PyTYPE(self) or a superclass
>> of it)
>> This way methods can get both module state and the class they are
>> defined on, and the replacement for PyModule_State is two indirections.
> I've read the linked import-sig thread and realized the depth of issues
> involved...
> Those poor heap type methods don't even have access their own type
> pointer! In the light of that, your variant is more useful than mine.
> Still, without a good slot support option, new METH_X conventions
> don't look attractive at all. Such fundamental change, but only solves
> half of the problem.

Well, it solves the problem for methods that have calling conventions,
and I'm pretty sure by now that a full solution will need this *plus*
another solution for slots. So I'm looking at the problems as two
separate parts, and I also think that when it comes to writing PEPs,
having two separate PEPs would make this more understandable.

> Also, MRO walking is actually not as slow as it seems.
> Checking Py_TYPE(self) and Py_TYPE(self)->tp_base *inline*
> will minimize performance overhead in the (very) common case.

I think so as well. This would mean that module state access in named
methods is fast; in slot methods it's possible (and usually fast
*enough*), and the full solution with __typeslots__ would still be possible.

> If non-slot methods had a suitable static anchor (the equivalent
> of slot function address for slots), they could use MRO walking too.

I think a new METH_* calling style and explicit pointers is a better
alternative here.

From egregius313 at  Fri Apr 15 09:41:41 2016
From: egregius313 at (Ed Minnix)
Date: Fri, 15 Apr 2016 09:41:41 -0400
Subject: [Python-ideas] Adding a pipe function to functools
Message-ID: <>


I have been looking over the toolz library and one of the functions I like the most is pipe. Since most programmers are familiar with piping (via the Unix `|` symbol), and it can help make tighter code, I think it would be nice to add it to the standard library (such as functools).

- Ed Minnix

From wes.turner at  Fri Apr 15 12:25:51 2016
From: wes.turner at (Wes Turner)
Date: Fri, 15 Apr 2016 11:25:51 -0500
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

"[Python-ideas] The pipe protocol, a convention for extensible method
On Apr 15, 2016 9:51 AM, "Ed Minnix" <egregius313 at> wrote:


I have been looking over the toolz library and one of the functions I like
the most is pipe. Since most programmers are familiar with piping (via the
Unix `|` symbol), and it can help make tighter code, I think it would be
nice to add it to the standard library (such as functools).

- Ed Minnix
Python-ideas mailing list
Python-ideas at
Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From richard.prosser at  Fri Apr 15 12:57:05 2016
From: richard.prosser at (Richard Prosser)
Date: Fri, 15 Apr 2016 17:57:05 +0100
Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list
In-Reply-To: <>
References: <>
Message-ID: <>

Is there any mileage in having a naming convention to indicate the type of
a variable? I have never really liked the fact that the Python 'duck
typing' policy is so lax, yet the new "Type Hints" package for Python 3 is
rather clumsy, IMO.

For example:

github_response = requests.get('',
auth=('user', 'pass'))
 # Derived from

The above request returns a Response
<> object
and so the variable has 'response' in its name.


word_count = total_words_in_file('text_file')

where 'count' has been defined (in the IDE, by the user perhaps) as an
Integer and the function is known to return an Integer, perhaps via a local
'count' or 'total' variable.

I know that this has been attempted before but I think that an IDE like
PyCharm could actually check variable usage and issue a warning if a
conflict is detected. Also earlier usages of this 'Hungarian Notation' have
largely been applied to compiled languages - rather strangely, in the case
of known types - rather than an interpreted one like Python.

Please note that I have shown suffixes above but prefixes could also be
valid. I am not sure about relying on 'type strings' *within* a variable
name however.

Is this idea feasible, do you think?

Thanks ...


PS Originally posted in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Apr 15 13:03:53 2016
From: guido at (Guido van Rossum)
Date: Fri, 15 Apr 2016 10:03:53 -0700
Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks for the positive tone of your message.

I do think there's some benefit in having a naming convention that's
consistent within your own code base -- especially if you're working with a
team and you can agree on a convention before you have written much code.

Joel Spolsky agrees:

On Fri, Apr 15, 2016 at 9:57 AM, Richard Prosser <richard.prosser at>

> Is there any mileage in having a naming convention to indicate the type of
> a variable? I have never really liked the fact that the Python 'duck
> typing' policy is so lax, yet the new "Type Hints" package for Python 3 is
> rather clumsy, IMO.
> For example:
> github_response = requests.get('', auth=('user', 'pass'))
>  # Derived from
> The above request returns a Response
> <> object
> and so the variable has 'response' in its name.
> Likewise:
> word_count = total_words_in_file('text_file')
> where 'count' has been defined (in the IDE, by the user perhaps) as an
> Integer and the function is known to return an Integer, perhaps via a local
> 'count' or 'total' variable.
> I know that this has been attempted before but I think that an IDE like
> PyCharm could actually check variable usage and issue a warning if a
> conflict is detected. Also earlier usages of this 'Hungarian Notation' have
> largely been applied to compiled languages - rather strangely, in the case
> of known types - rather than an interpreted one like Python.
> Please note that I have shown suffixes above but prefixes could also be
> valid. I am not sure about relying on 'type strings' *within* a variable
> name however.
> Is this idea feasible, do you think?
> Thanks ...
> Richard
> PS Originally posted in
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Fri Apr 15 13:05:15 2016
From: ethan at (Ethan Furman)
Date: Fri, 15 Apr 2016 10:05:15 -0700
Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/15/2016 09:57 AM, Richard Prosser wrote:

> Is there any mileage in having a naming convention to indicate the type
> of a variable? I have never really liked the fact that the Python 'duck
> typing' policy is so lax, yet the new "Type Hints" package for Python 3
> is rather clumsy, IMO.

Hungarian notation can be very helpful if used meaningfully.  Your 
examples are good:

word_count  (not int_count)

github_result (not str_result)

However, it is still a style question, and as such it will not be 
added/enforced by Python.

If you just want to discuss the merits of Hungarian Notation then 
Python-List is a better place to go.


From steve at  Fri Apr 15 13:04:13 2016
From: steve at (Steven D'Aprano)
Date: Sat, 16 Apr 2016 03:04:13 +1000
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 15, 2016 at 09:41:41AM -0400, Ed Minnix wrote:
> Hello,
> I have been looking over the toolz library and one of the functions I 
> like the most is pipe. Since most programmers are familiar with piping 
> (via the Unix `|` symbol), and it can help make tighter code, I think 
> it would be nice to add it to the standard library (such as 
> functools).

I don't know the toolz library, but I have this:

Is that the sort of thing you mean?

It's certainly not ready yet for the std lib, but if there's interest in 
it, I should be able to tidy it up and publish it. 


From jbvsmo at  Fri Apr 15 13:29:40 2016
From: jbvsmo at (=?UTF-8?Q?Jo=C3=A3o_Bernardo?=)
Date: Fri, 15 Apr 2016 14:29:40 -0300
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

I really like this tool:

It is really a matter of implementing the __or__ operator on iterators. Or
possibly, as this module above, add the operator for all functions dealing
with iterators as first argument
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From julien at  Fri Apr 15 14:05:45 2016
From: julien at (Julien Palard)
Date: Fri, 15 Apr 2016 18:05:45 +0000
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>


Le 15 avril 2016 15:41:41 GMT+02:00, Ed Minnix <egregius313 at> a ?crit :
>Since most programmers are familiar with piping
>(via the Unix `|` symbol)

See also:

Julien Palard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Fri Apr 15 14:19:02 2016
From: steve at (Steven D'Aprano)
Date: Sat, 16 Apr 2016 04:19:02 +1000
Subject: [Python-ideas] Hungarian notation [was Welcome ...]
In-Reply-To: <>
References: <>
Message-ID: <>

(Changing the subject line to something a little more relevant.)

Hi Richard, and welcome.

My repsonses are below, interleaved between your comments as 

On Fri, Apr 15, 2016 at 05:57:05PM +0100, Richard Prosser wrote:

> Is there any mileage in having a naming convention to indicate the type of
> a variable? 

Perhaps not mileage, as such, but maybe yardage or even inchage :-)

Certainly there are naming conventions which are very common:

s for strings;
n for ints;
i, j, k for indexes, especially in a loop;
o or obj for arbitrary objects of any type;
x, y for floats or arbitrary numbers;

etc. They're good for short, generic functions, and it wouldn't surprise 
me if linters and type-checkers had an option to complain if they see 
what looks like misused standard names:

n = 23  # okay
n = {}  # perhaps not?

For less generic functions, we should pick names which describe the 
purpose of the variable. That's usually far more important than it's 
type, especially in languages like Python where *variables* (the names 
themselves) are untyped, only the values assigned to them have types.

> I have never really liked the fact that the Python 'duck
> typing' policy is so lax, yet the new "Type Hints" package for 
> Python 3 is rather clumsy, IMO.

Oh? Can you give an example of type annotations or declarations which 
aren't clumsy?

> For example:
> github_response = requests.get('',
> auth=('user', 'pass'))
>  # Derived from
> The above request returns a Response
> <> object
> and so the variable has 'response' in its name.

I consider that a perfectly reasonable variable name, since it describes 
the variable and gives an idea of its purpose. The fact that it is of 
type Response is not as important as the fact that it is a response from 
a web-server.

> Likewise:
> word_count = total_words_in_file('text_file')
> where 'count' has been defined (in the IDE, by the user perhaps) as an
> Integer 

Declaring that "count" is an integer doesn't tell us anything about 
"word_count". They're completely different variables.

> and the function is known to return an Integer, perhaps via a local
> 'count' or 'total' variable.

A good type-checker should be able to infer that if total_words_in_file 
returns an int, then word_count is also an int. (At least up to the 
point where it is re-bound to another value.)

> I know that this has been attempted before but I think that an IDE like
> PyCharm could actually check variable usage and issue a warning if a
> conflict is detected.

That's the purpose of MyPy and related projects.

(By the way, for the avoidance of doubt, you should understand that 
Python will never force the use of type annotations or static typing.)

> Also earlier usages of this 'Hungarian Notation' have
> largely been applied to compiled languages - rather strangely, in the case
> of known types - rather than an interpreted one like Python.

You should be aware that there are two forms of Hungarian Notation. One 
is useless and widely despised, and the other is helpful but rarely used 
because it's reputation was ruined by the other kind.

> Please note that I have shown suffixes above but prefixes could also be
> valid. I am not sure about relying on 'type strings' *within* a variable
> name however.

I'm not sure I understand what you mean here.

> Is this idea feasible, do you think?

If you mean, could Python enforce type hints through naming conventions, 
I think not. Do you really want to see code like:

numpages_Integer_Or_None = self.count_pages()

While some form of Hungarian Notation is useful, using it everywhere as 
a type hint is going to get frustrating really fast. A good type hint or 
declaration should happen once:

Declare numpages: Integer|None   # making up some syntax

numpages = self.count_pages()
if numpages is not None:
    for page_num in range(1, numpages+1):
        print("Page %d of %d." % (page_num, numpages))


numpages_Integer_Or_None = self.count_pages()
if numpages_Integer_Or_None is not None:
    for page_num in range(1, numpages_Integer_Or_None+1):
        print("Page %d of %d." % (page_num, numpages_Integer_Or_None))

The second is too tiresome to read and write. Using abbreviations will 
decrease the typing burden, but increase the burden of remembering what 
those abbreviations mean.


From julien at  Fri Apr 15 14:24:22 2016
From: julien at (Julien Palard)
Date: Fri, 15 Apr 2016 18:24:22 +0000
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>


Le 15 avril 2016 19:29:40 GMT+02:00, "Jo?o Bernardo" <jbvsmo at> a ?crit :
>I really like this tool:

First, thank you Jo?o !

However I almost don't use it myself, I dislike the idea of exposing an overloaded operator (may cause surprises), and there's in fact a very few places where it's really clearer without reducing maintainability. I mean, working with a long pipe is opaque (no variables involved, so no logging no breakpoints, and no names, well chosen names helps readability).

Also using my Pipe module forces you to mix infix with prefix calls, and I dislike mixing syntaxes, yet having some DSL is sometime cool when they are really useful, think of SQL typically.

Finally the resulting code does clearly not look like Python code, even if readable, it may hurt readability, causing a surprise like "wow what is that, how is it possible ?" Almost forcing one to read Pipe doc instead of simply read Python code ...

All of this obviously also apply to the idea of adding an infix call syntax to the stdlib, so I'm -1 on it.

Julien Palard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From wes.turner at  Fri Apr 15 14:51:45 2016
From: wes.turner at (Wes Turner)
Date: Fri, 15 Apr 2016 13:51:45 -0500
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 15, 2016 9:51 AM, "Ed Minnix" <egregius313 at> wrote:
> Hello,
> I have been looking over the toolz library and one of the functions I
like the most is pipe. Since most programmers are familiar with piping (via
the Unix `|` symbol), and it can help make tighter code, I think it would
be nice to add it to the standard library (such as functools).

Docs: http://

> def pipe(data, *funcs):
>    """ [...] """
>    for func in funcs:
>        data = func(data)
>    return data

`|` is a "vertical bar":

In Python, | is the 'bitwise or' operator __or__:

* [ ] the operator. docs seem to omit ``|`` #TODO

... Sarge also has `|` within command strings for command pipelines

... functools.compose() & functools.partial()


    from functional import compose, partial
    import functools
    multi_compose = partial(functools.reduce, compose)


  * functools.compose is gone!

> - Ed Minnix
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Fri Apr 15 20:32:00 2016
From: greg.ewing at (Greg Ewing)
Date: Sat, 16 Apr 2016 12:32:00 +1200
Subject: [Python-ideas] Welcome to the "Python-ideas" mailing list
In-Reply-To: <>
References: <>
Message-ID: <>

Richard Prosser wrote:
> For example:
> github_response = requests.get('', auth=('user', 'pass')) 
>  # Derived from
> The above request returns a |Response| 
> <> object 
> and so the variable has 'response' in its name.

Conventions like that can be useful, but only when they clarify
the code, which they don't always do. Sometimes they just make
it needlessly verbose and hard to read. So I would not like to
see any formal use made of such a convention.


From steve at  Sat Apr 16 01:06:33 2016
From: steve at (Steven D'Aprano)
Date: Sat, 16 Apr 2016 15:06:33 +1000
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 15, 2016 at 01:51:45PM -0500, Wes Turner wrote:

> *
>     from functional import compose, partial
>     import functools
>     multi_compose = partial(functools.reduce, compose)
> *
>   * functools.compose is gone!

functools.compose never existed. The above example is 
functional.compose, a third party library.


From wes.turner at  Sat Apr 16 02:04:25 2016
From: wes.turner at (Wes Turner)
Date: Sat, 16 Apr 2016 01:04:25 -0500
Subject: [Python-ideas] Adding a pipe function to functools
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 16, 2016 12:07 AM, "Steven D'Aprano" <steve at> wrote:
> On Fri, Apr 15, 2016 at 01:51:45PM -0500, Wes Turner wrote:
> > *
> >
> >     from functional import compose, partial
> >     import functools
> >     multi_compose = partial(functools.reduce, compose)
> >
> > *
> >
> >   * functools.compose is gone!
> functools.compose never existed. The above example is
> functional.compose, a third party library.

my mistake. in my haste, I had thought that functional was functools.

it looks like funcy has both curry() and compose():

* #compose
* #compose
* #curry

and has __lshift__ and __rshift__ for F.__compose:


> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sat Apr 16 22:50:53 2016
From: rosuav at (Chris Angelico)
Date: Sun, 17 Apr 2016 12:50:53 +1000
Subject: [Python-ideas] __getattr__ bouncer for modules
Message-ID: <>

Every now and then there's been talk of making it easier to subclass
modules, and the most common use case that I can remember hearing
about is descriptor protocol requiring code on the type. (For
instance, you can't change a module-level constant into a property
without moving away from the default ModuleType.)

How bad would it be for the default ModuleType to have a __getattr__
function which defers to the instance? Something like this:

def __getattr__(self, name):
    if '__getattr__' in self.__dict__:
        return self.__dict__['__getattr__'](name)
    raise AttributeError

The biggest downside I'm seeing is that module attributes double as
global names, which might mean this would get checked for every global
name that ends up being resolved from the builtins (which is going to
be a LOT). But I'm not sure if that's even true.

Dumb idea? Already been thought of and rejected?


From random832 at  Sat Apr 16 23:26:44 2016
From: random832 at (Random832)
Date: Sat, 16 Apr 2016 23:26:44 -0400
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 16, 2016, at 22:50, Chris Angelico wrote:
> def __getattr__(self, name):
>     if '__getattr__' in self.__dict__:
>         return self.__dict__['__getattr__'](name)
>     raise AttributeError
> The biggest downside I'm seeing is that module attributes double as
> global names, which might mean this would get checked for every global
> name that ends up being resolved from the builtins (which is going to
> be a LOT). But I'm not sure if that's even true.

It is not. (Also, incidentally, defining a global called __class__ does
not set the module's class.)

I don't think this would be enough alone to let you use property
decorators on a module - you'd have to explicitly define a __getattr__
(and __setattr__). And of course make sure that the names you're using
as properties don't exist as real members of the module, since you're
using __getattr__ instead of __getattribute__.

From njs at  Sat Apr 16 23:40:14 2016
From: njs at (Nathaniel Smith)
Date: Sat, 16 Apr 2016 20:40:14 -0700
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 16, 2016 at 7:50 PM, Chris Angelico <rosuav at> wrote:
> Every now and then there's been talk of making it easier to subclass
> modules, and the most common use case that I can remember hearing
> about is descriptor protocol requiring code on the type. (For
> instance, you can't change a module-level constant into a property
> without moving away from the default ModuleType.)
> How bad would it be for the default ModuleType to have a __getattr__
> function which defers to the instance? Something like this:
> def __getattr__(self, name):
>     if '__getattr__' in self.__dict__:
>         return self.__dict__['__getattr__'](name)
>     raise AttributeError
> The biggest downside I'm seeing is that module attributes double as
> global names, which might mean this would get checked for every global
> name that ends up being resolved from the builtins (which is going to
> be a LOT). But I'm not sure if that's even true.

It's not true :-). Code executing inside the module has 'globals() is
mod.__dict__', so lookups go directly to mod.__dict__ and skip

However, starting in 3.5 cpython allows __class__ assignment on
modules, so you can implement custom __getattr__ on a module with:

  class ModuleWithMyGetattr(types.ModuleType):
      def __getattr__(self, name):
          # .. whatever you want ...

  sys.modules[__name__].__class__ = ModuleWithMyGetattr

The advantage of doing it this way is that you can also implement
other things like __dir__ (so tab completion on your new attributes
will work).

This package backports the functionality to earlier versions of CPython:

Basically just replace the explicit __class__ assignment with

  import metamodule
  metamodule.install(__name__, ModuleWithMyGetattr)

(See the links above for more details and a worked example.)


Nathaniel J. Smith --

From rosuav at  Sat Apr 16 23:42:46 2016
From: rosuav at (Chris Angelico)
Date: Sun, 17 Apr 2016 13:42:46 +1000
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 17, 2016 at 1:26 PM, Random832 <random832 at> wrote:
> On Sat, Apr 16, 2016, at 22:50, Chris Angelico wrote:
>> def __getattr__(self, name):
>>     if '__getattr__' in self.__dict__:
>>         return self.__dict__['__getattr__'](name)
>>     raise AttributeError
>> The biggest downside I'm seeing is that module attributes double as
>> global names, which might mean this would get checked for every global
>> name that ends up being resolved from the builtins (which is going to
>> be a LOT). But I'm not sure if that's even true.
> It is not. (Also, incidentally, defining a global called __class__ does
> not set the module's class.)
> I don't think this would be enough alone to let you use property
> decorators on a module - you'd have to explicitly define a __getattr__
> (and __setattr__). And of course make sure that the names you're using
> as properties don't exist as real members of the module, since you're
> using __getattr__ instead of __getattribute__.

Right, it wouldn't automatically allow the use of properties *as
such*, but you would be able to achieve most of the same goal.

# Version 1

# Version 2 - doesn't work
    return 32 or 64

# Version 3 - could work
def __getattr__(name):
    if name == 'BITS_PER_WORD':
        return 32 or 64
    raise AttributeError

It's not as clean as actually supporting @property, but it could be
done without the "bootstrap problem" of trying to have a module
contain the class that it's to be an instance of. All you have to do
is define __getattr__ as a regular top-level function, and it'll get
called. You can then dispatch to property functions if you wish (eg
"""return globals()['_property_'+name]()"""), or just put all the code
straight into __getattr__.


From rosuav at  Sat Apr 16 23:47:08 2016
From: rosuav at (Chris Angelico)
Date: Sun, 17 Apr 2016 13:47:08 +1000
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 17, 2016 at 1:40 PM, Nathaniel Smith <njs at> wrote:
> However, starting in 3.5 cpython allows __class__ assignment on
> modules, so you can implement custom __getattr__ on a module with:
>   class ModuleWithMyGetattr(types.ModuleType):
>       def __getattr__(self, name):
>           # .. whatever you want ...
>   sys.modules[__name__].__class__ = ModuleWithMyGetattr
> The advantage of doing it this way is that you can also implement
> other things like __dir__ (so tab completion on your new attributes
> will work).

Oooh! I did not know that. That's pretty much what I was thinking of -
you can have the class in the module that it's affecting. Coolness!

And inside that class, you can toss in @property and everything, so
it's exactly as clean as it wants to be. I like! Thanks Nathaniel.


From random832 at  Sun Apr 17 00:06:30 2016
From: random832 at (Random832)
Date: Sun, 17 Apr 2016 00:06:30 -0400
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 16, 2016, at 23:42, Chris Angelico wrote:
> It's not as clean as actually supporting @property, but it could be
> done without the "bootstrap problem" of trying to have a module
> contain the class that it's to be an instance of. All you have to do
> is define __getattr__ as a regular top-level function, and it'll get
> called. You can then dispatch to property functions if you wish (eg
> """return globals()['_property_'+name]()"""), or just put all the code
> straight into __getattr__.

Or you could have ModuleType.__getattr(ibute?)__ search through some
object other than the module itself for descriptors.

from types import ModuleType, SimpleNamespace
import sys

# __magic__ could simply be a class, but just to prove it doesn't have
to be:

def foo(self):
    return eval(input("get foo?"))

def foo(self, value):
    print("set foo=" + repr(value))

def __call__(self, v1, v2):
    print("modcall" + repr((v1, v2)))

__magic__ = SimpleNamespace()
for name in 'foo __call__'.split():
    setattr(__magic__, name, globals()[name])
    del globals()[name]

class MagicModule(ModuleType):
    def __getattr__(self, name):
        descriptor = getattr(self.__magic__, name)
        return descriptor.__get__(self, None)

    def __setattr__(self, name, value):
            descriptor = getattr(self.__magic__, name)
        except AttributeError:
            return super().__setattr__(name, value)
        return descriptor.__set__(self, value)

    def __call__(self, *args, **kwargs):
        # Because why not?
            call = self.__magic__.__call__
        except AttributeError:
            raise TypeError("Could not access " + self.__name__ +
            "__magic__.__call__ method.")
        return call(self, *args, **kwargs)

sys.modules[__name__].__class__ = MagicModule

From k7hoven at  Sun Apr 17 04:49:53 2016
From: k7hoven at (Koos Zevenhoven)
Date: Sun, 17 Apr 2016 11:49:53 +0300
Subject: [Python-ideas] __getattr__ bouncer for modules
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Apr 17, 2016 at 7:06 AM, Random832 <random832 at> wrote:
> On Sat, Apr 16, 2016, at 23:42, Chris Angelico wrote:
>> It's not as clean as actually supporting @property, but it could be
>> done without the "bootstrap problem" of trying to have a module
>> contain the class that it's to be an instance of. All you have to do
>> is define __getattr__ as a regular top-level function, and it'll get
>> called. You can then dispatch to property functions if you wish (eg
>> """return globals()['_property_'+name]()"""), or just put all the code
>> straight into __getattr__.
> Or you could have ModuleType.__getattr(ibute?)__ search through some
> object other than the module itself for descriptors.
> from types import ModuleType, SimpleNamespace
> import sys
> [...]

Not for a module, but for a *package*, this was reasonably easy (well,
at least after figuring out how ;-) to do in pre-3.5 Python: One would
import a submodule in and then, within that submodule,
reconstruct the package using a subclass of ModuleType and replace the
original package in sys.modules.


From pavol.lisy at  Sun Apr 17 14:24:21 2016
From: pavol.lisy at (Pavol Lisy)
Date: Sun, 17 Apr 2016 20:24:21 +0200
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-04-09 17:43 GMT+02:00, Steven D'Aprano <steve at>:
> On Fri, Apr 08, 2016 at 02:52:45PM +0900, Stephen J. Turnbull wrote:
>> I just don't see why the current
>> behaviors of &|^ are particularly useful, since you'll have to guard
>> all bitwise expressions against non-bool truthies and falsies.
> flag ^ flag is useful since we don't have a boolean-xor operator and
> bitwise-xor does the right thing for bools. And I suppose some people
> might prefer & and | over boolean-and and boolean-or because they're
> shorter and require less typing. I don't think that's a particularly
> good reason for using them, and as you say, you do have to guard
> against non-bools slipping, but Consenting Adults applies.

They are also useful if you need to avoid short-circuit evaluation.

From k7hoven at  Sun Apr 17 15:28:58 2016
From: k7hoven at (Koos Zevenhoven)
Date: Sun, 17 Apr 2016 22:28:58 +0300
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Apr 2, 2016 at 11:22 AM, Random832 <random832 at> wrote:
> On Sat, Apr 2, 2016, at 02:37, Koos Zevenhoven wrote:
>>  python -e "random.randint(0,10)"
>> which would automatically import stdlib if their names appear in the
>> expression.
> #!/usr/bin/env python3
> import sys
> class magicdict(dict):
>     def __getitem__(self, x):
>         try:
>             return super().__getitem__(x)
>         except KeyError:
>             try:
>                 mod = __import__(x)
>                 self[x] = mod
>                 return mod
>             except ImportError:
>                 raise KeyError
> g = magicdict()
> for arg in sys.argv[1:]:
>     try:
>         p, obj = True, eval(arg, g)
>     except SyntaxError:
>         p = False
>         exec(arg, g)
>     if p:
>         sys.displayhook(obj)
> Handling modules inside packages is left as an exercise for the reader.

Thanks :).

There were some positive reactions to this in the discussions two
weeks ago, so I decided to go on and implement this further. Then I
kind of forgot about it, but now the other getattr thread reminded me
of this, so I came back to it. The implementation now requires Python
3.5+, but I could also do it the same way as I did in my 'np' package
[1], which in the newest version (released three weeks ago), uses
different module-magic-method approaches for Python<3.5 and >= 3.5, so
it even works on Python 2.

So here's

I suppose this could be on pypi, and one could do things like "random.randint(0,10)"


    python -m oneline "random.randint(0,10)"

Any thoughts?



From leewangzhong+python at  Sun Apr 17 21:04:11 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Sun, 17 Apr 2016 21:04:11 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

Two different (probably radical) ideas, with the same justification: reduce
often-useless output in the REPL, which can flood out terminal history and
overwhelm the user.

1. Limit the output per entered command: If you type into the REPL (AKA
interactive shell),


and you forgot that you set n to 10**10, the interpreter should not print
more than a page of output. Instead, it will print a few lines ("... and
approximately X more lines"), and tell you how to print more. (E.g. "Call
'_more()' for more. Call '_full()' for full output.")

Alternatively, have "less"-like behavior.

2. Only print a few parts of the stack trace. In particular, for a
recursive or mutually recursive function, if the error was due to maximum
recursion (is this reasonably distinguishable? the error is
`RuntimeError('maximum recursion depth exceeded')`), try to print each
function on the stack once each.

Again, there should be a message telling you how to get the full stacktrace
printed. EXACTLY how, preferably in a way that is easy to type, so that a
typo won't cause the trace to be lost. It should not use `sys.something()`,
because the user's first few encounters with this message will result in,
"NameError: name 'sys' is not defined".

A few possible rules to reduce stacktrace size:
- Always show the last (top) frame(?).
- Hide any other stdlib funcs directly below (above) it.
- If a function appears more than once in a row, show it once, with the
note, "(and X recursive calls)".
- If functions otherwise appears more than once (usually by mutual
recursion?), and there is a run of them, list them as, "(Mutual recursion:
'x' (5 times), 'y' (144 times), 'z' (13 times).)".

These two behaviors and their corresponding functions could go into a
special module which is (by default) loaded by the interactive shell. The
behaviors can be turned off with some command-line verbosity flag, or tuned
with command-line parameters (e.g. how many lines/pages to print).


  - Excessive output floods the terminal window. Some terminals have a
limit on output history (Windows *defaults* to 300 lines), or don't let you
scroll up at all (at least, my noob self couldn't figure it out when I did
get flooded).

  - Students learning Python, and also everyone else using Python, don't
usually need 99% of a 10000-line output or stack trace.

  - People who want the full output are probably advanced users with, like,
high-limit or unlimited window size, and advanced users are more likely to
look for a verbosity flag, or use a third-party REPL. Default should be
newbie friendly, because advanced users can work around it.

Thoughts? Even if the specific proposals are unworkable, is limiting REPL
output (by default) a reasonable goal?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ben+python at  Sun Apr 17 21:42:57 2016
From: ben+python at (Ben Finney)
Date: Mon, 18 Apr 2016 11:42:57 +1000
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <>

"Franklin? Lee"
<leewangzhong+python at> writes:

> 1. Limit the output per entered command: If you type into the REPL (AKA
> interactive shell),
>     list(range(n))
> and you forgot that you set n to 10**10, the interpreter should not
> print more than a page of output.

For that specific example, when I run it, the output is quite short:

    $ python3

    >>> n = 10**10
    >>> list(range(n))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>

I take the point though: some objects have a very long ?repr? output.

> Instead, it will print a few lines ("... and approximately X more
> lines"), and tell you how to print more. (E.g. "Call '_more()' for
> more. Call '_full()' for full output.")

Perhaps you want a different ?print the interactive-REPL-safe text
representation of this object? function, and to have the interactive
REPL use that new function to represent objects.

 \      ?I am too firm in my consciousness of the marvelous to be ever |
  `\       fascinated by the mere supernatural ?? ?Joseph Conrad, _The |
_o__)                                                     Shadow-Line_ |
Ben Finney

From dan at  Sun Apr 17 22:33:31 2016
From: dan at (Dan Sommers)
Date: Mon, 18 Apr 2016 02:33:31 +0000 (UTC)
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <nf1h1r$bu6$>

On Sun, 17 Apr 2016 21:04:11 -0400, Franklin? Lee wrote:

> These two behaviors and their corresponding functions could go into a
> special module which is (by default) loaded by the interactive shell. The
> behaviors can be turned off with some command-line verbosity flag, or tuned
> with command-line parameters (e.g. how many lines/pages to print).

The existing pretty print module already does some of what you want.
Instead of creating a new module, maybe extending that one would have
greater benefits to more users.

>   - People who want the full output are probably advanced users with, like,
> high-limit or unlimited window size, and advanced users are more likely to
> look for a verbosity flag, or use a third-party REPL. Default should be
> newbie friendly, because advanced users can work around it.

Judging what others might find more friendly, or guessing what actions
they are more likely to take, can be dangerous.

> Thoughts? Even if the specific proposals are unworkable, is limiting
> REPL output (by default) a reasonable goal?

IMO, the default REPL should be as simple as possible, and respond as
directly as possible to what I ask it to do.  It should also be [highly]
configurable for when you want more complexity, like summarizing stack
traces and paginating output.

From leewangzhong+python at  Sun Apr 17 23:03:21 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Sun, 17 Apr 2016 23:03:21 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 17, 2016 9:43 PM, "Ben Finney" <ben+python at> wrote:
> Perhaps you want a different ?print the interactive-REPL-safe text
> representation of this object? function, and to have the interactive
> REPL use that new function to represent objects.

You mean that the REPL should call `repr_limited(...)` instead of
`repr(...)`, and not that class makers should implement `__repr_limited__`,
right? I think one could make a `pprint` class for that.

Thing is, I would also want the following to have limited output.

    >>> def dumb(n):
    ...        for i in range(n):
    ...                print(i)
    >>> dumb(10**10)

That could print the first 50 or so lines, then the REPL buffers the rest
and prints out the message about suppressed output.

On the other hand, for a long-running computation with logging statements,
I don't wanna start it, come back, and lose all that would have been
printed, just because I forgot that the REPL does that now. Storing it all
could be a use of memory that the computation might not be able to afford.

Possible solution:
1. After output starts getting buffered, print out warnings every X seconds
about Y lines being buffered.
2. Detect screen limit, and only store that many lines back. You don't lose
anything that wasn't already going to be lost. (Optional, but possibly
useful: Also warn about Z lines being lost due to screen limit.)

It was going to be in the terminal's memory, anyway, so just store it (as
efficiently as possible) in Python's memory. I think this only uses up to
twice as much total system memory (actual screen mem + dump). Since you
won't use the output as Python at least until the user comes back, it could
be lazy about converting to `str`, and maybe even compress the buffer on
the fly (if queue compression is cheaper than printing).

It gets trickier when you want user input during the eval part of the REPL.
3. Allow the user to interrupt the eval to print the last page of output,
or dump the entire current output, and then Python will continue the eval.
(Include in the warning, "Press Ctrl+?? to show <thing>.")
4. If the eval is waiting for user input, show the last page. (I don't know
how the user could ask for more pages without sending something that the
eval is allowed to interrupt as input. I don't understand terminal keyboard
control that much.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ben+python at  Sun Apr 17 23:24:06 2016
From: ben+python at (Ben Finney)
Date: Mon, 18 Apr 2016 13:24:06 +1000
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <>

"Franklin? Lee"
<leewangzhong+python at> writes:

> You mean that the REPL should call `repr_limited(...)` instead of
> `repr(...)`, and not that class makers should implement
> `__repr_limited__`, right?

The former, yes: changing the behaviour of the interactive session,
without the need for the programmer to change their code.

To be clear, I'm asking whether that would meet your requirements :-)

> I think one could make a `pprint` class for that.

Surem, that would be a good place for it. I think that's much more
feasible than changing the behaviour of ?repr? for this purpose.

> Thing is, I would also want the following to have limited output.
>     >>> def dumb(n):
>     ...        for i in range(n):
>     ...                print(i)
>     ...
>     >>> dumb(10**10)

Again, it's only the interactive session which has the behaviour you
want changed.

That code run from a non-interactive session simply won't output
anything if there's no error, so I think ?repr? is not a problem by

It's only because the interactive session has the *additional* behaviour
of emitting the representation of the object resulting from evaluation,
that it has the behaviour you describe.

So this is increasing the likelihood that the best way to address your
request is not ?change default ?repr? behaviour?, but ?change the
representation function the interactive session uses for its special
object-representation output?.

 \         ?The conflict between humanists and religionists has always |
  `\     been one between the torch of enlightenment and the chains of |
_o__)                          enslavement.? ?Wole Soyinka, 2014-08-10 |
Ben Finney

From njs at  Sun Apr 17 23:57:07 2016
From: njs at (Nathaniel Smith)
Date: Mon, 18 Apr 2016 03:57:07 +0000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <nf1h1r$bu6$>
References: <>
Message-ID: <>

On Mon, Apr 18, 2016 at 2:33 AM, Dan Sommers <dan at> wrote:
> On Sun, 17 Apr 2016 21:04:11 -0400, Franklin? Lee wrote:
>> These two behaviors and their corresponding functions could go into a
>> special module which is (by default) loaded by the interactive shell. The
>> behaviors can be turned off with some command-line verbosity flag, or
>> with command-line parameters (e.g. how many lines/pages to print).
> The existing pretty print module already does some of what you want.
> Instead of creating a new module, maybe extending that one would have
> greater benefits to more users.

There's also IPython's repr system as prior art:

My impression (as I guess one of the few people who have tried to
systematically add _repr_pretty_ callbacks to their libraries) is that it's
only about 30% baked and has a number of flaws and limitations, but it's
still a significant step beyond __repr__ or the stdlib pprint.

(Some issues I've run into: there's something wonky in the line breaking
that often produces misaligned lines; integration with __repr__ is awkward
if you don't want to reimplement everything twice [1]; there's no built-in
control for compressing output down to ellipses; handling the most common
case of reprs that look like Foo(bar, baz, kwarg=1) is way more awkward
than it needs to be [2]; and IIRC there were some awkward limitations in
the formatting tools provided though I don't remember the details now. But
despite all this it does make it easy to write composable reprs, so e.g.
here's a syntax tree:

In [1]: import patsy
In [2]: patsy.parse_formula.parse_formula("y ~ 1 + (a * b)")
          Token('~', <Origin y ->~<- 1 + (a * b) (2-3)>),
                           <Origin ->y<- ~ 1 + (a * b) (0-1)>,
                     Token('+', <Origin y ~ 1 ->+<- (a * b) (6-7)>),
                                      <Origin y ~ ->1<- + (a * b) (4-5)>,
                                      <Origin y ~ 1 + (a ->*<- b) (11-12)>),
                                                 <Origin y ~ 1 + (->a<- *
b) (9-10)>,
                                                 <Origin y ~ 1 + (a *
->b<-) (13-14)>,



Nathaniel J. Smith --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From leewangzhong+python at  Mon Apr 18 01:32:41 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Mon, 18 Apr 2016 01:32:41 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 17, 2016 11:24 PM, "Ben Finney" <ben+python at> wrote:
> Surem, that would be a good place for it. I think that's much more
> feasible than changing the behaviour of ?repr? for this purpose.

Huh? I never meant for any change to happen to `repr`. The desired behavior
is, "Set default: Each command entered into the REPL should have limited
output (without loss of capability for a newbie)." The proposal is to have
the REPL determine when it's going to output too much.

My example, as originally written, `print`d the huge object. I realized
that this was too specific, and I didn't *really* care about `print` as
such, so I changed it to (what I thought was) a more general example. I
should've just had multiple examples.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Apr 18 03:12:52 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 18 Apr 2016 17:12:52 +1000
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On 18 April 2016 at 05:28, Koos Zevenhoven <k7hoven at> wrote:

> So here's
Neat, although you'll want to use importlib.import_module() rather than
calling __import__ directly (the latter won't behave the way you want when
importing submodules, as it returns the top level module for the import
statement to bind in the current namespace, rather than the imported

> I suppose this could be on pypi, and one could do things like
> "random.randint(0,10)"
> or
>     python -m oneline "random.randint(0,10)"
> Any thoughts?

There are certainly plenty of opportunities to make Python easier to invoke
for one-off commands. Another interesting example is pyp:

A completely undocumented hack I put together while playing one day was a
utility to do json -> json transformations via command line pipes:

The challenge with these kinds of things is getting them from "Hey, look at
this cool thing you can do" to "This will materially improve your
day-to-day programming experience". The former can still be fun to work on
as a hobby, but it's the latter that people need to get over the initial
adoption barrier.


> -Koos
> [1]
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Apr 18 03:25:24 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 18 Apr 2016 17:25:24 +1000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 18 April 2016 at 15:32, Franklin? Lee <leewangzhong+python at>

> On Apr 17, 2016 11:24 PM, "Ben Finney" <ben+python at> wrote:
> >
> > Surem, that would be a good place for it. I think that's much more
> > feasible than changing the behaviour of ?repr? for this purpose.
> Huh? I never meant for any change to happen to `repr`. The desired
> behavior is, "Set default: Each command entered into the REPL should have
> limited output (without loss of capability for a newbie)." The proposal is
> to have the REPL determine when it's going to output too much.
A few tips for folks that want to play with this:

- setting sys.displayhook controls how evaluated objects are displayed in
the default REPL
- setting sys.excepthook does the same for exceptions
- the PYTHONSTARTUP env var runs the given file when the default REPL starts

So folks are already free to change their REPL to work however they want it
to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file

Changing the *default* REPL behaviour is a very different question, and
forks off in a couple of different directions:

- improved defaults for teaching novices? Perhaps the default REPL isn't
the best environment for that
- easier debugging at the REPL? Perhaps pprint should gain an
"install_displayhook()" option that overwrites sys.displayhook and
optionally allows enabling of an output pager

Since the more the REPL does, the more opportunities there are for it to
break when debugging, having the output hooks be as simple as possible is
quite desirable. However, making it easier to request something more
sophisticated via the pprint module seems like a plausible approach to me.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mal at  Mon Apr 18 03:27:36 2016
From: mal at (M.-A. Lemburg)
Date: Mon, 18 Apr 2016 09:27:36 +0200
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 18.04.2016 03:04, Franklin? Lee wrote:
> Two different (probably radical) ideas, with the same justification: reduce
> often-useless output in the REPL, which can flood out terminal history and
> overwhelm the user.
> 1. Limit the output per entered command: If you type into the REPL (AKA
> interactive shell),
>     list(range(n))
> and you forgot that you set n to 10**10, the interpreter should not print
> more than a page of output. Instead, it will print a few lines ("... and
> approximately X more lines"), and tell you how to print more. (E.g. "Call
> '_more()' for more. Call '_full()' for full output.")
> Alternatively, have "less"-like behavior.

You should be able to write your own sys.displayhook to accomplish this.

> 2. Only print a few parts of the stack trace. In particular, for a
> recursive or mutually recursive function, if the error was due to maximum
> recursion (is this reasonably distinguishable? the error is
> `RuntimeError('maximum recursion depth exceeded')`), try to print each
> function on the stack once each.
> Again, there should be a message telling you how to get the full stacktrace
> printed. EXACTLY how, preferably in a way that is easy to type, so that a
> typo won't cause the trace to be lost. It should not use `sys.something()`,
> because the user's first few encounters with this message will result in,
> "NameError: name 'sys' is not defined".

Same here with sys.excepthook.

FWIW: I see your point in certain situations, but don't think the
defaults should be such that you have to enable some variable to see
everything. This would make debugging harder than necessary, since
often enough (following Murphy's law) the most interesting information
would be hidden in some ellipsis.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Apr 18 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From leewangzhong+python at  Mon Apr 18 11:41:14 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Mon, 18 Apr 2016 11:41:14 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 18, 2016 3:25 AM, "Nick Coghlan" <ncoghlan at> wrote:
> On 18 April 2016 at 15:32, Franklin? Lee <leewangzhong+python at>
>> On Apr 17, 2016 11:24 PM, "Ben Finney" <ben+python at>
>> >
>> > Surem, that would be a good place for it. I think that's much more
>> > feasible than changing the behaviour of ?repr? for this purpose.
>> Huh? I never meant for any change to happen to `repr`. The desired
behavior is, "Set default: Each command entered into the REPL should have
limited output (without loss of capability for a newbie)." The proposal is
to have the REPL determine when it's going to output too much.
> A few tips for folks that want to play with this:
> - setting sys.displayhook controls how evaluated objects are displayed in
the default REPL
> - setting sys.excepthook does the same for exceptions
> - the PYTHONSTARTUP env var runs the given file when the default REPL
> So folks are already free to change their REPL to work however they want
it to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file

I don't want arguments like, "This can already be done, for yourself, if
you really need it." I use IPython's shell, with maximum output height, and
it was years ago that I used a terminal which I couldn't scroll.

I want arguments like, "This will break my workflow."

> - improved defaults for teaching novices? Perhaps the default REPL isn't
the best environment for that

Why not? I imagine that self-taught novices will use the default REPL more
than the advanced users.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Mon Apr 18 11:51:46 2016
From: ethan at (Ethan Furman)
Date: Mon, 18 Apr 2016 08:51:46 -0700
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/18/2016 08:41 AM, Franklin? Lee wrote:
> On Apr 18, 2016 3:25 AM, "Nick Coghlan" wrote:

>> A few tips for folks that want to play with this:
>> - setting sys.displayhook controls how evaluated objects are
 >>   displayed in the default REPL
>> - setting sys.excepthook does the same for exceptions
>> - the PYTHONSTARTUP env var runs the given file when the default REPL
>>   starts
>> So folks are already free to change their REPL to work however they
>> want it to: set sys.displayhook and sys.excepthook from a PYTHONSTARTUP file
> I don't want arguments like, "This can already be done, for yourself, if
> you really need it." I use IPython's shell, with maximum output height,
> and it was years ago that I used a terminal which I couldn't scroll.
> I want arguments like, "This will break my workflow."

How about:  the default REPL is a basic tool, and the limitations of 
basic tools are what drive folks to seek out advanced tools. ?

Or Marc-Andre's:
 > I [...] don't think the defaults should be such that you have to
 > enable some variable to see everything. This would make debugging
 > harder than necessary, since often enough (following Murphy's law)
 > the most interesting information would be hidden in some ellipsis.

Or even Nick's (that you snipped):
> Since the more the REPL does, the more opportunities there are for
 > it to break when debugging, having the output hooks be as simple as
 > possible is quite desirable.

IMO those are all good reasons to leave the basic REPL alone.


From k7hoven at  Mon Apr 18 12:03:46 2016
From: k7hoven at (Koos Zevenhoven)
Date: Mon, 18 Apr 2016 19:03:46 +0300
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 18, 2016 at 10:12 AM, Nick Coghlan <ncoghlan at> wrote:
> On 18 April 2016 at 05:28, Koos Zevenhoven <k7hoven at> wrote:
>> So here's
> Neat, although you'll want to use importlib.import_module() rather than
> calling __import__ directly (the latter won't behave the way you want when
> importing submodules, as it returns the top level module for the import
> statement to bind in the current namespace, rather than the imported
> submodule)

Thanks :). That is in fact why I had worked around it by grabbing
sys.modules[name] instead. Good to know import_module() already does
the right thing. I now changed the code to use import_module, assuming
that is the preferred way today. However, to prevent infinite
recursion when importing submodules, I now do a setattr(parentmodule,
submodulename, None) before the import (and delattr if the import

>> I suppose this could be on pypi, and one could do things like
>> "random.randint(0,10)"
>> or
>>     python -m oneline "random.randint(0,10)"
>> Any thoughts?
> There are certainly plenty of opportunities to make Python easier to invoke
> for one-off commands. Another interesting example is pyp:

This is nice, although solves a different problem.

> A completely undocumented hack I put together while playing one day was a
> utility to do json -> json transformations via command line pipes:

So it looks like it would work like this:

cat input.json | pycall "my.transformation.function" > output.json

Also a different problem, but cool.

> The challenge with these kinds of things is getting them from "Hey, look at
> this cool thing you can do" to "This will materially improve your day-to-day
> programming experience". The former can still be fun to work on as a hobby,
> but it's the latter that people need to get over the initial adoption
> barrier.

I think the users of could be people that now write lots of
bash scripts and work on the command line. So whenever someone asks a
question somewhere about how to do X on the linux command line, we
might have the answer: """

Q: On the linux commandline, how do I get only the filename from a
full path that is in $FILEPATH

A: Python has this. You can use the tools in os.path:

$ "os.path.basename('$FILEPATH')"

Path to directory:
$ "os.path.dirname('$FILEPATH')"

This might be more appealing than python -c. The whole point is to
make Python's power available and visible for a larger audience.


From wes.turner at  Mon Apr 18 13:30:01 2016
From: wes.turner at (Wes Turner)
Date: Mon, 18 Apr 2016 12:30:01 -0500
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 18, 2016 11:04 AM, "Koos Zevenhoven" <k7hoven at> wrote:
> On Mon, Apr 18, 2016 at 10:12 AM, Nick Coghlan <ncoghlan at> wrote:
> > On 18 April 2016 at 05:28, Koos Zevenhoven <k7hoven at> wrote:
> >>
> >> So here's
> >>
> >>
> >>
> >
> > Neat, although you'll want to use importlib.import_module() rather than
> > calling __import__ directly (the latter won't behave the way you want
> > importing submodules, as it returns the top level module for the import
> > statement to bind in the current namespace, rather than the imported
> > submodule)
> >
> Thanks :). That is in fact why I had worked around it by grabbing
> sys.modules[name] instead. Good to know import_module() already does
> the right thing. I now changed the code to use import_module, assuming
> that is the preferred way today. However, to prevent infinite
> recursion when importing submodules, I now do a setattr(parentmodule,
> submodulename, None) before the import (and delattr if the import
> fails).
> >>
> >> I suppose this could be on pypi, and one could do things like
> >>
> >> "random.randint(0,10)"
> >>
> >> or
> >>
> >>     python -m oneline "random.randint(0,10)"
> >>
> >> Any thoughts?
> >
> >
> > There are certainly plenty of opportunities to make Python easier to
> > for one-off commands. Another interesting example is pyp:
> >
> This is nice, although solves a different problem.
> > A completely undocumented hack I put together while playing one day was
> > utility to do json -> json transformations via command line pipes:
> >
> So it looks like it would work like this:
> cat input.json | pycall "my.transformation.function" > output.json
> Also a different problem, but cool.
> > The challenge with these kinds of things is getting them from "Hey,
look at
> > this cool thing you can do" to "This will materially improve your
> > programming experience". The former can still be fun to work on as a
> > but it's the latter that people need to get over the initial adoption
> > barrier.
> I think the users of could be people that now write lots of
> bash scripts and work on the command line. So whenever someone asks a
> question somewhere about how to do X on the linux command line, we
> might have the answer: """
> Q: On the linux commandline, how do I get only the filename from a
> full path that is in $FILEPATH
> A: Python has this. You can use the tools in os.path:
> Filename:
> $ "os.path.basename('$FILEPATH')"
> Path to directory:
> $ "os.path.dirname('$FILEPATH')"
> """

FILEPATH='for'"example');"'"cat /etc/passwd", shell=True)'

> This might be more appealing than python -c. The whole point is to
> make Python's power available and visible for a larger audience.
> -Koos
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From wes.turner at  Mon Apr 18 13:36:41 2016
From: wes.turner at (Wes Turner)
Date: Mon, 18 Apr 2016 12:36:41 -0500
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 18, 2016 12:30 PM, "Wes Turner" <wes.turner at> wrote:

> >
> > I think the users of could be people that now write lots of
> > bash scripts and work on the command line. So whenever someone asks a
> > question somewhere about how to do X on the linux command line, we
> > might have the answer: """
> >
> > Q: On the linux commandline, how do I get only the filename from a
> > full path that is in $FILEPATH
> >
> > A: Python has this. You can use the tools in os.path:
> >
> > Filename:
> > $ "os.path.basename('$FILEPATH')"
> >
> > Path to directory:
> > $ "os.path.dirname('$FILEPATH')"
> > """
> FILEPATH='for'"example');"'"cat /etc/passwd", shell=True)'

sys.argv[1]  (IFS=' ')
stdin (~IFS=$'\n')



(considering adding an argument (in addition to the existing -m) for

> >
> > This might be more appealing than python -c. The whole point is to
> > make Python's power available and visible for a larger audience.
> >
> > -Koos
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ben+python at  Mon Apr 18 14:39:32 2016
From: ben+python at (Ben Finney)
Date: Tue, 19 Apr 2016 04:39:32 +1000
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <>

"Franklin? Lee"
<leewangzhong+python at> writes:

> On Apr 18, 2016 3:25 AM, "Nick Coghlan" <ncoghlan at> wrote:
> > So folks are already free to change their REPL to work however they
> > want it to: set sys.displayhook and sys.excepthook from a
> I don't want arguments like, "This can already be done, for yourself,
> if you really need it."

That's your prerogative.

This forum is for discussing whether ideas are right for changing
Python, though. Arguments such as ?This can already be done, for
yourself, if you need it? are salient and sufficient, and not to be

> I want arguments like, "This will break my workflow."

You're free to solicit those arguments. You'll need to ackonwledge,
though, that they are in addition to the quite sufficient argument of
?already possible for people to get this in Python as is, if they want

 \         ?I think Western civilization is more enlightened precisely |
  `\     because we have learned how to ignore our religious leaders.? |
_o__)                                                ?Bill Maher, 2003 |
Ben Finney

From wes.turner at  Mon Apr 18 15:20:47 2016
From: wes.turner at (Wes Turner)
Date: Mon, 18 Apr 2016 14:20:47 -0500
Subject: [Python-ideas] Add __main__ for uuid, random and urandom
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 18, 2016 12:36 PM, "Wes Turner" <wes.turner at> wrote:
> On Apr 18, 2016 12:30 PM, "Wes Turner" <wes.turner at> wrote:
> > >
> > > I think the users of could be people that now write lots of
> > > bash scripts and work on the command line. So whenever someone asks a
> > > question somewhere about how to do X on the linux command line, we
> > > might have the answer: """
> > >
> > > Q: On the linux commandline, how do I get only the filename from a
> > > full path that is in $FILEPATH
> > >
> > > A: Python has this. You can use the tools in os.path:
> > >
> > > Filename:
> > > $ "os.path.basename('$FILEPATH')"
> > >
> > > Path to directory:
> > > $ "os.path.dirname('$FILEPATH')"
> > > """
> >
> > FILEPATH='for'"example');"'"cat /etc/passwd",
> sys.argv[1]  (IFS=' ')
> stdin (~IFS=$'\n')
> ...
> *
> *
(considering adding an argument (in addition to the existing -m) for

another thing worth mentioning is that
`ls` prints '?' for certain characters in filenames (e.g. newlines $'\n')
so, | pipes  with ls and xargs are bad/wrong/unsafe:


$ touch 'file'$'\n''name'

$ ls 'file'* | xargs stat  #ERR
$ find . -maxdepth 1 -name 'file*' | xargs stat  #ERRless unsafe (?):

>> [x for x in os.listdir('.') if x.startswith('file')]  # ['file\nname']

$ find . -maxdepth 1 -name 'file*' -print0 | xargs -0 stat


* "CWE-93: Improper Neutralization of CRLF Sequences ('CRLF Injection')"

* CWE-78: Improper Neutralization of Special Elements used in an OS Command
('OS Command Injection')

> >
> > >
> > > This might be more appealing than python -c. The whole point is to
> > > make Python's power available and visible for a larger audience.
> > >
> > > -Koos
> > > _______________________________________________
> > > Python-ideas mailing list
> > > Python-ideas at
> > >
> > > Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From joejev at  Mon Apr 18 15:33:22 2016
From: joejev at (Joseph Jevnik)
Date: Mon, 18 Apr 2016 15:33:22 -0400
Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8
Message-ID: <>

I saw that there was recently a change to pep 8 to suggest adding a line
break before a binary operator. Pep 7 suggests the opposite:

> When you break a long expression at a binary operator, the operator goes
at the end of the previous line, e.g.:

> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
>     type->tp_dictoffset == b_size &&
>     (size_t)t_size == b_size + sizeof(PyObject *))
>     return 0; /* "Forgive" adding a __dict__ only */

I imagine that some of the reasons for making the change in pep 8 for
readability reasons will also
translate to C; maybe pep 7 should also be updated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Mon Apr 18 16:27:05 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 06:27:05 +1000
Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 5:33 AM, Joseph Jevnik <joejev at> wrote:
> I saw that there was recently a change to pep 8 to suggest adding a line
> break before a binary operator. Pep 7 suggests the opposite:
>> When you break a long expression at a binary operator, the operator goes
>> at the end of the previous line, e.g.:
>> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
>>     type->tp_dictoffset == b_size &&
>>     (size_t)t_size == b_size + sizeof(PyObject *))
>>     return 0; /* "Forgive" adding a __dict__ only */
> I imagine that some of the reasons for making the change in pep 8 for
> readability reasons will also
> translate to C; maybe pep 7 should also be updated.

I would agree with this. Passing it directly to python-dev as that's
where the key decision makers are.


From guido at  Mon Apr 18 18:52:38 2016
From: guido at (Guido van Rossum)
Date: Mon, 18 Apr 2016 15:52:38 -0700
Subject: [Python-ideas] pep 7 line break suggestion differs from pep 8
In-Reply-To: <>
References: <>
Message-ID: <>

[ideas to bcc]

I'm not as excited about this as I am about the PEP 8 change.

PEP 8 affects most Python programmers.

But PEP 7 is really just for CPython and its extensions, and I don't think
it has found anything like as widespread a following as PEP 8.

I worry that if we change this in PEP 7 we'll just see either massing
inconsistent code or endless diffs that do nothing but change the
formatting (and occasionally introduce a bug).

And I don't think it would do as much good -- reading and understanding C
code is primarily a matter of knowing the language, and the audience is
much more heavily skewed towards experts.

IOW, -1.

On Mon, Apr 18, 2016 at 1:27 PM, Chris Angelico <rosuav at> wrote:

> On Tue, Apr 19, 2016 at 5:33 AM, Joseph Jevnik <joejev at> wrote:
> > I saw that there was recently a change to pep 8 to suggest adding a line
> > break before a binary operator. Pep 7 suggests the opposite:
> >
> >> When you break a long expression at a binary operator, the operator goes
> >> at the end of the previous line, e.g.:
> >
> >> if (type->tp_dictoffset != 0 && base->tp_dictoffset == 0 &&
> >>     type->tp_dictoffset == b_size &&
> >>     (size_t)t_size == b_size + sizeof(PyObject *))
> >>     return 0; /* "Forgive" adding a __dict__ only */
> >
> > I imagine that some of the reasons for making the change in pep 8 for
> > readability reasons will also
> > translate to C; maybe pep 7 should also be updated.
> I would agree with this. Passing it directly to python-dev as that's
> where the key decision makers are.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From k7hoven at  Mon Apr 18 20:40:41 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 03:40:41 +0300
Subject: [Python-ideas] Type hinting for path-related functions
Message-ID: <>

I actually proposed this already in one of the pathlib threads on
python-dev, but I decided to repost here, because this is easily seen
as a separate issue. I'll start with some introduction, then moving on
to the actual type hinting part.

In our seemingly never-ending discussions about pathlib support in the
stdlib in various threads, first here on python-ideas, then even more
extensively on python-dev, have perhaps almost converged. The required
changes involve a protocol method, probably named __fspath__, which
any path-like type could implement to return a more, let's say,
"classical" path object such as a str. However, the protocol is
polymorphic and may also return bytes, which has a lot do do with the
fact that the stdlib itself is polymophic and currently accepts str as
well as bytes paths almost everywhere, including the newly-introduced
os.scandir + DirEntry combination. The upcoming improvements will
further allow passing pathlib path objects as well as DirEntry objects
to any stdlib function that take paths.

It came up, for instance here [1], that the function associated with
the protocol, potentially named os.fspath, will end up needing type
hints. This function takes pathlike objects and turns them into str or
bytes. There are various different scenarios [2] that can be
considered for code dealing with paths, but let's consider the case of
os.path.* and other traditional python path-related functions.

Some examples:


Currently, it takes str or bytes paths and returns a joined path of
the same type (mixing different types raises an exception).

In the future, it will also accept pathlib objects (underlying type
always str) and DirEntry (underlying type str or bytes) or third-party
path objects (underlying type str or bytes). The function will then
return a pathname of the underlying type.


Currently, it takes a str or bytes and returns the dirname of the same type.
In the future, it will also accept Path and DirEntry and return the
underlying type.

Let's consider the type hint of os.path.dirname at present and in the future:

Currently, one could write

def dirname(p: Union[str, bytes]) -> Union[str, bytes]:

While this is valid, it could be more precise:

pathstring = typing.TypeVar('pathstring', str, bytes)

def dirname(p: pathstring) -> pathstring:

This now contains the information that the return type is the same as
the argument type. The name 'pathstring' may be considered slightly
misleading because "byte strings" are not actually strings in Python
3, but at least it does not advertise the use of bytes as paths, which
is very rarely desirable.

But what about the future. There are two kinds of rich path objects,
those with an underlying type of str and those with an underlying type
of bytes. These should implement the __fspath__() protocol and return
their underlying type. However, we do care about what (underlying)
type is provided by the protocol, so we might want to introduce
something like typing.FSPath[underlying_type]:

FSPath[str]       # str-based pathlike, including str
FSPath[bytes]  # bytes-based pathlike, including bytes

And now, using the above defined TypeVar pathstring, the future
version of dirname would be type annotated as follows:

def dirname(p: FSPath[pathstring]) -> pathstring:

It's getting late. I hope this made sense :).



From guido at  Mon Apr 18 21:27:00 2016
From: guido at (Guido van Rossum)
Date: Mon, 18 Apr 2016 18:27:00 -0700
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

Your pathstring seems to be the same as the predefined (in, and
PEP 484) AnyStr.

You are indeed making sense, except that for various reasons the stdlib is
not likely to adopt in-line signature annotations yet -- not even for new

However once there's agreement on os.fspath() it can be added to the stubs

Is there going to be a PEP for os.fspath()? (I muted most of the
discussions so I'm not sure where it stands.)

On Mon, Apr 18, 2016 at 5:40 PM, Koos Zevenhoven <k7hoven at> wrote:

> I actually proposed this already in one of the pathlib threads on
> python-dev, but I decided to repost here, because this is easily seen
> as a separate issue. I'll start with some introduction, then moving on
> to the actual type hinting part.
> In our seemingly never-ending discussions about pathlib support in the
> stdlib in various threads, first here on python-ideas, then even more
> extensively on python-dev, have perhaps almost converged. The required
> changes involve a protocol method, probably named __fspath__, which
> any path-like type could implement to return a more, let's say,
> "classical" path object such as a str. However, the protocol is
> polymorphic and may also return bytes, which has a lot do do with the
> fact that the stdlib itself is polymophic and currently accepts str as
> well as bytes paths almost everywhere, including the newly-introduced
> os.scandir + DirEntry combination. The upcoming improvements will
> further allow passing pathlib path objects as well as DirEntry objects
> to any stdlib function that take paths.
> It came up, for instance here [1], that the function associated with
> the protocol, potentially named os.fspath, will end up needing type
> hints. This function takes pathlike objects and turns them into str or
> bytes. There are various different scenarios [2] that can be
> considered for code dealing with paths, but let's consider the case of
> os.path.* and other traditional python path-related functions.
> Some examples:
> os.path.join
> Currently, it takes str or bytes paths and returns a joined path of
> the same type (mixing different types raises an exception).
> In the future, it will also accept pathlib objects (underlying type
> always str) and DirEntry (underlying type str or bytes) or third-party
> path objects (underlying type str or bytes). The function will then
> return a pathname of the underlying type.
> os.path.dirname
> Currently, it takes a str or bytes and returns the dirname of the same
> type.
> In the future, it will also accept Path and DirEntry and return the
> underlying type.
> Let's consider the type hint of os.path.dirname at present and in the
> future:
> Currently, one could write
> def dirname(p: Union[str, bytes]) -> Union[str, bytes]:
>     ...
> While this is valid, it could be more precise:
> pathstring = typing.TypeVar('pathstring', str, bytes)
> def dirname(p: pathstring) -> pathstring:
>     ...
> This now contains the information that the return type is the same as
> the argument type. The name 'pathstring' may be considered slightly
> misleading because "byte strings" are not actually strings in Python
> 3, but at least it does not advertise the use of bytes as paths, which
> is very rarely desirable.
> But what about the future. There are two kinds of rich path objects,
> those with an underlying type of str and those with an underlying type
> of bytes. These should implement the __fspath__() protocol and return
> their underlying type. However, we do care about what (underlying)
> type is provided by the protocol, so we might want to introduce
> something like typing.FSPath[underlying_type]:
> FSPath[str]       # str-based pathlike, including str
> FSPath[bytes]  # bytes-based pathlike, including bytes
> And now, using the above defined TypeVar pathstring, the future
> version of dirname would be type annotated as follows:
> def dirname(p: FSPath[pathstring]) -> pathstring:
>     ...
> It's getting late. I hope this made sense :).
> -Koos
> [1]
> [2]
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Mon Apr 18 22:06:40 2016
From: ethan at (Ethan Furman)
Date: Mon, 18 Apr 2016 19:06:40 -0700
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/18/2016 06:27 PM, Guido van Rossum wrote:

> Is there going to be a PEP for os.fspath()? (I muted most of the
> discussions so I'm not sure where it stands.)

We're nearing the end of the discussions.  Brett Cannon and Chris 
Angelico will draw up an amendment to the pathlib PEP.


From leewangzhong+python at  Tue Apr 19 00:52:39 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 19 Apr 2016 00:52:39 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Apr 18, 2016 at 11:51 AM, Ethan Furman <ethan at> wrote:
> On 04/18/2016 08:41 AM, Franklin? Lee wrote:
>> I don't want arguments like, "This can already be done, for yourself, if
>> you really need it." I use IPython's shell, with maximum output height,
>> and it was years ago that I used a terminal which I couldn't scroll.
>> I want arguments like, "This will break my workflow."
> How about:  the default REPL is a basic tool, and the limitations of basic
> tools are what drive folks to seek out advanced tools. ?

The REPL is a basic tool for basic users, which is why it should "Do
the right thing" for people who wouldn't know better. I'm asking
whether this is the "right thing" for those basic users: Advanced
users are the ones who can use more than basic info.

> Or Marc-Andre's:
>> I [...] don't think the defaults should be such that you have to
>> enable some variable to see everything. This would make debugging
>> harder than necessary, since often enough (following Murphy's law)
>> the most interesting information would be hidden in some ellipsis.

I think it's a high priority not to lose information that was hidden.
In my proposal, I said that there should be a way to access it, right
after you cause it, which is very different from needing to enable a

If not for (memory) performance anxiety, I would've proposed that the
info would be stored indefinitely, or that you would always be able to
access the last three (or something) outputs/stacktraces, because I
was concerned about typos causing exceptions that would then push the
interesting stacktrace out.

(Hey, that's another possible proposal: maybe syntax/lookup errors on
the "R" part of "REPL" shouldn't count toward `sys.last_traceback`.
Has anyone here ever needed to debug or reprint syntax errors coming
from compiling the command they just entered?)

> Or even Nick's (that you snipped):
>> Since the more the REPL does, the more opportunities there are for
>> it to break when debugging, having the output hooks be as simple as
>> possible is quite desirable.

I read this as, "It'll be more complicated, so it MIGHT be buggy."
This seems solvable via design, and doesn't speak to whether it's
good, in principle, to limit output per input. If you still think it's
a good reason, then I probably didn't understand it correctly.

On Mon, Apr 18, 2016 at 2:39 PM, Ben Finney <ben+python at> wrote:
> "Franklin? Lee"
> <leewangzhong+python at> writes:
>> On Apr 18, 2016 3:25 AM, "Nick Coghlan" <ncoghlan at> wrote:
>> > So folks are already free to change their REPL to work however they
>> > want it to: set sys.displayhook and sys.excepthook from a
>> I don't want arguments like, "This can already be done, for yourself,
>> if you really need it."
> That's your prerogative.
> This forum is for discussing whether ideas are right for changing
> Python, though. Arguments such as ?This can already be done, for
> yourself, if you need it? are salient and sufficient, and not to be
> dismissed.
>> I want arguments like, "This will break my workflow."
> You're free to solicit those arguments. You'll need to ackonwledge,
> though, that they are in addition to the quite sufficient argument of
> ?already possible for people to get this in Python as is, if they want
> it?.

When they are presented as a solution to "my" problem, they are, to be
short, irrelevant. They try to address the speaker's need for the
feature on their own machine, when I am asking for opinions on both
the usefulness and harmfulness of such a feature, and the principle
behind the feature. They are distracting: people are talking more
about whether I can do it "myself" than how it's bad to have this, or
how it wouldn't help who I want to help.

Even if they weren't trying to solve "my" problem, I had non-advanced
users in mind, and these solutions tend to be about what advanced
users can do. I felt the need to point out what I'm looking for in the
arguments, because people are telling me how *I* can have this feature
by, for example, writing a display hook, or plugging into pprint.

From mike at  Tue Apr 19 01:21:46 2016
From: mike at (Michael Selik)
Date: Tue, 19 Apr 2016 05:21:46 +0000
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 13, 2016 at 12:45 PM Guido van Rossum <guido at> wrote:

> On Wed, Apr 13, 2016 at 2:47 AM, Rob Cliffe <rob.cliffe at>
> wrote:
> > Isn't there an inconsistency that random.sample caters to a set by
> > converting it to a tuple, but random.choice doesn't?
> Perhaps because the use cases are different? Over the years I've
> learned that inconsistencies aren't always signs of sloppy thinking --
> they may actually point to deep issues that aren't apparently on the
> surface.
> I imagine the typical use case for sample() to be something that
> samples the population once and then does something to the sample; the
> next time sample() is called the population is probably different
> (e.g. the next lottery has a different set of players).
> But I imagine a fairly common use case for choice() to be choosing
> from the same population over and over, and that's exactly the case
> where the copying implementation you're proposing would be a small
> disaster.

Repeated random sampling from the same population is a common use case
("bootstrapping"). Perhaps the oversight was *allowing* sets as input into
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From p.f.moore at  Tue Apr 19 04:09:25 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 09:09:25 +0100
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 05:52, Franklin? Lee <leewangzhong+python at> wrote:
>> How about:  the default REPL is a basic tool, and the limitations of basic
>> tools are what drive folks to seek out advanced tools. ?
> The REPL is a basic tool for basic users, which is why it should "Do
> the right thing" for people who wouldn't know better. I'm asking
> whether this is the "right thing" for those basic users: Advanced
> users are the ones who can use more than basic info.

Basic users should probably be using a tool like IDLE, which has a bit
more support for beginners than the raw REPL. I view the REPL as more
important for intermediate or advanced users who want to quickly test
out an idea (at least, that's *my* usage of the REPL).

As I disagree with your statement "the REPL is a basic tool for basic
users" I don't find your conclusions compelling, sorry.


From k7hoven at  Tue Apr 19 05:35:18 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 12:35:18 +0300
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 11:09 AM, Paul Moore <p.f.moore at> wrote:
> On 19 April 2016 at 05:52, Franklin? Lee <leewangzhong+python at> wrote:
>>> How about:  the default REPL is a basic tool, and the limitations of basic
>>> tools are what drive folks to seek out advanced tools. ?
>> The REPL is a basic tool for basic users, which is why it should "Do
>> the right thing" for people who wouldn't know better. I'm asking
>> whether this is the "right thing" for those basic users: Advanced
>> users are the ones who can use more than basic info.
> Basic users should probably be using a tool like IDLE, which has a bit
> more support for beginners than the raw REPL. I view the REPL as more
> important for intermediate or advanced users who want to quickly test
> out an idea (at least, that's *my* usage of the REPL).

I was once a basic user, but I still have no idea what "IDLE" is. Does
it come with python?

I have tried

$ idle
$ python -m idle
$ python -m IDLE
$ python --idle

To be honest, I do remember seeing a shortcut to IDLE in one of my
Windows python installations, and I've seen it come up in discussions.
However, it does not seem to me that IDLE is something that beginners
would know to turn to. I do use IPython. IPython is nice---too bad it
starts up slowly and is not recommended by default.

-- Koos

From leewangzhong+python at  Tue Apr 19 05:36:14 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 19 Apr 2016 05:36:14 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 19, 2016 4:09 AM, "Paul Moore" <p.f.moore at> wrote:
> On 19 April 2016 at 05:52, Franklin? Lee <leewangzhong+python at>
> >> How about:  the default REPL is a basic tool, and the limitations of
> >> tools are what drive folks to seek out advanced tools. ?
> >
> > The REPL is a basic tool for basic users, which is why it should "Do
> > the right thing" for people who wouldn't know better. I'm asking
> > whether this is the "right thing" for those basic users: Advanced
> > users are the ones who can use more than basic info.
> >
> Basic users should probably be using a tool like IDLE, which has a bit
> more support for beginners than the raw REPL.

You say "should"? Do you mean that it is likely, or do you mean that it is
what would happen in an ideal world? My college had CS students SSH into
the department's Linux server to compile and run their code, and many
teachers don't believe that students should start with fancy IDE featues
like, er, syntax highlighting.

You (and most regulars on this list) can adjust your shell to the way you
like it, or use a more sophisticated shell, like IPython or bpython. On the
other hand, changing shells and adding display hooks to is not an
option for those who don't know it's an option.

> I view the REPL as more
> important for intermediate or advanced users who want to quickly test
> out an idea (at least, that's *my* usage of the REPL).

But that doesn't answer my question: would the proposed change hurt your
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From p.f.moore at  Tue Apr 19 05:48:54 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 10:48:54 +0100
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 10:36, Franklin? Lee <leewangzhong+python at> wrote:
> On Apr 19, 2016 4:09 AM, "Paul Moore" <p.f.moore at> wrote:
>> On 19 April 2016 at 05:52, Franklin? Lee <leewangzhong+python at>
>> wrote:
>> >> How about:  the default REPL is a basic tool, and the limitations of
>> >> basic
>> >> tools are what drive folks to seek out advanced tools. ?
>> >
>> > The REPL is a basic tool for basic users, which is why it should "Do
>> > the right thing" for people who wouldn't know better. I'm asking
>> > whether this is the "right thing" for those basic users: Advanced
>> > users are the ones who can use more than basic info.
>> >
>> Basic users should probably be using a tool like IDLE, which has a bit
>> more support for beginners than the raw REPL.
> You say "should"? Do you mean that it is likely, or do you mean that it is
> what would happen in an ideal world? My college had CS students SSH into the
> department's Linux server to compile and run their code, and many teachers
> don't believe that students should start with fancy IDE featues like, er,
> syntax highlighting.

I mean that that is what I hear people on lists like this saying as "a
reasonable beginner environment". It means that the people I've
introduced to Python (on Windows) have tended to end up using IDLE
(either from the start menu or via "edit with IDLE" from the right
click menu on a script).

My experience (in business, environments) is that people expect an IDE
when introduced to a new programming language, and IDLE, like it or
not, is what is available out of the box with Python.

> You (and most regulars on this list) can adjust your shell to the way you
> like it, or use a more sophisticated shell, like IPython or bpython. On the
> other hand, changing shells and adding display hooks to is not an
> option for those who don't know it's an option.

As a "scripting expert" and consultant, I typically get asked to knock
up scripts on a variety of environments. I do not normally have what
I'd describe as "my shell", just a series of basic "out of the box"
prompts people expect me to work at. So no, the luxury of configuring
the default experience is *not* something I typically have.

>> I view the REPL as more
>> important for intermediate or advanced users who want to quickly test
>> out an idea (at least, that's *my* usage of the REPL).
> But that doesn't answer my question: would the proposed change hurt your
> workflow?

Yes. If I get a stack trace, I want it all. And if I print something
out, I want to see it by default. The REPL for me is an investigative
environment for seeing exactly what's going on.

(Also, having the REPL behave differently depending on what version of
Python I have would be a problem - backward compatibility applies here
as much as anywhere else).


From contrebasse at  Tue Apr 19 05:49:44 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 09:49:44 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
Message-ID: <>

Hi, list !

namedtuples are really great. I would like to use them even more, for
example for functions that return multiple arguments. The problem is that
namedtuples have to be "declared" beforehand, so it would be quite tedious
to declare a nameedtuple by function, that's why I very rarely do it.

Another hting I don't like about namedtuples is the duplication of the name.
TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`,
where `Point` is repeated two times. I'll go one step further and say that
the name is useless most of the time, so let's just get rid of it.


So I thought about a new (ok, maybe it has been proposed before but I
couldn't find it) syntax for anonymous namedtuples (I put the prints as
comments, otherwise gmane is complainig about top-posting):

    my_point = (x=12, y=16)
    # (x=12, y=16)
    # 12
    # 16
    # <class 'anonymousnamedtuple'>

It's just a tuple, but with names. Parenthesis would be mandatory because
`my_point = x = 12, y = 16` wouldn't work. Single elements anonymouns
namedtuples would require a trailing comma, similarely to tuples.

I'd be happy to make a factory function for my personal use, but the order
of kwargs is not respected. To have an elegant syntax, it has to be a
construct of the language.

The created objects would all be of the same class (with a better name, of
course). As it adds no keyword to the language, it would not break

The created objects could support some namedtuples methods: _asdict,
_replace and _fields. The two other methods _make and _source would not apply.

Performance-wise, I guess that they would be slower than tuples and
namedtuples, But to me the additionnal usability trumps the performance hit.
If you have to care about performance, you can still use a namedtuple ot
bare tuples.

Even further...

This initial ideas is useful in itself, but it could be extended even
further. I've thought about a possible evolution.

The idea is that not all values has to be named: similarely to how args and
kwargs work for functions, there could be non-named values, with the same
limitations as the arguments (unnamed first, named afterwards). All values
could be retrived by indexing, and named values could also be retrieved by
their attribute name.

This way they could be used in __getitem__ as proposed in PEP 472[0], so
that __getitem__ supports additionnal keywords arguments. It would
correspond to Strategy "named tuple", with some of the cons removed:

"The namedtuple fields, and thus the type, will have to change according to
the passed arguments. This can be a performance bottleneck, and makes it
impossible to guarantee that two subsequent index accesses get the same
Index class;"

That would still be true for the performance bottleneck, but since the class
would always be the same the second problem disappears.

To minimize the performace hit, a standard tuple would be passed to
__getitiem__ if there is no keyword argument, and an anonymous namedtuple
would be passed if there is a keyword.

"the _n "magic" fields are a bit unusual, but ipython already uses them for
result history."

Those wouldn't be needed if both named and unnamed values are allowed.

"Differently from a function, the two notations gridValues[x=3, y=5, z=8]
and gridValues[3,5,8] would not gracefully match if the order is modified at
call time (e.g. we ask for gridValues[y=5, z=8, x=3]) . In a function, we
can pre-define argument names so that keyword arguments are properly
matched. Not so in __getitem__ , leaving the task for interpreting and
matching to __getitem__ itself."

Indexing already doesn't behave like a function call. Keeping the argument
order (like this proposal would imply) is a special case of not keeping it,
while the reverse is not true, so the more general solution would be

Finally, I admit that I have no idea of how to implement this, I love python
but never looked at it's internals. Maybe my proposal is too naive, I really
hope not. 

Thanks for your attention,



From rosuav at  Tue Apr 19 05:56:22 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 19:56:22 +1000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 7:35 PM, Koos Zevenhoven <k7hoven at> wrote:
> I was once a basic user, but I still have no idea what "IDLE" is. Does
> it come with python?
> I have tried
> $ idle
> $ python -m idle
> $ python -m IDLE
> $ python --idle

The first one will often work, but it depends on exactly how your
Python installation has been set up. (Also, not all Linux distros come
with Idle by default; you may have to install a python-idle package.)
The most reliable way to invoke Idle is:

$ python3 -m idlelib.idle

(or python2 if you want 2.7). It's more verbose than would be ideal,
and maybe it'd help to add one of your other attempts as an alias, but
normally Idle will be made available in your GUI, which is where a lot
of people will look for it.


From contrebasse at  Tue Apr 19 06:01:49 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 10:01:49 +0000 (UTC)
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <>

Franklin? Lee <leewangzhong+python at ...> writes:

> 2. Only print a few parts of the stack trace. In particular, for a
recursive or mutually recursive function, if the error was due to maximum
recursion (is this reasonably distinguishable? the error is
`RuntimeError('maximum recursion depth exceeded')`), try to print each
function on the stack once each.

> - If a function appears more than once in a row, show it once, with the
note, "(and X recursive calls)".
> - If functions otherwise appears more than once (usually by mutual
recursion?), and there is a run of them, list them as, "(Mutual recursion:
'x' (5 times), 'y' (144 times), 'z' (13 times).)".

I don't know about the other ideas, but simplifying the output of recursion
errors would be very useful. 990 repeats of the exact same lines add no
value at all, the user being an expert or a beginner.

Something like this is very clear in my mind:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/", line 10, in f
  File "/home/user/", line 10, in f
  File "/home/user/", line 10, in f (994 repeats)
  File "/home/user/", line 10, in f
  File "/home/user/", line 10, in f
RecursionError: maximum recursion depth exceeded

From p.f.moore at  Tue Apr 19 06:03:35 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 11:03:35 +0100
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 10:35, Koos Zevenhoven <k7hoven at> wrote:
> I was once a basic user, but I still have no idea what "IDLE" is. Does
> it come with python?
> I have tried
> $ idle
> $ python -m idle
> $ python -m IDLE
> $ python --idle
> To be honest, I do remember seeing a shortcut to IDLE in one of my
> Windows python installations, and I've seen it come up in discussions.
> However, it does not seem to me that IDLE is something that beginners
> would know to turn to. I do use IPython. IPython is nice---too bad it
> starts up slowly and is not recommended by default.

On Windows, it's set up as a shortcut in the start menu alongside
Python, with a tooltip "Launches IDLE, the interactive environment for
Python 3.5". And Python scripts have an "Edit with IDLE" option on the
right click menu. So it's where most Windows users would naturally
look when searching for "the Python GUI".

Discoverability of IDLE on other platforms isn't something I know much
about, sorry.

(I should also point out that I'm a relentless command line user on
Windows, so my comments above about Windows are based on my experience
watching what my "normal Windows user" colleagues do, not on my
personal habits...)


From rosuav at  Tue Apr 19 06:06:17 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 20:06:17 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde
<contrebasse at> wrote:
> I'd be happy to make a factory function for my personal use, but the order
> of kwargs is not respected. To have an elegant syntax, it has to be a
> construct of the language.

There have been proposals to have kwargs retain some order (either by
having it actually be an OrderedDict, or by changing the native dict
type to retain order under fairly restricted circumstances). With
that, you could craft the factory function easily.

It may be worth dusting off one of those proposals and seeing if it
can move forward.

That said, though: I don't often feel the yearning for quick
namedtuples. A function that today returns a namedtuple of 'x' and 'y'
can't in the future grow a 'z' without breaking any callers that
unpack "x, y = func()", so extensible return objects have to forego
unpacking (eg using a dict or SimpleNamespace). So you still have a
sharp division between "tuple" and "thing with arbitrary names", and a
namedtuple has to be soundly in the first camp. There *is* a use-case
for this, but it's fairly narrow - you have to have enough names that
you don't want to just unpack them everywhere, yet still have an
order, AND the set of names has to be constant. I suspect the correct
solution here is to make it easier for people to write their own
namedtuple factories, rather than granting them syntactic support.


From leewangzhong+python at  Tue Apr 19 06:10:29 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 19 Apr 2016 06:10:29 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 19, 2016 5:48 AM, "Paul Moore" <p.f.moore at> wrote:
> > You say "should"? Do you mean that it is likely, or do you mean that it
> > what would happen in an ideal world? My college had CS students SSH
into the
> > department's Linux server to compile and run their code, and many
> > don't believe that students should start with fancy IDE featues like,
> > syntax highlighting.
> I mean that that is what I hear people on lists like this saying as "a
> reasonable beginner environment". It means that the people I've
> introduced to Python (on Windows) have tended to end up using IDLE
> (either from the start menu or via "edit with IDLE" from the right
> click menu on a script).

But are they perhaps tending to use IDLE because you were the one
introducing them to it? Whether it is a recommended environment by the
consensus of mailing lists is hardly indicative that IDLE is what people
will start on.

> My experience (in business, environments) is that people expect an IDE
> when introduced to a new programming language, and IDLE, like it or
> not, is what is available out of the box with Python.

They expect it => they know what an IDE is. Programmers are already
half-intermediate, especially programmers who use IDEs.

> > You (and most regulars on this list) can adjust your shell to the way
> > like it, or use a more sophisticated shell, like IPython or bpython. On
> > other hand, changing shells and adding display hooks to is not
> > option for those who don't know it's an option.
> As a "scripting expert" and consultant, I typically get asked to knock
> up scripts on a variety of environments. I do not normally have what
> I'd describe as "my shell", just a series of basic "out of the box"
> prompts people expect me to work at. So no, the luxury of configuring
> the default experience is *not* something I typically have.

I suggested a command-line switch.

> >> I view the REPL as more
> >> important for intermediate or advanced users who want to quickly test
> >> out an idea (at least, that's *my* usage of the REPL).
> >
> > But that doesn't answer my question: would the proposed change hurt your
> > workflow?
> Yes. If I get a stack trace, I want it all. And if I print something
> out, I want to see it by default. The REPL for me is an investigative
> environment for seeing exactly what's going on.

And why is it an insufficient option for you to type, for example,
"_exfull()" to get the full trace?

> (Also, having the REPL behave differently depending on what version of
> Python I have would be a problem - backward compatibility applies here
> as much as anywhere else).

This is a concern, but not one that is enough, by itself, to justify the
status quo. I disagree that it applies *anywhere near as much* here, since
I don't see how it could break existing code. It will add a little
adjustment and a little more work sometimes, but it doesn't require, say,
refactoring a module, or every site page you ever downloaded.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From __peter__ at  Tue Apr 19 06:13:28 2016
From: __peter__ at (Peter Otten)
Date: Tue, 19 Apr 2016 12:13:28 +0200
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <nf50cb$jug$>

Joseph Martinot-Lagarde wrote:

> Proposal
> ========
> So I thought about a new (ok, maybe it has been proposed before but I
> couldn't find it) syntax for anonymous namedtuples (I put the prints as
> comments, otherwise gmane is complainig about top-posting):
>     my_point = (x=12, y=16)
>     # (x=12, y=16)
>     my_point[0]
>     # 12
>     my_point.y
>     # 16
>     type(my_point)
>     # <class 'anonymousnamedtuple'>
> It's just a tuple, but with names. Parenthesis would be mandatory because
> `my_point = x = 12, y = 16` wouldn't work. Single elements anonymouns
> namedtuples would require a trailing comma, similarely to tuples.

That would be a great addition!

The parens could be optional for return values

return x=1, y=2

From p.f.moore at  Tue Apr 19 06:25:07 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 11:25:07 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 11:06, Chris Angelico <rosuav at> wrote:
> That said, though: I don't often feel the yearning for quick
> namedtuples. A function that today returns a namedtuple of 'x' and 'y'
> can't in the future grow a 'z' without breaking any callers that
> unpack "x, y = func()", so extensible return objects have to forego
> unpacking (eg using a dict or SimpleNamespace).

Possibly the docs for namedtuple should refer the user to
SimpleNamespace as an alternative if "being a tuple subclass" isn't an
important requirement. Or possibly SimpleNamespace should be in the
collections module (alongside namedtuple) rather than in the types

The issue may simply be that SimpleNamespace isn't as discoverable as
it should be?


From rosuav at  Tue Apr 19 06:37:08 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 20:37:08 +1000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 8:10 PM, Franklin? Lee
<leewangzhong+python at> wrote:
> This is a concern, but not one that is enough, by itself, to justify the
> status quo. I disagree that it applies *anywhere near as much* here, since I
> don't see how it could break existing code. It will add a little adjustment
> and a little more work sometimes, but it doesn't require, say, refactoring a
> module, or every site page you ever downloaded.

You have things slightly backwards. The status quo doesn't have to
justify itself; a change does. Particularly when there's ANY backward
incompatibility being brought in, the change has to be of material
benefit, or it isn't going to happen. Take, for example, this bug

Notice that:

* The error is clearly *incorrect*. The interpreter should not be
raising SystemError in valid situations.
* The old behaviour (raising ValueError) is also incorrect, but Brett
offered it as a serious possibility.
* Only the latest CPython was changed (the 'default' branch). It
wasn't correspondingly fixed in 3.4 or 3.5.
* Nobody ever even *considered* changing 2.7's behaviour.

Backward compatibility CAN be broken, but it's a big deal. Anything
that can be done simply by changing a few local options is preferable
to a core language change. Remember, anyone can redistribute Python
with a modified or other init script, or create a shortcut
icon or shell script that invokes "python3 -i"
rather than simply running "python3" - and that script can do whatever
you want it to, including changing sys.displayhook and sys.excepthook.


From k7hoven at  Tue Apr 19 06:41:45 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 13:41:45 +0300
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum <guido at> wrote:
> Your pathstring seems to be the same as the predefined (in, and
> PEP 484) AnyStr.

Oh, there too! :) I thought I will need a TypeVar, so I turned to
help(typing.TypeVar) to look up how to do that, and there it was,
right in front of me, just with a different name 'A':

A = TypeVar('A', str, bytes)

Anyway, it might make sense to consider defining 'pathstring' (or
'PathStr' for consistency?), even if it would be the same as AnyStr.
Then, hypothetically, if at any point in the far future, bytes paths
would be deprecated, it could be considered to make PathStr just str.
After all, we don't want just Any String, we want something that
represents a path (in a documentation sense).

> You are indeed making sense, except that for various reasons the stdlib is
> not likely to adopt in-line signature annotations yet -- not even for new
> code.
> However once there's agreement on os.fspath() it can be added to the stubs
> in

I see, and I did have that impression already about the stdlib and
type hints, probably based on some of your writings. My intention was
to write these in the stub format, but apparently I need to look up
the stub syntax once more.

> Is there going to be a PEP for os.fspath()? (I muted most of the discussions
> so I'm not sure where it stands.)

It has not seemed like a good idea to discuss this (too?), but now
that you ask, I have been wondering how optimal it is to add this to
the pathlib PEP. While the changes do affect pathlib (even the code of
the module itself), this will affect ntpath, posixpath, os.scandir,
os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is
not true), shutil.[stuff], (io.)open, and potentially all kinds of
random places in the stdlib, such as fileinput, filecmp, zipfile,
tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob,
and fnmatch, to name a few :).

And now, if the FSPath[underlying_type] I just proposed ends up being
added to typing (by whatever name), this will even affect


From rosuav at  Tue Apr 19 06:46:02 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 20:46:02 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 8:25 PM, Paul Moore <p.f.moore at> wrote:
> On 19 April 2016 at 11:06, Chris Angelico <rosuav at> wrote:
>> That said, though: I don't often feel the yearning for quick
>> namedtuples. A function that today returns a namedtuple of 'x' and 'y'
>> can't in the future grow a 'z' without breaking any callers that
>> unpack "x, y = func()", so extensible return objects have to forego
>> unpacking (eg using a dict or SimpleNamespace).
> Possibly the docs for namedtuple should refer the user to
> SimpleNamespace as an alternative if "being a tuple subclass" isn't an
> important requirement. Or possibly SimpleNamespace should be in the
> collections module (alongside namedtuple) rather than in the types
> module?
> The issue may simply be that SimpleNamespace isn't as discoverable as
> it should be?

That's entirely possible. For the situations where order is
insignificant and unpacking is unnecessary, a SimpleNamespace would be
perfect, if people knew about it. Probably not worth moving the actual
class, but a quick little docs link might help.


From leewangzhong+python at  Tue Apr 19 07:10:45 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 19 Apr 2016 07:10:45 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Apr 19, 2016 6:37 AM, "Chris Angelico" <rosuav at> wrote:
> On Tue, Apr 19, 2016 at 8:10 PM, Franklin? Lee
> <leewangzhong+python at> wrote:
> > This is a concern, but not one that is enough, by itself, to justify the
> > status quo. I disagree that it applies *anywhere near as much* here,
since I
> > don't see how it could break existing code. It will add a little
> > and a little more work sometimes, but it doesn't require, say,
refactoring a
> > module, or every site page you ever downloaded.
> >
> >
> You have things slightly backwards. The status quo doesn't have to
> justify itself; a change does. Particularly when there's ANY backward
> incompatibility being brought in, the change has to be of material
> benefit, or it isn't going to happen.

No, I meant exactly what I said. He made it as an argument for the status
quo, and I said that, alone, it is a weak argument. I don't mean anything
beyond that. I posted this to find out potential issues and to see whether
less output (by default) is a decent goal to have. I am pushing for better
counterarguments. I am not suggesting that change should be the default

The justification is already on the table: it will help people who are
still developing their skills and knowledge from getting too much
non-information. I expected arguments like, "When I was a newbie using the
REPL, this would not have helped me." Hardly anyone argued against it, but
just because it isn't discussed doesn't mean it isn't there.

>  Take, for example, this bug
> report:
> Notice that:
> * The error is clearly *incorrect*. The interpreter should not be
> raising SystemError in valid situations.
> * The old behaviour (raising ValueError) is also incorrect, but Brett
> offered it as a serious possibility.
> * Only the latest CPython was changed (the 'default' branch). It
> wasn't correspondingly fixed in 3.4 or 3.5.
> * Nobody ever even *considered* changing 2.7's behaviour.
> Backward compatibility CAN be broken, but it's a big deal.

Again, this is very different from the usual change. The incompatibility
that would be introduced is purely in the mind of the programmer:
-> "Oh, I am not getting the full output. But wait, there is a message
<- "Wow, that is a lot of output. I suppose I will have to deal with it,
since it is already here."

The change in the bug report affects existing code (like code that tries to
catch SystemError) and googlability (existing resources). This change only
affects existing understanding, which is exactly what you are able to
change when you are sitting at a REPL. That is a huge difference with
respect to compatibility.

> Anything
> that can be done simply by changing a few local options is preferable
> to a core language change. Remember, anyone can redistribute Python
> with a modified or other init script, or create a shortcut
> icon or shell script that invokes "python3 -i"
> rather than simply running "python3" - and that script can do whatever
> you want it to, including changing sys.displayhook and sys.excepthook.

Again, these are things that the ones who are supposed to benefit from the
proposal are least likely to have available to them. Adding more
distributions to the ecosystem just makes it that much worse for them.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From contrebasse at  Tue Apr 19 07:31:51 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 11:31:51 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

Chris Angelico <rosuav at ...> writes:

> On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde
> <contrebasse at ...> wrote:
> > I'd be happy to make a factory function for my personal use, but the order
> > of kwargs is not respected. To have an elegant syntax, it has to be a
> > construct of the language.
> There have been proposals to have kwargs retain some order (either by
> having it actually be an OrderedDict, or by changing the native dict
> type to retain order under fairly restricted circumstances). With
> that, you could craft the factory function easily.

I saw some proposals but forgot about them, thanks for the reminder.

I'd prefer my proposal for two reasons:
- the syntax is nicer that a factory function
- it wouldn't be possible to add keyword arguments to __getitem__ without
breaking compatibility

Those issues are relatively minor, i'd be happy with a factory function

> That said, though: I don't often feel the yearning for quick
> namedtuples. A function that today returns a namedtuple of 'x' and 'y'
> can't in the future grow a 'z' without breaking any callers that
> unpack "x, y = func()", so extensible return objects have to forego
> unpacking (eg using a dict or SimpleNamespace).
My intention wasn't to allow extensible return objects, just easier access
to the return values.

> There *is* a use-case
> for this, but it's fairly narrow - you have to have enough names that
> you don't want to just unpack them everywhere, yet still have an
> order, AND the set of names has to be constant.
It's true that python already has some commodities in place. When I first
heard about namedtuples I was pretty excited, but when I tried in practice
they are not that easy to use, with lots of duplications (the name, and
possibly the field names) and a global declaration.

For all my use cases (not only function returns) I would completely replace
namedtuple by anonymouns namedtuple.

From mistersheik at  Tue Apr 19 07:35:40 2016
From: mistersheik at (Neil Girdhar)
Date: Tue, 19 Apr 2016 04:35:40 -0700 (PDT)
Subject: [Python-ideas] Changing the meaning of bool.__invert__
In-Reply-To: <>
References: <>
Message-ID: <>

I'm +1 on this change because is makes sense as a user.

Note how numpy deals with invert and unsigned integers:

In [2]: a = np.uint8(10)

In [3]: ~a
Out[3]: 245

The result of invert staying within the same type makes sense to me.

(Also, as an idealist, I believe that decoupling int and bool might one day 
many many years from now bring about the ideal of bool not subclassing int.)



On Saturday, April 9, 2016 at 12:25:57 PM UTC-4, Guido van Rossum wrote:
> Let me pronounce something here. This change is not worth the amount 
> of effort and pain a deprecation would cause everyone. Either we 
> change this quietly in 3.6 (adding it to What's New etc. of course) or 
> we don't do it at all. 
> -- 
> --Guido van Rossum ( 
> _______________________________________________ 
> Python-ideas mailing list 
> Python... at <javascript:> 
> Code of Conduct: 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Tue Apr 19 07:49:19 2016
From: rosuav at (Chris Angelico)
Date: Tue, 19 Apr 2016 21:49:19 +1000
Subject: [Python-ideas] Fwd:  Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 9:31 PM, Joseph Martinot-Lagarde
<contrebasse at> wrote:
> Chris Angelico <rosuav at ...> writes:
>> On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde
>> <contrebasse at ...> wrote:
>> > I'd be happy to make a factory function for my personal use, but the order
>> > of kwargs is not respected. To have an elegant syntax, it has to be a
>> > construct of the language.
>> There have been proposals to have kwargs retain some order (either by
>> having it actually be an OrderedDict, or by changing the native dict
>> type to retain order under fairly restricted circumstances). With
>> that, you could craft the factory function easily.
> I saw some proposals but forgot about them, thanks for the reminder.
> I'd prefer my proposal for two reasons:
> - the syntax is nicer that a factory function
> - it wouldn't be possible to add keyword arguments to __getitem__ without
> breaking compatibility
> Those issues are relatively minor, i'd be happy with a factory function

Agreed, the syntax is a lot nicer - for this specific situation. Don't
forget, though, that every piece of new syntax carries with it the not
insignificant cost of complexity; and some things are better
represented by function calls than syntax (cf 'print'). With "from
types import SimpleNamespace as NS" at the top of your code, you can
use the keyword-arguments form of the constructor to be almost the
same as your proposal, without any new syntax. When a subsequent
maintainer looks at your code, s/he can quickly look up at the imports
to see what's going on, instead of having to learn another piece of

> It's true that python already has some commodities in place. When I first
> heard about namedtuples I was pretty excited, but when I tried in practice
> they are not that easy to use, with lots of duplications (the name, and
> possibly the field names) and a global declaration.
> For all my use cases (not only function returns) I would completely replace
> namedtuple by anonymouns namedtuple.

I have to agree. Fortunately it isn't hard to create a namedtuple factory:

def NS(fields):
    return namedtuple('anonymous', fields.split())

or possibly:

def NS(fields, *values):
    return namedtuple('anonymous', fields.split())(*values)

That cuts down the duplication some, but it's far from perfect.


From steve at  Tue Apr 19 08:06:28 2016
From: steve at (Steven D'Aprano)
Date: Tue, 19 Apr 2016 22:06:28 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 11:25:07AM +0100, Paul Moore wrote:

> Possibly the docs for namedtuple should refer the user to
> SimpleNamespace as an alternative if "being a tuple subclass" isn't an
> important requirement. Or possibly SimpleNamespace should be in the
> collections module (alongside namedtuple) rather than in the types
> module?


I'm repeatedly surprised that SimpleNamespace isn't in collections.

> The issue may simply be that SimpleNamespace isn't as discoverable as
> it should be?



From p.f.moore at  Tue Apr 19 08:26:31 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 13:26:31 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 11:46, Chris Angelico <rosuav at> wrote:
>> The issue may simply be that SimpleNamespace isn't as discoverable as
>> it should be?
> That's entirely possible. For the situations where order is
> insignificant and unpacking is unnecessary, a SimpleNamespace would be
> perfect, if people knew about it. Probably not worth moving the actual
> class, but a quick little docs link might help.

From contrebasse at  Tue Apr 19 08:30:10 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 12:30:10 +0000 (UTC)
Subject: [Python-ideas] Fwd:  Anonymous namedtuples
References: <>
Message-ID: <>

Chris Angelico <rosuav at ...> writes:

> On Tue, Apr 19, 2016 at 9:31 PM, Joseph Martinot-Lagarde
> <contrebasse at ...> wrote:
> > Chris Angelico <rosuav <at> ...> writes:
> >
> >>
> >> On Tue, Apr 19, 2016 at 7:49 PM, Joseph Martinot-Lagarde
> >> <contrebasse <at> ...> wrote:
> >> > I'd be happy to make a factory function for my personal use, but the
> >> > of kwargs is not respected. To have an elegant syntax, it has to be a
> >> > construct of the language.
> >>
> >> There have been proposals to have kwargs retain some order (either by
> >> having it actually be an OrderedDict, or by changing the native dict
> >> type to retain order under fairly restricted circumstances). With
> >> that, you could craft the factory function easily.
> >
> > I saw some proposals but forgot about them, thanks for the reminder.
> >
> > I'd prefer my proposal for two reasons:
> > - the syntax is nicer that a factory function
> > - it wouldn't be possible to add keyword arguments to __getitem__ without
> > breaking compatibility
> >
> > Those issues are relatively minor, i'd be happy with a factory function
> Agreed, the syntax is a lot nicer - for this specific situation. Don't
> forget, though, that every piece of new syntax carries with it the not
> insignificant cost of complexity; and some things are better
> represented by function calls than syntax (cf 'print'). With "from
> types import SimpleNamespace as NS" at the top of your code, you can
> use the keyword-arguments form of the constructor to be almost the
> same as your proposal, without any new syntax. When a subsequent
> maintainer looks at your code, s/he can quickly look up at the imports
> to see what's going on, instead of having to learn another piece of
> syntax.

I agree on the principle, but to me this syntax looks less complex to me
compared to the actual namedtuple where you have to use a factory to create
a class to be able to use a namedtuple. And if you use a namedtuple as a
return value, the definition of the namedtuple class will typically be
relatively far away from the actual use of it.

SimpleNamespace is nice but would break compatibility for return values.

> I have to agree. Fortunately it isn't hard to create a namedtuple factory:
As you said before, it would be if the order of the keyword arguments was

> def NS(fields, *values):
>     return namedtuple('anonymous', fields.split())(*values)
> That cuts down the duplication some, but it's far from perfect.

It cuts down the duplication at the cost of readability.

There is also the Matlab way which looks a bit better:

    from collections import namedtuple

    def NS(*args):
        fields = args[::2]
        values = args[1::2]
        return namedtuple('anonymous', fields)(*values)

    print(NS("x", 12, "y", 15))

But readability counts, and it's very easy to introduce bugs as adding or
removing arguments can impact the whole chain. Initially coming from Matlab,
the greatest interest I found in python was named arguments for functions! :)

From contrebasse at  Tue Apr 19 08:34:37 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 12:34:37 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> I'm repeatedly surprised that SimpleNamespace isn't in collections.

+1 !

But I still think that my proposal has some value outside of SimpleNamespace. :)

From p.f.moore at  Tue Apr 19 08:47:09 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 13:47:09 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 13:34, Joseph Martinot-Lagarde
<contrebasse at> wrote:
> But I still think that my proposal has some value outside of SimpleNamespace. :)

The biggest issue is likely to be that the added value doesn't justify
the cost of new syntax, and a language change. On the other hand,
functions preserving the order of keyword arguments (which would allow
you to write a cleaner factory function) is probably a change with
wider value, as well as being less intrusive (it's a semantic change,
but there's no change in syntax) and so may be more likely to get

A lot of the difficulty with assessing proposals like this is
balancing the costs and benefits, particularly as the costs typically
affect far more people, most of whom don't participate in these lists,
so we have to take a cautious ("assume the worst") view.

Personally, I've not used namedtuple enough to feel that it warrants
dedicated syntax. OTOH, I have wished for something like
SimpleNamespace lots of times, so making that more discoverable would
have been a huge win for me!


From contrebasse at  Tue Apr 19 08:49:22 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 12:49:22 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> This way they could be used in __getitem__ as proposed in PEP 472[0]

One of the link in this PEP is where
the writer raises some concerns about named tuples:

- They pose problems with pickle because namedtuples needs subclasses to
work. In my proposal it shouldn't be a problem anymore since there would be
one class.
- namedtuple() is a factory function requiring a 2-step initialization, a
name, and a subclass. Again it disappears in my proposal.

I'm not sure about the database problem since I don't use them.

From steve at  Tue Apr 19 08:59:58 2016
From: steve at (Steven D'Aprano)
Date: Tue, 19 Apr 2016 22:59:58 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 09:49:44AM +0000, Joseph Martinot-Lagarde wrote:
> Hi, list !
> namedtuples are really great. I would like to use them even more, for
> example for functions that return multiple arguments. The problem is that
> namedtuples have to be "declared" beforehand, so it would be quite tedious
> to declare a nameedtuple by function, that's why I very rarely do it.

You aren't forced to declare the type ahead of time, you can just use it 
as part of an expression:

py> from collections import namedtuple
py> my_tuple = namedtuple("_", "x y z")(1, 2, 3)
py> print(my_tuple)
_(x=1, y=2, z=3)

However, there is a gotcha if you do this: each time you create an 
anonymous namedtuple, you create a new, distinct, class:

py> a = namedtuple("_", "x y")(1, 2)
py> b = namedtuple("_", "x y")(1, 2)
py> type(a) == type(b)

which could be both surprising and expensive.

One possible solution: keep your own cache of classes:

def nt(name, fields, _cache={}):
    cls = _cache.get((name, fields), None)
    if cls is None:
        cls = _cache[(name, fields)] = namedtuple(name, fields)
    return cls

> Another hting I don't like about namedtuples is the duplication of the name.
> TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`,
> where `Point` is repeated two times. I'll go one step further and say that
> the name is useless most of the time, so let's just get rid of it.

The name is not useless. It is very useful for string representations, 
debugging, introspection, and generally having a clue what the object 
represents. How else do you know what kind of object you are dealing 

I agree that it is a little sad that we have to repeat the name twice, 
but strictly speaking, you don't even need to do that. For example:

class Point(namedtuple("AbstractPoint", "x y z")):
    def method(self):

In my opinion, avoiding having to repeat the name twice is a "nice to 
have", not a "must have".

> Proposal
> ========
> So I thought about a new (ok, maybe it has been proposed before but I
> couldn't find it) syntax for anonymous namedtuples (I put the prints as
> comments, otherwise gmane is complainig about top-posting):
>     my_point = (x=12, y=16)
>     # (x=12, y=16)
>     my_point[0]
>     # 12

How do you know that the 0th item is field "x"?

Keyword arguments are not ordered. Even if you could somehow determine 
the order that they are given, you can't use that information since we 
should expect that:

assert (x=12, y=16) == (y=16, x=12)

will pass. 

(Why? Because they're keyword arguments, and the order of keyword 
arguments shouldn't matter.)

So we have a problem that the indexing order (the same order is used for 
iteration and tuple unpacking) is not specified anywhere. This is a 
*major* problem for an ordered type like tuple.

namedtuple avoids this problem by requiring the user to specify the 
field name order before creating an instance.

SimpleNamespace avoids this problem by not being ordered or iterable.

I suppose we could put the fields in sorted order. But that's going to 
make life difficult for uses where we would like some other order, e.g. 
to match common conventions. Consider a 3D point in spherical 

pt = (r=3, theta=0.5, phi=0.25)

In sorted order, pt == (3, 0.25, 0.5) which goes against the standard 
mathematical definition.

Namedtuples pre-define what fields are allowed, what they are 
called, and what order they appear in. Anonymous namedtuples as you 
describe them don't do any of these things.

Consider that since they're tuples, we should be able to provide the 
items as positional arguments to the construct, just like regular 

pt = (x=12, y=16)
type(pt)(1, 2)

This works fine with namedtuple, but how will it work with your 
proposal? And what happens if we do this?

type(pt)(1, 2, 3)

And of course, a naive implementation would suffer from the same issue 
as mentioned above, where every instance is a singleton of a distinct 
class. Python isn't a language where we care too much about memory use, 
but surely we don't want to be quite this profligate:

py> a = namedtuple("Point", "x y z")(1, 2, 3)
py> sys.getsizeof(a)  # reasonably small
py> sys.getsizeof(type(a))  # not so small

Obviously we don't want every instance to belong to a distinct class, so 
we need some sort of cache. SimpleNamespace solves this problem by 
making all instances belong to the same class. That's another difference 
between namedtuple and what you seem to want.


From ncoghlan at  Tue Apr 19 09:12:16 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 19 Apr 2016 23:12:16 +1000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 14:52, Franklin? Lee <leewangzhong+python at>

> On Mon, Apr 18, 2016 at 11:51 AM, Ethan Furman <ethan at> wrote:
> > Or even Nick's (that you snipped):
> >>
> >> Since the more the REPL does, the more opportunities there are for
> >
> >> it to break when debugging, having the output hooks be as simple as
> >> possible is quite desirable.
> I read this as, "It'll be more complicated, so it MIGHT be buggy."
> This seems solvable via design, and doesn't speak to whether it's
> good, in principle, to limit output per input. If you still think it's
> a good reason, then I probably didn't understand it correctly.

No, that's not what it means. It relates to one of the ways experienced
developers are able to debug code: by running it in their heads, and
comparing the result their brain computed with what actually happened. When
there's a discrepancy, either their expectation is wrong or the code is
wrong, and they need to investigate further to figure out which it is. When
folks say "Python fits my brain" this is often what they're talking about.

Implicit side effects from hidden code break that mental equivalence - it's
why effective uses of metaclasses, monkeypatching, and other techniques for
deliberately introducing changed implicit behaviour often also involve
introducing some kind of more local signal to help convey what is going on
(such as a naming convention, or ensuring the altered behaviour is used
consistently across the entire project).

The default REPL behaviour is appropriate for this "somewhat experienced
Pythonista tinkering with code to see how it behaves" use case - keeping
the results very close to what they would be if you typed the same line of
code into a text file and ran it that way. It's not necessarily the best
way to *learn* those equivalences, but that's also not what it's designed

IPython's REPL is tailored for a different audience - their primary
audience is research scientists, and they want to be able to better eyeball
calculation results, rather than lower level Python instance
representations. As a result, it's much cleverer than the default REPL, but
it's also aiming to tap into people's intuitions about the shape of their
data and the expected outcomes of the operations they're performing on it,
rather than their ability to mentally run Python code specifically.

A REPL designed specifically for folks learning Python, like the one in the
Mu editor, or the direction IDLE seems to be going, would likely be better
off choosing different default settings for sys displayhook and
sys.excepthook, but those changes would be best selected based on direct
observations of classrooms and workshops, and noting where folks get
confused or intimidated by the default settings. For environments other
than IDLE, they can also be iterated on at a much higher rate than we make
CPython releases.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Apr 19 09:24:34 2016
From: steve at (Steven D'Aprano)
Date: Tue, 19 Apr 2016 23:24:34 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 01:47:09PM +0100, Paul Moore wrote:
> On 19 April 2016 at 13:34, Joseph Martinot-Lagarde
> <contrebasse at> wrote:
> > But I still think that my proposal has some value outside of SimpleNamespace. :)
> The biggest issue is likely to be that the added value doesn't justify
> the cost of new syntax, and a language change. 

I think there are more problems with the proposal than just "not useful 
enough". See my previous post.

> On the other hand,
> functions preserving the order of keyword arguments (which would allow
> you to write a cleaner factory function) is probably a change with
> wider value, as well as being less intrusive (it's a semantic change,
> but there's no change in syntax) and so may be more likely to get
> through.

It's a pretty big semantic change though. As I point out in my previous 
post, at the moment we can be sure that changing the order of keyword 
arguments does not change anything: spam(a=1, b=2) and spam(b=2, a=1) 
will always have the same semantics. Making keyword arguments ordered 
will break that assumption, and even if it doesn't break any existing 
code, it will make it harder to reason about future code. No more can 
you trust that the order of keyword arguments will have no effect on the 
result of calling a function.

I don't see the current behaviour of keyword arguments as a limitation, 
I see it as a feature. I don't need to care about the order of keyword 
arguments, only their names.


From contrebasse at  Tue Apr 19 09:48:08 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 13:48:08 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

Steven D'Aprano <steve at ...> writes:

> py> from collections import namedtuple
> py> my_tuple = namedtuple("_", "x y z")(1, 2, 3)
> py> print(my_tuple)
> _(x=1, y=2, z=3)

This looks nice with a short number of short attribute names, but in a real
case separating the values from their corresponding field hurts readability.

    my_tuple = namedtuple("_", "site, gisement, range_nominal, range_bista,
is_monostatic")(12.45, 45.0, 3000.0, 2998.5357, True)
    my_tuple = (site=12.45, gisement=45.0, range_nominal=3000.0,
range_bista=2998.5357, is_monostatic=True)

As a side note, if the empty string would be allowed for the class name the
printed value would look better. The fact that it's not possible right now
looks liek an implementation "detail".

> > Another hting I don't like about namedtuples is the duplication of the name.
> > TYpical declarations look like `Point = namedtuple('Point', ['x', 'y'])`,
> > where `Point` is repeated two times. I'll go one step further and say that
> > the name is useless most of the time, so let's just get rid of it.
> The name is not useless. It is very useful for string representations, 
> debugging, introspection, and generally having a clue what the object 
> represents. How else do you know what kind of object you are dealing 
> with?
> In my opinion, avoiding having to repeat the name twice is a "nice to 
> have", not a "must have".

Of course he name is not completely useless, but you can often know what a
namedtuple represents in your application or your script from the field
names only (or the variable name). SimpleNamespace live very well without a
name, for example.

> >     my_point = (x=12, y=16)
> >     # (x=12, y=16)
> >     my_point[0]
> >     # 12
> How do you know that the 0th item is field "x"?
> Keyword arguments are not ordered.

In my proposal these are not keywords arguments, they are a syntax construct
to build a anonymous namedtuple. That's why I can't just create a factory
The order is determiend by the construction, and can be found by
introspection (simply printing the object will do).

I suppose that as a developper it's a shift from how keywords behave now,
but as a user a namedtuple is still a tuple, and it keeps the ordering.

> Consider that since they're tuples, we should be able to provide the 
> items as positional arguments to the construct, just like regular 
> namedtuples:
> pt = (x=12, y=16)
> type(pt)(1, 2)

They inherits tuples, that doesn't mean that they behave in all cases like
tuples. In that case I guess that this form would not be allowed. I don't
know how bad this looks for a python dev though!

> Obviously we don't want every instance to belong to a distinct class, so 
> we need some sort of cache. SimpleNamespace solves this problem by 
> making all instances belong to the same class. That's another difference 
> between namedtuple and what you seem to want.

Yes, that's another difference, I don't want to replicate namedtuples
exactly (otherwise I'd just use a namedtuple). I don't think that you need a
cache, more like a frozendict per instance which binds the fields to the
corresponding index. Maybe it's what you call cache ?

>From your comments I get that a problem is that the only way to create an
anonymous namedtuple would be via the syntax construct, it's not possible to
use the class constructor directly. I would argue that it's not a problem
becaus if you need to do that you can just use a standard namedtuple, that's
exactly what it's here for. Anonymous namedtuple are for quick, easy, and
readable use.
Maybe that means that the added syntax complexity is not worth it.

From mike at  Tue Apr 19 10:03:35 2016
From: mike at (Michael Selik)
Date: Tue, 19 Apr 2016 14:03:35 +0000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee <leewangzhong+python at>

> On Apr 19, 2016 4:09 AM, "Paul Moore" <p.f.moore at> wrote:
> > Basic users should probably be using a tool like IDLE, which has a bit
> > more support for beginners than the raw REPL.
> My college had CS students SSH into the department's Linux server to
> compile and run their code, and many teachers don't believe that students
> should start with fancy IDE featues like, er, syntax highlighting.
That's probably because your professors thought you were more advanced than
other new Pythonistas, because you were CS students. If I were in their
shoes, I might chose a different approach depending on the level of the

> But that doesn't answer my question: would the proposed change hurt your

It might. Would it affect doctests? Would it make indirect infinite
recursion more difficult to trace? Would it make me remember yet another
command line option or REPL option to turn on complete reprs? Would it
force me to explain yet another config setting to new programmers?

I think a beginner understands when they've printed something too big. I
see this happen frequently. They laugh, shake their heads, and retype
whatever they need to.

If they're using IDLE, they say, "OMG I crashed it!" then they close the
window or restart IDLE. I'd say it's more a problem in IDLE than in the
default REPL.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From contrebasse at  Tue Apr 19 10:09:45 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 14:09:45 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> That's entirely possible. For the situations where order is
> insignificant and unpacking is unnecessary, a SimpleNamespace would be
> perfect, if people knew about it. Probably not worth moving the actual
> class, but a quick little docs link might help.
How big a problem would it be to actually move the class to collections ?
For example at the top of the documentation there is a table with all the
available container classes, having SimpleNamespace would be far more
discoverable than a paragraph in namedtuple.

The class could still be importable from types.SimpleNamespace for backward

Sorry for my noob questions...

From p.f.moore at  Tue Apr 19 10:26:08 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 15:26:08 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 15:09, Joseph Martinot-Lagarde
<contrebasse at> wrote:
>> That's entirely possible. For the situations where order is
>> insignificant and unpacking is unnecessary, a SimpleNamespace would be
>> perfect, if people knew about it. Probably not worth moving the actual
>> class, but a quick little docs link might help.
> How big a problem would it be to actually move the class to collections ?
> For example at the top of the documentation there is a table with all the
> available container classes, having SimpleNamespace would be far more
> discoverable than a paragraph in namedtuple.

Not incredibly hard, in theory. However...

Move the definition:

    SimpleNamespace = type(sys.implementation)

to collections (but note that this isn't *really* the definition, the
actual definition is in C, in the sys module, and is unnamed - to that
extent types, which is for exposing names for types that otherwise
aren't given built in names, is actually the right place for this
definition!) Update the documentation. Update the tests
(test/ - this probably involves *moving* those tests to Add a backward compatibility name in types. Add
some tests for that backward compatibility name.

Work out what to do about pickle compatibility:

>>> from pickle import dumps
>>> from types import SimpleNamespace
>>> dumps(SimpleNamespace(a=1))

Note that the type name (types.SimpleNamespace) is embedded in the
pickle. How do we avoid breaking pickles?

That's probably far from everything - this was 5 minutes' quick
investigation, and I'm not very experienced at doing this type of
refactoring. I'm pretty sure I've missed some things.

> The class could still be importable from types.SimpleNamespace for backward
> compatibility.

Where it's defined matters in the pickle case - so it's not quite that simple.

> Sorry for my noob questions...

Not at all. It's precisely because it "seems simple" that the feedback
people get seems negative at times - I hope the above gives you a
better idea of what might be involved in such a seemingly simple
change. So thanks for asking, and giving me the opportunity to clarify

In summary: It's not likely to be worth the effort, even though it
looks simple at first glance.

From random832 at  Tue Apr 19 10:26:41 2016
From: random832 at (Random832)
Date: Tue, 19 Apr 2016 10:26:41 -0400
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016, at 06:25, Paul Moore wrote:
> Possibly the docs for namedtuple should refer the user to
> SimpleNamespace as an alternative if "being a tuple subclass" isn't an
> important requirement.

Just my two cents - for the use cases that I would find this useful for
(I usually end up just using tuple), a hashable "FrozenSimpleNamespace"
would be nice. Or even one that has immutable key fields that are used
as the basis for the hash but that you can hang other values off of

A way to simply initialize a SimpleNamespace-like class with a passed-in
mapping, and have it inherit the equality/hash/etc from that (so
FrozenSimpleNamespace could use FrozenDict, you could do one with
OrderedDict, one with chainmap or with an easy-to-write custom dict that
provides the "immutable key plus mutable other" behavior), might provide
for all of these in a relatively easy way. Maybe have
SimpleNamespace([dict], /, **kwargs) that just throws the dict argument
(if present) into __dict__.

If there were a FrozenOrderedDict you could even implement a very
namedtuple-like class that way.

From p.f.moore at  Tue Apr 19 10:39:48 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 15:39:48 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 15:26, Random832 <random832 at> wrote:
> On Tue, Apr 19, 2016, at 06:25, Paul Moore wrote:
>> Possibly the docs for namedtuple should refer the user to
>> SimpleNamespace as an alternative if "being a tuple subclass" isn't an
>> important requirement.
> Just my two cents - for the use cases that I would find this useful for
> (I usually end up just using tuple), a hashable "FrozenSimpleNamespace"
> would be nice. Or even one that has immutable key fields that are used
> as the basis for the hash but that you can hang other values off of
> (SemiFrozen?)

Having done some digging just now... SimpleNamespace is basically just
exposing for end users, the type that is used to build the
sys.implementation object. The definition is *literally* nothing more

    SimpleNamespace = type(sys.implementation)

Variations such as you suggest may indeed be useful, but the existing
implementation was essentially a free benefit of "we use this one
ourselves in the core". Going beyond that, any additional types
probably need to justify themselves a bit better - and at that point
the usual "publish on PyPI as a standalone module and see how useful
they turn out to be" suggestion probably applies.


From contrebasse at  Tue Apr 19 10:51:35 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 14:51:35 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> Going beyond that, any additional types
> probably need to justify themselves a bit better - and at that point
> the usual "publish on PyPI as a standalone module and see how useful
> they turn out to be" suggestion probably applies.

The core of my proposal depends on modifications in the langage itself, it
needs either adding a new syntax, either keeping the order of keywords
arguments. Right now I don't see how it could work as a standalone module,
but I'll think about it.

From guido at  Tue Apr 19 11:06:30 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Apr 2016 08:06:30 -0700
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 4:49 AM, Chris Angelico <rosuav at> wrote:

> Fortunately it isn't hard to create a namedtuple factory:
> def NS(fields):
>     return namedtuple('anonymous', fields.split())
> or possibly:
> def NS(fields, *values):
>     return namedtuple('anonymous', fields.split())(*values)
> That cuts down the duplication some, but it's far from perfect.

Please don't recommend or spread this idiom!

Every call to namedtuple() creates a new class, which is a very expensive
operation. On my machine the simplest namedtuple call taks around 350 usec.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Tue Apr 19 11:13:53 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Apr 2016 08:13:53 -0700
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

A general warning about this (x=12, y=16) idea: It's not so different from
using dicts as cheap structs, constructing objects on the fly using either
{'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is
that if you use this idiom a lot, you're invariably going to run into cases
where a field is missing, and you're spending a lot of time tracking down
where the object was created incorrectly. Using a namedtuple (or any other
construct that asks you to pre-declare the type used) the incorrect
construction will cause a failure right at the point where it's being
incorrectly constructed, making it much simpler to diagnose.

In fully typed languages like Haskell or Java this wouldn't be a problem,
but in languages like Python or Perl it is a real concern.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From k7hoven at  Tue Apr 19 11:15:01 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 18:15:01 +0300
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 5:39 PM, Paul Moore <p.f.moore at> wrote:
> Having done some digging just now... SimpleNamespace is basically just
> exposing for end users, the type that is used to build the
> sys.implementation object. The definition is *literally* nothing more
> than
>     SimpleNamespace = type(sys.implementation)

So, if this were added to 'collections' too, the documentation should
be moved there, and the 'types' documentation could mention this
somehow and recommend using collections.

Heck, if it gets another alias, why not give it another name too:
collections.ComplicatedNamespace ;-)

My main objection to putting SimpleNamespace in collections is that we
might want something even better in collections, perhaps


From p.f.moore at  Tue Apr 19 11:30:26 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 19 Apr 2016 16:30:26 +0100
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19 April 2016 at 15:51, Joseph Martinot-Lagarde
<contrebasse at> wrote:
>> Going beyond that, any additional types
>> probably need to justify themselves a bit better - and at that point
>> the usual "publish on PyPI as a standalone module and see how useful
>> they turn out to be" suggestion probably applies.
> The core of my proposal depends on modifications in the langage itself, it
> needs either adding a new syntax, either keeping the order of keywords
> arguments. Right now I don't see how it could work as a standalone module,
> but I'll think about it.

I was referring to randon832's suggested variants on SimpleNamespace.
As you say, your proposal is basically a language change and you're
completely right to assume that needs to be implemented in the core.

Sorry if I wasn't clear enough.

From steve at  Tue Apr 19 12:18:58 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Apr 2016 02:18:58 +1000
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote:

> The default REPL behaviour is appropriate for this "somewhat experienced
> Pythonista tinkering with code to see how it behaves" use case - keeping
> the results very close to what they would be if you typed the same line of
> code into a text file and ran it that way. It's not necessarily the best
> way to *learn* those equivalences, but that's also not what it's designed
> for.

I mostly agree with what you say, but I would like to see one change to 
the default sys.excepthook: large numbers of *identical* traceback lines 
(as you often get with recursion errors) should be collapsed. For 

py> sys.setrecursionlimit(20)
py> fact(30)  # obvious recursive factorial
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 3, in fact
  File "<stdin>", line 2, in fact
RuntimeError: maximum recursion depth exceeded in comparison

Try as I might, I just don't see the value of manually counting all 
those 'File "<stdin>", line 3, in fact' lines to find out where the 
recursive call failed :-)

I think that it would be better if identical lines were collapsed, 
something like this:

import sys
import traceback
from itertools import groupby
TEMPLATE = "  [...repeat previous line %d times...]\n"

def collapse(seq, minimum, template=TEMPLATE):
    for key, group in groupby(seq):
        group = list(group)
        if len(group) < minimum:
            for item in group:
                yield item
            yield key
            yield template % (len(group)-1)

def shortertb(*args):
    lines = traceback.format_exception(*args)
    sys.stderr.write(''.join(collapse(lines, 3)))

sys.excepthook = shortertb

which then gives tracebacks like this:

py> sys.setrecursionlimit(200)
py> a = fact(10000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in fact
  [...repeat previous line 197 times...]
  File "<stdin>", line 2, in fact
RuntimeError: maximum recursion depth exceeded in comparison


From guido at  Tue Apr 19 12:20:48 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Apr 2016 09:20:48 -0700
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 3:41 AM, Koos Zevenhoven <k7hoven at> wrote:

> On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum <guido at>
> wrote:
> > Your pathstring seems to be the same as the predefined (in, and
> > PEP 484) AnyStr.
> Oh, there too! :) I thought I will need a TypeVar, so I turned to
> help(typing.TypeVar) to look up how to do that, and there it was,
> right in front of me, just with a different name 'A':
> A = TypeVar('A', str, bytes)
> Anyway, it might make sense to consider defining 'pathstring' (or
> 'PathStr' for consistency?), even if it would be the same as AnyStr.
> Then, hypothetically, if at any point in the far future, bytes paths
> would be deprecated, it could be considered to make PathStr just str.
> After all, we don't want just Any String, we want something that
> represents a path (in a documentation sense).

Unfortunately, until we implement something like "NewType" ( the type checkers won't check
whether you're actually using the right thing, so while the separate name
would add a bit of documentation, I doubt that you'll ever be able to
change the meaning of PathStr.

Also, I don't expect a future where bytes paths don't make sense, unless
Linux starts enforcing a normalized UTF-8 encoding in the kernel.

> > You are indeed making sense, except that for various reasons the stdlib
> is
> > not likely to adopt in-line signature annotations yet -- not even for new
> > code.
> >
> > However once there's agreement on os.fspath() it can be added to the
> stubs
> > in
> >
> I see, and I did have that impression already about the stdlib and
> type hints, probably based on some of your writings. My intention was
> to write these in the stub format, but apparently I need to look up
> the stub syntax once more.

Once there's a PEP, updating the stubs will be routine.

> > Is there going to be a PEP for os.fspath()? (I muted most of the
> discussions
> > so I'm not sure where it stands.)
> It has not seemed like a good idea to discuss this (too?), but now
> that you ask, I have been wondering how optimal it is to add this to
> the pathlib PEP. While the changes do affect pathlib (even the code of
> the module itself), this will affect ntpath, posixpath, os.scandir,
> os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is
> not true), shutil.[stuff], (io.)open, and potentially all kinds of
> random places in the stdlib, such as fileinput, filecmp, zipfile,
> tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob,
> and fnmatch, to name a few :).
> And now, if the FSPath[underlying_type] I just proposed ends up being
> added to typing (by whatever name), this will even affect

Personally I think it's better off as a separate PEP, unless it turns out
that it can be compressed to just the addition of a few paragraphs to the
original PEP 428.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Apr 19 12:44:41 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Apr 2016 02:44:41 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 10:06:28PM +1000, Steven D'Aprano wrote:
> On Tue, Apr 19, 2016 at 11:25:07AM +0100, Paul Moore wrote:
> > Possibly the docs for namedtuple should refer the user to
> > SimpleNamespace as an alternative if "being a tuple subclass" isn't an
> > important requirement. Or possibly SimpleNamespace should be in the
> > collections module (alongside namedtuple) rather than in the types
> > module?
> +1 
> I'm repeatedly surprised that SimpleNamespace isn't in collections.

I've thought about this some more, and I've decided that I don't think 
it should be in collections. It's not a collection.

Strangely, "collection" is not in the glossary:

(neither is "container") and there doesn't appear to be an ABC for what 
makes a collection, but the docs seem to suggest that "collections" are 
another word for "containers":


  8.3. collections ? Container datatypes

  Source code: Lib/collections/

  This module implements specialized container datatypes providing 
  alternatives to Python?s general purpose built-in containers, dict, 
  list, set, and tuple.

which mostly matches the old (and I mean really old, going back to 
Python 1.5 days) informal usage of "collection" as "a string, list, 
tuple or dict", that is, a sequence or mapping. (These days, I'd add set 
to the list as well.)

SimpleNamespace is neither a sequence nor a mapping.

We do have a "Container" ABC:

and SimpleNamespace doesn't match that ABC either.

So I think SimpleNamespace should stay where it is. I guess I'll just 
have to learn that it's a type, not a collection :-)


From jbvsmo at  Tue Apr 19 12:58:47 2016
From: jbvsmo at (=?UTF-8?Q?Jo=C3=A3o_Bernardo?=)
Date: Tue, 19 Apr 2016 13:58:47 -0300
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 12:06 PM, Guido van Rossum <guido at> wrote:

> Every call to namedtuple() creates a new class, which is a very expensive
> operation. On my machine the simplest namedtuple call taks around 350 usec.

Isn't that because the code for namedtuple is using exec instead a more
pythonic approach of metaclassing to make it "faster" but taking longer to
actually create the class?

Maybe instead of changing namedtuple (which seems to be a taboo), there
could be a new anonymous type. Something built-in, with its own syntax
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Tue Apr 19 13:35:45 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Apr 2016 10:35:45 -0700
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 9:58 AM, Jo?o Bernardo <jbvsmo at> wrote:

> On Tue, Apr 19, 2016 at 12:06 PM, Guido van Rossum <guido at>
> wrote:
>> Every call to namedtuple() creates a new class, which is a very expensive
>> operation. On my machine the simplest namedtuple call taks around 350 usec.
> Isn't that because the code for namedtuple is using exec instead a more
> pythonic approach of metaclassing to make it "faster" but taking longer to
> actually create the class?

That's part of it, but no matter how you do it, it's still creating a new
class object, you can't get around that. And class objects are just
expensive. Any idiom that ends up creating a new class object for each
instance that's created is a big anti-pattern.

> Maybe instead of changing namedtuple (which seems to be a taboo), there
> could be a new anonymous type. Something built-in, with its own syntax.

That's essentially what the (x=12, y=16) proposal is about, IIUC -- it
would just be a single new type, so (x=12, y=16).__class__ would be the
same class object as (a='', b=3.14).

But I have serious reservations about that idiom too.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Tue Apr 19 13:36:47 2016
From: random832 at (Random832)
Date: Tue, 19 Apr 2016 13:36:47 -0400
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016, at 12:20, Guido van Rossum wrote:
> Also, I don't expect a future where bytes paths don't make sense, unless
> Linux starts enforcing a normalized UTF-8 encoding in the kernel.

Well, OSX does that now, but that's a whole other topic. Whether it is
useful to represent paths as the bytes type in Python code is orthogonal
to whether you can have paths that aren't valid strings in an encoding,
considering that surrogateescape lets you represent any sequence of
bytes as a str.

From k7hoven at  Tue Apr 19 13:40:06 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 20:40:06 +0300
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 7:20 PM, Guido van Rossum <guido at> wrote:
> On Tue, Apr 19, 2016 at 3:41 AM, Koos Zevenhoven <k7hoven at> wrote:
>> On Tue, Apr 19, 2016 at 4:27 AM, Guido van Rossum <guido at>
>> wrote:
>> > Is there going to be a PEP for os.fspath()? (I muted most of the
>> > discussions
>> > so I'm not sure where it stands.)
>> It has not seemed like a good idea to discuss this (too?), but now
>> that you ask, I have been wondering how optimal it is to add this to
>> the pathlib PEP. While the changes do affect pathlib (even the code of
>> the module itself), this will affect ntpath, posixpath, os.scandir,
>> os.[other stuff], DirEntry (tempted to say os.DirEntry, but that is
>> not true), shutil.[stuff], (io.)open, and potentially all kinds of
>> random places in the stdlib, such as fileinput, filecmp, zipfile,
>> tarfile, tempfile (for the 'dir' keyword arguments), maybe even glob,
>> and fnmatch, to name a few :).
>> And now, if the FSPath[underlying_type] I just proposed ends up being
>> added to typing (by whatever name), this will even affect
> Personally I think it's better off as a separate PEP, unless it turns out
> that it can be compressed to just the addition of a few paragraphs to the
> original PEP 428.

While I could imagine the discussions having been shorter, it does not
seem like compressing everything into a few paragraphs is a good idea
either. And there are things that have not really been discussed, such
as the details of the 'typing' part and the list of affected modules,
which I tried to sketch above. Anyway, after all this, I wouldn't mind
to also work on the PEP if there will be separate one---if that makes
any sense.


From guido at  Tue Apr 19 13:51:13 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Apr 2016 10:51:13 -0700
Subject: [Python-ideas] Type hinting for path-related functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 10:36 AM, Random832 <random832 at> wrote:

> On Tue, Apr 19, 2016, at 12:20, Guido van Rossum wrote:
> > Also, I don't expect a future where bytes paths don't make sense, unless
> > Linux starts enforcing a normalized UTF-8 encoding in the kernel.
> Well, OSX does that now, but that's a whole other topic. Whether it is
> useful to represent paths as the bytes type in Python code is orthogonal
> to whether you can have paths that aren't valid strings in an encoding,
> considering that surrogateescape lets you represent any sequence of
> bytes as a str.

I'm sorry, I just don't see support for bytes paths going away any time
soon (regardless of the alternative you bring up), so I think it's a waste
of breath to discuss it further.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From __peter__ at  Tue Apr 19 13:55:59 2016
From: __peter__ at (Peter Otten)
Date: Tue, 19 Apr 2016 19:55:59 +0200
Subject: [Python-ideas] Fwd: Anonymous namedtuples
References: <>
Message-ID: <nf5rfg$soa$>

Guido van Rossum wrote:

> A general warning about this (x=12, y=16) idea: It's not so different from
> using dicts as cheap structs, constructing objects on the fly using either
> {'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is
> that if you use this idiom a lot, you're invariably going to run into
> cases where a field is missing, and you're spending a lot of time tracking
> down where the object was created incorrectly. Using a namedtuple (or any
> other construct that asks you to pre-declare the type used) the incorrect
> construction will cause a failure right at the point where it's being
> incorrectly constructed, making it much simpler to diagnose.
> In fully typed languages like Haskell or Java this wouldn't be a problem,
> but in languages like Python or Perl it is a real concern.

If there were support from the compiler for a function

def f():
   return x=1, y=2

the namedtuple("_", "x y") class could be stored in f.__code__.co_consts 
just like 1 and 2. Adding information about the file and line number could 
probably be handled like it's done for code objects of classes and functions 
defined inside a function.

While this approach still produces "late" failures debugging should at least 
be easier than for dict or SimpleNamespace instances.

From k7hoven at  Tue Apr 19 16:06:31 2016
From: k7hoven at (Koos Zevenhoven)
Date: Tue, 19 Apr 2016 23:06:31 +0300
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 7:18 PM, Steven D'Aprano <steve at> wrote:

> which then gives tracebacks like this:
> py> sys.setrecursionlimit(200)
> py> a = fact(10000)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 3, in fact
>   [...repeat previous line 197 times...]
>   File "<stdin>", line 2, in fact
> RuntimeError: maximum recursion depth exceeded in comparison

Something like this would also make sense (although not be quite as
useful) when there is infinite recursion. But it should recognize a
block of several lines repeating.


From contrebasse at  Tue Apr 19 16:25:30 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 20:25:30 +0000 (UTC)
Subject: [Python-ideas] Fwd: Anonymous namedtuples
References: <>
Message-ID: <>

Guido van Rossum <guido at ...> writes:

> A general warning about this (x=12, y=16) idea: It's not so different from
using dicts as cheap structs, constructing objects on the fly using either
{'x': 12, 'y': 16} or dict(x=12, y=16). The problem with all these this is
that if you use this idiom a lot, you're invariably going to run into cases
where a field is missing, and you're spending a lot of time tracking down
where the object was created incorrectly. Using a namedtuple (or any other
construct that asks you to pre-declare the type used) the incorrect
construction will cause a failure right at the point where it's being
incorrectly constructed, making it much simpler to diagnose.

I think that it depends a lot on what these are used for. What I primarily
though about was to be able to easily return a namadtuple as a function
output. In this case the namedtuple (anonymous or not) is created in a
single place, so having to declare the namedtuple before using it has a
limited interest, and has the drawbacks I presented before (code
duplication, declaration far away from the use).

What you're talking about is more like a container which would be used all
around an application, where you need to ensure the corecntess of the struct
everywhere. In this case a standard nametuple is a better fit.

I feel like it's similar to the separation you talked about before, script
vs application. My view is more script-like, yours is more application-like.


From contrebasse at  Tue Apr 19 16:38:06 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 20:38:06 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> It's a pretty big semantic change though. As I point out in my previous 
> post, at the moment we can be sure that changing the order of keyword 
> arguments does not change anything: spam(a=1, b=2) and spam(b=2, a=1) 
> will always have the same semantics. Making keyword arguments ordered 
> will break that assumption, and even if it doesn't break any existing 
> code, it will make it harder to reason about future code. No more can 
> you trust that the order of keyword arguments will have no effect on the 
> result of calling a function.
> I don't see the current behaviour of keyword arguments as a limitation, 
> I see it as a feature. I don't need to care about the order of keyword 
> arguments, only their names.
I think I finally understand your concern about keeping the order of
keywords arguments. Most of the time it wouldn't matter but it can
potentially lead to very surprising behavior. Maybe the fact that we're all
consenting adults can help ?

However in my proposal there is no function call, even if the syntax is
similar. The order of the fields has the same significance as in a
namedtuple, but it's another (easier) way to set them.
It's also why I proposed an evolution to use it in __getitem__ also, which
already doesn't behave like a standard function call either.


From contrebasse at  Tue Apr 19 16:47:13 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 20:47:13 +0000 (UTC)
Subject: [Python-ideas] Anonymous namedtuples
References: <>
Message-ID: <>

> SimpleNamespace is neither a sequence nor a mapping.
I don't know the exact definition of a mapping, but to me SimpleNamespace is
like a dict with restricted keys (valid attributes only) and a nicer syntax.
`my_dict["key"]` is roughly equivalent to `my_namespace.key`.
SimpleNamespace lacks lots of methods comapred to dict. Maybe an
hypothetical ComplexNamespace would be in collections module ?


From srkunze at  Tue Apr 19 17:20:30 2016
From: srkunze at (Sven R. Kunze)
Date: Tue, 19 Apr 2016 23:20:30 +0200
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 19.04.2016 19:35, Guido van Rossum wrote:
> That's essentially what the (x=12, y=16) proposal is about, IIUC -- it 
> would just be a single new type, so (x=12, y=16).__class__ would be 
> the same class object as (a='', b=3.14).

The proposal reminds me of JavaScript's "object".

So, "(x=12, y=16).__class__ == type(object())"?

> But I have serious reservations about that idiom too.

Me, too. I don't fully support this kind of on-the-fly construction as 
it is the same for closures. But that might just be my personal feeling 
because as such they escape proper testing and type support by IDEs 
(which reminds me strongly about Web development with JavaScript). On 
the other side, it allows ultra-rapid prototyping from which one can 
strip down to "traditional development" step by step when needed (using 
classes, tests etc.).

This said, what are the reasons for your reservations?


From tjreedy at  Tue Apr 19 17:37:14 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Apr 2016 17:37:14 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <nf68fl$elv$>

On 4/19/2016 5:35 AM, Koos Zevenhoven wrote:

> I was once a basic user, but I still have no idea what "IDLE" is. Does
> it come with python?

Yes, unless explicitly omitted either in packaging or installation. 
(Some redistributors might put it with a separate 
tkinter/tix/idle/turtle/turtledemo package.  The windows installer has a 
box that can be unchecked.)

> I have tried
> $ idle

This and idle3 works on some but not all systems.

> $ python -m idle

python -m idlelib (or idlelib.idle -- required in 2.x)

> $ python -m IDLE
> $ python --idle

> To be honest, I do remember seeing a shortcut to IDLE in one of my
> Windows python installations,

Right.  Once run, the IDLE icon can be pinned to the taskbar.

 > and I've seen it come up in discussions.
> However, it does not seem to me that IDLE is something that beginners
> would know to turn to.

Yet many do, perhaps because instructors and books suggest it and tell 
how to start it up.

Terry Jan Reedy

From contrebasse at  Tue Apr 19 18:02:55 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Tue, 19 Apr 2016 22:02:55 +0000 (UTC)
Subject: [Python-ideas] Fwd: Anonymous namedtuples
References: <>
Message-ID: <>

> On 19.04.2016 19:35, Guido van Rossum wrote:
> > That's essentially what the (x=12, y=16) proposal is about, IIUC -- it 
> > would just be a single new type, so (x=12, y=16).__class__ would be 
> > the same class object as (a='', b=3.14).
> The proposal reminds me of JavaScript's "object".
> So, "(x=12, y=16).__class__ == type(object())"?

No, it would have its own class but it would be the same for all of these
object, independently of the fields. This is different from namedtuples who
need the creation of a subclass for each set of fields.

From tjreedy at  Tue Apr 19 18:02:54 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Apr 2016 18:02:54 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <nf69vq$5p9$>

On 4/19/2016 9:12 AM, Nick Coghlan wrote:

> The default REPL behaviour is appropriate for this "somewhat experienced
> Pythonista tinkering with code to see how it behaves" use case - keeping
> the results very close to what they would be if you typed the same line
> of code into a text file and ran it that way. It's not necessarily the
> best way to *learn* those equivalences, but that's also not what it's
> designed for.

The default REPL behavior should not throw away output.  In the Windows 
console, what is not displayed cannot be retrieved.  Reversible output 
editing is possible and appropriate in a GUI that can keep output 
separate from what is displayed.

> IPython's REPL is tailored for a different audience - their primary
> audience is research scientists, and they want to be able to better
> eyeball calculation results, rather than lower level Python instance
> representations. As a result, it's much cleverer than the default REPL,
> but it's also aiming to tap into people's intuitions about the shape of
> their data and the expected outcomes of the operations they're
> performing on it, rather than their ability to mentally run Python code
> specifically.
> A REPL designed specifically for folks learning Python, like the one in
> the Mu editor, or the direction IDLE seems to be going, would likely be
> better off choosing different default settings for sys displayhook and
> sys.excepthook,

There is a external Squeezer extension to IDLE that more or less does 
what this thread proposes and is also reversible.  Deleted blocks are 
replaced by something that can be clicked on to expand the text.  There 
is a proposal to incorporate Squeezer into IDLE.  I have not reviewed 
the proposal yet because it would not solve any of my problems.  I am 
more interested in way to put long output from help() into a separate 
text window instead of the shell.

> but those changes would be best selected based on direct
> observations of classrooms and workshops, and noting where folks get
> confused or intimidated by the default settings. For environments other
> than IDLE, they can also be iterated on at a much higher rate than we
> make CPython releases.

Terry Jan Reedy

From tjreedy at  Tue Apr 19 18:12:51 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Apr 2016 18:12:51 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <nf6aif$emb$>

On 4/19/2016 10:03 AM, Michael Selik wrote:
> On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee
> <leewangzhong+python at
> <mailto:leewangzhong%2Bpython at>>
> wrote:
>     On Apr 19, 2016 4:09 AM, "Paul Moore"
>     <p.f.moore at
>     <mailto:p.f.moore at>> wrote:
>     > Basic users should probably be using a tool like IDLE, which has a bit
>     > more support for beginners than the raw REPL.
>     My college had CS students SSH into the department's Linux server to
>     compile and run their code, and many teachers don't believe that
>     students should start with fancy IDE featues like, er, syntax
>     highlighting.
> That's probably because your professors thought you were more advanced
> than other new Pythonistas, because you were CS students. If I were in
> their shoes, I might chose a different approach depending on the level
> of the course.
>> But that doesn't answer my question: would the proposed change hurt
> your workflow?
> It might. Would it affect doctests? Would it make indirect infinite
> recursion more difficult to trace? Would it make me remember yet another
> command line option or REPL option to turn on complete reprs? Would it
> force me to explain yet another config setting to new programmers?
> I think a beginner understands when they've printed something too big. I
> see this happen frequently. They laugh, shake their heads, and retype
> whatever they need to.
> If they're using IDLE, they say, "OMG I crashed it!" then they close the
> window or restart IDLE.

Recursion limit tracebacks with 2000 short lines (under 80 chars) are 
not a problem for tk's Text widget.  Long lines, from printing something 
line 'a'*10000 or [1]*5000 make the widget sluggish. Too many or too 
long may require a restart.

> I'd say it's more a problem in IDLE than in the default REPL.

Its a problem with using an underlying text widget optimized for 
managing 'sensible' length '\n' delimited lines.

Terry Jan Reedy

From tjreedy at  Tue Apr 19 18:28:40 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Apr 2016 18:28:40 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <nf6bg4$sa8$>

On 4/19/2016 12:18 PM, Steven D'Aprano wrote:
> I mostly agree with what you say, but I would like to see one change to
> the default sys.excepthook: large numbers of *identical* traceback lines
> (as you often get with recursion errors) should be collapsed. For
> example:

Tracebacks produce *pairs* of lines: the location and the line itself.
Replacing pairs with a count of repetitions would not lose information, 
and would make the info more visible.  I would require at least, say, 3 
repetitions before collapsing.

> py> sys.setrecursionlimit(20)
> py> fact(30)  # obvious recursive factorial
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 3, in fact
           return fact(n-1)
>   File "<stdin>", line 3, in fact
           return fact(n-1)
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 3, in fact
>   File "<stdin>", line 2, in fact
> RuntimeError: maximum recursion depth exceeded in comparison
> Try as I might, I just don't see the value of manually counting all
> those 'File "<stdin>", line 3, in fact' lines to find out where the
> recursive call failed :-)
> I think that it would be better if identical lines were collapsed,
> something like this:
> import sys
> import traceback
> from itertools import groupby
> TEMPLATE = "  [...repeat previous line %d times...]\n"
> def collapse(seq, minimum, template=TEMPLATE):
>     for key, group in groupby(seq):
>         group = list(group)
>         if len(group) < minimum:
>             for item in group:
>                 yield item
>         else:
>             yield key
>             yield template % (len(group)-1)
> def shortertb(*args):
>     lines = traceback.format_exception(*args)
>     sys.stderr.write(''.join(collapse(lines, 3)))
> sys.excepthook = shortertb
> which then gives tracebacks like this:
> py> sys.setrecursionlimit(200)
> py> a = fact(10000)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "<stdin>", line 3, in fact
          return fact(n-1)
>   [...repeat previous line 197 times...]
     [repeat previous pair of lines 197 times]
>   File "<stdin>", line 2, in fact
> RuntimeError: maximum recursion depth exceeded in comparison

Terry Jan Reedy

From ericsnowcurrently at  Tue Apr 19 20:26:42 2016
From: ericsnowcurrently at (Eric Snow)
Date: Tue, 19 Apr 2016 18:26:42 -0600
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 8:39 AM, Paul Moore <p.f.moore at> wrote:
> Having done some digging just now... SimpleNamespace is basically just
> exposing for end users, the type that is used to build the
> sys.implementation object. The definition is *literally* nothing more
> than
>     SimpleNamespace = type(sys.implementation)

FYI, sys.implementation was added for PEP 421 (May-ish 2012).  That
PEP has a small section on the type I used. [1]  At first it was
unclear if I should even expose type(sys.implementation). [2]  Nick
suggested types.SimpleNamespace, so I went with that. [3] :)

Note that the type is super simple/trivial, a basic subclass of object
with a **kwargs __init__() to pre-populate and a nice repr.  Both that
first link [1] and the docs [4] spell it out.  The point is that there
isn't much to the type.  It's a convenience, nothing else.  I used to
get roughly the same thing with "class SimpleNamespace: pass". :)

FWIW, I proposed giving SimpleNamespace more exposure around the same
time as PEP 421, and the reaction was lukewarm. [5]  However, since
then it's been used in several other places in the stdlib and I know
of a few folks that have used it in their own projects.  Regardless,
ultimately the only reason I added it to the stdlib in the first place
[6] was because it's the type of sys.implementation and it made more
sense to expose it explicitly in the types module than implicitly via


[6] in contrast to argparse.Namespace and multiprocessing.managers.Namespace...

From ethan at  Tue Apr 19 20:53:51 2016
From: ethan at (Ethan Furman)
Date: Tue, 19 Apr 2016 17:53:51 -0700
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/19/2016 08:15 AM, Koos Zevenhoven wrote:

> My main objection to putting SimpleNamespace in collections is that we
> might want something even better in collections, perhaps
> ComplexNamespace?

A ComplexNamespace is a class.  :)


From random832 at  Tue Apr 19 22:04:40 2016
From: random832 at (Random832)
Date: Tue, 19 Apr 2016 22:04:40 -0400
Subject: [Python-ideas] Fwd: Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016, at 17:20, Sven R. Kunze wrote:
> So, "(x=12, y=16).__class__ == type(object())"?

Well, object does equality comparison by identity; you presumably would
want this class to use its contents.

From steve at  Tue Apr 19 22:48:11 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Apr 2016 12:48:11 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Apr 19, 2016 at 08:47:13PM +0000, Joseph Martinot-Lagarde wrote:
> > SimpleNamespace is neither a sequence nor a mapping.
> > 
> I don't know the exact definition of a mapping, but to me SimpleNamespace is
> like a dict with restricted keys (valid attributes only) and a nicer syntax.
> `my_dict["key"]` is roughly equivalent to `my_namespace.key`.

That is (I believe) how Javascript and PHP treat it, but they're not 
exactly the best languages to emulate, or at least not blindly.

Attributes and keys represent different concepts. Attributes represent 
an integral part of the object:


while keys represent arbitrary (or nearly so) data associated with some 
data collection:

books['The Lord Of The Rings']
kings['Henry VIII']

They don't just use different syntaxes, they have different purposes, 
and while it is tempting to (mis)use the shorter attribute syntax for 
key lookups:


it is risky to conflate the two. Suppose you have a book called 
"update" (presumably a book of experimental poetry by somebody who 
dislikes uppercase letters):


Either the key shadows the update method, or the method shadows the 
book, or you get an error. All of these scenarios are bad.

I'd consider giving Python an alternate key-lookup syntax purely as 
syntactic sugar:

colours~AliceBlue  # sugar for colours['AliceBlue']
books~update  # books['update']

before I would consider adding a standard library class that conflates 
attribute- and key-lookup. If people want to do this in their own code 
(and I do see the attraction, even if I think it is a bad idea), so be 
it, but the standard library shouldn't encourage it.


From rosuav at  Tue Apr 19 23:00:29 2016
From: rosuav at (Chris Angelico)
Date: Wed, 20 Apr 2016 13:00:29 +1000
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Apr 20, 2016 at 12:48 PM, Steven D'Aprano <steve at> wrote:
> Attributes and keys represent different concepts. Attributes represent
> an integral part of the object:
> dog.tail
> me.head
> car.engine
> book.cover
> while keys represent arbitrary (or nearly so) data associated with some
> data collection:
> books['The Lord Of The Rings']
> kings['Henry VIII']
> prisoner[239410]
> colours['AliceBlue']
> They don't just use different syntaxes, they have different purposes,
> and while it is tempting to (mis)use the shorter attribute syntax for
> key lookups:
> colours.AliceBlue
> it is risky to conflate the two.

Enumerations blur the line. You could define 'colours' thus:

class colours(enum.IntEnum):
    AliceBlue = 0xf0f8ff

Is "AliceBlue" a fundamental attribute of the "colours" object, or is
it a piece of data associated with the collection of colours? Or, like
Danny Kaye, perchance a brilliant combination of both?


From mertz at  Tue Apr 19 23:12:19 2016
From: mertz at (David Mertz)
Date: Tue, 19 Apr 2016 20:12:19 -0700
Subject: [Python-ideas] random.choice on non-sequence
In-Reply-To: <>
References: <>
 <-176815014396412028@unknownmsgid> <nel7fd$7fq$>
Message-ID: <>

For any positive integer you select (including those with more digits than
there are particles in the universe), ALMOST ALL integers are larger than
your selection. I.e. the measure of those smaller remains zero.
On Apr 13, 2016 6:31 PM, "Steven D'Aprano" <steve at> wrote:

> On Thu, Apr 14, 2016 at 09:47:37AM +1200, Greg Ewing wrote:
> > Chris Angelico wrote:
> > >On Wed, Apr 13, 2016 at 8:36 PM, Terry Reedy <tjreedy at> wrote:
> > >
> > >>On 4/13/2016 12:52 AM, Chris Barker - NOAA Federal wrote:
> > >>
> > >>>BTW, isn't it impossible to randomly select from an infinite iterable
> > >>>anyway?
> > >>
> > >>With equal probability, yes, impossible.
> [...]
> > I think Terry meant that you can't pick just one item that's
> > equally likely to be any of the infinitely many items returned
> > by the iterator.
> Correct. That's equivalent to chosing a positive integer with uniform
> probability distribution and no upper bound.
> > You can prove that by considering that the probability of
> > a given item being returned would have to be 1/infinity,
> > which is zero -- so you can't return anything!
> That's not how probability works :-)
> Consider a dart which is thrown at a dartboard. The probability of it
> landing on any specific point is zero, since the area of a single point
> is zero. Nevertheless, the dart does hit somewhere!
> A formal and precise treatment would have to involve calculus and limits
> as the probability approaches zero, rather than a flat out "the
> probability is zero, therefore it's impossible".
> Slightly less formally, we can say (only horrifying mathematicians a
> little bit) that the probability of any specific number is an
> infinitesimal number.
> While it is *mathematically* meaningful to talk about selecting a random
> positive integer uniformly, its hard to do much more than that. The mean
> (average) is undefined[1]. A typical value chosen would have a vast
> number of digits, far larger than anything that could be stored in
> computer memory. Indeed Almost All[2] of the values we generate would be
> so large that we have no notation for writing it down (and not enough
> space in the universe to write it even if we did). So it is impossible
> in practice to select a random integer with uniform distribution and no
> upper bound.
> Non-uniform distributions, though, are easy :-)
> [1] While weird, this is not all that weird. For example, selecting
> numbers from a Cauchy distribution also has an undefined mean. What this
> means in practice is that the *sample mean* will not converge as you
> take more and more samples: the more samples you take, the more wildly
> the average will jump all over the place.
> [2]
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From dan at  Tue Apr 19 23:18:25 2016
From: dan at (Dan Sommers)
Date: Wed, 20 Apr 2016 03:18:25 +0000 (UTC)
Subject: [Python-ideas] Have REPL print less by default
References: <>
Message-ID: <nf6se1$ue9$>

On Wed, 20 Apr 2016 02:18:58 +1000, Steven D'Aprano wrote:

> On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote:

>> The default REPL behaviour is appropriate for this "somewhat
>> experienced Pythonista tinkering with code to see how it behaves" use
>> case - keeping the results very close to what they would be if you
>> typed the same line of code into a text file and ran it that
>> way. It's not necessarily the best way to *learn* those equivalences,
>> but that's also not what it's designed for.

That's a pretty powerful argument:  running something in the REPL should
give the same results as running it from a command line.  When things
fail differently in different environments, the environments themselves
become suspects.

> I mostly agree with what you say, but I would like to see one change
> to the default sys.excepthook: large numbers of *identical* traceback
> lines (as you often get with recursion errors) should be
> collapsed. For example:


> Try as I might, I just don't see the value of manually counting all
> those 'File "<stdin>", line 3, in fact' lines to find out where the
> recursive call failed :-)

Yes, I see the smiley, but I would add specifically that the number of
calls in the stack trace from a recursive function is rarely the
important part.  When I write something recursive, and get *that* stack
trace, I don't have to scroll anywhere to look at anything to know that
I blew the termination condition(s).

(I'm agreeing with you:  there is no value of counting the number of
calls in the stack trace.)

If I suddenly got tiny stack traces, I'd spend *more* time realizing
what went wrong, until I retrained myself.

From tjreedy at  Tue Apr 19 23:33:38 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Apr 2016 23:33:38 -0400
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <nf6se1$ue9$>
References: <>
 <> <nf6se1$ue9$>
Message-ID: <nf6tbu$mbr$>

On 4/19/2016 11:18 PM, Dan Sommers wrote:
> On Wed, 20 Apr 2016 02:18:58 +1000, Steven D'Aprano wrote:
>> On Tue, Apr 19, 2016 at 11:12:16PM +1000, Nick Coghlan wrote:
>>> The default REPL behaviour is appropriate for this "somewhat
>>> experienced Pythonista tinkering with code to see how it behaves" use
>>> case - keeping the results very close to what they would be if you
>>> typed the same line of code into a text file and ran it that
>>> way. It's not necessarily the best way to *learn* those equivalences,
>>> but that's also not what it's designed for.
> That's a pretty powerful argument:  running something in the REPL should
> give the same results as running it from a command line.  When things
> fail differently in different environments, the environments themselves
> become suspects.

If RecursionError stack traces were condensed, I would have them 
continue to be the same (condensed) in either interactive or batch mode.

>> I mostly agree with what you say, but I would like to see one change
>> to the default sys.excepthook: large numbers of *identical* traceback
>> lines (as you often get with recursion errors) should be
>> collapsed. For example:
> [...]
>> Try as I might, I just don't see the value of manually counting all
>> those 'File "<stdin>", line 3, in fact' lines to find out where the
>> recursive call failed :-)
> Yes, I see the smiley, but I would add specifically that the number of
> calls in the stack trace from a recursive function is rarely the
> important part.  When I write something recursive, and get *that* stack
> trace, I don't have to scroll anywhere to look at anything to know that
> I blew the termination condition(s).
> (I'm agreeing with you:  there is no value of counting the number of
> calls in the stack trace.)
> If I suddenly got tiny stack traces, I'd spend *more* time realizing
> what went wrong, until I retrained myself.

Terry Jan Reedy

From ethan at  Wed Apr 20 00:07:38 2016
From: ethan at (Ethan Furman)
Date: Tue, 19 Apr 2016 21:07:38 -0700
Subject: [Python-ideas] Anonymous namedtuples
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/19/2016 08:00 PM, Chris Angelico wrote:
> On Wed, Apr 20, 2016 at 12:48 PM, Steven D'Aprano wrote:

>> Attributes and keys represent different concepts. Attributes represent
>> an integral part of the object:
 >> [...]
>> it is risky to conflate the two.
> Enumerations blur the line. You could define 'colours' thus:
> class colours(enum.IntEnum):
>      AliceBlue = 0xf0f8ff

However, enumerations have a very specific purpose -- they aren't a 
generic collection you can modify and update at will (at least, not 
without a *lot* of effort).


From vgr255 at  Wed Apr 20 00:07:11 2016
From: vgr255 at (=?iso-8859-1?Q?=C9manuel_Barry?=)
Date: Wed, 20 Apr 2016 00:07:11 -0400
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have REPL
 print less by default)
Message-ID: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>

This idea was mentioned a couple of times in the previous thread, and it
seems reasonable to me. Getting recursion errors when testing a function in
the interactive prompt often screwed me over, as I have a limited scrollback
of 1000 lines at best on Windows (I never checked exactly how high/low the
limit was), and with Python's recursion limit of 1000, that's a whopping
1000 to 2000 lines in my console, effectively killing off anything useful I
might have wanted to see. Such as, for example, the function definition that
triggered the exception.

Of course, the same is true for actual programs, where tracebacks drown off
everything else useful. For all we know, a typo caused a NameError in one of
the functions which somehow triggered it to call itself over again until
Python decided it was enough, but you can't know that because your entire
scrollback is full of

File "<stdin>", line 1, in func

And, for programs that are user-facing, I take all tracebacks and redirect
them to a file that the user can submit me so I can debug the issue, and
having tens of hundreds of the same line hurts readability.

The issue that the output from running a certain script in the REPL and as a
script would differ has been raised, and I am of the opinion that in no way,
shape or form should the two have different behaviours (the exception being
sys.ps1, sys.ps2 and builtins._ in interactive mode, which are 100% fine).
I'm suggesting a change to how Python handles specifically recursion limits,
to cut after a sane number of times (someone suggested to shrink the output
after 3 times), and simply stating how many more identical messages there
are. I would also like to extend this to any arbitrary loop of callers (for
example foo -> bar -> baz -> foo and so on would be counted as one "call"
for the purposes of this proposal).

Under this proposal, something similar to this would happen:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in func
  File "<stdin>", line 1, in func
  File "<stdin>", line 1, in func
    [Previous 1 message(s) repeated 996 more times]
RecursionError: maximum recursion depth exceeded

With multiple chained calls (I don't know how hard it would be to implement
this, probably not trivial):

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in foo
  File "<stdin>", line 1, in bar
  File "<stdin>", line 1, in baz
  File "<stdin>", line 1, in foo
  File "<stdin>", line 1, in bar
  File "<stdin>", line 1, in baz
  File "<stdin>", line 1, in foo
  File "<stdin>", line 1, in bar
  File "<stdin>", line 1, in baz
    [Previous 3 message(s) repeated 330 more times]
RecursionError: maximum recursion depth exceeded

We can probably afford to cut out immediately as we notice the recursion,
but even just 3 times with multiple chained calls isn't going to be anywhere
near as long as it currently is.

Thoughts? Suggestions? Something I forgot?


From ethan at  Wed Apr 20 00:26:24 2016
From: ethan at (Ethan Furman)
Date: Tue, 19 Apr 2016 21:26:24 -0700
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

On 04/19/2016 09:07 PM, ?manuel Barry wrote:

> Under this proposal, something similar to this would happen:
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "<stdin>", line 1, in func
>    File "<stdin>", line 1, in func
>    File "<stdin>", line 1, in func
>      [Previous 1 message(s) repeated 996 more times]
> RecursionError: maximum recursion depth exceeded
> With multiple chained calls (I don't know how hard it would be to implement
> this, probably not trivial):
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>      [Previous 3 message(s) repeated 330 more times]
> RecursionError: maximum recursion depth exceeded

> Thoughts? Suggestions? Something I forgot?

I don't think you'll find anyone opposed -- someone just needs to do the 
work, and volunteer hours are scarce. :(


From tjreedy at  Wed Apr 20 02:19:38 2016
From: tjreedy at (Terry Reedy)
Date: Wed, 20 Apr 2016 02:19:38 -0400
On 4/20/2016 12:07 AM, ?manuel Barry wrote:
> This idea was mentioned a couple of times in the previous thread, and it
> seems reasonable to me. Getting recursion errors when testing a function in
> the interactive prompt often screwed me over, as I have a limited scrollback
> of 1000 lines at best on Windows (I never checked exactly how high/low the
> limit was), and with Python's recursion limit of 1000, that's a whopping
> 1000 to 2000 lines in my console, effectively killing off anything useful I
> might have wanted to see. Such as, for example, the function definition that
> triggered the exception.

For those who don't know, the Windows console uses a circular buffer of 
lines.  The default size was once 300, I believe -- too small to contain 
a run of the test suite.  (I seems to be less on Win 10).  The max with 
win 10 appears to be 999 (x 4? unclear).

Terry Jan Reedy

On Wed, Apr 20, 2016 at 2:19 AM, Terry Reedy <tjreedy at> wrote:
> On 4/20/2016 12:07 AM, ?manuel Barry wrote:
>> This idea was mentioned a couple of times in the previous thread, and it
>> seems reasonable to me. Getting recursion errors when testing a function
>> in
>> the interactive prompt often screwed me over, as I have a limited
>> scrollback
>> of 1000 lines at best on Windows (I never checked exactly how high/low the
>> limit was), and with Python's recursion limit of 1000, that's a whopping
>> 1000 to 2000 lines in my console, effectively killing off anything useful
>> I
>> might have wanted to see. Such as, for example, the function definition
>> that
>> triggered the exception.
> For those who don't know, the Windows console uses a circular buffer of
> lines.  The default size was once 300, I believe -- too small to contain a
> run of the test suite.  (I seems to be less on Win 10).  The max with win 10
> appears to be 999 (x 4? unclear).

I'm guessing you pulled the 999x4 from the Options tab of the
properties dialog, but that option is actually for command history
(i.e. the ability to use the up arrow to recall previous commands).
That has a limit of 999 commands in the buffer, with a limit of 999
(not 4, which is the default) total buffers in memory across all open
CMD.EXE instances. (Not sure what happens when you open more instances
than that limit is set to.)

The setting for scrollback is found under the Layout tab, and is
disguised as the Height option in the Screen Buffer Size section. That
has a limit of 9,999 lines, and I do believe it defaulted to 300 (I
changed that setting so long ago that I no longer remember for sure
what the stock default was).

Perhaps to simplify the implementation, one could just look for 
repetitions/cycles that include the LAST trace line.
Rob Cliffe

On 20/04/2016 05:07, ?manuel Barry wrote:
> This idea was mentioned a couple of times in the previous thread, and it
> seems reasonable to me. Getting recursion errors when testing a function in
> the interactive prompt often screwed me over, as I have a limited scrollback
> of 1000 lines at best on Windows (I never checked exactly how high/low the
> limit was), and with Python's recursion limit of 1000, that's a whopping
> 1000 to 2000 lines in my console, effectively killing off anything useful I
> might have wanted to see. Such as, for example, the function definition that
> triggered the exception.
> Of course, the same is true for actual programs, where tracebacks drown off
> everything else useful. For all we know, a typo caused a NameError in one of
> the functions which somehow triggered it to call itself over again until
> Python decided it was enough, but you can't know that because your entire
> scrollback is full of
> File "<stdin>", line 1, in func
> And, for programs that are user-facing, I take all tracebacks and redirect
> them to a file that the user can submit me so I can debug the issue, and
> having tens of hundreds of the same line hurts readability.
> The issue that the output from running a certain script in the REPL and as a
> script would differ has been raised, and I am of the opinion that in no way,
> shape or form should the two have different behaviours (the exception being
> sys.ps1, sys.ps2 and builtins._ in interactive mode, which are 100% fine).
> I'm suggesting a change to how Python handles specifically recursion limits,
> to cut after a sane number of times (someone suggested to shrink the output
> after 3 times), and simply stating how many more identical messages there
> are. I would also like to extend this to any arbitrary loop of callers (for
> example foo -> bar -> baz -> foo and so on would be counted as one "call"
> for the purposes of this proposal).
> Under this proposal, something similar to this would happen:
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "<stdin>", line 1, in func
>    File "<stdin>", line 1, in func
>    File "<stdin>", line 1, in func
>      [Previous 1 message(s) repeated 996 more times]
> RecursionError: maximum recursion depth exceeded
> With multiple chained calls (I don't know how hard it would be to implement
> this, probably not trivial):
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>    File "<stdin>", line 1, in foo
>    File "<stdin>", line 1, in bar
>    File "<stdin>", line 1, in baz
>      [Previous 3 message(s) repeated 330 more times]
> RecursionError: maximum recursion depth exceeded
> We can probably afford to cut out immediately as we notice the recursion,
> but even just 3 times with multiple chained calls isn't going to be anywhere
> near as long as it currently is.
> Thoughts? Suggestions? Something I forgot?
> -Emanuel
On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy <tjreedy at> wrote:
> On 4/19/2016 12:18 PM, Steven D'Aprano wrote:
>> I mostly agree with what you say, but I would like to see one change to
>> the default sys.excepthook: large numbers of *identical* traceback lines
>> (as you often get with recursion errors) should be collapsed. For
>> example:
> Tracebacks produce *pairs* of lines: the location and the line itself.

Not always, as Steven's example shows. For example:

def fun():


Does that.

But yes, as I wrote in my previous email, it should recognize a block
of several lines repeating, too.


On Wed, Apr 20, 2016 at 12:41 PM, Rob Cliffe <rob.cliffe at> wrote:
> Perhaps to simplify the implementation, one could just look for
> repetitions/cycles that include the LAST trace line.
> Rob Cliffe

Maybe you mean the last line before the exception message?


Yes of course.

On 20/04/2016 11:41, Koos Zevenhoven wrote:
> On Wed, Apr 20, 2016 at 12:41 PM, Rob Cliffe <rob.cliffe at> wrote:
>> Perhaps to simplify the implementation, one could just look for
>> repetitions/cycles that include the LAST trace line.
>> Rob Cliffe
> Maybe you mean the last line before the exception message?
> -Koos

On Tue, Apr 19, 2016 at 4:06 AM, Chris Angelico <rosuav at> wrote:
> There have been proposals to have kwargs retain some order (either by
> having it actually be an OrderedDict, or by changing the native dict
> type to retain order under fairly restricted circumstances). With
> that, you could craft the factory function easily.
> It may be worth dusting off one of those proposals and seeing if it
> can move forward.

FYI, PEP 468: "Preserving the order of **kwargs in a function." [1]

The PEP exists for exactly the sort of use case being discussed here.
In my case it was an alternate enum proposal I was making back when
enums were under discussion.  Regardless, I'd still like to see the
PEP land.  It's just a matter of finding time.  I'll see if I can get
some consensus leading up to and at PyCon next month.

The actual implementation for PEP 468 is almost trivial.  Now that we
have a C-implementation of OrderedDict, it's mostly just a matter of
addressing performance concerns [2], particularly those expressed by
Guido, or falling back to one of the alternatives discussed in the

Note that a related concept is represented with my work to make the
class definition namespace ordered by default. [3]  It would allow
class decorators to have access to the order in which the class's
attributes were defined.



On 4/20/2016 6:39 AM, Koos Zevenhoven wrote:
> On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy <tjreedy at> wrote:
>> On 4/19/2016 12:18 PM, Steven D'Aprano wrote:
>>> I mostly agree with what you say, but I would like to see one change to
>>> the default sys.excepthook: large numbers of *identical* traceback lines
>>> (as you often get with recursion errors) should be collapsed. For
>>> example:
>> Tracebacks produce *pairs* of lines: the location and the line itself.
> Not always, as Steven's example shows. For example:
> def fun():
>     fun()
> fun()
> Does that.

The above produces pairs of lines, so I do not understand your point.

Terry Jan Reedy

On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy <tjreedy at> wrote:
> On 4/20/2016 6:39 AM, Koos Zevenhoven wrote:
>> On Wed, Apr 20, 2016 at 1:28 AM, Terry Reedy <tjreedy at> wrote:
>>> On 4/19/2016 12:18 PM, Steven D'Aprano wrote:
>>>> I mostly agree with what you say, but I would like to see one change to
>>>> the default sys.excepthook: large numbers of *identical* traceback lines
>>>> (as you often get with recursion errors) should be collapsed. For
>>>> example:
>>> Tracebacks produce *pairs* of lines: the location and the line itself.
>> Not always, as Steven's example shows. For example:
>> def fun():
>>     fun()
>> fun()
>> Does that.
> The above produces pairs of lines, so I do not understand your point.

Strange. for me it produces a repeating single line. Tried both on
2.7.6. and 3.5.1. Probably not worth discussing, though.


On Wed, Apr 20, 2016, at 13:57, Koos Zevenhoven wrote:
> On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy <tjreedy at> wrote:
> > On 4/20/2016 6:39 AM, Koos Zevenhoven wrote:
> >> def fun():
> >>     fun()
> >>
> >> fun()
> >>
> >> Does that.
> >
> >
> > The above produces pairs of lines, so I do not understand your point.
> >
> Strange. for me it produces a repeating single line. Tried both on
> 2.7.6. and 3.5.1. Probably not worth discussing, though.

It produces single lines in the interactive interpreter, pairs in a

Any real implementation should do a comparison at the traceback data
level, though, rather than the string.

On Apr 19, 2016 9:12 AM, "Nick Coghlan" <ncoghlan at> wrote:
> Implicit side effects from hidden code break that mental equivalence -
it's why effective uses of metaclasses, monkeypatching, and other
techniques for deliberately introducing changed implicit behaviour often
also involve introducing some kind of more local signal to help convey what
is going on (such as a naming convention, or ensuring the altered behaviour
is used consistently across the entire project).

But a local signal _is_ part of my original proposal:

    """Instead, it will print a few lines ("... and approximately X more
lines"), and tell you how to print more. (E.g. "Call '_more()' for more.
Call '_full()' for full output.")"""

    """Again, there should be a message telling you how to get the full
stacktrace printed. EXACTLY how, preferably in a way that is easy to type,
so that a typo won't cause the trace to be lost. It should not use
`sys.something()`, because the user's first few encounters with this
message will result in, "NameError: name 'sys' is not defined"."""

This also satisfies Terry Reedy's "reversibility" condition.

On Apr 19, 2016 10:03 AM, "Michael Selik" <mike at> wrote:
> On Tue, Apr 19, 2016 at 5:36 AM Franklin? Lee <
leewangzhong+python at> wrote:
>> On Apr 19, 2016 4:09 AM, "Paul Moore" <p.f.moore at> wrote:
>> > Basic users should probably be using a tool like IDLE, which has a bit
>> > more support for beginners than the raw REPL.
>> My college had CS students SSH into the department's Linux server to
compile and run their code, and many teachers don't believe that students
should start with fancy IDE featues like, er, syntax highlighting.
> That's probably because your professors thought you were more advanced
than other new Pythonistas, because you were CS students. If I were in
their shoes, I might chose a different approach depending on the level of
the course.

I meant to have two separate clauses:

- In my college, there was a server for compiling and running code. It
allowed people to do so without installing a compiler (e.g. on a
general-purpose school computer). It also gave a standard compiler to test
against. I did not learn Python in school.

- Many teachers, all over the world, do not believe in IDE features for
beginners. (Therefore, there will be many *students* which don't learn
about IDLE.) You can want them to stop thinking that way, but the students
will still be out there.

> > But that doesn't answer my question: would the proposed change hurt
your workflow?
> It might. Would it affect doctests? Would it make indirect infinite
recursion more difficult to trace? Would it make me remember yet another
command line option or REPL option to turn on complete reprs? Would it
force me to explain yet another config setting to new programmers?

Let's be clear about what I am proposing. I prioritize not losing info and
making it obvious how to get at that info. My suggestion was to show less
output *and* have a message at the end saying how to get the output.

It obviously should not affect doctests, since the output there is not for
humans. An implementation that affects doctests should be considered buggy.
Perhaps have an option to allow it, but affecting doctests by default would
be an actual backward incompatibility, since it changes how existing code
*runs* (i.e. not just output, but different logical paths during runtime).

My idea for tracebacks of mutually recursive calls: See which functions are
on the stack more than once, and if they keep appearing, pack them up. The
algorithm should not pack up calls which happen only once within an
apparent cycle, and it should be clear which order the calls come in
(outside of folded mutually-recursive calls, of course).

    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
    [Mutually recursive calls hidden: f (300), g (360)]
    File "<stdin>", line 1, in h
    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
   [Mutual-recursive calls hidden: f (103), g (200)]
  RuntimeError: maximum recursion depth exceeded
  [963 calls hidden. Call _full_ex() to print full trace.]

Or maybe the second f and g are also folded into the second hidden bit. And
maybe it checks line numbers when deciding whether to print explicitly (but
not when folding).

Would that output make indirect infinite recursion more difficult for you
to debug?

> I think a beginner understands when they've printed something too big. I
see this happen frequently. They laugh, shake their heads, and retype
whatever they need to.

I am not proposing this as a better error message, but because a flooded
terminal is loss of information. It is also good to have significant info
close together, to reduce effort (and thus mental cache) between processing
of related ideas.

Have you never needed to see the output of previous lines? Re-entering the
earlier line might not work if the state of the program has changed. This
happens at the beginner level.
On Wed, Apr 20, 2016, at 14:25, Franklin? Lee wrote:
> Example:
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>     [Mutually recursive calls hidden: f (300), g (360)]
>     File "<stdin>", line 1, in h
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>    [Mutual-recursive calls hidden: f (103), g (200)]
>   RuntimeError: maximum recursion depth exceeded
>   [963 calls hidden. Call _full_ex() to print full trace.]
> Or maybe the second f and g are also folded into the second hidden
> bit. And maybe it checks line numbers when deciding whether to print
> explicitly (but not when folding).
> Would that output make indirect infinite recursion more difficult for
> you to debug?

You know what would make complicated infinite recursion easier to debug?
The arguments.

Is there a reliable way to determine what in f_locals correspond to
arguments? My toy example below only works for named positional

def magic(frame):
    code = frame.f_code
    fname = code.co_name
    argcount = code.co_argcount
    args = code.co_varnames[:argcount]
    values = tuple(frame.f_locals[a] for a in args)
    result = '%s%r' % (fname, values)
    if len(result) > 64:
        return fname + '(...)'
    return result

Maybe even tokenize the source line the error occurred on and print any
other locals whose name matches any token on the line. I guess I'll
leave it to the bikeshed design committee and say: WIBNI tracebacks
printed relevant frame variables FSDO relevant?

On 4/20/2016 2:15 PM, Random832 wrote:
> On Wed, Apr 20, 2016, at 13:57, Koos Zevenhoven wrote:
>> On Wed, Apr 20, 2016 at 8:46 PM, Terry Reedy <tjreedy at> wrote:
>>> On 4/20/2016 6:39 AM, Koos Zevenhoven wrote:
>>>> def fun():
>>>>     fun()
>>>> fun()
>>>> Does that.
>>> The above produces pairs of lines, so I do not understand your point.
>> Strange. for me it produces a repeating single line. Tried both on
>> 2.7.6. and 3.5.1. Probably not worth discussing, though.
> It produces single lines in the interactive interpreter, pairs in a
> file.

Aha.  In IDLE's Shell, which generally simulates the interactive 
interpreter quite well, pairs are printed because interactive user input 
is exec'ed in a process running Python in normal batch mode.  I expect 
any GUI shell to act similarly.

I consider interactive interpreter omission of the line at fault to be a 
design buglet.  For a single line of input, it is not much of a problem 
to look back up.  But if one pastes a 20-line statement, finding line 
13, for instance, is not so quick.  And on Windows, a 1000 line 
traceback may erase the input so there is nothing to look at.

Terry Jan Reedy

On Wed, Apr 20, 2016 at 03:01:48PM -0400, Random832 wrote:

> You know what would make complicated infinite recursion easier to debug?
> The arguments.

Check out the cgitb module, which installs an except hook which (among 
many other things) prints the arguments to the functions. Run this 
sample code:

import cgitb

def spam(arg):
    if arg == 0:
        raise ValueError('+++ out of cheese error, redo from start +++')
    return spam(arg - 1)

def eggs(x):
    return spam(x) + 1

def cheese(a, b, c):
    return a or b or eggs(c)

cheese(0, None, 1)

and you will see output something like the following. (For brevity I have 
compressed some of the output.)

Python 3.3.0rc3: /usr/local/bin/python3.3
Thu Apr 21 10:58:22 2016

A problem occurred in a Python script.  Here is the sequence of
function calls leading up to the error, in the order they occurred.

 /home/steve/python/<stdin> in <module>()
 /home/steve/python/<stdin> in cheese(a=0, b=None, c=2)
 /home/steve/python/<stdin> in eggs(x=2)
 /home/steve/python/<stdin> in spam(arg=2)
 /home/steve/python/<stdin> in spam(arg=1)
 /home/steve/python/<stdin> in spam(arg=0)

ValueError: +++ out of cheese error, redo from start +++
    __cause__ = None
    __class__ = <class 'ValueError'>
    __traceback__ = <traceback object>
    args = ('+++ out of cheese error, redo from start +++',)
    with_traceback = <built-in method with_traceback of ValueError object>

The above is a description of an error in a Python program.  Here is
the original traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in cheese
  File "<stdin>", line 2, in eggs
  File "<stdin>", line 4, in spam
  File "<stdin>", line 4, in spam
  File "<stdin>", line 3, in spam
ValueError: +++ out of cheese error, redo from start +++


On Tue, Apr 19, 2016 at 06:28:40PM -0400, Terry Reedy wrote:
> On 4/19/2016 12:18 PM, Steven D'Aprano wrote:
> >I mostly agree with what you say, but I would like to see one change to
> >the default sys.excepthook: large numbers of *identical* traceback lines
> >(as you often get with recursion errors) should be collapsed. For
> >example:
> Tracebacks produce *pairs* of lines: the location and the line itself.

Only if the source code is available. The source isn't available for 
functions defined in the REPL, for C code, or for Python functions 
read from a .pyc file where the .py file is not available.

The code I gave before works with pairs of location + source, without 
any change. If I move the definition of fact into a file, so that the 
source lines are included, we get:

py> sys.setrecursionlimit(10)
py> fact(30)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/python/", line 3, in fact
    return n*fact(n-1)
  [...repeat previous line 7 times...]
  File "/home/steve/python/", line 2, in fact
    if n < 1: return 1
RuntimeError: maximum recursion depth exceeded in comparison

I'd want to adjust the wording. Perhaps "previous entry" rather than 
previous line?

> Replacing pairs with a count of repetitions would not lose information, 
> and would make the info more visible.  I would require at least, say, 3 
> repetitions before collapsing.

That's what my example does: it only collapses the line/(pair of lines) 
if there are at least three identical repetitions.


On Wed, Apr 20, 2016 at 12:07:11AM -0400, ?manuel Barry wrote:
> This idea was mentioned a couple of times in the previous thread, and it
> seems reasonable to me. Getting recursion errors when testing a function in
> the interactive prompt often screwed me over, as I have a limited scrollback
> of 1000 lines at best on Windows (I never checked exactly how high/low the
> limit was), and with Python's recursion limit of 1000, that's a whopping
> 1000 to 2000 lines in my console, effectively killing off anything useful I
> might have wanted to see. Such as, for example, the function definition that
> triggered the exception.

Even if you have an effectively unlimited scrollback, getting thousands 
of identical lines is a PITA and rather distracting and annoying.

> I'm suggesting a change to how Python handles specifically recursion limits,

We should not limit this to specifically *recursion* limits, despite the 
error message printed out, it is not just recursive calls that count 
towards the recursion limit. It is any function call.

In practice, though, it is difficult to run into this limit with regular 
non-recursive calls, but not impossible.

> to cut after a sane number of times (someone suggested to shrink the output
> after 3 times), and simply stating how many more identical messages there
> are. 

That would be me :-)

> I would also like to extend this to any arbitrary loop of callers (for
> example foo -> bar -> baz -> foo and so on would be counted as one "call"
> for the purposes of this proposal).

Seems reasonable.

> We can probably afford to cut out immediately as we notice the recursion,

No you can't. A bunch of recursive calls may be followed by non- 
recursive calls, or a different set of recursive calls. It might not 
even be a RuntimeError at the end.

For example, put these four functions in a file, "":

def fact(n):
    if n < 1: return 1
    return n*fact(n-1)

def rec(n):
    if n < 20:
        return spam(n)
    return n + rec(n-1)

def spam(n):
    return eggs(2*n)

def eggs(arg):
    return fact(arg//2)

Now I run this:

import fact

and I get this shorter traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/python/", line 9, in rec
    return n + rec(n-1)
  [...repeat previous line 10 times...]
  File "/home/steve/python/", line 8, in rec
    return spam(n)
  File "/home/steve/python/", line 12, in spam
    return eggs(2*n)
  File "/home/steve/python/", line 15, in eggs
    return fact(arg//2)
  File "/home/steve/python/", line 3, in fact
    return n*fact(n-1)
  [...repeat previous line 18 times...]
  File "/home/steve/python/", line 2, in fact
    if n < 1: return 1
RuntimeError: maximum recursion depth exceeded in comparison

The point being, you cannot just stop processing traces when you find 
one repeated.


Sometimes users inherit from builtin types only to find that their 
overridden methods are not called.  Instead of this being a trap for 
unsuspecting users, I suggest overriding the __new__ method of these types 
so that it will raise with an informative exception explaining that, e.g., 
instead of inheriting from dict, you should inherit from UserDict.

I suggest this modification to any Python implementation that has special 
versions of classes that cannot easily be extended, such as CPython.  If 
another Python implementation allows dict (e.g.) to be extended easily, 
then it doesn't have to raise.


On 04/20/2016 07:51 PM, Neil Girdhar wrote:

> Sometimes users inherit from builtin types only to find that their
> overridden methods are not called.  Instead of this being a trap for
> unsuspecting users, I suggest overriding the __new__ method of these
> types so that it will raise with an informative exception explaining
> that, e.g., instead of inheriting from dict, you should inherit from
> UserDict.

How, exactly, does that help?

And even if it does help (which I doubt), you're willing to break 
currently working code that successfully subclasses dicts, strings, 
tuples, etc., that then pass through dicts, strings, tuples, etc., to 
build their new class?

No thank you.


Specifically, I'm suggesting something like

class dict:
    def __new__(cls):
          if cls is not dict:
                 raise RuntimeError("Cannot inherit from dict; inherit from 
UserDict instead")

On Wednesday, April 20, 2016 at 10:51:37 PM UTC-4, Neil Girdhar wrote:
> Sometimes users inherit from builtin types only to find that their 
> overridden methods are not called.  Instead of this being a trap for 
> unsuspecting users, I suggest overriding the __new__ method of these types 
> so that it will raise with an informative exception explaining that, e.g., 
> instead of inheriting from dict, you should inherit from UserDict.
> I suggest this modification to any Python implementation that has special 
> versions of classes that cannot easily be extended, such as CPython.  If 
> another Python implementation allows dict (e.g.) to be extended easily, 
> then it doesn't have to raise.
> Best,
> Neil
On Wed, Apr 20, 2016 at 07:51:37PM -0700, Neil Girdhar wrote:
> Sometimes users inherit from builtin types only to find that their 
> overridden methods are not called.  Instead of this being a trap for 
> unsuspecting users, I suggest overriding the __new__ method of these types 
> so that it will raise with an informative exception explaining that, e.g., 
> instead of inheriting from dict, you should inherit from UserDict.


What about those who don't want or need to inherit from UserDict, and 
are perfectly happy with inheriting from dict? Why should we break their 
working code for the sake of people whose code already isn't working?

Not everyone who inherits from dict tries to override __getitem__ and 
__setitem__ and are then surprised that other methods don't call their 
overridden methods. We shouldn't punish them.


I think inheriting directly from dict is simply bad code because CPython
doesn't promise that any of your overridden methods will be called.  The
fact that it silently doesn't call them is an inscrutable trap.  And
really, it's not much of a "punishment" to simply change your base class

On Wed, Apr 20, 2016 at 11:16 PM Steven D'Aprano <steve at>

> On Wed, Apr 20, 2016 at 07:51:37PM -0700, Neil Girdhar wrote:
> > Sometimes users inherit from builtin types only to find that their
> > overridden methods are not called.  Instead of this being a trap for
> > unsuspecting users, I suggest overriding the __new__ method of these
> types
> > so that it will raise with an informative exception explaining that,
> e.g.,
> > instead of inheriting from dict, you should inherit from UserDict.
> -1
> What about those who don't want or need to inherit from UserDict, and
> are perfectly happy with inheriting from dict? Why should we break their
> working code for the sake of people whose code already isn't working?
> Not everyone who inherits from dict tries to override __getitem__ and
> __setitem__ and are then surprised that other methods don't call their
> overridden methods. We shouldn't punish them.
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
See my second comment

> > We can probably afford to cut out immediately as we notice the
> No you can't. A bunch of recursive calls may be followed by non-
> recursive calls, or a different set of recursive calls. It might not
> even be a RuntimeError at the end.

See my first comment *wink wink*

My proposal is so that we strip out repeated lines to avoid unnecessary
noise, *not* to strip out everything thereafter! I agree that my wording was
misleading, but I do want to keep all relevant information - only to strip
out consecutive and identical lines. If there's a 20-call recursion in the
middle of the stack for some odd reason (call itself with n+1 until n == 20
and then do something else?), some of it would be stripped out but the rest
would remain.

The core of my proposal is to enhance ability to read and debug large
tracebacks, which most of the times will be because of recursive calls.
We're shrinking tracebacks for recursive calls, and if that happens to
shrink other large tracebacks as well, that's a useful side effect and
should be considered, but that's not the core of my idea. If a
one-size-fits-all solution works here, I'll go for that and avoid dumping
yet another bucket of special cases (which aren't special enough to break
the rules) into the code. There are already enough.

> The point being, you cannot just stop processing traces when you find
> one repeated.

Yep, don't want that at all indeed.

Thanks for your input Steven :)


Except from the fact that a lot of code requiring actual dicts (or str, list?) will break because they are no longer receiving a dict instance (or an instance of (one of) its subclasses), but a collections.UserDict instance which is also entirely unrelated to dict except that they have methods with similar names.


As I understand it, you prefer to put the burden of supporting collections.UserDict not on the people subclassing dict, but on the library developers expecting actual dicts. I?m sure there are a couple of built-in functions (written in C) accepting dicts that would not work if they didn?t get dicts. And getting built-in functions to accept instances of pure Python code is nontrivial and tricky at best, and requires a lot more work than it would make sense to in this case.


I?m -1 on the idea.




From: Neil Girdhar
Sent: Thursday, April 21, 2016 12:34 AM


I think inheriting directly from dict is simply bad code because CPython doesn't promise that any of your overridden methods will be called.  The fact that it silently doesn't call them is an inscrutable trap.  And really, it's not much of a "punishment" to simply change your base class name.

On Thu, Apr 21, 2016 at 2:33 PM, Neil Girdhar <mistersheik at> wrote:
> I think inheriting directly from dict is simply bad code because CPython
> doesn't promise that any of your overridden methods will be called.  The
> fact that it silently doesn't call them is an inscrutable trap.  And really,
> it's not much of a "punishment" to simply change your base class name.

There are way too many cases that work just fine, though.

>>> class AutoCreateDict(dict):
...     def __missing__(self, key):
...         return "<autocreated %r>" % key
>>> acd = AutoCreateDict()
>>> acd["asdf"]
"<autocreated 'asdf'>"

Why should I inherit from UserDict instead? I have to import that from
somewhere (is it in types? collections? though presumably your error
message would tell me that), and then I have to contend with the fact
that my class is no longer a dictionary.

>>> class AutoCreateUserDict(collections.UserDict):
...     def __missing__(self, key):
...         return "<autocreated %r>" % key
>>> acud = AutoCreateUserDict()
>>> acud["qwer"]
"<autocreated 'qwer'>"
>>> json.dumps(acd)
>>> json.dumps(acud)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/json/", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/lib/python3.6/json/", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/lib/python3.6/json/", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/lib/python3.6/json/", line 180, in default
TypeError: Object of type 'AutoCreateUserDict' is not JSON serializable

If you *do* push forward with this proposal, incidentally, I would
recommend not doing it in __new__, but changing it so the class is no
longer subclassable, as per bool:

>>> class X(int): pass
>>> class X(bool): pass
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: type 'bool' is not an acceptable base type

But I am firmly -1 on disallowing dict subclasses just because they
aren't guaranteed to call all of your overridden methods.


From rosuav at  Thu Apr 21 01:26:52 2016
From: rosuav at (Chris Angelico)
Date: Thu, 21 Apr 2016 15:26:52 +1000
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <BLU403-EAS3521232DB83C1793B235993916E0@phx.gbl>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

On Thu, Apr 21, 2016 at 2:55 PM, ?manuel Barry <vgr255 at> wrote:
> The core of my proposal is to enhance ability to read and debug large
> tracebacks, which most of the times will be because of recursive calls.
> We're shrinking tracebacks for recursive calls, and if that happens to
> shrink other large tracebacks as well, that's a useful side effect and
> should be considered, but that's not the core of my idea.

No problem with that. In my experience, most RecursionErrors come from
*accidental* recursion, which is straight-forwardly infinite and
usually involves a single function. Consider this idiom:

>>> class id(int):
...     _orig_id = id
...     def __new__(cls, obj):
...         return super().__new__(cls, cls._orig_id(obj))
...     def __repr__(self):
...         return hex(self)
>>> obj = object()
>>> obj
<object object at 0x7ffb12290110>
>>> id(obj)

If I muck something up and accidentally call id() inside the
definition of __new__ (instead of cls._orig_id), it'll end up
infinitely recursing. A traceback shortener that recognizes only the
very simplest forms of repetition would work fine for this case. It
doesn't need a huge amount of intelligence - the traceback would have
a bunch of exactly identical lines, and it _would_ end with one of the
identical ones.


From mistersheik at  Thu Apr 21 01:33:24 2016
From: mistersheik at (Neil Girdhar)
Date: Thu, 21 Apr 2016 05:33:24 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 1:15 AM Chris Angelico <rosuav at> wrote:

> On Thu, Apr 21, 2016 at 2:33 PM, Neil Girdhar <mistersheik at>
> wrote:
> > I think inheriting directly from dict is simply bad code because CPython
> > doesn't promise that any of your overridden methods will be called.  The
> > fact that it silently doesn't call them is an inscrutable trap.  And
> really,
> > it's not much of a "punishment" to simply change your base class name.
> There are way too many cases that work just fine, though.
> >>> class AutoCreateDict(dict):
> ...     def __missing__(self, key):
> ...         return "<autocreated %r>" % key
> ...
> >>> acd = AutoCreateDict()
> >>> acd["asdf"]
> "<autocreated 'asdf'>"
> Why should I inherit from UserDict instead? I have to import that from
> somewhere (is it in types? collections? though presumably your error
> message would tell me that), and then I have to contend with the fact
> that my class is no longer a dictionary.

Of course it's a dictionary.  It's an abc.Mapping, which is all a user of
your class should care about.  After all, it "quacks like a duck", which is
all that matters.

> >>> class AutoCreateUserDict(collections.UserDict):
> ...     def __missing__(self, key):
> ...         return "<autocreated %r>" % key
> ...
> >>> acud = AutoCreateUserDict()
> >>> acud["qwer"]
> "<autocreated 'qwer'>"
> >>> json.dumps(acd)
> '{}'
> >>> json.dumps(acud)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/lib/python3.6/json/", line 230, in dumps
>     return _default_encoder.encode(obj)
>   File "/usr/local/lib/python3.6/json/", line 199, in encode
>     chunks = self.iterencode(o, _one_shot=True)
>   File "/usr/local/lib/python3.6/json/", line 257, in iterencode
>     return _iterencode(o, 0)
>   File "/usr/local/lib/python3.6/json/", line 180, in default
>     o.__class__.__name__)
> TypeError: Object of type 'AutoCreateUserDict' is not JSON serializable
That's a fair point, but it seems like a bug in JSON.  They should have
checked if it's an abc.Mapping imho.

> If you *do* push forward with this proposal, incidentally, I would
> recommend not doing it in __new__, but changing it so the class is no
> longer subclassable, as per bool:
> >>> class X(int): pass
> ...
> >>> class X(bool): pass
> ...
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: type 'bool' is not an acceptable base type

Good point.

> But I am firmly -1 on disallowing dict subclasses just because they
> aren't guaranteed to call all of your overridden methods.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> --
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at
> For more options, visit
On Thu, Apr 21, 2016 at 3:33 PM, Neil Girdhar <mistersheik at> wrote:
>> Why should I inherit from UserDict instead? I have to import that from
>> somewhere (is it in types? collections? though presumably your error
>> message would tell me that), and then I have to contend with the fact
>> that my class is no longer a dictionary.
> Of course it's a dictionary.  It's an abc.Mapping, which is all a user of
> your class should care about.  After all, it "quacks like a duck", which is
> all that matters.

Your first sentence is in conflict with your other statements. My type
is simply *not* a dictionary. You can start raising bug reports all
over the place saying "JSON should look for abc.Mapping rather than
dict", but I'm not even sure that it should.


From mistersheik at  Thu Apr 21 01:57:41 2016
From: mistersheik at (Neil Girdhar)
Date: Thu, 21 Apr 2016 05:57:41 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

Lol, well, I think it should.

Anyway, the cons are that yes some people's bad code won't work.  The pros
are that unsuspecting users won't fall into the trap associated with
inheriting from dict, list or set.  Of course, people on python-ideas know
the pitfalls of inheriting from builtin types.   You're not users that
benefit from this change.  It's people who don't know the problems and have
no way of finding out until they spend a day debugging why overriding some
method on a derived class of a derived class that ultimately inherits from
dict doesn't work.  Experts always want to make their own lives better.
That's the problem with a language made by experts.  New users fall into
traps that you will never fall into (again).

On Thu, Apr 21, 2016 at 1:48 AM Chris Angelico <rosuav at> wrote:

> On Thu, Apr 21, 2016 at 3:33 PM, Neil Girdhar <mistersheik at>
> wrote:
> >> Why should I inherit from UserDict instead? I have to import that from
> >> somewhere (is it in types? collections? though presumably your error
> >> message would tell me that), and then I have to contend with the fact
> >> that my class is no longer a dictionary.
> >
> >
> > Of course it's a dictionary.  It's an abc.Mapping, which is all a user of
> > your class should care about.  After all, it "quacks like a duck", which
> is
> > all that matters.
> Your first sentence is in conflict with your other statements. My type
> is simply *not* a dictionary. You can start raising bug reports all
> over the place saying "JSON should look for abc.Mapping rather than
> dict", but I'm not even sure that it should.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> --
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at
> For more options, visit
On 21 April 2016 at 12:51, Neil Girdhar <mistersheik at> wrote:

> Sometimes users inherit from builtin types only to find that their
> overridden methods are not called.  Instead of this being a trap for
> unsuspecting users, I suggest overriding the __new__ method of these types
> so that it will raise with an informative exception explaining that, e.g.,
> instead of inheriting from dict, you should inherit from UserDict.
> I suggest this modification to any Python implementation that has special
> versions of classes that cannot easily be extended, such as CPython.  If
> another Python implementation allows dict (e.g.) to be extended easily,
> then it doesn't have to raise.

Builtins can be extended, you just have to override all the methods where
you want to change the return type:

>>> from collections import defaultdict, Counter, OrderedDict
>>> issubclass(defaultdict, dict)
>>> issubclass(Counter, dict)
>>> issubclass(OrderedDict, dict)

This isn't hard as such, it's just tedious, so it's often simpler to use
the more subclass friendly variants that dynamically look up the type to
return and hence let you get away with overriding a smaller subset of the


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From antony.lee at  Thu Apr 21 02:38:34 2016
From: antony.lee at (Antony Lee)
Date: Wed, 20 Apr 2016 23:38:34 -0700
Subject: [Python-ideas] Why was Path.path added?
Message-ID: <>

Well, it's already there and it's probably too late to remove it, but
introducing the idiom

path_str = getattr(arg, "path", arg)

was not necessary: you could also write

path_str = str(Path(arg))

which works just as well.

My 2c.,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mistersheik at  Thu Apr 21 03:20:40 2016
From: mistersheik at (Neil Girdhar)
Date: Thu, 21 Apr 2016 07:20:40 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

Good point, I withdraw my suggestion.  I think it's unfortunate for Python
to have traps like this, but I don't see a nice way to protect users while
still letting people do whatever they want.
On Thu, Apr 21, 2016 at 3:03 AM Neil Girdhar <mistersheik at> wrote:

> I guess that's fair.
> On Thu, Apr 21, 2016 at 2:36 AM Nick Coghlan <ncoghlan at> wrote:
>> On 21 April 2016 at 12:51, Neil Girdhar <mistersheik at> wrote:
>>> Sometimes users inherit from builtin types only to find that their
>>> overridden methods are not called.  Instead of this being a trap for
>>> unsuspecting users, I suggest overriding the __new__ method of these types
>>> so that it will raise with an informative exception explaining that, e.g.,
>>> instead of inheriting from dict, you should inherit from UserDict.
>>> I suggest this modification to any Python implementation that has
>>> special versions of classes that cannot easily be extended, such as
>>> CPython.  If another Python implementation allows dict (e.g.) to be
>>> extended easily, then it doesn't have to raise.
>> Builtins can be extended, you just have to override all the methods where
>> you want to change the return type:
>> >>> from collections import defaultdict, Counter, OrderedDict
>> >>> issubclass(defaultdict, dict)
>> True
>> >>> issubclass(Counter, dict)
>> True
>> >>> issubclass(OrderedDict, dict)
>> True
>> This isn't hard as such, it's just tedious, so it's often simpler to use
>> the more subclass friendly variants that dynamically look up the type to
>> return and hence let you get away with overriding a smaller subset of the
>> methods.
>> Cheers,
>> Nick.
>> --
>> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
On Thu, Apr 21, 2016 at 1:58 AM Neil Girdhar <mistersheik at> wrote:

> The pros are that unsuspecting users won't fall into the trap associated
> with inheriting from dict, list or set. ...  It's people who don't know the
> problems and have no way of finding out until they spend a day debugging
> why overriding some method on a derived class of a derived class that
> ultimately inherits from dict doesn't work.

I feel your pain. However, what you see as a failure or defect of the
builtins is actually a feature! :-)

The downstream programmer should not be expected to know the implementation
details of a dict. You don't want to read the source, you just want to use
the public interface. You should be able to subclass and override methods
as you like without worrying about hidden internal relationships. You
should be able to override ``__getitem__`` without accidentally affecting
things like ``values``.

I guess it's a case of "a little knowledge is dangerous". Someone who knows
nothing about dict implementation would not expect to see a change in
method A because of an override of method B. The expert is keeping the
novice safe. It's the journeyman who suffers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mistersheik at  Thu Apr 21 03:30:27 2016
From: mistersheik at (Neil Girdhar)
Date: Thu, 21 Apr 2016 07:30:27 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

Wow, I disagree with this.  The fact that you can override getitem and it
doesn't fix initialization in the constructor to match is definitely
confusing.  The fact that __add__ called on a dict subclass doesn't return
a class of the same type is confusing.  This is why every Python book
advises people not to inherit from builtins.

Let's not get carried away.  This is an optimization for CPython ? not a

On Thu, Apr 21, 2016 at 3:27 AM Michael Selik <mike at> wrote:

> On Thu, Apr 21, 2016 at 1:58 AM Neil Girdhar <mistersheik at>
> wrote:
>> The pros are that unsuspecting users won't fall into the trap associated
>> with inheriting from dict, list or set. ...  It's people who don't know the
>> problems and have no way of finding out until they spend a day debugging
>> why overriding some method on a derived class of a derived class that
>> ultimately inherits from dict doesn't work.
> I feel your pain. However, what you see as a failure or defect of the
> builtins is actually a feature! :-)
> The downstream programmer should not be expected to know the
> implementation details of a dict. You don't want to read the source, you
> just want to use the public interface. You should be able to subclass and
> override methods as you like without worrying about hidden internal
> relationships. You should be able to override ``__getitem__`` without
> accidentally affecting things like ``values``.
> I guess it's a case of "a little knowledge is dangerous". Someone who
> knows nothing about dict implementation would not expect to see a change in
> method A because of an override of method B. The expert is keeping the
> novice safe. It's the journeyman who suffers.
On 21 April 2016 at 07:38, Antony Lee <antony.lee at> wrote:
> Well, it's already there and it's probably too late to remove it, but

pathlib is provisional, so it *can* be removed (and indeed probably
will be - see the endless threads here and on python-dev)

> introducing the idiom
> path_str = getattr(arg, "path", arg)
> was not necessary: you could also write
> path_str = str(Path(arg))
> which works just as well.

The latter requires you to import pathlib, which is not Python
2.7-compatible. A minor point, but one reason why the path attribute
offers simpler code. Also, str(Path(arg)) applies an unnecessary call
to Path() if arg is already a path. (And str(arg) is not OK, as it
will also accept things like ints).

If you want the full details, please read the various threads that
have been going on here for about a month. It isn't worth reiterating
the debate again, I'm afraid - most people are fairly burned out on
pathlib discussions by now.


From oscar.j.benjamin at  Thu Apr 21 04:44:53 2016
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Thu, 21 Apr 2016 09:44:53 +0100
Subject: [Python-ideas] Have REPL print less by default
In-Reply-To: <>
References: <>
Message-ID: <>

On 21 April 2016 at 02:03, Steven D'Aprano <steve at> wrote:
> On Wed, Apr 20, 2016 at 03:01:48PM -0400, Random832 wrote:
>> You know what would make complicated infinite recursion easier to debug?
>> The arguments.
> Check out the cgitb module, which installs an except hook which (among
> many other things) prints the arguments to the functions. Run this
> sample code:
> import cgitb
> cgitb.enable(format='text')

Oh that's nice. It would be good if you could use it with -m like pdb

    $ python -m cgitb

rather than needing to insert it in the code.


From k7hoven at  Thu Apr 21 04:49:22 2016
From: k7hoven at (Koos Zevenhoven)
Date: Thu, 21 Apr 2016 11:49:22 +0300
Subject: [Python-ideas] Why was Path.path added?
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 11:07 AM, Paul Moore <p.f.moore at> wrote:
> On 21 April 2016 at 07:38, Antony Lee <antony.lee at> wrote:
>> Well, it's already there and it's probably too late to remove it, but
> pathlib is provisional, so it *can* be removed (and indeed probably
> will be - see the endless threads here and on python-dev)
>> introducing the idiom
>> path_str = getattr(arg, "path", arg)
>> was not necessary: you could also write
>> path_str = str(Path(arg))
>> which works just as well.
> The latter requires you to import pathlib, which is not Python
> 2.7-compatible. A minor point, but one reason why the path attribute
> offers simpler code. Also, str(Path(arg)) applies an unnecessary call
> to Path() if arg is already a path. (And str(arg) is not OK, as it
> will also accept things like ints).
> If you want the full details, please read the various threads that
> have been going on here for about a month. It isn't worth reiterating
> the debate again, I'm afraid - most people are fairly burned out on
> pathlib discussions by now.

The need for .path has been brought up before, although not because
.path was unnecessary, but because we wanted something better. And
this better thing currently goes by the name fspath, so instead of
str(Path(arg)) you should be calling os.fspath(arg), which even works
for third-party path libraries that implement the protocol. Regarding
this, however, we are seizing the discussions until we have a PEP.

A problem with .path, however, is that there should be only one way to
do it, and if I were to decide, I would probably remove it in
pathlib.PurePath before it's too late.


From steve at  Thu Apr 21 07:17:09 2016
From: steve at (Steven D'Aprano)
Date: Thu, 21 Apr 2016 21:17:09 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 04:33:33AM +0000, Neil Girdhar wrote:
> I think inheriting directly from dict is simply bad code because CPython
> doesn't promise that any of your overridden methods will be called.

Python makes the same promise that it makes for *all* classes: if you 
override a method, and call that method on an instance of a subclass, 
the subclass' overridden implementation will be called. Dicts are no 
different from any other class, and have been since version 2.2 when 
types and classes where unified and inheriting from built-ins was first 

In fact, dicts were *explicitly* listed by Guido as one of the built-ins 
which can be subclassed:

  "Let's start with the juiciest bit: you can subtype built-in 
   types like dictionaries and lists."

Your proposal would break at least two standard types, both of which 
subclass dict:

py> from collections import defaultdict, Counter
py> defaultdict.__mro__
(<class 'collections.defaultdict'>, <class 'dict'>, <class 'object'>)
py> Counter.__mro__
(<class 'collections.Counter'>, <class 'dict'>, <class 'object'>)

and it runs counter to a documented feature of dicts, that they can be 
subclassed and given a __missing__ method:

"If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() 
and key is not present, the d[key] operation calls that method ..."

> The fact that it silently doesn't call them is an inscrutable trap.

That is not a fact.

What you should say is:

"The fact that dicts don't obey the undocumented assumptions I made 
about the implementation of methods is a trap."

and I will agree: correct, but its a trap for hubris and foolishness, 
and the answer to that is, don't make unjustified assumptions about how 
dicts are implemented. And I can say that because I made exactly that 
wrong assumption too.

Nowhere does the documentation say that dict.update calls __setitem__. 
Nowhere does it say that dict.clear calls __delitem__. Nowhere does it 
say that dict.values calls __getitem__. And yet I assumed that they did 
all that.

When somebody makes the unjustified assumption that they do, then they 
will discover that calling the subclass' clear method fails to call the 
overridden __delitem__. It also fails to call the overridden __str__. 
The only difference is that nobody assumes that clear() calls __str__, 
but many people foolishly assume that it calls the __delitem__ method. 
Both assumptions have *exactly* the same justification: none at all.

I made a bunch of stupid, foolish assumptions, based on absolutely 
nothing more than the idea that it stands to reason that dict must be 
implemented in this way, and got bitten. And I deserved it. I was wrong, 
and anyone making the same assumption is wrong.

Well, I say I was "bitten", but that over-dramatises the situation. What 
actually happened was that I wrote a subclass without doing any tests, 
then tested it, and discovered that it didn't work how I expected. I 
spent a few minutes playing with the class, added a few print 
statements, discovered that my assumptions were wrong, and then overrode 
the classes I actually wanted to override.

Twenty minutes of googling and reading the docs convinced me that, no, 
Python doesn't promise that dict.clear calls __delitem__. It would have 
been five minutes except I was especially stubbon and pig-headed that 
day and didn't want to admit that I was in the wrong. But I was.

Getting bitten by this was, in fact, a valuable lesson. I learned what I 
should have already known, what I had *intellectually* known but had 
ignored because "it stands to reason". Namely, if an implementation 
isn't documented, you cannot assume that it works in a particular way.

My sympathy level is zero, and my support for this proposal is negative.


From steve at  Thu Apr 21 07:30:37 2016
From: steve at (Steven D'Aprano)
Date: Thu, 21 Apr 2016 21:30:37 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote:

> Builtins can be extended, you just have to override all the methods where
> you want to change the return type:

I wonder whether we should have a class decorator which automatically 
adds the appropriate methods?

It would need some sort of introspection to look at the superclass and 
adds overridden methods? E.g. if the superclass is float, it would add a 
bunch of methods like:

def __add__(self, other):
    x = super().__add__(other)
    if x is NotImplemented:
        return x
    return type(self)(x)

but only if they aren't already overridden. Then you could do this:

@decorator  # I have no idea what to call it.
class MyInt(int):

and now MyInt() + MyInt() will return a MyInt, rather than a regular 

Getting the list of dunder methods is easy, but telling whether or not 
they should return an instance of the subclass may not be.



From mike at  Thu Apr 21 07:44:00 2016
From: mike at (Michael Selik)
Date: Thu, 21 Apr 2016 11:44:00 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano <steve at> wrote:

> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote:
> > Builtins can be extended, you just have to override all the methods where
> > you want to change the return type:
> I wonder whether we should have a class decorator which automatically
> adds the appropriate methods?

If I'm not mistaken, some of the dunders need to be overridden as part of
the class definition and can't be added dynamically without some exec a la
namedtuple. If so, are you still interested in creating that decorator?
On Thu, Apr 21, 2016 at 9:44 PM, Michael Selik <mike at> wrote:
> On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano <steve at> wrote:
>> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote:
>> > Builtins can be extended, you just have to override all the methods
>> > where
>> > you want to change the return type:
>> I wonder whether we should have a class decorator which automatically
>> adds the appropriate methods?
> If I'm not mistaken, some of the dunders need to be overridden as part of
> the class definition and can't be added dynamically without some exec a la
> namedtuple. If so, are you still interested in creating that decorator?

Which ones? __new__ is already covered, and AFAIK all operator dunders
can be injected just fine.

def operators_return_subclass(cls):
    if len(cls.__bases__) != 1:
        raise ValueError("Need to have exactly one base class")
    base, = cls.__bases__
    def dunder(name):
        def flip_to_subclass(self, other):
            x = getattr(super(cls, self), name)(other)
            if type(x) is base:
                return cls(x)
            return x
        setattr(cls, name, flip_to_subclass)
    for meth in "add", "sub", "mul", "div":
    return cls

class hexint(int):
    def __repr__(self):
        return hex(self)

I'm not sure what set of dunders should be included though.

I've gone extra strict in this version; for instance, hexint/2 -->
float, not hexint. You could go either way.


From mike at  Thu Apr 21 08:50:36 2016
From: mike at (Michael Selik)
Date: Thu, 21 Apr 2016 12:50:36 +0000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 8:42 AM Chris Angelico <rosuav at> wrote:

> On Thu, Apr 21, 2016 at 9:44 PM, Michael Selik <mike at> wrote:
> > On Thu, Apr 21, 2016 at 7:30 AM Steven D'Aprano <steve at>
> wrote:
> >>
> >> On Thu, Apr 21, 2016 at 04:36:54PM +1000, Nick Coghlan wrote:
> >>
> >> > Builtins can be extended, you just have to override all the methods
> >> > where
> >> > you want to change the return type:
> >>
> >> I wonder whether we should have a class decorator which automatically
> >> adds the appropriate methods?
> >
> >
> > If I'm not mistaken, some of the dunders need to be overridden as part of
> > the class definition and can't be added dynamically without some exec a
> la
> > namedtuple. If so, are you still interested in creating that decorator?
> Which ones? __new__ is already covered, and AFAIK all operator dunders
> can be injected just fine.

I have a vague memory of trouble with __repr__ when I once tried to make a
sort of mutable namedtuple, but now I can't reproduce the issue.
-------------- next part --------------
On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote:
> and it runs counter to a documented feature of dicts, that they can be 
> subclassed and given a __missing__ method:
> "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() 
> and key is not present, the d[key] operation calls that method ..."

So what method should be overridden to make a dict subclass useful as a
class or object dictionary (i.e. for attribute lookup to work with names
that have not been stored with dict.__setitem__)? Overriding __getitem__
or __missing__ doesn't work. My only consolation is that defaultdict
doesn't work either.

I can't even figure out how to get the real class dict, as I would need
if I were overriding __getattribute__ explicitly in the metaclass (which
also doesn't work) - cls.__dict__ returns a mappingproxy.

Alternatively, where, other than object and class dicts, are you
actually required to have a subclass of dict rather than a UserDict or
other duck-typed mapping?

Incidentally, why is __missing__ documented under defaultdict as "in
addition to the standard dict operations"?

From storchaka at  Thu Apr 21 11:52:56 2016
From: storchaka at (Serhiy Storchaka)
Date: Thu, 21 Apr 2016 18:52:56 +0300
Subject: [Python-ideas] Bytecode for calling function with keyword arguments
Message-ID: <nfat0p$84n$>

Currently the call of function with keyword arguments (say `f(a, b, x=c, 
y=d)`) is compiled to following bytecode:

       0 LOAD_NAME                0 (f)
       3 LOAD_NAME                1 (a)
       6 LOAD_NAME                2 (b)
       9 LOAD_CONST               0 ('x')
      12 LOAD_NAME                3 (c)
      15 LOAD_CONST               1 ('y')
      18 LOAD_NAME                4 (d)
      21 CALL_FUNCTION          514 (2 positional, 2 keyword pair)

For every positional argument its value is pushed on the stack, and for 
every keyword argument its name and its value are pushed on the stack. 
But keyword arguments are always constant strings! We can reorder the 
stack, and push keyword argument values and names separately. And we can 
save all keyword argument names for this call in a constant tuple and 
load it by one bytecode command.

       0 LOAD_NAME                0 (f)
       3 LOAD_NAME                1 (a)
       6 LOAD_NAME                2 (b)
       9 LOAD_NAME                3 (c)
      12 LOAD_NAME                4 (d)
      15 LOAD_CONST               0 (('x', 'y'))
      18 CALL_FUNCTION2           2


1. We save one command for every keyword parameter after the first. This 
saves bytecode size and execution time.

2. Since the number of keyword arguments is obtained from tuple's size, 
new CALL_FUNCTION opcode needs only the number of positional arguments. 
It's argument is simpler and needs less bits (important for wordcode).


1. Increases the number of constants.

From rosuav at  Thu Apr 21 12:00:35 2016
From: rosuav at (Chris Angelico)
Date: Fri, 22 Apr 2016 02:00:35 +1000
Subject: [Python-ideas] Bytecode for calling function with keyword
In-Reply-To: <nfat0p$84n$>
References: <nfat0p$84n$>
Message-ID: <>

On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka <storchaka at> wrote:
> 2. Since the number of keyword arguments is obtained from tuple's size, new
> CALL_FUNCTION opcode needs only the number of positional arguments. It's
> argument is simpler and needs less bits (important for wordcode).

What about calls that don't include any keyword arguments? Currently,
no pushing is done. Will you have two opcodes, one if there are kwargs
and one if there are not?

Also: How does this interact with **dict calls?


From storchaka at  Thu Apr 21 13:28:39 2016
From: storchaka at (Serhiy Storchaka)
Date: Thu, 21 Apr 2016 20:28:39 +0300
Subject: [Python-ideas] Bytecode for calling function with keyword
In-Reply-To: <>
References: <nfat0p$84n$>
Message-ID: <nfb2k8$81t$>

On 21.04.16 19:00, Chris Angelico wrote:
> On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka <storchaka at> wrote:
>> 2. Since the number of keyword arguments is obtained from tuple's size, new
>> CALL_FUNCTION opcode needs only the number of positional arguments. It's
>> argument is simpler and needs less bits (important for wordcode).
> What about calls that don't include any keyword arguments? Currently,
> no pushing is done. Will you have two opcodes, one if there are kwargs
> and one if there are not?

Yes, we need either two opcodes, or a falg in the argument. I prefer the 
first approach to keep the argument short and simple.

> Also: How does this interact with **dict calls?

For now we have separate opcode for calls with **kwargs. We will have 
corresponding opcode with new way to provide keyword arguments.

Instead of 4 current opcodes we will need at most 8 new opcodes. May be 
less, because calls with *args or **kwargs is less performance critical, 
we can pack arguments in a tuple and a dict by separate opcodes and call 
a function just with a tuple and a dict.

From contrebasse at  Thu Apr 21 15:35:58 2016
From: contrebasse at (Joseph Martinot-Lagarde)
Date: Thu, 21 Apr 2016 19:35:58 +0000 (UTC)
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

> No problem with that. In my experience, most RecursionErrors come from
> *accidental* recursion, which is straight-forwardly infinite and
> usually involves a single function.
It can come very easily with __getattr__, if __getattr__ itself uses an
undefined attribute. This recursion is usually not what the user wanted! :)

From ethan at  Thu Apr 21 16:15:57 2016
From: ethan at (Ethan Furman)
Date: Thu, 21 Apr 2016 13:15:57 -0700
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

On 04/21/2016 12:35 PM, Joseph Martinot-Lagarde wrote:

>> No problem with that. In my experience, most RecursionErrors come from
>> *accidental* recursion, which is straight-forwardly infinite and
>> usually involves a single function.
> It can come very easily with __getattr__, if __getattr__ itself uses an
> undefined attribute. This recursion is usually not what the user wanted! :)

True, but I'm pretty sure that falls in to the *accidental recursion* 
category.  ;)


From chris.barker at  Thu Apr 21 18:17:32 2016
From: chris.barker at (Chris Barker)
Date: Thu, 21 Apr 2016 15:17:32 -0700
Subject: [Python-ideas] Why was Path.path added?
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 1:49 AM, Koos Zevenhoven <k7hoven at> wrote:

> Regarding
> this, however, we are seizing the discussions until we have a PEP.

I think you meant "ceasing" the discussion -- but case that was
intentional, I like it! it really did need to be seized!




Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
On Fri, Apr 22, 2016 at 8:17 AM, Chris Barker <chris.barker at> wrote:
> On Thu, Apr 21, 2016 at 1:49 AM, Koos Zevenhoven <k7hoven at> wrote:
>> Regarding
>> this, however, we are seizing the discussions until we have a PEP.
> I think you meant "ceasing" the discussion -- but case that was intentional,
> I like it! it really did need to be seized!
> :-)

Carpe Discussiem?

about to be slapped with a fish if he makes a 'carp' joke

From tjreedy at  Thu Apr 21 19:13:04 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 21 Apr 2016 19:13:04 -0400
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <nfbmq1$7ti$>

On 4/21/2016 7:30 AM, Steven D'Aprano wrote:

> I wonder whether we should have a class decorator which automatically
> adds the appropriate methods?
> It would need some sort of introspection to look at the superclass and
> adds overridden methods? E.g. if the superclass is float, it would add a
> bunch of methods like:
> def __add__(self, other):
>     x = super().__add__(other)
>     if x is NotImplemented:
>         return x
>     return type(self)(x)
> but only if they aren't already overridden. Then you could do this:
> @decorator  # I have no idea what to call it.
> class MyInt(int):
>     pass
> and now MyInt() + MyInt() will return a MyInt, rather than a regular
> int.
> Getting the list of dunder methods is easy, but telling whether or not
> they should return an instance of the subclass may not be.
> Thoughts?

Interesting idea.  I would start with a TDDed fix_int_subclass, then a 
TDDed fix_list_subclass, and only then think about whether there is 
enough common code to refactor.

Terry Jan Reedy

> True, but I'm pretty sure that falls in to the *accidental recursion* 
> category.  ;)
Of course, I just wanted to present a less complicated example (and more
common) than the id example from Chris Angelico.

Anyway shortening the traceback is useful in all cases, the recursion being
intentionnal or not.

From steve at  Thu Apr 21 22:00:17 2016
From: steve at (Steven D'Aprano)
Date: Fri, 22 Apr 2016 12:00:17 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 09:43:17AM -0400, Random832 wrote:
> On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote:
> > and it runs counter to a documented feature of dicts, that they can be 
> > subclassed and given a __missing__ method:
> > 
> > "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__() 
> > and key is not present, the d[key] operation calls that method ..."
> So what method should be overridden to make a dict subclass useful as a
> class or object dictionary (i.e. for attribute lookup to work with names
> that have not been stored with dict.__setitem__)? 

I don't understand your question. Or rather, if I have understood it, 
the question has a trivial answer: you don't have to override anything.

class MyDict(dict):

d = MyDict()
d.attr = 1

will do what you appear to be asking. If that's not what you actually 
mean, then you need to explain more carefully.

I will also point out that your question is based on a point of 
confusion. You ask:

"So what method should be overridden..."

but the answer to this in general must be "What makes you think a single 
method is sufficient?" This doesn't just apply to dicts, it applies in 
general to any class.

"What method do I override to make a subclass of int implement 
arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?"

> Overriding __getitem__
> or __missing__ doesn't work. My only consolation is that defaultdict
> doesn't work either.
> I can't even figure out how to get the real class dict, as I would need
> if I were overriding __getattribute__ explicitly in the metaclass (which
> also doesn't work) - cls.__dict__ returns a mappingproxy.

This doesn't make anything any clearer for me.

> Alternatively, where, other than object and class dicts, are you
> actually required to have a subclass of dict rather than a UserDict or
> other duck-typed mapping?

Anywhere you have to operate with code that does "if isinstance(x, 
dict)" checks.

> Incidentally, why is __missing__ documented under defaultdict as "in
> addition to the standard dict operations"?

Because defaultdict provides __missing__ in addition to the standard 
dict operations. Is there a problem with the current docs?


From rosuav at  Thu Apr 21 22:13:00 2016
From: rosuav at (Chris Angelico)
Date: Fri, 22 Apr 2016 12:13:00 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 22, 2016 at 12:00 PM, Steven D'Aprano <steve at> wrote:
> On Thu, Apr 21, 2016 at 09:43:17AM -0400, Random832 wrote:
>> On Thu, Apr 21, 2016, at 07:17, Steven D'Aprano wrote:
>> > and it runs counter to a documented feature of dicts, that they can be
>> > subclassed and given a __missing__ method:
>> >
>> > "If a SUBCLASS OF DICT [emphasis added] defines a method __missing__()
>> > and key is not present, the d[key] operation calls that method ..."
>> So what method should be overridden to make a dict subclass useful as a
>> class or object dictionary (i.e. for attribute lookup to work with names
>> that have not been stored with dict.__setitem__)?
> I don't understand your question. Or rather, if I have understood it,
> the question has a trivial answer: you don't have to override anything.
> class MyDict(dict):
>     pass
> d = MyDict()
> d.attr = 1
> print(d.attr)
> will do what you appear to be asking. If that's not what you actually
> mean, then you need to explain more carefully.

"As a class or object dictionary". Consider:

>>> class X: pass
>>> X().__dict__
>>> type(_)
<class 'dict'>

Now, what can I replace that with? A regular dict works fine:

>>> x = X()
>>> x.__dict__ = {"asdf": 1}
>>> x.asdf

But defining __missing__ doesn't create attributes automatically:

>>> class AutoCreateDict(dict):
...     def __missing__(self, key):
...         return "<%r>" % key
>>> x.__dict__ = AutoCreateDict()
>>> x.qwer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'X' object has no attribute 'qwer'

And neither does overriding __getitem__:

>>> class OtherAutoCreateDict(dict):
...     def __getitem__(self, key):
...         try: return super().__getitem__(key)
...         except KeyError: return "<%r>" % key
>>> x.__dict__ = OtherAutoCreateDict()
>>> x.zxcv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'X' object has no attribute 'zxcv'

The question is a fair one. What can you do to make an object's
dictionary provide attributes that weren't put there with __setitem__?


From random832 at  Thu Apr 21 22:35:01 2016
From: random832 at (Random832)
Date: Thu, 21 Apr 2016 22:35:01 -0400
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016, at 22:00, Steven D'Aprano wrote:
> I don't understand your question. Or rather, if I have understood it, 
> the question has a trivial answer: you don't have to override anything.

I thought it went without saying that I want to be able to get items
other than by having them already stored in the dict under the exact
key. Like if I wanted to make a case-insensitive dict, or one where
numbers are equivalent to strings, or have it materialize a default
value for missing keys.

> "So what method should be overridden..."
> but the answer to this in general must be "What makes you think a single 
> method is sufficient?"

Well, I was assuming it doesn't call *more than one* method for the
specific task of looking up an attribute by name, even if there are
other methods for other tasks that I would have to override to get the
whole package of what behavior I want. As near as I can tell, it doesn't
call *any method at all*, but instead reaches inside PyDictObject's
internal structure.

> This doesn't just apply to dicts, it applies in 
> general to any class.
> "What method do I override to make a subclass of int implement 
> arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?"

Right, but I'm asking which method *one particular* expression calls.
This is more like "Which method on an object implements the + operator",
only it's "Which method on an object's type's namespace dict implements
the ability to look up attributes on that object?"

In code terms:

class mydict(dict):
    def ????????(self, name):
        if name == 'foo': return 'bar'

mytype = type('C',(),mydict())

=> desired result: mytype().foo == == 1

> > I can't even figure out how to get the real class dict, as I would need
> > if I were overriding __getattribute__ explicitly in the metaclass (which
> > also doesn't work) - cls.__dict__ returns a mappingproxy.
> This doesn't make anything any clearer for me.

Given a type, how do I get a reference to the dict instance that was
passed to the type's constructor?

Or, in code terms:

>>> mydict = {}
>>> mytype = type('C',(),mydict)
>>> mytype.__dict__ is mydict
>>> type(mytype.__dict__)
<class 'mappingproxy'>

def f(x): ???????? => desired result: f(mytype) is mydict

> > Alternatively, where, other than object and class dicts, are you
> > actually required to have a subclass of dict rather than a UserDict or
> > other duck-typed mapping?
> Anywhere you have to operate with code that does "if isinstance(x, 
> dict)" checks.

Fix the other code, because it's wrong. If you can't, monkey-patch its

> > Incidentally, why is __missing__ documented under defaultdict as "in
> > addition to the standard dict operations"?
> Because defaultdict provides __missing__ in addition to the standard 
> dict operations. Is there a problem with the current docs?

When I read it earlier, the wording in defaultdict's documentation
seemed to suggest that what it provides is the ability to define a
__missing__ method and have it be called - and that, itself, *is* a
"standard dict operation" - rather than an implementation of the method.
It looks like I misinterpreted it though.

From vgr255 at  Thu Apr 21 22:38:08 2016
I implemented a small patch for this, over at

Comments and feedback are very much welcome :)


From steve at  Thu Apr 21 23:49:57 2016
From: steve at (Steven D'Aprano)
Date: Fri, 22 Apr 2016 13:49:57 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 10:35:01PM -0400, Random832 wrote:
> On Thu, Apr 21, 2016, at 22:00, Steven D'Aprano wrote:
> > I don't understand your question. Or rather, if I have understood it, 
> > the question has a trivial answer: you don't have to override anything.
> I thought it went without saying 

Obviously not.

> that I want to be able to get items
> other than by having them already stored in the dict under the exact
> key. 

"Items"? Like items in a dict? Then you already have at least two ways: 
you override __getitem__, or add a __missing__ method to the dict.

What makes you think these techniques don't work?

> Like if I wanted to make a case-insensitive dict, or one where
> numbers are equivalent to strings, or have it materialize a default
> value for missing keys.
> > "So what method should be overridden..."
> > 
> > but the answer to this in general must be "What makes you think a single 
> > method is sufficient?"
> Well, I was assuming it doesn't call *more than one* method for the
> specific task of looking up an attribute by name, 

Where does attribute lookup come into this?

> even if there are
> other methods for other tasks that I would have to override to get the
> whole package of what behavior I want. As near as I can tell, it doesn't
> call *any method at all*, but instead reaches inside PyDictObject's
> internal structure.

Are you still talking about *attribute lookup* or are you back to *key 

Honestly Random, you have to be clear as to which you want, because they 
are different thing. You can't just jump backwards and forwards 
between talking about dicts and attributes and expect people to 
understand what you mean.

Classes and instances may not have a __dict__ at all, if they define 
__slots__ instead, or if they are builtins:

py> (1).__dict__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute '__dict__'

> > This doesn't just apply to dicts, it applies in 
> > general to any class.
> > 
> > "What method do I override to make a subclass of int implement 
> > arithmetic with wrap-around (e.g. 999+1 = 0, 501*2 = 2)?"
> Right, but I'm asking which method *one particular* expression calls.
> This is more like "Which method on an object implements the + operator",

An excellent example, because the answer is *two* methods, __add__ and 
__radd__. You've made my point for me again.

> only it's "Which method on an object's type's namespace dict implements
> the ability to look up attributes on that object?"

What makes you think that such a method exists?

Have you read the rest of this thread? If not, I suggest you do, because 
the thread is all about making unjustified assumptions about how objects 
are implemented. You are doing it again. Where is it documented that 
there is a method on the object __dict__ (if such a __dict__ even 
exists!) that does this?

So you are asking the wrong question. Don't ask about an implementation 
detail that you *assume* exists. You should ask, not "which method", but 
"how do I customise attribute lookup?"

And you know the standard answer to that: override __getattribute__ or 
__getattr__. Do they not solve your problem?

Another smart question might be, what are the constraints and rules for 
customizing behaviour by setting the object __dict__ to something other 
than a regular builtin dict? I don't know the answer to that.

But don't make assumptions about the implementation, and having made 
those assumptions, assume that everyone else shares them and that they 
"go without saying". *Especially not* in a thread that talks about how 
silly it is to make those assumptions.

> Given a type, how do I get a reference to the dict instance that was
> passed to the type's constructor?

I believe that the answer to that is, you can't.

> > > Alternatively, where, other than object and class dicts, are you
> > > actually required to have a subclass of dict rather than a UserDict or
> > > other duck-typed mapping?
> > 
> > Anywhere you have to operate with code that does "if isinstance(x, 
> > dict)" checks.
> Fix the other code, because it's wrong. If you can't, monkey-patch its
> builtins.

You cannot assume that it is "wrong" just because it is inconvenient for 
you. Do you seriously think that monkey-patching the built-ins in 
production code is a good idea?


From random832 at  Fri Apr 22 00:45:34 2016
From: random832 at (Random832)
Date: Fri, 22 Apr 2016 00:45:34 -0400
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote:
> "Items"? Like items in a dict? Then you already have at least two ways: 
> you override __getitem__, or add a __missing__ method to the dict.
> What makes you think these techniques don't work?

The fact that I tried them and they didn't work.

> Where does attribute lookup come into this?

Because looking up an attribute implies getting an item from the object
or class's __dict__ with a string key of the name of the attribute (see
below for the documented basis for this assumption). How are you not
following this? I don't believe you're not messing with me. 

> Classes and instances may not have a __dict__ at all, if they define 
> __slots__ instead, or if they are builtins:

If a class defines __slots__, the *instances* don't have a __dict__, but
the *class* sure as hell still does. Even, as it turns out, if the
metaclass had __slots__. (Not sure what's up with that, actually).

> What makes you think that such a method exists?
> Have you read the rest of this thread? If not, I suggest you do, because 
> the thread is all about making unjustified assumptions about how objects 
> are implemented. You are doing it again. Where is it documented that 
> there is a method on the object __dict__ (if such a __dict__ even 
> exists!) that does this?

My main concern here is the class dict. So, let's see...

### Class attribute references are translated to lookups in this
dictionary, e.g., C.x is translated to C.__dict__["x"]

Now, that is technically true (C.__dict__ is, we've established, not the
actual dict, but a "mappingproxy" object), but this behavior itself
contradicts the documentation:

### [type] With three arguments, [...] and the dict dictionary is the
namespace containing definitions for class body and becomes the __dict__

Except, it *doesn't* become the __dict__ attribute - its contents are
*copied* into the __dict__ object, which is a new "mappingproxy" whose
contents will not reflect further updates to the dict that was passed

And regarding the object __dict__, when such a __dict__ *does* exist
(since, unlike class dicts, you actually can set object dicts to be
arbitrary dict subclasses)

### The default behavior for attribute access is to get, set, or delete
the attribute from an object?s dictionary. For instance, a.x has a
lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'],
and continuing through the base classes of type(a) excluding

> > Given a type, how do I get a reference to the dict instance that was
> > passed to the type's constructor?
> I believe that the answer to that is, you can't.

I hadn't yet realized, when I asked this question, that the class
__dict__ "mappingproxy" is a new object (that isn't even a dict! "A
class has a namespace implemented by a dictionary object" was also
wrong) and doesn't retain a reference to the passed-in dict nor reflect
changes to it.

From ben+python at  Fri Apr 22 01:17:34 2016
From: ben+python at (Ben Finney)
Date: Fri, 22 Apr 2016 15:17:34 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
References: <>
Message-ID: <>

Random832 <random832 at> writes:

> On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote:
> > Where does attribute lookup come into this?
> Because looking up an attribute implies getting an item from the
> object or class's __dict__ with a string key of the name of the
> attribute (see below for the documented basis for this assumption).

No, that's not what it implies. The ?__dict__? of an object is an
implementation detail, and is not necessarily used for attribute lookup.

As the documentation says:

    A class has a namespace implemented by a dictionary object. Class
    attribute references are translated to lookups in this dictionary,
    e.g., C.x is translated to C.__dict__["x"] (although there are a
    number of hooks which allow for other means of locating attributes).


So no, attribute lookup does not imply getting an item from any
particular dictionary. Nothing in the documentation implies that ? if
you find an exception, please file a bug report for that part of the

> How are you not following this? I don't believe you're not messing
> with me.

Steven follows quite well; he is trying to get you to explain your
meaning so we can find where the mismatch is.

> > > Given a type, how do I get a reference to the dict instance that
> > > was passed to the type's constructor?
> > 
> > I believe that the answer to that is, you can't.
> I hadn't yet realized, when I asked this question, that the class
> __dict__ "mappingproxy" is a new object (that isn't even a dict! "A
> class has a namespace implemented by a dictionary object" was also
> wrong) and doesn't retain a reference to the passed-in dict nor
> reflect changes to it.

That's right. As documented, attribute lookup does not imply any
dictionary lookup for the attribute.

Attribute lookup by name is one thing. Dictionary lookup by key is quite
a different thing. Please let's keep the two distinct in any discussions.

 \        ?Ubi dubium, ibi libertas.? (?Where there is doubt, there is |
  `\                                                        freedom.?) |
_o__)                                                                  |
Ben Finney

From greg.ewing at  Fri Apr 22 01:39:16 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 22 Apr 2016 17:39:16 +1200
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

Random832 wrote:
> If a class defines __slots__, the *instances* don't have a __dict__, but
> the *class* sure as hell still does. Even, as it turns out, if the
> metaclass had __slots__. (Not sure what's up with that, actually).

It's because metaclasses are all based on type, whose instances
have a dict. An object will always have a dict if any of its base
classes cause it to have a dict.


From rosuav at  Fri Apr 22 02:06:39 2016
From: rosuav at (Chris Angelico)
Date: Fri, 22 Apr 2016 16:06:39 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
On Fri, Apr 22, 2016 at 3:17 PM, Ben Finney <ben+python at> wrote:
> Random832 <random832 at> writes:
>> On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote:
>> > Where does attribute lookup come into this?
>> Because looking up an attribute implies getting an item from the
>> object or class's __dict__ with a string key of the name of the
>> attribute (see below for the documented basis for this assumption).
> No, that's not what it implies. The ?__dict__? of an object is an
> implementation detail, and is not necessarily used for attribute lookup.
> As the documentation says:
>     A class has a namespace implemented by a dictionary object. Class
>     attribute references are translated to lookups in this dictionary,
>     e.g., C.x is translated to C.__dict__["x"] (although there are a
>     number of hooks which allow for other means of locating attributes).
>     <URL:>

Okay. That definitely implies that the class itself has a real dict,
though. And it should be possible, using a metaclass, to manipulate
that, right?

>>> class AutoCreateDict(dict):
...     def __missing__(self, key):
...         return "<%r>" % key
...     def __repr__(self):
...         return "AutoCreateDict" + super().__repr__()
>>> class DemoMeta(type):
...     def __new__(*a):
...         print("new:", a)
...         return type.__new__(*a)
...     @classmethod
...     def __prepare__(*a):
...         print("prepare:", a)
...         return AutoCreateDict()
>>> class Demo(metaclass=DemoMeta): pass
prepare: (<class '__main__.DemoMeta'>, 'Demo', ())
new: (<class '__main__.DemoMeta'>, 'Demo', (),
AutoCreateDict{'__qualname__': 'Demo', '__module__': "<'__name__'>"})
>>> Demo.asdf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'Demo' has no attribute 'asdf'
>>> Demo().asdf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Demo' object has no attribute 'asdf'
>>> Demo.__dict__
mappingproxy({'__dict__': <attribute '__dict__' of 'Demo' objects>,
'__doc__': None, '__weakref__': <attribute '__weakref__' of 'Demo'
objects>, '__module__': "<'__name__'>"})
>>> Demo().__dict__

Maybe I'm just misunderstanding how metaclasses should be written, but
this seems like it ought to work. And it's not making any use of the
AutoCreateDict - despite __prepare__ and __new__ clearly being called,
the latter with an AutoCreateDict instance. But by the time it gets
actually attached to the dictionary, we have a mappingproxy that
ignores __missing__. The docs say "are translated to", implying that
this will actually be equivalent.

Using a non-dict-subclass results in prompt rejection in the type() constructor:

>>> class DictLike:
...     def __getitem__(self, key):
...         print("DictLike get", key)
...         return "<%r>" % key
...     def __setitem__(self, key, value):
...         print("DictLike set", key, value)
>>> class Demo(metaclass=DemoMeta): pass
prepare: (<class '__main__.DemoMeta'>, 'Demo', ())
DictLike get __name__
DictLike set __module__ <'__name__'>
DictLike set __qualname__ Demo
new: (<class '__main__.DemoMeta'>, 'Demo', (), <__main__.DictLike
object at 0x7f2b41b96ac8>)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in __new__
TypeError: type() argument 3 must be dict, not DictLike

Is there any way to make use of this documented transformation?


From niki.spahiev at  Fri Apr 22 02:49:25 2016
On 21.04.2016 20:28, Serhiy Storchaka wrote:
> On 21.04.16 19:00, Chris Angelico wrote:
>> On Fri, Apr 22, 2016 at 1:52 AM, Serhiy Storchaka
>> <storchaka at> wrote:
>>> 2. Since the number of keyword arguments is obtained from tuple's
>>> size, new
>>> CALL_FUNCTION opcode needs only the number of positional arguments. It's
>>> argument is simpler and needs less bits (important for wordcode).
>> What about calls that don't include any keyword arguments? Currently,
>> no pushing is done. Will you have two opcodes, one if there are kwargs
>> and one if there are not?
> Yes, we need either two opcodes, or a falg in the argument. I prefer the
> first approach to keep the argument short and simple.
>> Also: How does this interact with **dict calls?
> For now we have separate opcode for calls with **kwargs. We will have
> corresponding opcode with new way to provide keyword arguments.
> Instead of 4 current opcodes we will need at most 8 new opcodes. May be
> less, because calls with *args or **kwargs is less performance critical,
> we can pack arguments in a tuple and a dict by separate opcodes and call
> a function just with a tuple and a dict.

We can limit second argument (number of kw args) to be 0 or 1 only. With 
1 meaning single tuple with the names.


From steve at  Fri Apr 22 07:03:35 2016
From: steve at (Steven D'Aprano)
Date: Fri, 22 Apr 2016 21:03:35 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 22, 2016 at 03:17:34PM +1000, Ben Finney wrote:
> Random832 <random832 at> writes:
> > On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote:
> > > Where does attribute lookup come into this?
> >
> > Because looking up an attribute implies getting an item from the
> > object or class's __dict__ with a string key of the name of the
> > attribute (see below for the documented basis for this assumption).
> No, that's not what it implies. The ?__dict__? of an object is an
> implementation detail, and is not necessarily used for attribute lookup.

I think it's a bit more than an implementation detail. For example, I 
don't think it would be legitimate for (let's say) FooPython to decide 
to use instance.__tree__ (a binary tree) for instance attribute 

But I think that the details are far more complex than "attribute 
lookups are done by key access to a dict __dict__" (for example, if 
attribute lookups are done by inspecting obj.__dict__, how do you look 
up obj.__dict__?). And I don't think that the docs suggest that they are 
replacable by dict subclasses.

> > How are you not following this? I don't believe you're not messing
> > with me.
> Steven follows quite well; he is trying to get you to explain your
> meaning so we can find where the mismatch is.

Not quite: my first email was genuinely confused. When I said I didn't 
understand what Random was trying to say, I meant it. By my second 
email, or at least the end of it, I could guess what he was trying to 
do, which (I think) is to replace __dict__ with a subclass instance in 
order to customise attribute lookup.


From steve at  Fri Apr 22 07:16:52 2016
From: steve at (Steven D'Aprano)
Date: Fri, 22 Apr 2016 21:16:52 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote:

> My main concern here is the class dict. So, let's see...
> ### Class attribute references are translated to lookups in this
> dictionary, e.g., C.x is translated to C.__dict__["x"]
> Now, that is technically true (C.__dict__ is, we've established, not the
> actual dict, but a "mappingproxy" object), but this behavior itself
> contradicts the documentation:
> ### [type] With three arguments, [...] and the dict dictionary is the
> namespace containing definitions for class body and becomes the __dict__
> attribute.
> Except, it *doesn't* become the __dict__ attribute - its contents are
> *copied* into the __dict__ object, which is a new "mappingproxy" whose
> contents will not reflect further updates to the dict that was passed
> in.

That is a very good point. I think that's a documentation bug.

> And regarding the object __dict__, when such a __dict__ *does* exist
> (since, unlike class dicts, you actually can set object dicts to be
> arbitrary dict subclasses)

True, but the documentation doesn't say that attribute lookup goes 
through the *full* dict key lookup, including support of __missing__ and 
__getitem__. I'll grant you that neither does the documentation say that 
it doesn't, so I'd call this a documentation bug.


From ethan at  Fri Apr 22 07:51:47 2016
From: ethan at (Ethan Furman)
Date: Fri, 22 Apr 2016 04:51:47 -0700
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/21/2016 11:06 PM, Chris Angelico wrote:
> On Fri, Apr 22, 2016 at 3:17 PM, Ben Finney wrote:
>> Random832 writes:

>>> Because looking up an attribute implies getting an item from the
>>> object or class's __dict__ with a string key of the name of the
>>> attribute (see below for the documented basis for this assumption).
>> No, that's not what it implies. The ?__dict__? of an object is an
>> implementation detail, and is not necessarily used for attribute lookup.
>> As the documentation says:
>>      A class has a namespace implemented by a dictionary object. Class
>>      attribute references are translated to lookups in this dictionary,
>>      e.g., C.x is translated to C.__dict__["x"] (although there are a
>>      number of hooks which allow for other means of locating attributes).
>>      <URL:>
> Okay. That definitely implies that the class itself has a real dict,
> though.

A class does have a real dict() -- not a subclass, not a look-a-like, 
but a real, honest-to-goodness, freshly minted {}.

> And it should be possible, using a metaclass, to manipulate
> that, right?

Only for the purposes of class creation.  One of the final steps of 
creating a class (after __prepare__ and __new__ have been called) is to 
copy everything from whatever dict (sub)class you used into a brand-new 

> Maybe I'm just misunderstanding how metaclasses should be written, but
> this seems like it ought to work.

You have the metaclass portion right.

> And it's not making any use of the
> AutoCreateDict - despite __prepare__ and __new__ clearly being called,
> the latter with an AutoCreateDict instance.

Well, it would if you tried creating any attributes without values 
during class creation.

> But by the time it gets
> actually attached to the dictionary, we have a mappingproxy that
> ignores __missing__.

No, it's a dict -- we just don't get to directly fiddle with it (hence 
the return of a mappingproxy).

> The docs say "are translated to", implying that
> this will actually be equivalent.

> Is there any way to make use of this documented transformation?

There are already many hooks in place to supplement or replace attribute 
lookup -- use them instead.  :)


From ethan at  Fri Apr 22 07:58:53 2016
From: ethan at (Ethan Furman)
Date: Fri, 22 Apr 2016 04:58:53 -0700
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On 04/22/2016 04:16 AM, Steven D'Aprano wrote:
> On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote:

>> ### Class attribute references are translated to lookups in this
>> dictionary, e.g., C.x is translated to C.__dict__["x"]
>> Now, that is technically true (C.__dict__ is, we've established, not the
>> actual dict, but a "mappingproxy" object), but this behavior itself
>> contradicts the documentation:

No, it's a real dict() -- we just don't get direct access to it anymore.

>> ### [type] With three arguments, [...] and the dict dictionary is the
>> namespace containing definitions for class body and becomes the __dict__
>> attribute.
>> Except, it *doesn't* become the __dict__ attribute - its contents are
>> *copied* into the __dict__ object, which is a new "mappingproxy" whose
>> contents will not reflect further updates to the dict that was passed
>> in.

See above about the 'mappingproxy'.  As for updates, IIRC setattr() is 
the way to make changes these days.

> That is a very good point. I think that's a documentation bug.

Patches welcome.  ;)


From ncoghlan at  Fri Apr 22 09:24:10 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 22 Apr 2016 23:24:10 +1000
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On 22 April 2016 at 16:06, Chris Angelico <rosuav at> wrote:

> Maybe I'm just misunderstanding how metaclasses should be written, but
> this seems like it ought to work. And it's not making any use of the
> AutoCreateDict - despite __prepare__ and __new__ clearly being called,
> the latter with an AutoCreateDict instance. But by the time it gets
> actually attached to the dictionary, we have a mappingproxy that
> ignores __missing__. The docs say "are translated to", implying that
> this will actually be equivalent.

The mapping used during class body execution can be customised via
__prepare__, but the resulting contents of that mapping are still copied to
a regular dictionary when constructing the class object itself.

We deliberately don't provide a mechanism to customise the runtime
dictionary used by object instances, regardless of whether they're normal
instances or type definitions. In combination with the __dict__ descriptor
only exposing a mapping proxy, this ensures that all Python level
modifications to the contents go through the descriptor machinery - you
can't get your hands on a mutable pointer to the post-creation namespace.

It looks like there *is* a missing detail in the data model docs in
relation to this, though:
should state explicitly that the namespace contents are copied to a plain
dict (which is then never exposed directly to Python code), but it doesn't.


P.S. I actually played around with an experimental interpreter build that
dropped the copy-to-a-new-namespace step back when I was working on It's
astonishingly broken in the number of ways it offers to corrupt the
interpreter state (since you can entirely bypass the descriptor machinery,
which the rest of the interpreter expects to be impossible if you're not
messing about with ctypes or C extensions), but kinda fun in the quirky
action at a distance it makes possible :)

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
On Fri, Apr 22, 2016 at 11:24 PM, Nick Coghlan <ncoghlan at> wrote:
> It looks like there *is* a missing detail in the data model docs in relation
> to this, though:
> should state explicitly that the namespace contents are copied to a plain
> dict (which is then never exposed directly to Python code), but it doesn't.

Okay, that explains my confusion, at least :) I can't think of any
situations where I actually *want* a dict subclass (__getattr[ibute]__
or descriptors can do everything I can come up with), but it'd be good
to document that that's not possible.


From ericsnowcurrently at  Fri Apr 22 15:19:59 2016
From: ericsnowcurrently at (Eric Snow)
Date: Fri, 22 Apr 2016 13:19:59 -0600
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Apr 22, 2016 at 5:16 AM, Steven D'Aprano <steve at> wrote:
> On Fri, Apr 22, 2016 at 12:45:34AM -0400, Random832 wrote:
>> ### [type] With three arguments, [...] and the dict dictionary is the
>> namespace containing definitions for class body and becomes the __dict__
>> attribute.
>> Except, it *doesn't* become the __dict__ attribute - its contents are
>> *copied* into the __dict__ object, which is a new "mappingproxy" whose
>> contents will not reflect further updates to the dict that was passed
>> in.
> That is a very good point. I think that's a documentation bug.

mappingproxy (a.k.a. types.MappingProxyType) is exactly what it says:
a proxy for a mapping.  Basically, it is a wrapper around a

The namespace (a dict or dict sub-class, e.g. OrderedDict) passed to
type() is copied into a new dict and the new type's __dict__ is set to
a mappingproxy that wraps that copy.

So if there is a documentation bug then it is the ambiguity of the
word "becomes".  Perhaps it would be more correct as "is copied into".

>> And regarding the object __dict__, when such a __dict__ *does* exist
>> (since, unlike class dicts, you actually can set object dicts to be
>> arbitrary dict subclasses)
> True, but the documentation doesn't say that attribute lookup goes
> through the *full* dict key lookup, including support of __missing__ and
> __getitem__. I'll grant you that neither does the documentation say that
> it doesn't, so I'd call this a documentation bug.

What would you say is the specific documentation bug?  That the
default attribute lookup (object.__getattribute()) does not use the
obj.__dict__ attribute but rather the dict it points to (if it knows
about it)?  Or just that anything set to __dict__ is not guaranteed to
be honored by the default __getattribute__()?


From ericsnowcurrently at  Fri Apr 22 16:12:57 2016
From: ericsnowcurrently at (Eric Snow)
Date: Fri, 22 Apr 2016 14:12:57 -0600
Subject: [Python-ideas] Override dict.__new__ to raise if cls is not
 dict; do the same for str, list, etc.
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Apr 21, 2016 at 10:45 PM, Random832 <random832 at> wrote:
> On Thu, Apr 21, 2016, at 23:49, Steven D'Aprano wrote:
>> Where does attribute lookup come into this?
> Because looking up an attribute implies getting an item from the object
> or class's __dict__ with a string key of the name of the attribute (see
> below for the documented basis for this assumption)

Attribute access and item access communicate different things about
the namespaces on which they operate.  The fact that objects use
mappings under the hood for the default attribute access behavior is
an implementation detail.  Python does not document any mechanism to
hook arbitrary mappings into the default attribute access behavior
(i.e. object.__getattribute__()).

>> What makes you think that such a method exists?
>> Have you read the rest of this thread? If not, I suggest you do, because
>> the thread is all about making unjustified assumptions about how objects
>> are implemented. You are doing it again. Where is it documented that
>> there is a method on the object __dict__ (if such a __dict__ even
>> exists!) that does this?
> My main concern here is the class dict. So, let's see...
> ### Class attribute references are translated to lookups in this
> dictionary, e.g., C.x is translated to C.__dict__["x"]

Not exactly.  You still have to factor in descriptors (including
slots). [1]  In the absence of those then you are correct that the
current implementation of type.__getattribute__() (not the same as
object.__getattribute__(), BTW) is a lookup on the type's __dict__.
However, that is done directly on tp_dict and not on <TYPE>.__dict__,
if you want to talk about implementation details.  Of course, the
point is moot, as you've pointed out, since a type's __dict__ is both
unsettable and a read-only view.

> And regarding the object __dict__, when such a __dict__ *does* exist
> (since, unlike class dicts, you actually can set object dicts to be
> arbitrary dict subclasses)
> ### The default behavior for attribute access is to get, set, or delete
> the attribute from an object?s dictionary. For instance, a.x has a
> lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'],
> and continuing through the base classes of type(a) excluding
> metaclasses.

Again, you also have to factor in descriptors. [2]

Regardless of that, it's important to realize that
object.__getattribute__() doesn't use any object's __dict__ attribute
for any of the lookups you've identified.  This is because you can't
really look up the object's __dict__ attribute *while* doing attribute
lookup.  It's the same way that the following doesn't work:

    class Spam:
        def __getattribute__(self, name):
            if self.__class__ == Spam:  # infinite recursion!!!
            return object.__getattribute__(self, name)

Instead you have to do this:

    class Spam:
        def __getattribute__(self, name):
            if object.__getattribute__(self, "__class__") == Spam:
            return object.__getattribute__(self, name)

Hence, object.__getattribute__() only does lookup on the dict
identified through the type's tp_dictoffset field.  However, as you've
noted, objects of custom classes have a settable __dict__ attribute.
This is because by default the mapping is tied to the object at the
tp_dictoffset of the object's type. [3]

Notably, the mapping must be of type dict or of a dict subclass.  What
this implies to me is that someone went to the trouble of allowing
folks to use some other dict (or OrderedDict, etc.) than the one you
get by default with a new object.  However, either they felt that
using a non-dict mapping type was asking for too much trouble, there
were performance concerns, or they did not want to go to the effort to
fix all the places that expect __dict__ to be an actual dict.  It's
probably all three.

Keep in mind that even with a dict subclass the implementation of
object.__getattribute__() can't sensibly use normal lookup on when it
does the lookup on the underlying namespace dict.  Instead it uses
PyDict_GetItem(), which is basically equivalent to calling
dict.__getitem__(ns, attr).  Hence that underlying mapping must be a
dict or dict subclass, and any overridden __getitem__() method is



I have been thinking, and especially after Nick?s comments, that it might be better to keep it as simple as possible to reduce the risk of errors happening while we?re printing the tracebacks. Recursive functions (that go deep enough to trigger an exception) are usually within one function, in the REPL, or by using __getattr__ / __getattribute__. Nice to have? Sure. Necessary? Don?t think as much.


I?m also a very solid -1 on the idea of prompting the user to do something like full_ex() to get the full stack trace. My rationale for such a change is that we need to be extremely precise in our message, to leave absolutely no room for a different interpretation than what we mean. Your message, for instance, is ambiguous. The fact it says that calls are hidden would let me think I just lost information about the stack trace (though that?s but a wording issue). As a user, just seeing "Call _full_ex() to print full trace." would be an immediate red flag: Crap, I just lost precious debugging information to save on lines!


Might be just me, but that?s my opinion anyway :)




    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
    [Mutually recursive calls hidden: f (300), g (360)]
    File "<stdin>", line 1, in h
    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
   [Mutual-recursive calls hidden: f (103), g (200)]
  RuntimeError: maximum recursion depth exceeded
  [963 calls hidden. Call _full_ex() to print full trace.]

This rule is easily modified to, "have been seen at least three times before." For functions that recurse at multiple lines, it can print out one message per line, but maybe it should count them all together in the "hidden" summary.

On Fri, Apr 22, 2016 at 10:00:12PM -0400, ?manuel Barry wrote:

> I?m also a very solid -1 on the idea of prompting the user to do 
> something like full_ex() to get the full stack trace.

I'm not sure who suggested that idea. I certainly didn't.

> My rationale for 
> such a change is that we need to be extremely precise in our message, 
> to leave absolutely no room for a different interpretation than what 
> we mean. Your message, for instance, is ambiguous.

I'm not sure who you are talking to here, or whose suggestion the 
following faked stack trace is.

> Example:
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>     [Mutually recursive calls hidden: f (300), g (360)]

For the record, detecting subsequences of mutally recurive calls is not 
a trivial task. (It's not the same as cycle detection.) And I don't 
understand how you can get 300 calls to f and 360 calls to g in a 
scenario where f calls g and g calls f. Surely there would be equal 
number of calls to each?

>     File "<stdin>", line 1, in h
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>    [Mutual-recursive calls hidden: f (103), g (200)]
>   RuntimeError: maximum recursion depth exceeded
>   [963 calls hidden. Call _full_ex() to print full trace.]
> This rule is easily modified to, "have been seen at least three times 
> before." For functions that recurse at multiple lines, it can print 
> out one message per line, but maybe it should count them all together 
> in the "hidden" summary.

I think the above over-complicates the situation. I want to deal with 
the low-hanging fruit that gives the biggest benefit for the least work 
(and least complexity!), not minimise every conceivable bit of 
redundancy in a stack trace. So, for the record, let me explicitly state 
my proposal:

Stack traces should collapse blocks of repeated identical, contiguous 
lines (or pairs of lines, where the source code is available for 
display). For brevity, whenever I refer to a "line" in the stack trace, 
I mean either a single line of the form:

    File "<stdin>", line 1, in f

when source code is not available, or a pair of lines of the form:

    File "path/to/", line 1, in f
      line of source code

when it is available.

Whenever a single, continguous block of lines from the traceback 
consists of three or more identical lines (i.e. lines which compare 
equal using string equality), they should be collapsed down to:

    a single instance of that line

    followed by a message reporting the number of repetitions.

For example:

    File "<stdin>", line 1, in f
      return f(arg)
    File "<stdin>", line 1, in f
      return f(arg)
    File "<stdin>", line 1, in f
      return f(arg)
    File "<stdin>", line 1, in f
      return f(arg)
    File "<stdin>", line 1, in f
      return f(arg)
    File "<stdin>", line 1, in f
      return f(arg)

would be collapsed to something like this:

    File "<stdin>", line 1, in f
      return f(arg)
    [...previous call was repeated 5 more times...]

(In practice, you are more like to see "repeated 1000 more times" than 
just five.)

This would not be collapsed, as the line numbers are not the same:

    File "<stdin>", line 1, in f
    File "<stdin>", line 2, in f
    File "<stdin>", line 3, in f
    File "<stdin>", line 4, in f
    File "<stdin>", line 5, in f
    File "<stdin>", line 6, in f

nor this:

    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g
    File "<stdin>", line 1, in f
    File "<stdin>", line 1, in g

My proposal does not include any provision for collapsing chains of 
mutual recursion. If somebody else wants to champion that as a separate 
proposal, please do, but I won't be making that proposal.

This will shrink the ugly and harmful huge stacktraces that we 
get from many accidental recursion errors, without hiding potentially 
useful information.


From tjreedy at  Fri Apr 22 23:15:15 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 22 Apr 2016 23:15:15 -0400
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <BLU403-EAS1279634E3580DF262A3B8B91600@phx.gbl>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <nfepca$5up$>

On 4/22/2016 10:00 PM, ?manuel Barry wrote:
> I have been thinking, and especially after Nick?s comments, that it
> might be better to keep it as simple as possible to reduce the risk of
> errors happening while we?re printing the tracebacks. Recursive
> functions (that go deep enough to trigger an exception) are usually
> within one function, in the REPL, or by using __getattr__ /
> __getattribute__. Nice to have? Sure. Necessary? Don?t think as much.

I agree (I think) that the first issue and patch should be to detect and 
collapse n identical lines or pairs of lines in a row, where n > 3, with 
a message "The above pair of lines [or line] is repeated n-1 more times."
This will be an optional but nice new feature giving 80% benefit for 20% 
work.  There seems to be a consensus for this much.  Lets do it.

> I?m also a very solid -1 on the idea of prompting the user to do
> something like full_ex() to get the full stack trace. My rationale for
> such a change is that we need to be extremely precise in our message, to
> leave absolutely no room for a different interpretation than what we
> mean. Your message, for instance, is ambiguous. The fact it says that
> calls are hidden would let me think I just lost information about the
> stack trace (though that?s but a wording issue). As a user, just seeing
> "Call _full_ex() to print full trace." would be an immediate red flag:
> Crap, I just lost precious debugging information to save on lines!
> Might be just me, but that?s my opinion anyway :)

I agree that anything beyond the first step is more dubious.  I think 
extensions should be another proposal and possible issue.  Or left to 
GUI wrappers of an underlying Python.

> Example:
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>     [Mutually recursive calls hidden: f (300), g (360)]
>     File "<stdin>", line 1, in h
>     File "<stdin>", line 1, in f
>     File "<stdin>", line 1, in g
>    [Mutual-recursive calls hidden: f (103), g (200)]
>   RuntimeError: maximum recursion depth exceeded
>   [963 calls hidden. Call _full_ex() to print full trace.]
> This rule is easily modified to, "have been seen at least three times
> before." For functions that recurse at multiple lines, it can print out
> one message per line, but maybe it should count them all together in the
> "hidden" summary.

Terry Jan Reedy

From leewangzhong+python at  Sat Apr 23 02:14:10 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Sat, 23 Apr 2016 02:14:10 -0400
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <BLU403-EAS1279634E3580DF262A3B8B91600@phx.gbl>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

(Sorry, my previous message was sent from the wrong address, so only
?manuel got it. I then replied to his reply, which had me reuse the
wrong address. Sorry, ?manuel, for this repeat.)

On Fri, Apr 22, 2016 at 10:00 PM, ?manuel Barry <vgr255 at> wrote:
> I have been thinking, and especially after Nick?s comments, that it might be
> better to keep it as simple as possible to reduce the risk of errors
> happening while we?re printing the tracebacks. Recursive functions (that go
> deep enough to trigger an exception) are usually within one function, in the
> REPL, or by using __getattr__ / __getattribute__. Nice to have? Sure.
> Necessary? Don?t think as much.

Simple as in, unlikely to be affected by whatever caused the exception
in the first place?

Here's pseudocode for my suggestion.
(I assume appropriate definitions of `traceback`, `print_folded`, and
`print_the_thing`. I assume `func_name` is a qualified name (e.g.
`('f', '<stdin>')`).)

    seen = collections.Counter() # Counts number of times each
(func,line_no) has been seen
    block = None # A block of potentially-hidden functions.
    prev_line_no = None # In case len(block) == 1.
    hidecount = 0 # Total number of hidden lines.
    for func_name, line_no in traceback:
        times = seen[func_name, line_no] += 1
        if times >= 3:
            if block is None:
                block = collections.Counter()
            block[func_name] += 1
            prev_line_no = line_no
            # This `if` can be a function which returns a hidecount,
            #  so we don't repeat ourselves at the end of the loop.
            if block is not None:
                if len(block) == 1: # don't need to hide
keys()), prev_line_no)
                    hidecount += len(block)
                block = None
            print_the_thing(func_name, line_no)

    if block is not None:
        if len(block) == 1:
            hidecount += len(block)

> I?m also a very solid -1 on the idea of prompting the user to do something
> like full_ex() to get the full stack trace. My rationale for such a change
> is that we need to be extremely precise in our message, to leave absolutely
> no room for a different interpretation than what we mean. Your message, for
> instance, is ambiguous. The fact it says that calls are hidden would let me
> think I just lost information about the stack trace (though that?s but a
> wording issue). As a user, just seeing "Call _full_ex() to print full
> trace." would be an immediate red flag: Crap, I just lost precious debugging
> information to save on lines!

My hiding is more complex (can't reproduce original output exactly),
so it would be important to have an obvious way to get the old
behavior. Someone else can propose the wording, if the hiding strategy
itself seems useful.

On Fri, Apr 22, 2016 at 11:50 PM, Steven D'Aprano <steve at> wrote:
> For the record, detecting subsequences of mutally recurive calls is not
> a trivial task. (It's not the same as cycle detection.) And I don't
> understand how you can get 300 calls to f and 360 calls to g in a
> scenario where f calls g and g calls f. Surely there would be equal
> number of calls to each?

It depends on how you define the problem. I look for, "contiguous
block of output, each line of which has been seen before."

(You can set up a trivial example where g calls itself 60 times if the
parameter is mod 300. Other examples might involve randomness, or
perhaps it's just a really deep recursion rather than a cycle. I'm
just giving an example of how you don't need to detect repeated

From steve at  Sat Apr 23 06:13:36 2016
From: steve at (Steven D'Aprano)
Date: Sat, 23 Apr 2016 20:13:36 +1000
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

On Sat, Apr 23, 2016 at 02:14:10AM -0400, Franklin? Lee wrote:

> Here's pseudocode for my suggestion.
> (I assume appropriate definitions of `traceback`, `print_folded`, and
> `print_the_thing`. I assume `func_name` is a qualified name (e.g.
> `('f', '<stdin>')`).)
>     seen = collections.Counter() # Counts number of times each
> (func,line_no) has been seen
>     block = None # A block of potentially-hidden functions.
>     prev_line_no = None # In case len(block) == 1.
>     hidecount = 0 # Total number of hidden lines.
>     for func_name, line_no in traceback:
>         times = seen[func_name, line_no] += 1
>         if times >= 3:
>             if block is None:
>                 block = collections.Counter()
>             block[func_name] += 1
>             prev_line_no = line_no
>         else:
>             # This `if` can be a function which returns a hidecount,
>             #  so we don't repeat ourselves at the end of the loop.
>             if block is not None:
>                 if len(block) == 1: # don't need to hide
>                     print_the_thing(next(block.
> keys()), prev_line_no)
>                 else:
>                     print_folded(block)
>                     hidecount += len(block)
>                 block = None
>             print_the_thing(func_name, line_no)
>     if block is not None:
>         if len(block) == 1:
>             print_the_thing(block[0])
>         else:
>             print_folded(block)
>             hidecount += len(block)

Just in case anyone missed it, here's my actual, working, code, which I 
now have in my PYTHONSTARTUP file.

import sys
import traceback
from itertools import groupby
TEMPLATE = "  [...previous call is repeated %d times...]\n"

def collapse(seq):
    for key, group in groupby(seq):
        group = list(group)
        if len(group) < 3:
            for item in group:
                yield item
            yield key
            yield TEMPLATE % (len(group)-1)

def shortertb(*args):
    lines = traceback.format_exception(*args)

sys.excepthook = shortertb

And here is an actual working example of it in action:

py> import fact  # Uses the obvious recursive algorithm.
py> fact.fact(50000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/steve/python/", line 3, in fact
    return n*fact(n-1)
  [...previous call is repeated 997 times...]
  File "/home/steve/python/", line 2, in fact
    if n < 1: return 1
RuntimeError: maximum recursion depth exceeded in comparison

I'm not very interested in a complex, untested, incomplete, non-working 
chunk of pseudo-code when I have something which actually works in less 
than twenty lines. Especially since your version hides useful traceback 
information and requires the user to call a separate function to display 
the unmangled (and likely huge) traceback to find out what they're 

In my version, they're not missing anything: it's a simple run-length 
encoding of the tracebacks, and nothing is hidden. The output is just 

So I'm afraid that, even if you manage to get your pseudo-code working 
and debugged, I'm going to vote a strong -1 on your proposal. Even if it 
works, I don't want lines to be hidden just because they've been seen 
before in some unrelated part of the traceback.

> My hiding is more complex (can't reproduce original output exactly),
> so it would be important to have an obvious way to get the old
> behavior. Someone else can propose the wording, if the hiding strategy
> itself seems useful.

To me, it seems harmful, not useful.


From storchaka at  Sat Apr 23 07:14:30 2016
From: storchaka at (Serhiy Storchaka)
Date: Sat, 23 Apr 2016 14:14:30 +0300
Subject: [Python-ideas] Bytecode for calling function with keyword
In-Reply-To: <nfb2k8$81t$>
References: <nfat0p$84n$>
Message-ID: <nffleo$73o$>

On 21.04.16 20:28, Serhiy Storchaka wrote:
> On 21.04.16 19:00, Chris Angelico wrote:
>> Also: How does this interact with **dict calls?
> For now we have separate opcode for calls with **kwargs. We will have
> corresponding opcode with new way to provide keyword arguments.
> Instead of 4 current opcodes we will need at most 8 new opcodes. May be
> less, because calls with *args or **kwargs is less performance critical,
> we can pack arguments in a tuple and a dict by separate opcodes and call
> a function just with a tuple and a dict.

Actually I think we need only 3 opcodes: one general and two specialized 
for common cases.

1. Call with fixed number of positional arguments. This covers about 90% 
of calls.

2. Call with fixed number of positional and keyword arguments. This 
covers about 10% of calls.

3. Call with variable number of positional and/or keyword arguments 
packed in a tuple and a dict. Only 0.5% of calls need this. Since *args 
and **kwargs can be now used multiple times in the call, there are 
opcodes for packing arguments: BUILD_LIST_UNPACK and 

From leewangzhong+python at  Sat Apr 23 20:44:27 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Sat, 23 Apr 2016 20:44:27 -0400
Subject: [Python-ideas] Shrink recursion error tracebacks (was: Have
 REPL print less by default)
In-Reply-To: <>
References: <BLU403-EAS22061158D06BC971341ED84916D0@phx.gbl>
Message-ID: <>

On Sat, Apr 23, 2016 at 6:13 AM, Steven D'Aprano <steve at> wrote:

> Just in case anyone missed it, here's my actual, working, code,

> I'm not very interested in a complex, untested, incomplete, non-working
> chunk of pseudo-code when I have something which actually works in less
> than twenty lines.

I'm not conjoined to the idea. I'm just proposing it as an
alternative. I provided the pseudocode to express an idea, so that the
idea could be discussed. I've already identified one glaring bug, but
the quality of pseudocode itself is not really important: if there are
actual implementation obstacles, they can be discussed _if_ the idea
is considered useful.

Remember that we're all here for the good of Python.

> So I'm afraid that, even if you manage to get your pseudo-code working
> and debugged, I'm going to vote a strong -1 on your proposal. Even if it
> works, I don't want lines to be hidden just because they've been seen
> before in some unrelated part of the traceback.

In a single callstack, there ARE no unrelated parts of the traceback.
They've been seen before because they are part of a cycle: they
started a call chain that eventually leads back to them.

> Especially since your version hides useful traceback
> information and requires the user to call a separate function to display
> the unmangled (and likely huge) traceback to find out what they're
> missing.

In fact, it would probably only hide useful information if there is a
non-trivial graph cycle. In that case, the RLE will still give the
unmangled (and likely huge) traceback.

From guettliml at  Mon Apr 25 03:47:25 2016
From: guettliml at (=?UTF-8?Q?Thomas_G=c3=bcttler?=)
Date: Mon, 25 Apr 2016 09:47:25 +0200
Subject: [Python-ideas] unittest: assertEqual(a, b, msg): Show diff AND msg
Message-ID: <>

Up to now assertEqual(a, b, msg) outputs only the msg, not the diff.

I know that setting longMessage to True shows the diff and the msg[1]

I think the sane default is to show the diff and the message for assertEqual().

What do you think?

   Thomas G?ttler

[1] longMessage:

Thomas Guettler

From ram at  Mon Apr 25 04:38:17 2016
From: ram at (Ram Rachum)
Date: Mon, 25 Apr 2016 11:38:17 +0300
Subject: [Python-ideas] Allow with (x as y, z as w):
Message-ID: <>


I have a problem: I've got code that has a `with` statement with multiple
long names, so I have to break the line awkwardly using the \ character:

with some_darn_object.my_long_named_method_on_it(argument='yep') as foo, \
     a_different_darn_object.and_yet_another_method() as bar:

I tried wrapping the context managers in parentheses, but it looks like
it's not legal syntax:

  File "<stdin>", line 1

    with (some_darn_object.my_long_named_method_on_it(argument='yep') as

SyntaxError: invalid syntax

Another demonstration from the REPL:

Python 3.5.1 (v3.5.1:37a07cee5969, Dec  6 2015, 01:54:25) [MSC v.1900 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

>>> with (x as y, z as w): pass

  File "<stdin>", line 1

    with (x as y, z as w): pass


SyntaxError: invalid syntax

I wish this syntax to be legal.

-------------- next part --------------
> On 25 Apr 2016, at 09:38, Ram Rachum <ram at> wrote:
> Hi,
> I have a problem: I've got code that has a `with` statement with multiple long names, so I have to break the line awkwardly using the \ character:


This request has come up a couple of times before. See the following prior discussions:

- <>
- <>
The upshot is: the restriction on context managers with parentheses like this is relatively likely to stick around. There have been discussions about the relative ambiguity of the syntax you?re using here compared with the syntax of tuples, and some suggestions that it?ll make life particularly tricky for the parser.

I don?t have anything to add to those discussions, just wanted to suggest you may want to familiarise yourself with the previous times we had this conversation before continuing it.

On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler <
guettliml at> wrote:

> Up to now assertEqual(a, b, msg) outputs only the msg, not the diff.
> I know that setting longMessage to True shows the diff and the msg[1]
> I think the sane default is to show the diff and the message for
> assertEqual().
> What do you think?

longMessage already defaults to True in Python 3.

Changing the default in a future Python 2.7.xx release is unlikely as that
kind of change can catch people by surprise and cause problems in the
middle of a stable release.

Setting it to true manually in all your 2.x code?  recommended!

-------------- next part --------------
Am 25.04.2016 um 19:35 schrieb Gregory P. Smith:
> On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler <guettliml at <mailto:guettliml at>> wrote:
>     Up to now assertEqual(a, b, msg) outputs only the msg, not the diff.
>     I know that setting longMessage to True shows the diff and the msg[1]
>     I think the sane default is to show the diff and the message for assertEqual().
>     What do you think?
> longMessage already defaults to True in Python 3.
> Changing the default in a future Python 2.7.xx release is unlikely as that kind of change can catch people by surprise
> and cause problems in the middle of a stable release.

Thank you Gregory! I was blind.

First I read the Python2 docs, then I read the first line of the docs of Python3:

    If set to True then any explicit failure  .....

Yes, Python3 has the better default. Maybe the docs should get updated.
I guess the above sentence was copied from the old docs where you had to set True.

"If set to True then ..." is correct if you have a "math brain". But it is confusing
for new comers.

Thomas Guettler

From barry at  Tue Apr 26 11:22:16 2016
From: barry at (Barry Warsaw)
Date: Tue, 26 Apr 2016 11:22:16 -0400
Subject: [Python-ideas] Allow with (x as y, z as w):
References: <>
Message-ID: <>

On Apr 25, 2016, at 12:36 PM, Cory Benfield wrote:

>This request has come up a couple of times before.
>The upshot is: the restriction on context managers with parentheses like this
>is relatively likely to stick around.

I highly recommend looking at contextlib.ExitStack.  Once I started using
this idiom, I found I wanted the requested with-feature much less frequently.

-------------- next part --------------
Also, I think some people are too fundamentalist about rejecting all uses
of \ to break long lines. When combined with proper indentation of the
continuation line, if there is no alternative, I think it looks better than
artifically introduced parentheses. At least if there's only one or two
continuation lines. E.g.
very_long_variable_name = \
looks better to me than
very_long_variable_name = (
I'm just saying, there's a reason PEP 8's motto is "A Foolish Consistency
is the Hobgoblin of Little Minds".

On Tue, Apr 26, 2016 at 8:22 AM, Barry Warsaw <barry at> wrote:

> On Apr 25, 2016, at 12:36 PM, Cory Benfield wrote:
> >This request has come up a couple of times before.
> ...
> >The upshot is: the restriction on context managers with parentheses like
> this
> >is relatively likely to stick around.
> I highly recommend looking at contextlib.ExitStack.  Once I started using
> this idiom, I found I wanted the requested with-feature much less
> frequently.
> Cheers,
> -Barry
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
I opened a docs issue for "longMessage"

Am 26.04.2016 um 10:06 schrieb Thomas G?ttler:
> Am 25.04.2016 um 19:35 schrieb Gregory P. Smith:
>> On Mon, Apr 25, 2016 at 12:47 AM Thomas G?ttler <guettliml at <mailto:guettliml at>>
>> wrote:
>>     Up to now assertEqual(a, b, msg) outputs only the msg, not the diff.
>>     I know that setting longMessage to True shows the diff and the msg[1]
>>     I think the sane default is to show the diff and the message for assertEqual().
>>     What do you think?
>> longMessage already defaults to True in Python 3.
>> Changing the default in a future Python 2.7.xx release is unlikely as that kind of change can catch people by surprise
>> and cause problems in the middle of a stable release.
> Thank you Gregory! I was blind.
> First I read the Python2 docs, then I read the first line of the docs of Python3:
>     If set to True then any explicit failure  .....
> Yes, Python3 has the better default. Maybe the docs should get updated.
> I guess the above sentence was copied from the old docs where you had to set True.
> "If set to True then ..." is correct if you have a "math brain". But it is confusing
> for new comers.

Thomas Guettler