From techtonik at  Wed Jan  1 12:58:35 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 1 Jan 2014 14:58:35 +0300
Subject: [Python-ideas] os.architecture
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Dec 30, 2013 at 3:13 PM, Andrew Barnert <abarnert at> wrote:
> On Dec 30, 2013, at 0:56, anatoly techtonik <techtonik at> wrote:
>> Ok. Architecture is a fail in terminology. The word "OS architecture"
>> can mean many things, and it will be the same design flaw as
>> How about os.bitness instead?
> You missed the part where you were told that os is for OS services, not platform (including hardware, interpreter, and OS) information.

I've heard your opinion. Now why do you think os is for OS services?
Docs say os is about OS interfaces, to which bitness or architecture
is interface information.

> Anyway, "bitness" by itself doesn't tell you whether it will return 32 or 64 when running a 32-bit Python on 64-bit Windows

That's why it is "os.bitness", not "interpreter.bitness" or "cpu.bitness".

> It's just as potentially ambiguous as the functions that already exist

Do you still think so after my example above?

From thomasgrzybowski at  Wed Jan  1 21:12:58 2014
From: thomasgrzybowski at (tg)
Date: Wed, 01 Jan 2014 15:12:58 -0500
Subject: [Python-ideas] Reporting tools for python
Message-ID: <>

With the more general use of python for access to database
information, numpy and scipy analysis, and web posting, it seems that
there should be more and better means of reporting from python. Some of 
the existing tools are too low-level for general use (such as Reportlab).

As far as I can tell, there are no tools that approach the high-level 
of Proc Report, as used in SAS. Pagination with headers and footers,
and column-spanning headers are some specific tool limitations.  I
believe that there would be even more usage of python in science and
industry if there were better tools for reporting.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From phd at  Wed Jan  1 21:28:53 2014
From: phd at (Oleg Broytman)
Date: Wed, 1 Jan 2014 21:28:53 +0100
Subject: [Python-ideas] Reporting tools for python
In-Reply-To: <>
References: <>
Message-ID: <>


On Wed, Jan 01, 2014 at 03:12:58PM -0500, tg <thomasgrzybowski at> wrote:
> With the more general use of python for access to database
> information, numpy and scipy analysis, and web posting, it seems that
> there should be more and better means of reporting from python.

   Well, python is a programming language. It doesn't need any builtin
reporting. Even the standard library doesn't.
   (python-ideas is about ides for python and stdlib, not third-party
libraries or applications.)

> As far as I can tell, there are no tools that approach the
> high-level functionality
> of Proc Report, as used in SAS. Pagination with headers and footers,
> and column-spanning headers are some specific tool limitations.

   Like ? It was written in our
company (not by me) and was in use for some time.

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From stephen at  Thu Jan  2 06:24:00 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 02 Jan 2014 14:24:00 +0900
Subject: [Python-ideas] os.architecture
In-Reply-To: <>
References: <>
Message-ID: <>

anatoly techtonik writes:

 > I've heard your opinion. Now why do you think os is for OS
 > services?

Because everything in there is a Python wrapper for an OS service, and
because platfrom covers your use case.  That may not be obvious to
you.  But AFAICT (once explained) it works for most Pythonistas and is
a consistent point of view.  Your suggestion is nowhere near TOOWTDI,
so it's not going to happen.

From techtonik at  Wed Jan  1 20:01:13 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 1 Jan 2014 22:01:13 +0300
Subject: [Python-ideas] Fixing __file__ to be absolute
Message-ID: <>

Fixing this thing will make my happy (or very sad if you'd like this).

Problem is described here:
1.  chdir()
2.  dirname(__file__)
3.  FAIL

from __future__ import abs__file__

anatoly t.

From taleinat at  Thu Jan  2 13:37:56 2014
From: taleinat at (Tal Einat)
Date: Thu, 2 Jan 2014 14:37:56 +0200
Subject: [Python-ideas] Fixing __file__ to be absolute
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 1, 2014 at 9:01 PM, anatoly techtonik <techtonik at> wrote:
> Fixing this thing will make my happy (or very sad if you'd like this).
> Problem is described here:
> Summary:
> 1.  chdir()
> 2.  dirname(__file__)
> 3.  FAIL
> Proposal:
> from __future__ import abs__file__

Anatoly, this subject was already discussed on this list, just three
months ago, in a thread you started! [1]_

To quote one of Nick Coglahan's replies [2]_:

> Note that any remaining occurrences of non-absolute values in __file__ are
> generally considered bugs in the import system. However, we tend not to fix
> them in maintenance releases, since converting relative paths to absolute
> paths runs a risk of breaking user code.

> We're definitely *not* going to further pollute the module namespace with
> values that can be trivially and reliably derived from existing values.

- Tal

.. [1]:
.. [2]:

From liam.marsh.home at  Thu Jan  2 12:57:49 2014
From: liam.marsh.home at (Liam Marsh)
Date: Thu, 2 Jan 2014 12:57:49 +0100
Subject: [Python-ideas] *var()*
In-Reply-To: <>
References: <>
Message-ID: <>

hello,here is my idea:
input var name (str),
outputs var value

1.34thank you and have a nice day!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Thu Jan  2 14:27:42 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Thu, 2 Jan 2014 05:27:42 -0800
Subject: [Python-ideas] *var()*
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 3:57 AM, Liam Marsh <liam.marsh.home at> wrote:
> hello,here is my idea:
> var():
> input var name (str),
> outputs var value
> example:
> 1.34thank you and have a nice day!

This is underspecified. What should it do for this code?

count = 3
def foo():
    print var('count', 1)

If the output is "1", then you're in luck and can already use
vars().get('count', 1)

Otherwise, I don't know a trivial one-liner to do it. Either way I'd
be -1 on its inclusion in Python, it encourages a bad idiom.

-- Devin

From brett at  Thu Jan  2 14:28:48 2014
From: brett at (Brett Cannon)
Date: Thu, 2 Jan 2014 08:28:48 -0500
Subject: [Python-ideas] Fixing __file__ to be absolute
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 7:37 AM, Tal Einat <taleinat at> wrote:

> On Wed, Jan 1, 2014 at 9:01 PM, anatoly techtonik <techtonik at>
> wrote:
> > Fixing this thing will make my happy (or very sad if you'd like this).
> >
> > Problem is described here:
> >
> > Summary:
> > 1.  chdir()
> > 2.  dirname(__file__)
> > 3.  FAIL
> >
> > Proposal:
> > from __future__ import abs__file__
> Anatoly, this subject was already discussed on this list, just three
> months ago, in a thread you started! [1]_
> To quote one of Nick Coglahan's replies [2]_:
> > Note that any remaining occurrences of non-absolute values in __file__
> are
> > generally considered bugs in the import system. However, we tend not to
> fix
> > them in maintenance releases, since converting relative paths to absolute
> > paths runs a risk of breaking user code.
> > We're definitely *not* going to further pollute the module namespace with
> > values that can be trivially and reliably derived from existing values.

This was also changed in Python 3.4 back in October:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Thu Jan  2 14:35:00 2014
From: steve at (Steven D'Aprano)
Date: Fri, 3 Jan 2014 00:35:00 +1100
Subject: [Python-ideas] *var()*
In-Reply-To: <>
References: <>
Message-ID: <20140102133500.GM29356@ando>

On Thu, Jan 02, 2014 at 12:57:49PM +0100, Liam Marsh wrote:

> hello,here is my idea:
> var():
> input var name (str),
> outputs var value
> example:
> >>>count1=1.34
> >>>var('count',1)
> 1.34thank you and have a nice day!

Hello Liam, and welcome! Is this your first post here? I don't recall 
seeing your name before.

I'm afraid I don't quite understand your example above. The "thank you 
and have a nice day" confuses me, I don't understand where it comes 
from. Also, I'm not sure why you define a variable count1 = 1.34, and 
then pass "count", 1 as two separate arguments to the function. So I'm 
going to try to guess what your idea actually is, or at least what I 
think is reasonable, if I get it wrong please feel free to correct me.

You want a function, var(), which takes a single argument, the name of a 
variable, and then returns the value of that variable. E.g. given a 
variable "count1" set to the value 1.34, the function call:


will return 1.34.

Is this what you mean?

If so, firstly, the name "var" is too close to the existing function 
"vars". This would cause confusion.

Secondly, you can already do this, or at least *almost* this, using 
the locals() and globals() functions. Both will return a dict containing 
the local and global variables, so you can look up the variable name 
easily using locals() and standard dictionary methods:

py> count1 = 1.34
py> locals()['count1']
py> locals().get('count2', 'default')

The only thing which is missing is that there's no way to look up a 
variable name if you don't know which scope it is in. Normally name 
resolution goes:


You can easily look up a local name, or a global name, using the 
locals() and globals() function. With just a tiny bit more effort, you 
can also look in the builtins. But there's no way that I know of to look 
up a nonlocal name, or a name in an unspecified scope. Consequently, 
this *almost* works:

def lookup(name):
    import builtins
    for namespace in (locals(), globals(), vars(builtins)):
            return namespace[name]
        except KeyError:
    raise NameError("name '%s' not found" % name)

except for the nonlocal scope.

I would have guessed that you could get this working with eval, but if 
there is such a way, I can't work it out.

I think this would make a nice addition to the inspect module. I 
wouldn't want to see it as a builtin function, since it would encourage 
a style of programming which I think is poor, but for those occasional 
uses where you want to look up a variable from an unknown scope, I think 
this would be handy.


From techtonik at  Thu Jan  2 14:46:53 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 2 Jan 2014 16:46:53 +0300
Subject: [Python-ideas] Fixing __file__ to be absolute
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 4:28 PM, Brett Cannon <brett at> wrote:
> On Thu, Jan 2, 2014 at 7:37 AM, Tal Einat <taleinat at> wrote:
>> On Wed, Jan 1, 2014 at 9:01 PM, anatoly techtonik <techtonik at>
>> wrote:
>> > Fixing this thing will make my happy (or very sad if you'd like this).
>> >
>> > Problem is described here:
>> >
>> > Summary:
>> > 1.  chdir()
>> > 2.  dirname(__file__)
>> > 3.  FAIL
>> >
>> > Proposal:
>> > from __future__ import abs__file__
>> Anatoly, this subject was already discussed on this list, just three
>> months ago, in a thread you started! [1]_
>> To quote one of Nick Coglahan's replies [2]_:
>> > Note that any remaining occurrences of non-absolute values in __file__
>> > are
>> > generally considered bugs in the import system. However, we tend not to
>> > fix
>> > them in maintenance releases, since converting relative paths to
>> > absolute
>> > paths runs a risk of breaking user code.
>> > We're definitely *not* going to further pollute the module namespace
>> > with
>> > values that can be trivially and reliably derived from existing values.
> This was also changed in Python 3.4 back in October:

Thanks. That's just what I was looking for - a status update.
Links in emails are not telling anything about progress being
made, roadmap, problems and versions of Python. Seem like
tracker is a poor tool to track this stuff too.

Now in spite of recent Python 3 status update, the question is how
possible to make this feature more visible and implemented in
previous version as from __future__ import abs__file__?

I'd like to ask for two perspectives:
1. technical feasibility
2. political obstacles (backward compatibility policy / process obstacles),
    even if they are obvious

Also, what is the process of nominating this features to selection in
Python 2.8 (or whatever comes out of this incremental development idea)?

So, three questions with ideas in total.
anatoly t.

From brett at  Thu Jan  2 15:52:12 2014
From: brett at (Brett Cannon)
Date: Thu, 2 Jan 2014 09:52:12 -0500
Subject: [Python-ideas] Fixing __file__ to be absolute
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 8:46 AM, anatoly techtonik <techtonik at>wrote:

> On Thu, Jan 2, 2014 at 4:28 PM, Brett Cannon <brett at> wrote:
> > On Thu, Jan 2, 2014 at 7:37 AM, Tal Einat <taleinat at> wrote:
> >> On Wed, Jan 1, 2014 at 9:01 PM, anatoly techtonik <techtonik at>
> >> wrote:
> >> > Fixing this thing will make my happy (or very sad if you'd like this).
> >> >
> >> > Problem is described here:
> >> >
> >> > Summary:
> >> > 1.  chdir()
> >> > 2.  dirname(__file__)
> >> > 3.  FAIL
> >> >
> >> > Proposal:
> >> > from __future__ import abs__file__
> >>
> >> Anatoly, this subject was already discussed on this list, just three
> >> months ago, in a thread you started! [1]_
> >>
> >> To quote one of Nick Coglahan's replies [2]_:
> >>
> >> > Note that any remaining occurrences of non-absolute values in __file__
> >> > are
> >> > generally considered bugs in the import system. However, we tend not
> to
> >> > fix
> >> > them in maintenance releases, since converting relative paths to
> >> > absolute
> >> > paths runs a risk of breaking user code.
> >>
> >> > We're definitely *not* going to further pollute the module namespace
> >> > with
> >> > values that can be trivially and reliably derived from existing
> values.
> >
> >
> > This was also changed in Python 3.4 back in October:
> >
> Thanks. That's just what I was looking for - a status update.
> Links in emails are not telling anything about progress being
> made, roadmap, problems and versions of Python. Seem like
> tracker is a poor tool to track this stuff too.

It's not in released code yet so there is no way to really promote this in
a way that is guaranteed not to change. It will be in the What's New doc
for Python 3.4, though, when the final version is released:

> Now in spite of recent Python 3 status update, the question is how
> possible to make this feature more visible and implemented in
> previous version as from __future__ import abs__file__?

There is no chance that will ever happen.

> I'd like to ask for two perspectives:
> 1. technical feasibility

I don't see why it wouldn't be technically possible since I made it work in
Python 3.4.

> 2. political obstacles (backward compatibility policy / process obstacles),
>     even if they are obvious

It would be a total break in backwards-compatibility by adding a new
feature in a bugfix release and that's never acceptable (and that rule has
been in effect since Python 2.2.1).

> Also, what is the process of nominating this features to selection in
> Python 2.8 (or whatever comes out of this incremental development idea)?

There is no future Python 2.8 release so there is no process to nominate
something; PEP 404 is very clear on this: . And there is no "incremental
development idea" or something that's going to change the current
development process of Python so that part of the questions doesn't make
sense to me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From liam.marsh.home at  Thu Jan  2 17:22:21 2014
From: liam.marsh.home at (Liam Marsh)
Date: Thu, 2 Jan 2014 17:22:21 +0100
Subject: [Python-ideas] *var()*
In-Reply-To: <>
References: <>
Message-ID: <>

dear Jeanpierre,
sorry, no.
for >>>count1=3,
    var('count1') or var(str('count',1)) will output 3
in fact, it is even better to use libraries,
and it was stupid to send the first email before trying an other way.

2014/1/2 Devin Jeanpierre <jeanpierreda at>

> On Thu, Jan 2, 2014 at 3:57 AM, Liam Marsh <liam.marsh.home at>
> wrote:
> > hello,here is my idea:
> > var():
> > input var name (str),
> > outputs var value
> > example:
> >
> >>>>count1=1.34
> >>>>var('count',1)
> > 1.34
>thank you and have a nice day!
> This is underspecified. What should it do for this code?
> count = 3
> def foo():
>     print var('count', 1)
> foo()
> If the output is "1", then you're in luck and can already use
> vars().get('count', 1)
> Otherwise, I don't know a trivial one-liner to do it. Either way I'd
> be -1 on its inclusion in Python, it encourages a bad idiom.
> -- Devin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From denis.spir at  Thu Jan  2 17:39:24 2014
From: denis.spir at (spir)
Date: Thu, 02 Jan 2014 17:39:24 +0100
Subject: [Python-ideas] *var()*
In-Reply-To: <20140102133500.GM29356@ando>
References: <>
Message-ID: <>

On 01/02/2014 02:35 PM, Steven D'Aprano wrote:
> On Thu, Jan 02, 2014 at 12:57:49PM +0100, Liam Marsh wrote:
>> hello,here is my idea:
>> var():
>> input var name (str),
>> outputs var value
>> example:
>>>>> count1=1.34
>>>>> var('count',1)
>> 1.34thank you and have a nice day!
> Hello Liam, and welcome! Is this your first post here? I don't recall
> seeing your name before.
> I'm afraid I don't quite understand your example above. The "thank you
> and have a nice day" confuses me, I don't understand where it comes
> from. Also, I'm not sure why you define a variable count1 = 1.34, and
> then pass "count", 1 as two separate arguments to the function. So I'm
> going to try to guess what your idea actually is, or at least what I
> think is reasonable, if I get it wrong please feel free to correct me.
> You want a function, var(), which takes a single argument, the name of a
> variable, and then returns the value of that variable. E.g. given a
> variable "count1" set to the value 1.34, the function call:
> var("count1")
> will return 1.34.
> Is this what you mean?
> If so, firstly, the name "var" is too close to the existing function
> "vars". This would cause confusion.
> Secondly, you can already do this, or at least *almost* this, using
> the locals() and globals() functions. Both will return a dict containing
> the local and global variables, so you can look up the variable name
> easily using locals() and standard dictionary methods:
> py> count1 = 1.34
> py> locals()['count1']
> 1.34
> py> locals().get('count2', 'default')
> 'default'
> The only thing which is missing is that there's no way to look up a
> variable name if you don't know which scope it is in. Normally name
> resolution goes:
> locals
> nonlocals
> globals
> builtins
> You can easily look up a local name, or a global name, using the
> locals() and globals() function. With just a tiny bit more effort, you
> can also look in the builtins. But there's no way that I know of to look
> up a nonlocal name, or a name in an unspecified scope. Consequently,
> this *almost* works:
> def lookup(name):
>      import builtins
>      for namespace in (locals(), globals(), vars(builtins)):
>          try:
>              return namespace[name]
>          except KeyError:
>              pass
>      raise NameError("name '%s' not found" % name)
> except for the nonlocal scope.
> I would have guessed that you could get this working with eval, but if
> there is such a way, I can't work it out.
> I think this would make a nice addition to the inspect module. I
> wouldn't want to see it as a builtin function, since it would encourage
> a style of programming which I think is poor, but for those occasional
> uses where you want to look up a variable from an unknown scope, I think
> this would be handy.

I once used a direct try ... except NameError, which automagically looks up in 
the whole scope cascade:

i = 1
try: x = i
except NameError: x = None

# no "lookup-able" symbol 'j'
try: y = j
except NameError: y = None

print (x,y)     # ==>   1 None

Pretty practicle.

[Actually, I've never had any need for this in real python code, it was to 
simulate variable strings (implanted as eg "Hello, {username}!"), which requires 
variable lookup by name, itself variable. But python already has the final 
feature (even twice, with % or format).]


From james at  Thu Jan  2 21:29:21 2014
From: james at (James Powell)
Date: Thu, 02 Jan 2014 15:29:21 -0500
Subject: [Python-ideas] str.startswith taking any iterator instead of just
Message-ID: <>

Some functions and methods allow the provision of a tuple of arguments
which will be looped over internally. e.g.,

    'spam'.startswith(('s', 'z')) # 'spam' starts with 's' or with 'z'
    isinstance(42, (float, int))

In these cases, CPython uses PyTuple_Check and PyTuple_GET_ITEM to
perform this internal iteration.

As a result, the following are considered invalid:

    'spam'.startswith(['s', 'z'])
    'spam'.startswith({'s', 'z'})
    'spam'.startswith(x for x in 'sz')

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: startswith first arg must be str, unicode, or tuple

There are two common workarounds:

    'spam'.startswith(tuple({'s', 'z'}))
    any('spam'.startwith(c) for c in {'s', 'z'})

Of course, the following construction already has a clear, separate meaning:

   'spam'.startswith('sz') # 'spam' starts with 'sz'

In these cases, could we supplant the PyTuple_Check with one that would
allow any iterator? Alternatively, could add this as an additional branch?

The code would look something like:

    it = PyObject_GetIter(subobj);
    if (it == NULL)
        return NULL;

    iternext = *Py_TYPE(it)->tp_iternext;
    for(;;) {
        substring = iternext(it);
        if (substring == NULL)
        result = tailmatch(self, substring, start, end, -1);
        if (result)

Of course, in the case of methods like .startswith, this would need to
ensure the following behaviour remains unchanged. The following should
always check if 'spam' starts with 'sz' not starts with 's' or with 'z':


I searched and python-ideas for any previous discussion
of this topic. If this seems reasonable, I can submit an enhancement to with a patch for unicodeobject.c:unicode_startswith

James Powell

follow: @dontusethiscode + @nycpython
attend: +

From guido at  Fri Jan  3 00:24:00 2014
From: guido at (Guido van Rossum)
Date: Thu, 2 Jan 2014 13:24:00 -1000
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

The current behavior is intentional, and the ambiguity of strings
themselves being iterables is the main reason. Since startswith() is
almost always called with a literal or tuple of literals anyway, I see
little need to extend the semantics. (I notice that you don't actually
give any examples where the iterator would be useful -- have you
encountered any, or are you just arguing for consistency's sake?)

On Thu, Jan 2, 2014 at 10:29 AM, James Powell <james at> wrote:
> Some functions and methods allow the provision of a tuple of arguments
> which will be looped over internally. e.g.,
>     'spam'.startswith(('s', 'z')) # 'spam' starts with 's' or with 'z'
>     isinstance(42, (float, int))
> In these cases, CPython uses PyTuple_Check and PyTuple_GET_ITEM to
> perform this internal iteration.
> As a result, the following are considered invalid:
>     'spam'.startswith(['s', 'z'])
>     'spam'.startswith({'s', 'z'})
>     'spam'.startswith(x for x in 'sz')
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: startswith first arg must be str, unicode, or tuple
> There are two common workarounds:
>     'spam'.startswith(tuple({'s', 'z'}))
>     any('spam'.startwith(c) for c in {'s', 'z'})
> Of course, the following construction already has a clear, separate meaning:
>    'spam'.startswith('sz') # 'spam' starts with 'sz'
> In these cases, could we supplant the PyTuple_Check with one that would
> allow any iterator? Alternatively, could add this as an additional branch?
> The code would look something like:
>     it = PyObject_GetIter(subobj);
>     if (it == NULL)
>         return NULL;
>     iternext = *Py_TYPE(it)->tp_iternext;
>     for(;;) {
>         substring = iternext(it);
>         if (substring == NULL)
>             Py_RETURN_FALSE;
>         result = tailmatch(self, substring, start, end, -1);
>         Py_DECREF(substring);
>         if (result)
>             Py_RETURN_TRUE;
>     }
> Of course, in the case of methods like .startswith, this would need to
> ensure the following behaviour remains unchanged. The following should
> always check if 'spam' starts with 'sz' not starts with 's' or with 'z':
>     'spam'.startswith('sz')
> I searched and python-ideas for any previous discussion
> of this topic. If this seems reasonable, I can submit an enhancement to
> with a patch for unicodeobject.c:unicode_startswith
> Cheers,
> James Powell
> follow: @dontusethiscode + @nycpython
> attend: +
> read:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From amber.yust at  Fri Jan  3 00:33:59 2014
From: amber.yust at (Amber Yust)
Date: Thu, 02 Jan 2014 23:33:59 +0000
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
References: <>
Message-ID: <-7933402584649597485@gmail297201516>

I could see expanding to allow lists/sets as well as tuples being useful,
e.g. for using dynamically generated prefix lists without creating
additional tuple objects, but I don't see arbitrary iteration being

On Thu Jan 02 2014 at 3:25:20 PM, Guido van Rossum <guido at> wrote:

> The current behavior is intentional, and the ambiguity of strings
> themselves being iterables is the main reason. Since startswith() is
> almost always called with a literal or tuple of literals anyway, I see
> little need to extend the semantics. (I notice that you don't actually
> give any examples where the iterator would be useful -- have you
> encountered any, or are you just arguing for consistency's sake?)
> On Thu, Jan 2, 2014 at 10:29 AM, James Powell <james at>
> wrote:
> > Some functions and methods allow the provision of a tuple of arguments
> > which will be looped over internally. e.g.,
> >
> >     'spam'.startswith(('s', 'z')) # 'spam' starts with 's' or with 'z'
> >     isinstance(42, (float, int))
> >
> > In these cases, CPython uses PyTuple_Check and PyTuple_GET_ITEM to
> > perform this internal iteration.
> >
> > As a result, the following are considered invalid:
> >
> >     'spam'.startswith(['s', 'z'])
> >     'spam'.startswith({'s', 'z'})
> >     'spam'.startswith(x for x in 'sz')
> >
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in <module>
> >     TypeError: startswith first arg must be str, unicode, or tuple
> >
> > There are two common workarounds:
> >
> >     'spam'.startswith(tuple({'s', 'z'}))
> >     any('spam'.startwith(c) for c in {'s', 'z'})
> >
> > Of course, the following construction already has a clear, separate
> meaning:
> >
> >    'spam'.startswith('sz') # 'spam' starts with 'sz'
> >
> > In these cases, could we supplant the PyTuple_Check with one that would
> > allow any iterator? Alternatively, could add this as an additional
> branch?
> >
> > The code would look something like:
> >
> >     it = PyObject_GetIter(subobj);
> >     if (it == NULL)
> >         return NULL;
> >
> >     iternext = *Py_TYPE(it)->tp_iternext;
> >     for(;;) {
> >         substring = iternext(it);
> >         if (substring == NULL)
> >             Py_RETURN_FALSE;
> >         result = tailmatch(self, substring, start, end, -1);
> >         Py_DECREF(substring);
> >         if (result)
> >             Py_RETURN_TRUE;
> >     }
> >
> > Of course, in the case of methods like .startswith, this would need to
> > ensure the following behaviour remains unchanged. The following should
> > always check if 'spam' starts with 'sz' not starts with 's' or with 'z':
> >
> >     'spam'.startswith('sz')
> >
> > I searched and python-ideas for any previous discussion
> > of this topic. If this seems reasonable, I can submit an enhancement to
> > with a patch for unicodeobject.c:unicode_startswith
> >
> > Cheers,
> > James Powell
> >
> > follow: @dontusethiscode + @nycpython
> > attend: +
> > read:
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From james at  Fri Jan  3 00:37:56 2014
From: james at (James Powell)
Date: Thu, 02 Jan 2014 18:37:56 -0500
Subject: [Python-ideas] str.startswith taking any iterator instead of
 just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/02/2014 06:24 PM, Guido van Rossum wrote:
> The current behavior is intentional, and the ambiguity of strings
> themselves being iterables is the main reason. Since startswith() is
> almost always called with a literal or tuple of literals anyway, I see
> little need to extend the semantics. (I notice that you don't actually
> give any examples where the iterator would be useful -- have you
> encountered any, or are you just arguing for consistency's sake?)

This is driven by a real-world example wherein a large number of
prefixes stored in a set, necessitating:

    any('spam'.startswith(c) for c in prefixes)
    # or

However, .startswith doesn't seem to be the only example of this, and
the other examples are free of the string/iterable ambiguity:

    isinstance(x, {int, float})

I do agree that it's definitely important to retain the behaviour of:


At same time, I think the non-string iterable problem is already fairly
well-known and not a source of great confusion. How often has one typed:

   isinstance(x, Iterable) and not isinstance(x, str)

James Powell

follow: @dontusethiscode + @nycpython
attend: +

From guido at  Fri Jan  3 00:59:04 2014
From: guido at (Guido van Rossum)
Date: Thu, 2 Jan 2014 13:59:04 -1000
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 1:37 PM, James Powell <james at> wrote:
> On 01/02/2014 06:24 PM, Guido van Rossum wrote:
>> The current behavior is intentional, and the ambiguity of strings
>> themselves being iterables is the main reason. Since startswith() is
>> almost always called with a literal or tuple of literals anyway, I see
>> little need to extend the semantics. (I notice that you don't actually
>> give any examples where the iterator would be useful -- have you
>> encountered any, or are you just arguing for consistency's sake?)
> This is driven by a real-world example wherein a large number of
> prefixes stored in a set, necessitating:
>     any('spam'.startswith(c) for c in prefixes)
>     # or
>     'spam'.startswith(tuple(prefixes))

Neither of these strikes me as bad. Also, depending on whether the set
of prefixes itself changes dynamically, it may be best to lift the
tuple() call out of the startswith() call.

Note that for performance, I suspect that the any() version will be
slower if you can avoid calling tuple() every time -- I recall once
finding that x.startswith('ab') benchmarked slower than x[:2] == 'ab'
because the name lookup for 'startswith' dominated the overall time.

> However, .startswith doesn't seem to be the only example of this, and
> the other examples are free of the string/iterable ambiguity:
>     isinstance(x, {int, float})

But this is even less likely to have a dynamically generated argument.

And there could still be another ambiguity here: a metaclass could
conceivably make its instances (i.e. classes) iterable.

> I do agree that it's definitely important to retain the behaviour of:
>     'spam'.startswith('sz')

Duh. :-)

> At same time, I think the non-string iterable problem is already fairly
> well-known and not a source of great confusion. How often has one typed:
>    isinstance(x, Iterable) and not isinstance(x, str)

If you find yourself typing that a lot I think you have a bigger problem though.

All in all I hope you will give up your push for this feature. It just
doesn't seem all that important, and you really just move the
inconsistency to a different place (special-casing strings instead of

--Guido van Rossum (

From guido at  Fri Jan  3 01:16:39 2014
From: guido at (Guido van Rossum)
Date: Thu, 2 Jan 2014 14:16:39 -1000
Subject: [Python-ideas] *var()*
In-Reply-To: <20140102133500.GM29356@ando>
References: <>
Message-ID: <>

On Thu, Jan 2, 2014 at 3:35 AM, Steven D'Aprano <steve at> wrote:
> I would have guessed that you could get this working with eval, but if
> there is such a way, I can't work it out.

It's trivial if you directly invoke eval():

x = 42
def example():
  print 'first:', eval('x')
  y = 'hello world'
  print 'second:', eval('y')

will print

first: 42
second: hello world

Writing Liam's var() as a regular function would require using
sys._getframe() and won't access intermediate scopes; something like
this would at least find locals and globals:

def var(*args):
  name = ''.join(map(str, args))  # So var('count', 1) is the same as
  frame = sys._getframe(1)  # Caller's frame
  return eval(name, frame.f_globals, frame.f_locals)

Now this works as desired:

x = 42
def example():
  print 'first:', var('x')
  y = 'hello world'
  print 'second:', var('y')

All in all, agreed this doesn't need to be added to the language,
given that it's easy enough() to invoke eval() directly. (And advanced
programmers tend to use all kinds of other tricks to avoid the need.)

Two more things, especially for Liam:

(1) There was nothing stupid about your post -- welcome to the Python community!

(2) eval() is much more powerful than just variable lookup; if you
write a program that asks its user for a variable name and then pass
that to eval(), a clever user could trick your program into running
code you might not like to run, by typing an expression with a
side-effect as the "variable name". But if you're just beginning it's
probably best not to worry too much about such possibilities -- most
likely you yourself are the only user of your programs!

--Guido van Rossum (

From james at  Fri Jan  3 01:39:07 2014
From: james at (James Powell)
Date: Thu, 02 Jan 2014 19:39:07 -0500
Subject: [Python-ideas] str.startswith taking any iterator instead of
 just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/02/2014 06:59 PM, Guido van Rossum wrote:
>> This is driven by a real-world example wherein a large number of
>> prefixes stored in a set, necessitating:
>>     any('spam'.startswith(c) for c in prefixes)
>>     # or
>>     'spam'.startswith(tuple(prefixes))
> Neither of these strikes me as bad. Also, depending on whether the set
> of prefixes itself changes dynamically, it may be best to lift the
> tuple() call out of the startswith() call.

I agree. The any() formulation proves good enough in practice.

Creating a tuple can be a bit tricky, since the list of prefixes could
be large and could change.

>> However, .startswith doesn't seem to be the only example of this, and
>> the other examples are free of the string/iterable ambiguity:
>>     isinstance(x, {int, float})
> And there could still be another ambiguity here: a metaclass could
> conceivably make its instances (i.e. classes) iterable.

It's an interesting point that there's fundamental ambiguity between
providing an iterable of arguments or providing a single argument that
is itself an iterable (e.g., in the case of a type that is itself
iterable, like Enum)

In fact, I've actually warmed up to the any() formulation, because it
makes explicit which behaviour you want.

>> I do agree that it's definitely important to retain the behaviour of:
>>     'spam'.startswith('sz')
> Duh. :-)

You never know...

> All in all I hope you will give up your push for this feature. It just
> doesn't seem all that important, and you really just move the
> inconsistency to a different place (special-casing strings instead of
> tuples).

For these functions and methods, being able to provide a tuple of
arguments instead of a single argument seems mostly a convenience. It
allows the most common case of wanting to internalise the iteration with
a minimum of ambiguity. The any() or tuple() formulation are available
where needed.

In the end, I'm happy to drop the push for this feature.

(In general, I agree that there isn't a need to stamp out all
inconsistencies or to belabour the use of abstract types.)

James Powell

follow: @dontusethiscode + @nycpython
attend: +

From python at  Fri Jan  3 01:19:51 2014
From: python at (Alexander Heger)
Date: Fri, 3 Jan 2014 11:19:51 +1100
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

>>    isinstance(x, Iterable) and not isinstance(x, str)
> If you find yourself typing that a lot I think you have a bigger problem though.

How do you replace this?

From guido at  Fri Jan  3 01:49:14 2014
From: guido at (Guido van Rossum)
Date: Thu, 2 Jan 2014 14:49:14 -1000
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

By designing an API that doesn't require such overloading.

On Thursday, January 2, 2014, Alexander Heger wrote:

> >>    isinstance(x, Iterable) and not isinstance(x, str)
> >
> > If you find yourself typing that a lot I think you have a bigger problem
> though.
> How do you replace this?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at <javascript:;>
> Code of Conduct:

--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Fri Jan  3 02:18:34 2014
From: steve at (Steven D'Aprano)
Date: Fri, 3 Jan 2014 12:18:34 +1100
Subject: [Python-ideas] *var()*
In-Reply-To: <>
References: <>
Message-ID: <20140103011833.GP29356@ando>

On Thu, Jan 02, 2014 at 02:16:39PM -1000, Guido van Rossum wrote:
> On Thu, Jan 2, 2014 at 3:35 AM, Steven D'Aprano <steve at> wrote:
> > I would have guessed that you could get this working with eval, but if
> > there is such a way, I can't work it out.
> It's trivial if you directly invoke eval():

That's what I thought too, but I get surprising results with 

a = b = "global"

def test1():
    b = c = "nonlocal"
    def inner():
        d = "local"
        return (a, b, c, d)
    return inner()

def test2():
    b = c = "nonlocal"
    def inner():
        d = "local"
        c  # Need this or the function fails with NameError.
        return (eval('a'), eval('b'), eval('c'), eval('d'))
    return inner()

assert test1() == test2()  # Fails.

test1() returns ('global', 'nonlocal', 'nonlocal', 'local'), which is 
what I expect. But test2() returns ('global', 'global', 'nonlocal', 'local'),
which surprises me.

If I understand what is going on in test2's inner function, eval('b') 
doesn't see the nonlocal b so it picks up the global b. (If there is no 
global b, you get NameError.) But eval('c') sees the nonlocal c because 
we have a closure, due to the reference to c in the previous line.

If there's a way to get eval('b') to return "nonlocal" without having a 
closure, I don't know it. This suggests to me that you can't reliably 
look-up a nonlocal from an inner function using eval.


From python at  Fri Jan  3 04:54:57 2014
From: python at (Alexander Heger)
Date: Fri, 3 Jan 2014 14:54:57 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith taking
 any iterator instead of just tuple
Message-ID: <>

> By designing an API that doesn't require such overloading.
> On Thursday, January 2, 2014, Alexander Heger wrote:
>> >>    isinstance(x, Iterable) and not isinstance(x, str)
>> >
>> > If you find yourself typing that a lot I think you have a bigger problem
>> > though.
>> How do you replace this?

for my applications this seemed the most natural way - have the method
deal with what it is fed, which could be strings or any kind of
collections or iterables of strings.  But never would I want to
disassemble strings into characters.  From the previous message I
gather that I am not the only one with this application case.

Generally, I find strings being iterables of characters as useful as
if integers were iterables of bits.  They should just be units.  They
already start out being not mutable.  I think it would be a positive
design change for Python 4 to make them units instead of being
iterables.  At least for me, there is much fewer applications where
the latter is useful than where it requires extra code.  Overall, it
makes the language less clean that a string is an iterable; a special
case we always have to code around.

I know it will break a lot of existing code, but so did the string
change from py2 to 3.  (It would break very few of my codes, though.)


From rosuav at  Fri Jan  3 04:59:51 2014
From: rosuav at (Chris Angelico)
Date: Fri, 3 Jan 2014 14:59:51 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 3, 2014 at 2:54 PM, Alexander Heger <python at> wrote:
> Generally, I find strings being iterables of characters as useful as
> if integers were iterables of bits.  They should just be units.

What this would mean is that any time you want to iterate over the
characters, you'd have to iterate over string.split('') instead. So
the question is, is that common enough to be a problem?

The other point that comes to mind is that iteration and indexing are
closely related. I think most people would agree that "abcde"[1]
should be 'b' (granted, there's room for debate as to whether that
should be a one-character string or an integer with the Unicode
codepoint, but either way); it's possible to iterate over anything by
indexing it with 0, then 1, then 2, etc, until it raises IndexError.
For a string to not be iterable, that identity would have to be


From breamoreboy at  Fri Jan  3 05:27:15 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 03 Jan 2014 04:27:15 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <la5e5r$sk0$>

On 03/01/2014 03:54, Alexander Heger wrote:
> Generally, I find strings being iterables of characters as useful as
> if integers were iterables of bits.  They should just be units.  They
> already start out being not mutable.  I think it would be a positive
> design change for Python 4 to make them units instead of being
> iterables.  At least for me, there is much fewer applications where
> the latter is useful than where it requires extra code.  Overall, it
> makes the language less clean that a string is an iterable; a special
> case we always have to code around.

I find your terminology misleading.  A string is a sequence in the same 
way that list, tuple, range, bytes, bytearray and memoryview are.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From guido at  Fri Jan  3 05:58:06 2014
From: guido at (Guido van Rossum)
Date: Thu, 2 Jan 2014 18:58:06 -1000
Subject: [Python-ideas] *var()*
In-Reply-To: <20140103011833.GP29356@ando>
References: <>
Message-ID: <>

Right, that's why I said "won't access intermediate scopes"...

On Thursday, January 2, 2014, Steven D'Aprano wrote:

> On Thu, Jan 02, 2014 at 02:16:39PM -1000, Guido van Rossum wrote:
> > On Thu, Jan 2, 2014 at 3:35 AM, Steven D'Aprano <steve at<javascript:;>>
> wrote:
> > > I would have guessed that you could get this working with eval, but if
> > > there is such a way, I can't work it out.
> >
> > It's trivial if you directly invoke eval():
> That's what I thought too, but I get surprising results with
> nonlocals.
> a = b = "global"
> def test1():
>     b = c = "nonlocal"
>     def inner():
>         d = "local"
>         return (a, b, c, d)
>     return inner()
> def test2():
>     b = c = "nonlocal"
>     def inner():
>         d = "local"
>         c  # Need this or the function fails with NameError.
>         return (eval('a'), eval('b'), eval('c'), eval('d'))
>     return inner()
> assert test1() == test2()  # Fails.
> test1() returns ('global', 'nonlocal', 'nonlocal', 'local'), which is
> what I expect. But test2() returns ('global', 'global', 'nonlocal',
> 'local'),
> which surprises me.
> If I understand what is going on in test2's inner function, eval('b')
> doesn't see the nonlocal b so it picks up the global b. (If there is no
> global b, you get NameError.) But eval('c') sees the nonlocal c because
> we have a closure, due to the reference to c in the previous line.
> If there's a way to get eval('b') to return "nonlocal" without having a
> closure, I don't know it. This suggests to me that you can't reliably
> look-up a nonlocal from an inner function using eval.
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at <javascript:;>
> Code of Conduct:

--Guido van Rossum (on iPad)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Fri Jan  3 10:23:14 2014
From: tjreedy at (Terry Reedy)
Date: Fri, 03 Jan 2014 04:23:14 -0500
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <la5vhn$ki$>

On 1/2/2014 10:59 PM, Chris Angelico wrote:
> On Fri, Jan 3, 2014 at 2:54 PM, Alexander Heger <python at> wrote:
>> Generally, I find strings being iterables of characters as useful as
>> if integers were iterables of bits.  They should just be units.
> What this would mean is that any time you want to iterate over the
> characters, you'd have to iterate over string.split('') instead. So
> the question is, is that common enough to be a problem?
> The other point that comes to mind is that iteration and indexing are
> closely related.

def iter(collection): # is something like (ignoring two param form)
   if hasattr('__iter__'):
     return ob.__iter__
   elif hasattr('__getitem__'):
     return iterator(ob)

In 2.x, str does *not* have .__iter__, so the second branch is taken.

 >>> iter('ab')
<iterator object at 0x0000000002ED56D8>

In 3.x, str *does* have .__iter__.

 >>> iter('ab')
<str_iterator object at 0x00000000037D2EB8>

If .__iter__ were removed, strings would revert to using the generic 
iterator and would *still* be iterable.

> I think most people would agree that "abcde"[1]
> should be 'b' (granted, there's room for debate as to whether that
> should be a one-character string or an integer with the Unicode
> codepoint, but either way); it's possible to iterate over anything by
> indexing it with 0, then 1, then 2, etc, until it raises IndexError.
> For a string to not be iterable, that identity would have to be
> broken.

Which, to me, would be really ugly ;-).

Terry Jan Reedy

From denis.spir at  Fri Jan  3 11:19:35 2014
From: denis.spir at (spir)
Date: Fri, 03 Jan 2014 11:19:35 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/03/2014 04:54 AM, Alexander Heger wrote:
>> By designing an API that doesn't require such overloading.
>> On Thursday, January 2, 2014, Alexander Heger wrote:
>>>>>     isinstance(x, Iterable) and not isinstance(x, str)
>>>> If you find yourself typing that a lot I think you have a bigger problem
>>>> though.
>>> How do you replace this?
> for my applications this seemed the most natural way - have the method
> deal with what it is fed, which could be strings or any kind of
> collections or iterables of strings.  But never would I want to
> disassemble strings into characters.  From the previous message I
> gather that I am not the only one with this application case.
> Generally, I find strings being iterables of characters as useful as
> if integers were iterables of bits.  They should just be units.  They
> already start out being not mutable.  I think it would be a positive
> design change for Python 4 to make them units instead of being
> iterables.  At least for me, there is much fewer applications where
> the latter is useful than where it requires extra code.  Overall, it
> makes the language less clean that a string is an iterable; a special
> case we always have to code around.
> I know it will break a lot of existing code, but so did the string
> change from py2 to 3.  (It would break very few of my codes, though.)

I agree there is an occasionnal need which I also met in real code: it was parse 
result data, which can be a string (terminal patterns, that really "eat" part of 
the source) or list (or otherwise "tre" iterable collection, for composite or 
repetitive patterns). But the case is rare because it requires coincidence of 
* both string and collections may come as input
* both are valid, from the app's logics' point of view
* one want to iterate collections, but not strings

On the other hand, I find you much too quickly dismiss real and very common need 
to iterate strings (on the lowest units of code points), apparently on the only 
base that in your own programming practice you don't need/want it.

We should not make iterating strings a special case (eg by requiring explicit 
call to an iterator like for ucode in s.ucodes() because the case is so common. 
Instead we may consider finding a way to exclude strings in some collection 
traversal idiom (for which I have good proposal: the obvious one would .items(), 
but it's used for a different meaning), which would for instance yield an 
exception on strings because they don't match the idiom ("str object has no 
'items' attribute").


From ncoghlan at  Fri Jan  3 12:41:09 2014
From: ncoghlan at (Nick Coghlan)
Date: Fri, 3 Jan 2014 21:41:09 +1000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2014 20:19, spir <denis.spir at> wrote:
> On 01/03/2014 04:54 AM, Alexander Heger wrote:
>>> By designing an API that doesn't require such overloading.
>>> On Thursday, January 2, 2014, Alexander Heger wrote:
>>>>>>     isinstance(x, Iterable) and not isinstance(x, str)
>>>>> If you find yourself typing that a lot I think you have a bigger
>>>>> problem
>>>>> though.
>>>> How do you replace this?
>> for my applications this seemed the most natural way - have the method
>> deal with what it is fed, which could be strings or any kind of
>> collections or iterables of strings.  But never would I want to
>> disassemble strings into characters.  From the previous message I
>> gather that I am not the only one with this application case.
>> Generally, I find strings being iterables of characters as useful as
>> if integers were iterables of bits.  They should just be units.  They
>> already start out being not mutable.  I think it would be a positive
>> design change for Python 4 to make them units instead of being
>> iterables.  At least for me, there is much fewer applications where
>> the latter is useful than where it requires extra code.  Overall, it
>> makes the language less clean that a string is an iterable; a special
>> case we always have to code around.
>> I know it will break a lot of existing code, but so did the string
>> change from py2 to 3.  (It would break very few of my codes, though.)
> I agree there is an occasionnal need which I also met in real code: it was
> parse result data, which can be a string (terminal patterns, that really
> "eat" part of the source) or list (or otherwise "tre" iterable collection,
> for composite or repetitive patterns). But the case is rare because it
> requires coincidence of conditions:
> * both string and collections may come as input
> * both are valid, from the app's logics' point of view
> * one want to iterate collections, but not strings
> On the other hand, I find you much too quickly dismiss real and very common
> need to iterate strings (on the lowest units of code points), apparently on
> the only base that in your own programming practice you don't need/want it.
> We should not make iterating strings a special case (eg by requiring
> explicit call to an iterator like for ucode in s.ucodes() because the case
> is so common. Instead we may consider finding a way to exclude strings in
> some collection traversal idiom (for which I have good proposal: the obvious
> one would .items(), but it's used for a different meaning), which would for
> instance yield an exception on strings because they don't match the idiom
> ("str object has no 'items' attribute").

The underlying problem is that strings have a dual nature: you can
view them as either a sequence of code points (which is how Python
models them), or else you can view them as an opaque chunk of text
(which is often how you want to treat them in code that accepts either
containers or atomic values and treats them differently).

This has some interesting implications for API design.

"def f(*args)" handles the constraint fairly well, as f("astring") is
treated as a single value and f(*"string") is an unlikely mistake for
anyone to make.

"def f(iterable)" has problems in many cases, since f("string") is
treated as an iterable of code points, even if you'd prefer an
immediate error.

"def f(iterable_or_atomic)" also has problems, since strings will use
the "iterable" path, even if the atomic handling would be more

Algorithms that recursively descend into containers also need to deal
with the fact that doing so with strings causes an infinite loop
(since iterating over a string produces length 1 strings).

This is a genuine problem, which is why the question of how to cleanly
deal with these situations keeps coming up every couple of years, and
the current state of the art answer is "grit your teeth and use
isinstance(obj, str)" (or a configurable alternative).

However, I'm wondering if it might be reasonable to add a new entry in for 3.5:

>>> from abc import ABC
>>> from import Iterable
>>> class Atomic(ABC):
...     @classmethod
...     def __subclasshook__(cls, subclass):
...         if not issubclass(subclass, Iterable):
...             return True
...         return NotImplemented
>>> Atomic.register(str)
<class 'str'>
>>> Atomic.register(bytes)
<class 'bytes'>
>>> Atomic.register(bytearray)
<class 'bytearray'>
>>> isinstance(1, Atomic)
>>> isinstance(1.0, Atomic)
>>> isinstance(1j, Atomic)
>>> isinstance("Hello", Atomic)
>>> isinstance(b"Hello", Atomic)
>>> isinstance((), Atomic)
>>> isinstance([], Atomic)
>>> isinstance({}, Atomic)

Any type which wasn't iterable would automatically be considered
atomic, while some types which *are* iterable could *also* be
registered as atomic (with str, bytes and bytearray being the obvious
candidates, as shown above).

Armed with such an ABC, you could then write an "iter_non_atomic"
helper function as:

    def iter_non_atomic(iterable):
        if isinstance(iterable, Atomic):
            raise TypeError("{!r} is considered
        return iter(iterable)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From masklinn at  Fri Jan  3 13:12:41 2014
From: masklinn at (Masklinn)
Date: Fri, 3 Jan 2014 13:12:41 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
	taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 2014-01-03, at 12:41 , Nick Coghlan <ncoghlan at> wrote:
> "def f(iterable_or_atomic)" also has problems, since strings will use
> the "iterable" path, even if the atomic handling would be more
> appropriate.
> Algorithms that recursively descend into containers also need to deal
> with the fact that doing so with strings causes an infinite loop
> (since iterating over a string produces length 1 strings).
> This is a genuine problem, which is why the question of how to cleanly
> deal with these situations keeps coming up every couple of years, and
> the current state of the art answer is "grit your teeth and use
> isinstance(obj, str)" (or a configurable alternative).
> However, I'm wondering if it might be reasonable to add a new entry in
> for 3.5:
>>>> from abc import ABC
>>>> from import Iterable
>>>> class Atomic(ABC):
> ...     @classmethod
> ...     def __subclasshook__(cls, subclass):
> ...         if not issubclass(subclass, Iterable):
> ...             return True
> ...         return NotImplemented
> ...

I?ve used some sort of ad-hoc version of it enough that I think it?s
a good idea, although I?d suggest ?scalar?: ?atomic? also
exists (with very different semantics) in concurrency contexts, whereas
I believe scalar always means single-value (non-compound) data type.

>>>> Atomic.register(str)
> <class 'str'>
>>>> Atomic.register(bytes)
> <class 'bytes'>
>>>> Atomic.register(bytearray)
> <class 'bytearray'>
>>>> isinstance(1, Atomic)
> True
>>>> isinstance(1.0, Atomic)
> True
>>>> isinstance(1j, Atomic)
> True
>>>> isinstance("Hello", Atomic)
> True
>>>> isinstance(b"Hello", Atomic)
> True
>>>> isinstance((), Atomic)
> False
>>>> isinstance([], Atomic)
> False
>>>> isinstance({}, Atomic)
> False

From ncoghlan at  Fri Jan  3 13:30:31 2014
From: ncoghlan at (Nick Coghlan)
Date: Fri, 3 Jan 2014 22:30:31 +1000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2014 22:12, Masklinn <masklinn at> wrote:
> On 2014-01-03, at 12:41 , Nick Coghlan <ncoghlan at> wrote:
>> "def f(iterable_or_atomic)" also has problems, since strings will use
>> the "iterable" path, even if the atomic handling would be more
>> appropriate.
>> Algorithms that recursively descend into containers also need to deal
>> with the fact that doing so with strings causes an infinite loop
>> (since iterating over a string produces length 1 strings).
>> This is a genuine problem, which is why the question of how to cleanly
>> deal with these situations keeps coming up every couple of years, and
>> the current state of the art answer is "grit your teeth and use
>> isinstance(obj, str)" (or a configurable alternative).
>> However, I'm wondering if it might be reasonable to add a new entry in
>> for 3.5:
>>>>> from abc import ABC
>>>>> from import Iterable
>>>>> class Atomic(ABC):
>> ...     @classmethod
>> ...     def __subclasshook__(cls, subclass):
>> ...         if not issubclass(subclass, Iterable):
>> ...             return True
>> ...         return NotImplemented
>> ...
> I?ve used some sort of ad-hoc version of it enough that I think it?s
> a good idea, although I?d suggest ?scalar?: ?atomic? also
> exists (with very different semantics) in concurrency contexts, whereas
> I believe scalar always means single-value (non-compound) data type.

Yeah, that makes sense. I believe the NumPy folks run into a somewhat
similar issue with the subtle distinction between treating scalars as
scalars and treating them as zero-dimensional arrays.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From denis.spir at  Fri Jan  3 15:17:44 2014
From: denis.spir at (spir)
Date: Fri, 03 Jan 2014 15:17:44 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/03/2014 01:12 PM, Masklinn wrote:
> I?ve used some sort of ad-hoc version of it enough that I think it?s
> a good idea, although I?d suggest ?scalar?: ?atomic? also
> exists (with very different semantics) in concurrency contexts, whereas
> I believe scalar always means single-value (non-compound) data type.

I used to use, for non highly educated folks, "element" or "elementary" 
(considering "scalar" too rare a term, and "atomic" potentially misleading).


From joshua at  Fri Jan  3 15:17:19 2014
From: joshua at (Joshua Landau)
Date: Fri, 3 Jan 2014 14:17:19 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2014 12:12, Masklinn <masklinn at> wrote:
> On 2014-01-03, at 12:41 , Nick Coghlan <ncoghlan at> wrote:
> I?ve used some sort of ad-hoc version of it enough that I think it?s
> a good idea, although I?d suggest ?scalar?: ?atomic? also
> exists (with very different semantics) in concurrency contexts, whereas
> I believe scalar always means single-value (non-compound) data type.

OTOH, to many non-mathematical people I hardly expect "is this scalar"
to feel nearly as meaningful a question as "is this atomic".

To bike-shed, how about "unitary".

Nevertheless, I like the idea and the problem is a real one.

From denis.spir at  Fri Jan  3 15:21:31 2014
From: denis.spir at (spir)
Date: Fri, 03 Jan 2014 15:21:31 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>	<>
Message-ID: <>

On 01/03/2014 12:41 PM, Nick Coghlan wrote:
> The underlying problem is that strings have a dual nature: you can
> view them as either a sequence of code points (which is how Python
> models them), or else you can view them as an opaque chunk of text
> (which is often how you want to treat them in code that accepts either
> containers or atomic values and treats them differently).
> This has some interesting implications for API design.
> "def f(*args)" handles the constraint fairly well, as f("astring") is
> treated as a single value and f(*"string") is an unlikely mistake for
> anyone to make.
> "def f(iterable)" has problems in many cases, since f("string") is
> treated as an iterable of code points, even if you'd prefer an
> immediate error.
> "def f(iterable_or_atomic)" also has problems, since strings will use
> the "iterable" path, even if the atomic handling would be more
> appropriate.
> Algorithms that recursively descend into containers also need to deal
> with the fact that doing so with strings causes an infinite loop
> (since iterating over a string produces length 1 strings).
> This is a genuine problem, which is why the question of how to cleanly
> deal with these situations keeps coming up every couple of years, and
> the current state of the art answer is "grit your teeth and use
> isinstance(obj, str)" (or a configurable alternative).
> However, I'm wondering if it might be reasonable to add a new entry in
> for 3.5:
>>>> >>>from abc import ABC
>>>> >>>from import Iterable
>>>> >>>class Atomic(ABC):
> ...     @classmethod
> ...     def __subclasshook__(cls, subclass):
> ...         if not issubclass(subclass, Iterable):
> ...             return True
> ...         return NotImplemented
> ...
>>>> >>>Atomic.register(str)
> <class 'str'>
>>>> >>>Atomic.register(bytes)
> <class 'bytes'>
>>>> >>>Atomic.register(bytearray)
> <class 'bytearray'>
>>>> >>>isinstance(1, Atomic)
> True
>>>> >>>isinstance(1.0, Atomic)
> True
>>>> >>>isinstance(1j, Atomic)
> True
>>>> >>>isinstance("Hello", Atomic)
> True
>>>> >>>isinstance(b"Hello", Atomic)
> True
>>>> >>>isinstance((), Atomic)
> False
>>>> >>>isinstance([], Atomic)
> False
>>>> >>>isinstance({}, Atomic)
> False
> Any type which wasn't iterable would automatically be considered
> atomic, while some types which *are* iterable could *also* be
> registered as atomic (with str, bytes and bytearray being the obvious
> candidates, as shown above).
> Armed with such an ABC, you could then write an "iter_non_atomic"
> helper function as:
>      def iter_non_atomic(iterable):
>          if isinstance(iterable, Atomic):
>              raise TypeError("{!r} is considered
> atomic".format(iterable.__class__.__name__)
>          return iter(iterable)

I like this solution. But would live with checking for type (usually str). The 
point is that, while not that uncommon, when the issue arises one has to deal 
with it at one or at most a few places in code (typically at start of one a few 
methods of a given type). It is not as if we had to carry an unneeded overload 
about everywhere.


From ncoghlan at  Fri Jan  3 15:39:15 2014
From: ncoghlan at (Nick Coghlan)
Date: Sat, 4 Jan 2014 00:39:15 +1000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 4 January 2014 00:21, spir <denis.spir at> wrote:
> On 01/03/2014 12:41 PM, Nick Coghlan wrote:
>> Armed with such an ABC, you could then write an "iter_non_atomic"
>> helper function as:
>>      def iter_non_atomic(iterable):
>>          if isinstance(iterable, Atomic):
>>              raise TypeError("{!r} is considered
>> atomic".format(iterable.__class__.__name__)
>>          return iter(iterable)
> I like this solution. But would live with checking for type (usually str).

The ducktyping variant I've also used on occasion is "hasattr(obj,
'encode')" rather than an instance check against a concrete type (it
also has the benefit of picking up both str and unicode in Python 2
when writing 2/3 compatible code that can't rely on basestring, as
well as UserString instances)

> The point is that, while not that uncommon, when the issue arises one has to
> deal with it at one or at most a few places in code (typically at start of
> one a few methods of a given type). It is not as if we had to carry an
> unneeded overload about everywhere.

Right, I see it as very similar to the "is that a sequence or a
mapping?" question that was one of the key motivations for adding the
ABC machinery in the first place. For that case, people historically
used a check like "hasattr(obj, 'keys')" (and I think we still do that
in a couple of places).

Here, the distinction is between true containers types like sets,
dicts and lists, and more structured iterables like strings, where the
whole is substantially more than the sum of its parts.

Actually, that would be another way of carving out the distinction -
rather than trying to cover *all* Atomic types, just have an
AtomicIterable ABC that indicated any structure where applying
operations like "flatten" doesn't make sense. In addition to str,
bytes and bytearray, memoryview and namedtuple instances would also be
appropriate candidates.

The Iterable suffix would indicate directly that this wasn't related
to concurrency.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From stephen at  Fri Jan  3 16:54:17 2014
From: stephen at (Stephen J. Turnbull)
Date: Sat, 04 Jan 2014 00:54:17 +0900
Subject: [Python-ideas] strings as iterables - from
	str.startswith	taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

Masklinn writes:

 > I?ve used some sort of ad-hoc version of it enough that I think it?s
 > a good idea, although I?d suggest ?scalar?: ?atomic? also
 > exists (with very different semantics) in concurrency contexts, whereas
 > I believe scalar always means single-value (non-compound) data type.

Sure, but if you're a Unicode geek "scalar" essentially means
"character", so a string ain't that!

Seriously, all the good words have been taken two or three times
already in some other field.  Pick one and don't worry about the
overloading -- learning to spell English is *much* harder.

From denis.spir at  Fri Jan  3 17:31:22 2014
From: denis.spir at (spir)
Date: Fri, 03 Jan 2014 17:31:22 +0100
Subject: [Python-ideas] strings as iterables - from
 str.startswith	taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/03/2014 04:54 PM, Stephen J. Turnbull wrote:
> Masklinn writes:
>   > I?ve used some sort of ad-hoc version of it enough that I think it?s
>   > a good idea, although I?d suggest ?scalar?: ?atomic? also
>   > exists (with very different semantics) in concurrency contexts, whereas
>   > I believe scalar always means single-value (non-compound) data type.
> Sure, but if you're a Unicode geek "scalar" essentially means
> "character", so a string ain't that!

Unfortunately in unicode slang "character" does not mean character ;-) (but, 
say, whatever a code point happens to represent)

> Seriously, all the good words have been taken two or three times
> already in some other field.  Pick one and don't worry about the
> overloading -- learning to spell English is *much* harder.

Thankfully no one needs spelling english corectly to program --except for 


From denis.spir at  Fri Jan  3 17:39:15 2014
From: denis.spir at (spir)
Date: Fri, 03 Jan 2014 17:39:15 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>	<>	<>	<>
Message-ID: <>

On 01/03/2014 03:39 PM, Nick Coghlan wrote:
> Here, the distinction is between true containers types like sets,
> dicts and lists, and more structured iterables like strings, where the
> whole is substantially more than the sum of its parts.

That's it: the unique property of strings is that composing & combining are the 
same operation, while for true containers ther are distinct: when combining sets 
(union), one gets a set at the same complexity level, whatever the items are, 
while when composing sets one gets a set of sets.

> Actually, that would be another way of carving out the distinction -
> rather than trying to cover *all* Atomic types, just have an
> AtomicIterable ABC that indicated any structure where applying
> operations like "flatten" doesn't make sense. In addition to str,
> bytes and bytearray, memoryview and namedtuple instances would also be
> appropriate candidates.

Yes, maybe it's more practicle; but an ABC type common to strings (and the like) 
and atomic types also makes sense.


PS: I had another common use case at times, with trees which leaves may be 
string, or not (esp for their str and repr methods).

From abarnert at  Fri Jan  3 18:27:21 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 3 Jan 2014 09:27:21 -0800
Subject: [Python-ideas] strings as iterables - from str.startswith
	taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 3, 2014, at 6:39, Nick Coghlan <ncoghlan at> wrote:

> The Iterable suffix would indicate directly that this wasn't related
> to concurrency.

I don't know; something whose iter was guaranteed to return a iterator that I could next without synchronizing could be pretty handy. ;)

More seriously, I think a strength of your original version was having a single abstract type for both non-iterables and things that are iterable but you sometimes don't want to treat that way. A flatten function that uses "not
isinstance(x, Iterable) or isinstance(x, AtomicIterable)" is less obvious than one that just uses "isinstance(x, Atomic)", and will be a source of 10x as many stupid "oops I used and instead of or" type bugs.

If there really is no acceptable name for the easier concept, the tradeoff could be worth it anyway, but I think it's worth trying harder for one 

One last question to bring up: Is there a reasonable/common use case where you do want to flatten multi-char strings to single-char strings, but then want to treat single-char strings as atoms? I can certainly imagine toy cases like that, but it could easily be so rarely useful that it's ok to leave that clumsy to write.

From bruce at  Fri Jan  3 19:11:59 2014
From: bruce at (Bruce Leban)
Date: Fri, 3 Jan 2014 10:11:59 -0800
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 3, 2014 at 6:17 AM, Joshua Landau <joshua at> wrote:

> OTOH, to many non-mathematical people I hardly expect "is this scalar"
> to feel nearly as meaningful a question as "is this atomic".
> To bike-shed, how about "unitary".

"atomic" has the wrong meaning since it says it doesn't have any component
parts. Scalar has the right meaning.

As to the idea of making strings not iterable, that would break my code. I
write a lot of code to manipulate words (to create puzzles) and iterating
over strings is fundamental. In fact, I'd like to have strings as results
of iteration operations on strings:

>>> sorted('string')
>>> list(itertools.permutations('bar'))
['bar', 'bra', 'abr', 'arb', 'rba', 'rab']

instead I have to write

>>> ''.join(sorted('string'))
>>> [''.join(s) for s in itertools.permutations('bar')]

This would probably break less code than making strings non-iterable, but
realize that there's approximately 0% chance this would ever change and
there's no easy way to cover every iteration operation. And it would
confuse people if sometimes:

(x.upper() for x in s)

returned an iterator and sometimes it returned a string.

--- Bruce
My guest puzzle for Puzzles Live:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Sat Jan  4 05:08:19 2014
From: python at (Alexander Heger)
Date: Sat, 04 Jan 2014 15:08:19 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

Dear Nick,

yes, defining an ABC for this case would be an excellent solution.



> However, I'm wondering if it might be reasonable to add a new entry in
> for 3.5:
>>>> from abc import ABC
>>>> from import Iterable
>>>> class Atomic(ABC):
> ...     @classmethod
> ...     def __subclasshook__(cls, subclass):
> ...         if not issubclass(subclass, Iterable):
> ...             return True
> ...         return NotImplemented
> ...
>>>> Atomic.register(str)
> <class 'str'>
>>>> Atomic.register(bytes)
> <class 'bytes'>
>>>> Atomic.register(bytearray)
> <class 'bytearray'>
>>>> isinstance(1, Atomic)
> True
>>>> isinstance(1.0, Atomic)
> True
>>>> isinstance(1j, Atomic)
> True
>>>> isinstance("Hello", Atomic)
> True
>>>> isinstance(b"Hello", Atomic)
> True
>>>> isinstance((), Atomic)
> False
>>>> isinstance([], Atomic)
> False
>>>> isinstance({}, Atomic)
> False
> Any type which wasn't iterable would automatically be considered
> atomic, while some types which *are* iterable could *also* be
> registered as atomic (with str, bytes and bytearray being the obvious
> candidates, as shown above).
> Armed with such an ABC, you could then write an "iter_non_atomic"
> helper function as:
>      def iter_non_atomic(iterable):
>          if isinstance(iterable, Atomic):
>              raise TypeError("{!r} is considered
> atomic".format(iterable.__class__.__name__)
>          return iter(iterable)
> Cheers,
> Nick.

From python at  Sat Jan  4 05:23:59 2014
From: python at (Alexander Heger)
Date: Sat, 04 Jan 2014 15:23:59 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

> On Fri, Jan 3, 2014 at 2:54 PM, Alexander Heger <python at> wrote:
>> Generally, I find strings being iterables of characters as useful as
>> if integers were iterables of bits.  They should just be units.
> What this would mean is that any time you want to iterate over the
> characters, you'd have to iterate over string.split('') instead. So
> the question is, is that common enough to be a problem?

you could still have had str.iter()

> The other point that comes to mind is that iteration and indexing are
> closely related. I think most people would agree that "abcde"[1]
> should be 'b' (granted, there's room for debate as to whether that
> should be a one-character string or an integer with the Unicode
> codepoint, but either way); it's possible to iterate over anything by
> indexing it with 0, then 1, then 2, etc, until it raises IndexError.
> For a string to not be iterable, that identity would have to be
> broken.

OK, I admit that not being able to iterate over something that can be 
indexed may be confusing.  Though indexing of strings is somewhat 
special in many languages.


From rosuav at  Sat Jan  4 06:32:04 2014
From: rosuav at (Chris Angelico)
Date: Sat, 4 Jan 2014 16:32:04 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 4, 2014 at 3:23 PM, Alexander Heger <python at> wrote:
>> The other point that comes to mind is that iteration and indexing are
>> closely related. I think most people would agree that "abcde"[1]
>> should be 'b' (granted, there's room for debate as to whether that
>> should be a one-character string or an integer with the Unicode
>> codepoint, but either way); it's possible to iterate over anything by
>> indexing it with 0, then 1, then 2, etc, until it raises IndexError.
>> For a string to not be iterable, that identity would have to be
>> broken.
> OK, I admit that not being able to iterate over something that can be
> indexed may be confusing.  Though indexing of strings is somewhat special in
> many languages.

I don't know that it's particularly special. In some languages, a
string is simply an array of small integers (maybe bytes, maybe
Unicode codepoints), so when you index into one, you get the integers.
Python deems that the elements of a string are themselves strings,
which is somewhat special I suppose, but only because the
representation of a character is a short string. And of course, there
are languages that treat strings as simple atomic scalars, no
subscripting allowed at all - I don't think that's an advantage over
either of the above. :)

When you index a string, you get a character. Whatever the language
uses to represent a character, that's what you get. I don't think this
is particularly esoteric, but maybe that's just me.


From denis.spir at  Sat Jan  4 11:22:16 2014
From: denis.spir at (spir)
Date: Sat, 04 Jan 2014 11:22:16 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/03/2014 07:11 PM, Bruce Leban wrote:
> As to the idea of making strings not iterable, that would break my code. I
> write a lot of code to manipulate words (to create puzzles) and iterating
> over strings is fundamental. In fact, I'd like to have strings as results
> of iteration operations on strings:
>>>> >>>sorted('string')
> 'ginrst'
>>>> >>>list(itertools.permutations('bar'))
> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> instead I have to write
>>>> >>>''.join(sorted('string'))
>>>> >>>[''.join(s) for s in itertools.permutations('bar')]

Maybe we just need a 'cat' or 'concat' [1] method for lists:
    (s for s in itertools.permutations('bar')).cat()
(Then, a hard choice: should cat crash when items are not strings, or 
automagically stringify its operands? I wish join would do the latter.)


[1] I have not understood yet why "concatenation", instead of just "catenation". 
Literaly means chaining (things) together; but I'm still trying to figure out 
how one can chain things apart ;-)
As if strings were called "withstrings" or "stringtogethers", more or less. 
Enlightening welcome.

(Same for "concatenative languages"... of which one is called "cat"!)

From ram.rachum at  Sat Jan  4 23:41:01 2014
From: ram.rachum at (Ram Rachum)
Date: Sat, 4 Jan 2014 14:41:01 -0800 (PST)
Subject: [Python-ideas] `pathlib.Path.write` and ``
Message-ID: <>


I'd really like to have methods `pathlib.Path.write` and 
``. Untested implementation:

def read(self, binary=False):
    with'br' is binary else 'r') as file:

def write(self, data. binary=False):
    with'bw' is binary else 'w') as file:

This will be super useful to me. Many files actions are one liners like 
that, and avoiding putting the `with` clause in user code would be 

What do you think? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ram.rachum at  Sat Jan  4 23:05:27 2014
From: ram.rachum at (Ram Rachum)
Date: Sat, 4 Jan 2014 14:05:27 -0800 (PST)
Subject: [Python-ideas] Introduce constant: `pathlib.null_path`
Message-ID: <>

What do you think about introducing this constant in the `pathlib` module:

   null_path = pathlib.Path('\\Device\\Null') if = 'nt' else 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Jan  4 23:59:04 2014
From: steve at (Steven D'Aprano)
Date: Sun, 5 Jan 2014 09:59:04 +1100
Subject: [Python-ideas] strings as iterables - from str.startswith
	taking any iterator instead of just tuple
In-Reply-To: <>
References: <>
Message-ID: <20140104225857.GZ29356@ando>

On Sat, Jan 04, 2014 at 11:22:16AM +0100, spir wrote:
> On 01/03/2014 07:11 PM, Bruce Leban wrote:
> >As to the idea of making strings not iterable, that would break my code. I
> >write a lot of code to manipulate words (to create puzzles) and iterating
> >over strings is fundamental. In fact, I'd like to have strings as results
> >of iteration operations on strings:
> >
> >>>>>>>sorted('string')
> >'ginrst'
> >>>>>>>list(itertools.permutations('bar'))
> >['bar', 'bra', 'abr', 'arb', 'rba', 'rab']

That would be nice to have.

> >instead I have to write
> >
> >>>>>>>''.join(sorted('string'))
> >>>>>>>[''.join(s) for s in itertools.permutations('bar')]

Which is a slight inconvenience, but not a great one. You can always 
save three characters by creating a helper function:

join = ''.join

> Maybe we just need a 'cat' or 'concat' [1] method for lists:
>    sorted('string').cat()
>    (s for s in itertools.permutations('bar')).cat()


Lists are general collections, giving them a method that depends on a 
specific kind of item is ugly. Adding that same method to generator 
expressions is even worse. We don't have list.sum() for adding lists of 
numbers, we have a sum() function that takes a list.

> (Then, a hard choice: should cat crash when items are not strings, or 
> automagically stringify its operands? I wish join would do the latter.)


Joining what you think is a list of strings but actually isn't is an 
error. The right thing to do in the face of an error is to raise an 
exception, not to silently hide the error. If you want to 
automatically convert arbitrary items into strings, it is better to 
explicitly do so:

''.join(str(x) for x in items)

than to have it magically, and incorrectly, happen implicitly.

> [1] I have not understood yet why "concatenation", instead of just 
> "catenation". Literaly means chaining (things) together; but I'm still 
> trying to figure out how one can chain things apart ;-)

Chain your left arm to the wall on your left, and your right arm to the 
wall on your right. Your arms are now chained apart.

(Safe for work.)


From benjamin at  Sun Jan  5 00:25:40 2014
From: benjamin at (Benjamin Peterson)
Date: Sat, 4 Jan 2014 23:25:40 +0000 (UTC)
Subject: [Python-ideas]
References: <>
Message-ID: <>

Ram Rachum <ram.rachum at ...> writes:

> What do you think about introducing this constant in the `pathlib` module:
> ? ?null_path = pathlib.Path('\\Device\\Null') if = 'nt' else
What's wrong with pathlib.Path(os.devnull)?

From victor.stinner at  Sun Jan  5 00:27:25 2014
From: victor.stinner at (Victor Stinner)
Date: Sun, 5 Jan 2014 00:27:25 +0100
Subject: [Python-ideas] Introduce constant: `pathlib.null_path`
In-Reply-To: <>
References: <>
Message-ID: <>

There is already os.path.devnull.

Le 4 janv. 2014 23:48, "Ram Rachum" <ram.rachum at> a ?crit :

> What do you think about introducing this constant in the `pathlib` module:
>    null_path = pathlib.Path('\\Device\\Null') if = 'nt' else
> pathlib.Path('/dev/null')
> Thanks,
> Ram.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ram at  Sun Jan  5 00:27:54 2014
From: ram at (Ram Rachum)
Date: Sun, 5 Jan 2014 01:27:54 +0200
Subject: [Python-ideas] Introduce constant: `pathlib.null_path`
In-Reply-To: <>
References: <>
Message-ID: <>

Cool, I didn't know about that. Thanks!

On Sun, Jan 5, 2014 at 1:25 AM, Benjamin Peterson <benjamin at>wrote:

> Ram Rachum <ram.rachum at ...> writes:
> >
> > What do you think about introducing this constant in the `pathlib`
> module:
> >    null_path = pathlib.Path('\\Device\\Null') if = 'nt' else
> pathlib.Path('/dev/null')
> =
> What's wrong with pathlib.Path(os.devnull)?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> --
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at
> For more options, visit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From masklinn at  Sun Jan  5 00:30:07 2014
From: masklinn at (Masklinn)
Date: Sun, 5 Jan 2014 00:30:07 +0100
Subject: [Python-ideas] strings as iterables - from str.startswith
	taking any iterator instead of just tuple
In-Reply-To: <20140104225857.GZ29356@ando>
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <>

On 2014-01-04, at 23:59 , Steven D'Aprano <steve at> wrote:
> On Sat, Jan 04, 2014 at 11:22:16AM +0100, spir wrote:
>> On 01/03/2014 07:11 PM, Bruce Leban wrote:
>>> As to the idea of making strings not iterable, that would break my code. I
>>> write a lot of code to manipulate words (to create puzzles) and iterating
>>> over strings is fundamental. In fact, I'd like to have strings as results
>>> of iteration operations on strings:
>>>>>>>>> sorted('string')
>>> 'ginrst'
>>>>>>>>> list(itertools.permutations('bar'))
>>> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> That would be nice to have.

More generally, it would be nice if a sequence type could specify how to
derive a new instance of itself (from an iterable for instance).
Constructors don't necessarily work (e.g. str's constructor). Clojure
has such a concept through the IPersistentCollection protocol:
empty(coll) creates a new (empty) instance of coll (clojure's
collections being immutable, it makes sense to create an empty
collection then add stuff into it via into() or conj())

From amber.yust at  Sun Jan  5 01:08:13 2014
From: amber.yust at (Amber Yust)
Date: Sun, 05 Jan 2014 00:08:13 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <-9056898550324328423@gmail297201516>

__fromiter__, anyone?

On Sat Jan 04 2014 at 3:31:59 PM, Masklinn <masklinn at> wrote:

> On 2014-01-04, at 23:59 , Steven D'Aprano <steve at> wrote:
> > On Sat, Jan 04, 2014 at 11:22:16AM +0100, spir wrote:
> >> On 01/03/2014 07:11 PM, Bruce Leban wrote:
> >>> As to the idea of making strings not iterable, that would break my
> code. I
> >>> write a lot of code to manipulate words (to create puzzles) and
> iterating
> >>> over strings is fundamental. In fact, I'd like to have strings as
> results
> >>> of iteration operations on strings:
> >>>
> >>>>>>>>> sorted('string')
> >>> 'ginrst'
> >>>>>>>>> list(itertools.permutations('bar'))
> >>> ['bar', 'bra', 'abr', 'arb', 'rba', 'rab']
> >
> > That would be nice to have.
> More generally, it would be nice if a sequence type could specify how to
> derive a new instance of itself (from an iterable for instance).
> Constructors don't necessarily work (e.g. str's constructor). Clojure
> has such a concept through the IPersistentCollection protocol:
> empty(coll) creates a new (empty) instance of coll (clojure's
> collections being immutable, it makes sense to create an empty
> collection then add stuff into it via into() or conj())
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Sun Jan  5 01:50:11 2014
From: at (Joshua Landau)
Date: Sun, 5 Jan 2014 00:50:11 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <-9056898550324328423@gmail297201516>
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <>

On Jan 5, 2014 12:08 AM, "Amber Yust" <amber.yust at> wrote:
> __fromiter__, anyone?

I'm unconvinced that it should be a dunder method. Do you expect it to be
used like

    fromiter(str, characters)


However, +1 on the name, +0 on the idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amber.yust at  Sun Jan  5 03:10:52 2014
From: amber.yust at (Amber Yust)
Date: Sun, 05 Jan 2014 02:10:52 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <-1263298433535047096@gmail297201516>

I'm thinking of it being analogous to the __getstate__ and __setstate__
dunders used by Pickle to allow customization of object creation.

On Sat Jan 04 2014 at 4:50:11 PM, Joshua Landau < at>

> On Jan 5, 2014 12:08 AM, "Amber Yust" <amber.yust at> wrote:
> >
> > __fromiter__, anyone?
> I'm unconvinced that it should be a dunder method. Do you expect it to be
> used like
>     fromiter(str, characters)
> ?
> However, +1 on the name, +0 on the idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sun Jan  5 08:24:03 2014
From: guido at (Guido van Rossum)
Date: Sat, 4 Jan 2014 21:24:03 -1000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
In-Reply-To: <-1263298433535047096@gmail297201516>
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <>

Is this tread still about strings vs. other iterables?

First of all, the motivation for making strings iterable is that they
are indexable and sliceable, which means they act like sequences.

Historically, indexing and slicing predated the concept of iterators
in Python. Many other languages (starting with Pascal and C) also
treat strings as arrays; while many of those have a separate character
type, a few languages follow Python's example (or the other way
around, I don't feel like tracking the influences exactly, or even
finding examples -- I do know they exist). There are also languages
where strings are *not* considered arrays (I think this is the case in
Ruby and Perl). In such languages string manipulation is typically
done using regular expressions or similar APIs, although there usually
also non-array APIs to get characters or substrings using indexes, but
those APIs may not be O(1), e.g. for reasons having to do with
decoding UTF-8 on the fly.

All in all I am happy with Python's string-as-array semantics and I
don't want to change this.

While I would like to encourage API designs that don't require
distinguishing between strings and other iterables (just like I prefer
APIs that don't require distinguishing between sequences and mappings,
or between callables and "plain values"), I realize that pragmatically
people are going to want to write such code, and an ABC seems a good

However, if "Atomic" is still under consideration, I would strongly
argue against that particular term. Given that a string is an array of
characters, calling it an "atom" (== indivisible) seems particularly
out of order. (And yes, I know that the use of the term in physics is
also a misnomer -- let's not repeat that mistake. :-)

Alas, I don't have a better name, but I'm sure the thesauriers will
find something. We have until Python 3.5 is released to agree on a
name. :-)

--Guido van Rossum (

From aquavitae69 at  Sun Jan  5 12:09:38 2014
From: aquavitae69 at (David Townshend)
Date: Sun, 5 Jan 2014 13:09:38 +0200
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

Reading this thread made me start to think about why a string is a
sequence, and I can't actually see any obvious reason, other than
historical ones. Every use case I can think of for iterating over a string
either involves first splitting the string, or would be better done with a
regex. Also, the only times I can recall using a string as a sequence is in
doctests (because it reads better than a list of characters) or in the
interpreter when I'm trying something out. I'm not suggesting changing it -
there's too much history for that, but I am interested to know if there is
some fundamental reason that strings are sequences. If a new string object
was being implemented now, would it be a sequence?
On 3 Jan 2014 02:49, "Guido van Rossum" <guido at> wrote:

> By designing an API that doesn't require such overloading.
> On Thursday, January 2, 2014, Alexander Heger wrote:
>> >>    isinstance(x, Iterable) and not isinstance(x, str)
>> >
>> > If you find yourself typing that a lot I think you have a bigger
>> problem though.
>> How do you replace this?
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> --
> --Guido van Rossum (on iPad)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ericsnowcurrently at  Sun Jan  5 18:49:26 2014
From: ericsnowcurrently at (Eric Snow)
Date: Sun, 5 Jan 2014 10:49:26 -0700
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 5, 2014 4:10 AM, "David Townshend" <aquavitae69 at> wrote:
> Reading this thread made me start to think about why a string is a
sequence, and I can't actually see any obvious reason, other than
historical ones.

Sometimes I think it would be more clear if strings weren't sequences but
had various attributes that exposed sequence "views", e.g. codepoints,
etc.  Making strings non-sequences isn't realistic at this point, but
adding the sequence view attributes may still be nice.

That said, at present it's not something I personally have any use case
for.  There was an article floating around the web recently where the
deficiencies of unicode implementations was discussed and I recall
something there or in related discussions about use cases for having
different views into a string.  Wow that was vague. :)  The different views
into unicode strings certainly comes up from time to time on our lists.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Sun Jan  5 18:53:16 2014
From: solipsis at (Antoine Pitrou)
Date: Sun, 5 Jan 2014 18:53:16 +0100
Subject: [Python-ideas] `pathlib.Path.write` and ``
References: <>
Message-ID: <20140105185316.7ac5084f@fsol>

On Sat, 4 Jan 2014 14:41:01 -0800 (PST)
Ram Rachum <ram.rachum at> wrote:
> This will be super useful to me. Many files actions are one liners like 
> that, and avoiding putting the `with` clause in user code would be 
> wonderful.
> What do you think? 

I agree something like that would be useful, I'm just not sure what the
ideal API would be. For starters I think "binary" shouldn't be an
argument: there should be separate methods for reading/writing text and
binary contents. Also, you need to be able to pass encoding and other
parameters for text files.



From abarnert at  Sun Jan  5 18:48:31 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 5 Jan 2014 09:48:31 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 5, 2014, at 3:09, David Townshend <aquavitae69 at> wrote:

> Reading this thread made me start to think about why a string is a sequence, and I can't actually see any obvious reason, other than historical ones.

You've seriously never indexed or sliced a string? Those are the two core operations in sequences, and they're obviously useful on strings.

> Every use case I can think of for iterating over a string either involves first splitting the string, or would be better done with a regex

People have mentioned use cases for iterating strings in this thread. And it's easy to think of more. There are all kinds of algorithms that treat strings as sequences of characters. Sure, many of these functions are already methods on str or otherwise built into the stdlib, but that just means they're implemented by iterating the string storage in C with a loop around "*++s". And if you want to extend that set of builtins with similar functions, how else would you do it but with a "for ch in s" loop? (Well, you could "for ch in list(s)", but that's still treating strings as iterables.) For example, many people are asked to write a rot13 function in one of their first classes. How would you write that if strings weren't iterables? There's no way a regex is going to help you here, unless you wanted to do something like using re.sub('.') as a convoluted and slow way of writing map.

From amber.yust at  Sun Jan  5 19:17:13 2014
From: amber.yust at (Amber Yust)
Date: Sun, 05 Jan 2014 18:17:13 +0000
Subject: [Python-ideas] strings as iterables - from str.startswith
 taking any iterator instead of just tuple
References: <>
 <> <20140104225857.GZ29356@ando>
Message-ID: <818997001994832807@gmail297201516>

For ABC names, perhaps "IndependentSequence" or "UnaffiliatedSequence"?
On Sat Jan 04 2014 at 11:25:23 PM, Guido van Rossum <guido at>

> Is this tread still about strings vs. other iterables?
> First of all, the motivation for making strings iterable is that they
> are indexable and sliceable, which means they act like sequences.
> Historically, indexing and slicing predated the concept of iterators
> in Python. Many other languages (starting with Pascal and C) also
> treat strings as arrays; while many of those have a separate character
> type, a few languages follow Python's example (or the other way
> around, I don't feel like tracking the influences exactly, or even
> finding examples -- I do know they exist). There are also languages
> where strings are *not* considered arrays (I think this is the case in
> Ruby and Perl). In such languages string manipulation is typically
> done using regular expressions or similar APIs, although there usually
> also non-array APIs to get characters or substrings using indexes, but
> those APIs may not be O(1), e.g. for reasons having to do with
> decoding UTF-8 on the fly.
> All in all I am happy with Python's string-as-array semantics and I
> don't want to change this.
> While I would like to encourage API designs that don't require
> distinguishing between strings and other iterables (just like I prefer
> APIs that don't require distinguishing between sequences and mappings,
> or between callables and "plain values"), I realize that pragmatically
> people are going to want to write such code, and an ABC seems a good
> choice.
> However, if "Atomic" is still under consideration, I would strongly
> argue against that particular term. Given that a string is an array of
> characters, calling it an "atom" (== indivisible) seems particularly
> out of order. (And yes, I know that the use of the term in physics is
> also a misnomer -- let's not repeat that mistake. :-)
> Alas, I don't have a better name, but I'm sure the thesauriers will
> find something. We have until Python 3.5 is released to agree on a
> name. :-)
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Sun Jan  5 20:02:29 2014
From: python at (Alexander Heger)
Date: Mon, 6 Jan 2014 06:02:29 +1100
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

> People have mentioned use cases for iterating strings in this thread. And it's easy to think of more. There are all kinds of algorithms that treat strings as sequences of characters. Sure, many of these functions are already methods on str or otherwise built into the stdlib, but that just means they're implemented by iterating the string storage in C with a loop around "*++s". And if you want to extend that set of builtins with similar functions, how else would you do it but with a "for ch in s" loop? (Well, you could "for ch in list(s)", but that's still treating strings as iterables.) For example, many people are asked to write a rot13 function in one of their first classes. How would you write that if strings weren't iterables? There's no way a regex is going to help you here, unless you wanted to do something like using re.sub('.') as a convoluted and slow way of writing map.

whereas the issue seems now settled, you could use explicit functions
like str.iter(), str.codepoints(), str.substr(), ...

From ethan at  Sun Jan  5 20:33:51 2014
From: ethan at (Ethan Furman)
Date: Sun, 05 Jan 2014 11:33:51 -0800
Subject: [Python-ideas] a new bytestring type?
Message-ID: <>

As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has 
immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.

How many would be interested in having a 'bytestring'?

What do you see as the distinguishing characteristics?


From amber.yust at  Sun Jan  5 20:58:04 2014
From: amber.yust at (Amber Yust)
Date: Sun, 05 Jan 2014 19:58:04 +0000
Subject: [Python-ideas]  a new bytestring type?
References: <>
Message-ID: <-3280845380621406811@gmail297201516>

How would you see this bytestring type as differentiating itself from
bytes? What use cases do you envision?

On Sun Jan 05 2014 at 11:56:46 AM, Ethan Furman <ethan at> wrote:

> As anyone who has worked with Python 3 and low-level protocols knows,
> Python 3 has no 'bytestring' type.  It has
> immutable and mutable versions of arrays of integers, otherwise known as
> 'bytes' and 'bytearray'.
> How many would be interested in having a 'bytestring'?
> What do you see as the distinguishing characteristics?
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Sun Jan  5 21:04:20 2014
From: ethan at (Ethan Furman)
Date: Sun, 05 Jan 2014 12:04:20 -0800
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/05/2014 11:33 AM, Ethan Furman wrote:
> As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
> immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
> How many would be interested in having a 'bytestring'?


> What do you see as the distinguishing characteristics?

Indexing returns a bytestring of length 1, not an integer

`bytestring(7)` either fails, or returns 'bytestring('\x07')' not 'bytestring(0, 0, 0, 0, 0, 0, 0)'


From lukasz at  Sun Jan  5 21:30:12 2014
From: lukasz at (=?utf-8?Q?=C5=81ukasz_Langa?=)
Date: Sun, 5 Jan 2014 12:30:12 -0800
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <> <>
Message-ID: <>

On Jan 5, 2014, at 12:04 PM, Ethan Furman <ethan at> wrote:

> On 01/05/2014 11:33 AM, Ethan Furman wrote:
>> As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
>> immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
>> How many would be interested in having a 'bytestring'?
> +1

"I don't always +1 on python-ideas, but when I do, I do it on my own posts."


Best regards,
?ukasz Langa

Twitter: @llanga
IRC: ambv on #python-dev

From ethan at  Sun Jan  5 21:08:05 2014
From: ethan at (Ethan Furman)
Date: Sun, 05 Jan 2014 12:08:05 -0800
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <-3280845380621406811@gmail297201516>
References: <>
Message-ID: <>

On 01/05/2014 11:58 AM, Amber Yust wrote:
> How would you see this bytestring type as differentiating itself from bytes? What use cases do you envision?

I put the questions there so others could fill in the blanks for themselves.  I have responded to the original question 
with two of the differentiating features (the two that bug me most, of course ;).


From solipsis at  Sun Jan  5 22:01:44 2014
From: solipsis at (Antoine Pitrou)
Date: Sun, 5 Jan 2014 22:01:44 +0100
Subject: [Python-ideas] a new bytestring type?
References: <>
Message-ID: <20140105220144.67c0a613@fsol>

On Sun, 05 Jan 2014 12:04:20 -0800
Ethan Furman <ethan at> wrote:
> On 01/05/2014 11:33 AM, Ethan Furman wrote:
> > As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
> > immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
> >
> > How many would be interested in having a 'bytestring'?
> +1
> > What do you see as the distinguishing characteristics?
> Indexing returns a bytestring of length 1, not an integer
> `bytestring(7)` either fails, or returns 'bytestring('\x07')' not 'bytestring(0, 0, 0, 0, 0, 0, 0)'

I agree with that, but it's much too late, and I'm -10 on adding
another, similar but different, bytestring type.



From ethan at  Sun Jan  5 21:51:33 2014
From: ethan at (Ethan Furman)
Date: Sun, 05 Jan 2014 12:51:33 -0800
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 01/05/2014 12:30 PM, ?ukasz Langa wrote:
> On Jan 5, 2014, at 12:04 PM, Ethan Furman <ethan at> wrote:
>> On 01/05/2014 11:33 AM, Ethan Furman wrote:
>>> As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
>>> immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
>>> How many would be interested in having a 'bytestring'?
>> +1
> "I don't always +1 on python-ideas, but when I do, I do it on my own posts."

+1 QOTW !

From ncoghlan at  Sun Jan  5 23:57:24 2014
From: ncoghlan at (Nick Coghlan)
Date: Mon, 6 Jan 2014 08:57:24 +1000
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 6 Jan 2014 03:56, "Ethan Furman" <ethan at> wrote:
> As anyone who has worked with Python 3 and low-level protocols knows,
Python 3 has no 'bytestring' type.  It has immutable and mutable versions
of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
> How many would be interested in having a 'bytestring'?
> What do you see as the distinguishing characteristics?

I actually expected someone to have experimented with an "encodedstr" type
by now. This would be a type that behaved like the Python 2 str type, but
had an encoding attribute. On encountering Unicode text strings, it would
encode then appropriately.

However, people have generally instead followed the model of decoding to
text and operating in that domain, since it avoids a lot of subtle issues
(like accidentally embedding byte order marks when concatenating strings).

This is likely encouraged by the fact that str, bytes and bytearray don't
currently implement type coercion correctly (which in turn is due to a long
standing bug in the way the abstract C API handles sequence types defined
in C rather than Python), so an encodedstr type would need to inherit from
str or bytes to get interoperability, and then wouldn't interoperate with
the other one.


> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Mon Jan  6 01:27:08 2014
From: rosuav at (Chris Angelico)
Date: Mon, 6 Jan 2014 11:27:08 +1100
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 6, 2014 at 4:48 AM, Andrew Barnert <abarnert at> wrote:
> And if you want to extend that set of builtins with similar functions, how else would you do it but with a "for ch in s" loop? (Well, you could "for ch in list(s)", but that's still treating strings as iterables.)

You could simply "for ch in s.split('')". A number of languages define
that to mean fracturing a string into one-character strings. Python
currently raises ValueError, so it won't break existing code.

But yes, it's easier to be able to iterate over a string.


From rosuav at  Mon Jan  6 01:38:17 2014
From: rosuav at (Chris Angelico)
Date: Mon, 6 Jan 2014 11:38:17 +1100
Subject: [Python-ideas] `pathlib.Path.write` and ``
In-Reply-To: <20140105185316.7ac5084f@fsol>
References: <>
Message-ID: <>

On Mon, Jan 6, 2014 at 4:53 AM, Antoine Pitrou <solipsis at> wrote:
> For starters I think "binary" shouldn't be an
> argument: there should be separate methods for reading/writing text and
> binary contents.

For reading, yes. For writing, the type of the 'data' argument should
say whether it's binary or text, without having to be told.

Not sure it belongs in pathlib, though. Here's the naive code to do a
simple translation on a file:

data = open(fn).read()

Doesn't use with, will probably work on CPython but risks trampling on
itself in other interpreters. Needs a solution. Will someone who's
told "hey, there's a potential problem in that code" go looking in
pathlib? I'm not sure about that. I'd be thinking about files and
strings, but not about paths. It'd be great as a built-in:


or in some namespace that screams "Hey look, file I/O", but I can't
imagine looking for it in pathlib.

Now that 'file' isn't a builtin, would it be worth having a file
module that has this sort of thing? Or would that cause too much


From dreamingforward at  Mon Jan  6 01:39:53 2014
From: dreamingforward at (Mark Janssen)
Date: Sun, 5 Jan 2014 18:39:53 -0600
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

> As anyone who has worked with Python 3 and low-level protocols knows, Python
> 3 has no 'bytestring' type.  It has immutable and mutable versions of arrays
> of integers, otherwise known as 'bytes' and 'bytearray'.

"arrays of integers"?  You mean, unsigned short ints?  There's an
important difference.  One references an abstraction, and one
references a concrete machine type.

The other consideration is knowing what you mean by "string", if you
mean something to be interpreted textually, then the convention is to
use unsigned chars to document your intentions, which "technically" is
the same (as far as memory layout is concerned).  (I say "technically"
because there is some space reserved for endian-ness which can change
the bit ordering.)

> How many would be interested in having a 'bytestring'?
> What do you see as the distinguishing characteristics?

What it *should* have is a bytes-type, which is a raw, 8-bit type
which may or may not printable on the screen with quotation marks.
Different subtypes, >>>class Text(bytes) can interpret those bytes as
they want (as a text string for example, with or without formatting
awareness for control codes.  Otherwise File(bytes) can interpret
those bytes as binary data, so as to write to the file system without
any transformation of the codes (i.e. raw).

I'm afraid this reply may not be up to the standards of the list, but
hopefully has some useful data that has gone without good


From abarnert at  Mon Jan  6 01:38:03 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 5 Jan 2014 16:38:03 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 5, 2014, at 11:02, Alexander Heger <python at> wrote:

>> People have mentioned use cases for iterating strings in this thread. And it's easy to think of more. There are all kinds of algorithms that treat strings as sequences of characters. Sure, many of these functions are already methods on str or otherwise built into the stdlib, but that just means they're implemented by iterating the string storage in C with a loop around "*++s". And if you want to extend that set of builtins with similar functions, how else would you do it but with a "for ch in s" loop? (Well, you could "for ch in list(s)", but that's still treating strings as iterables.) For example, many people are asked to write a rot13 function in one of their first classes. How would you write that if strings weren't iterables? There's no way a regex is going to help you here, unless you wanted to do something like using re.sub('.') as a convoluted and slow way of writing map.
> whereas the issue seems now settled, you could use explicit functions
> like str.iter(), str.codepoints(), str.substr(), ...

Sure, and we could add list.iter(), list.slice(), etc. and get rid of iterables, indexing and slicing, entirely. If we add separate map and similar methods to every iterable type, we can even get rid of iterators. If it's good enough for ObjC, why should Python try to be more readable or concise?

From dreamingforward at  Mon Jan  6 01:45:28 2014
From: dreamingforward at (Mark Janssen)
Date: Sun, 5 Jan 2014 18:45:28 -0600
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

> "arrays of integers"?  You mean, unsigned short ints?  There's an
> important difference.  One references an abstraction, and one
> references a concrete machine type.
> The other consideration is knowing what you mean by "string", if you
> mean something to be interpreted textually, then the convention is to
> use unsigned chars to document your intentions, which "technically" is
> the same (as far as memory layout is concerned).  (I say "technically"
> because there is some space reserved for endian-ness which can change
> the bit ordering.)

One mistake I already wish to correct is in the last sentence:
"endian-ness" *always* changes or refers to the bit ordering.
Secondly, the term only applies to numerical (always integer, AFAIK)
representation -- not for chars.

Trying to be complete...


From cs at  Mon Jan  6 02:29:12 2014
From: cs at (Cameron Simpson)
Date: Mon, 6 Jan 2014 12:29:12 +1100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 05Jan2014 12:51, Ethan Furman <ethan at> wrote:
> On 01/05/2014 12:30 PM, ?ukasz Langa wrote:
> >On Jan 5, 2014, at 12:04 PM, Ethan Furman <ethan at> wrote:
> >>On 01/05/2014 11:33 AM, Ethan Furman wrote:
> >>>As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
> >>>immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
> >>>
> >>>How many would be interested in having a 'bytestring'?
> >>
> >>+1
> >
> >"I don't always +1 on python-ideas, but when I do, I do it on my own posts."
> +1 QOTW !


... but doesn't your +1 falsify the quote you're +1ing?
Cameron Simpson <cs at>

This person is currently undergoing electric shock therapy at Agnews
Developmental Center in San Jose, California. All his opinions are static,
please ignore him.  Thank you,  Nurse Ratched
- the sig quote of Bob "Another beer, please" Christ <bhatch at>

From dreamingforward at  Mon Jan  6 03:00:27 2014
From: dreamingforward at (Mark Janssen)
Date: Sun, 5 Jan 2014 20:00:27 -0600
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

>> "arrays of integers"?  You mean, unsigned short ints?  There's an
>> important difference.  One references an abstraction, and one
>> references a concrete machine type.
>> The other consideration is knowing what you mean by "string", if you
>> mean something to be interpreted textually, then the convention is to
>> use unsigned chars to document your intentions, which "technically" is
>> the same (as far as memory layout is concerned).  (I say "technically"
>> because there is some space reserved for endian-ness which can change
>> the bit ordering.)
> One mistake I already wish to correct ...
> Trying to be complete...

Come to think of it, this issue (the relationship between bytes, text,
and char/ints) may be the entire reason Python3 "uptake" hasn't
happened.  It gets back to the same old argument I've been trying to
make about "models of computation".  Python3 apparently did not
respect the machine and went the way of the "dark side", hence
scientific computing hasn't been as quick to convert to Python 3.

Specifically, the final issue with regard to bytes (and it's
consequent model of computation) is thus:   1) how they maintain
representation on the file system (the "disk") vs. 2) how they are
represented and managed in memory.  This is the primary articulation
point regarding how the *abstraction of computing* relates to its
*implementation*.  This also relates to the Turing Machine and it's
articulation with the underlying VonNeumann architecture

Ned, I hope you're finally understanding this.


From ethan at  Mon Jan  6 02:37:43 2014
From: ethan at (Ethan Furman)
Date: Sun, 05 Jan 2014 17:37:43 -0800
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/05/2014 05:29 PM, Cameron Simpson wrote:
> On 05Jan2014 12:51, Ethan Furman <ethan at> wrote:
>> On 01/05/2014 12:30 PM, ?ukasz Langa wrote:
>>> On Jan 5, 2014, at 12:04 PM, Ethan Furman <ethan at> wrote:
>>>> On 01/05/2014 11:33 AM, Ethan Furman wrote:
>>>>> As anyone who has worked with Python 3 and low-level protocols knows, Python 3 has no 'bytestring' type.  It has
>>>>> immutable and mutable versions of arrays of integers, otherwise known as 'bytes' and 'bytearray'.
>>>>> How many would be interested in having a 'bytestring'?
>>>> +1
>>> "I don't always +1 on python-ideas, but when I do, I do it on my own posts."
>> +1 QOTW !
> +1 QOTW
> ... but doesn't your +1 falsify the quote you're +1ing?

Hrmmm.... well, just in case:


From ned at  Mon Jan  6 04:39:51 2014
From: ned at (Ned Batchelder)
Date: Sun, 05 Jan 2014 22:39:51 -0500
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/5/14 9:00 PM, Mark Janssen wrote:
>>> "arrays of integers"?  You mean, unsigned short ints?  There's an
>>> important difference.  One references an abstraction, and one
>>> references a concrete machine type.
>>> The other consideration is knowing what you mean by "string", if you
>>> mean something to be interpreted textually, then the convention is to
>>> use unsigned chars to document your intentions, which "technically" is
>>> the same (as far as memory layout is concerned).  (I say "technically"
>>> because there is some space reserved for endian-ness which can change
>>> the bit ordering.)
>> One mistake I already wish to correct ...
>> Trying to be complete...
> Come to think of it, this issue (the relationship between bytes, text,
> and char/ints) may be the entire reason Python3 "uptake" hasn't
> happened.  It gets back to the same old argument I've been trying to
> make about "models of computation".  Python3 apparently did not
> respect the machine and went the way of the "dark side", hence
> scientific computing hasn't been as quick to convert to Python 3.
> Specifically, the final issue with regard to bytes (and it's
> consequent model of computation) is thus:   1) how they maintain
> representation on the file system (the "disk") vs. 2) how they are
> represented and managed in memory.  This is the primary articulation
> point regarding how the *abstraction of computing* relates to its
> *implementation*.  This also relates to the Turing Machine and it's
> articulation with the underlying VonNeumann architecture
> (implementation).
> Ned, I hope you're finally understanding this.
Mark, I think you are confusing my posts in Python-List with this 
thread.  I would rather you didn't address me: my interactions with you 
in the past have been unpleasant, especially where we've tried to get to 
the bottom of one of your typically obscure references to the theory of 
computation.  You've mocked and ignored me when I've tried to treat your 
ideas with respect, so I'm not going to make that mistake again.

> MarkJ
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From tjreedy at  Mon Jan  6 06:08:15 2014
From: tjreedy at (Terry Reedy)
Date: Mon, 06 Jan 2014 00:08:15 -0500
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <laddnm$cmu$>

On 1/5/2014 12:48 PM, Andrew Barnert wrote:
> On Jan 5, 2014, at 3:09, David Townshend
> <aquavitae69 at> wrote:
>> Reading this thread made me start to think about why a string is a
>> sequence,

Because a string is defined in math/language theory as a sequence of 
symbols from an alphabet. If you want to invent or define something 
else, such as an atomic symbol type, please use a different term. For 

class Symbol:
   def __init__(self, name):
     self._name = name  # optionally check that name is string
   def __eq__(self, other):
     return self._name == other._name
   def __hash__(self):
     return hash(self._name)
   def __repr__(self):
     return 'Symbol({r:})'.format(self._name)
   __str__ = __repr__  # or define to tast

Now Symbols are hashable, equality-comparable, but not iterable.

In other words, I believe the desire for a non-iterable 'string' is a 
desire for something that is not really a string, but is perhaps being 
represented as a string merely for convenience. Using duples as 
linked-list nodes (which I have done), because one does not bother to 
define a node class is similar. Tuple iteration is equally meaningless 
in this context as string iteration is in symbol context.

> You've seriously never indexed or sliced a string? Those are the two
> core operations in sequences, and they're obviously useful on
> strings.

And as already explained, indexable means iterable.

>> Every use case I can think of for iterating over a string either
>> involves first splitting the string, or would be better done with a
>> regex

Splitting involves forward iteration. Regex matching adds backtracking 
on top of forward iteration. Please tell me a *string* algorithm that 
does *not* involve character iteration somewhere.

> People have mentioned use cases for iterating strings in this thread.
> And it's easy to think of more. There are all kinds of algorithms
> that treat strings as sequences of characters. Sure, many of these
> functions are already methods on str or otherwise built into the
> stdlib, but that just means they're implemented by iterating the
> string storage in C with a loop around "*++s".

I was going to make the same point. Strings have the following methods: 
'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 
'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 
'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 
'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 
'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 
'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 
'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 
'upper', 'zfill'. Written in Python (as in classes and PyPy!), nearly 
all start with 'for c in s:' (or 'in reversed(s)').  The ones that do 
not generally use len(s). Len(s) is calculated in str.__new__ with an 
internal iteration: 'for char added to string, increment len counter'.

Comparing strings also involves interation, hence sorting lists of 
strings by comparison

> And if you want to
> extend that set of builtins with similar functions, how else would
> you do it but with a "for ch in s" loop? (Well, you could "for ch in
> list(s)", but that's still treating strings as iterables.) For
> example, many people are asked to write a rot13 function in one of
> their first classes. How would you write that if strings weren't
> iterables? There's no way a regex is going to help you here, unless
> yo u wanted to do something like using re.sub('.') as a convoluted
> and slow way of writing map.

AFAIK, all the codecs iterate character by character.

Terry Jan Reedy

From stephen at  Mon Jan  6 08:35:28 2014
From: stephen at (Stephen J. Turnbull)
Date: Mon, 06 Jan 2014 16:35:28 +0900
Subject: [Python-ideas]  a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman writes:

 > How many would be interested in having a 'bytestring'?

-1.  It's an attractive nuisance.

 > What do you see as the distinguishing characteristics?

Its main attraction is that it allows people who in practice only ever
deal with one non-Unicode encoding to ignore the fact that their data
is in fact encoded, and that their applications are very likely not
robust to data encoded differently.

While I sympathize with their problem to some extent (especially
people who are writing low-level web services), I don't think you'd
ever again be able to trust a 3rd- party module in a web context
without doing a thorough audit to ensure that all uses of
'bytestrings' are appropriate in themselves and appropriately guarded
against leaking garbage into other contexts.  Thus, "attractive

From bruce at  Mon Jan  6 08:06:10 2014
From: bruce at (Bruce Leban)
Date: Sun, 5 Jan 2014 23:06:10 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 5, 2014 at 9:48 AM, Andrew Barnert <abarnert at> wrote:

> > Reading this thread made me start to think about why a string is a
> sequence, and I can't actually see any obvious reason, other than
> historical ones.
> You've seriously never indexed or sliced a string? Those are the two core
> operations in sequences, and they're obviously useful on strings.

I am doing most coding in two languages right now: Python and Javascript. I
have never wished that Python had string.charAt(i) but I have often wished
that Javascript had string[i]. When I've iterated over the characters in a
string in Javascript, it has never occurred to me to write it using

By irrelevant analogy, I have never used complex numbers in Python or
Javascript and I can't see any obvious reason to support them. It just
confuses people who inadvertently write cmath.sqrt instead of math.sqrt.
For the few people that use complex numbers, they would be better served by
a tuple of real and imaginary parts. As someone who doesn't use them, my
opinion is clearly more important that that of those that use them.

--- Bruce
Learn how hackers think:

(Not serious about removing complex numbers from Python. If you didn't see
the sarcasm, sorry.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Mon Jan  6 09:09:41 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Mon, 6 Jan 2014 00:09:41 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <laddnm$cmu$>
References: <>
 <> <laddnm$cmu$>
Message-ID: <>

On Sun, Jan 5, 2014 at 9:08 PM, Terry Reedy <tjreedy at> wrote:
> On 1/5/2014 12:48 PM, Andrew Barnert wrote:
>> On Jan 5, 2014, at 3:09, David Townshend
>> <aquavitae69 at> wrote:
>>> Reading this thread made me start to think about why a string is a
>>> sequence,
> Because a string is defined in math/language theory as a sequence of symbols
> from an alphabet. If you want to invent or define something else, such as an
> atomic symbol type, please use a different term. For example:

And sequences in math / CS are functions from the natural
numbers to elements of the sequence. Since isinstance(str,
types.FunctionType) isn't True, it must mean that Python strings
aren't strings.

But seriously, Python functions aren't functions, the set of Python
complex numbers is not the set of complex numbers, Python types aren't
types, and Python addition is not addition; mathematical terminology
in programming is evocative and
not actually literally true. Arguments based on trying to literally
copy math to the letter are flawed, probably irretrievably so.

The important feature of strings in math is not that they are
literally a sequence of characters, but that they correspond to a
sequence of characters isomorphically. You can represent them any way
you like, as long as you maintain that isomorphism, and the operations
with the right names do the right thing, etc. As evidence, observe
that not every programming language has its string type obey the
equivalent of Python's sequence interface or math's notion of
"sequence" per se (mapping naturals to elements). For example, Haskell
strings are linked lists; Rust strings are arrays behind the scenes
but don't expose it within the str type; etc.

It's not just strings, either, There are a multitude of ways of
defining the natural numbers -- maybe a natural number is a set of a
given structure (and which structure?), maybe it is a pair of integers
where the second integer is 1, maybe it is an infinite sequence of
rationals whose limit is a rational with denominator 1, maybe it is a
bitstring of arbitrary finite length. The usual construction in math
is the first, but Python uses the last one. To say Python doesn't
actually have natural numbers but does have strings, is absurd, but it
is what your logic points towards. If two things are equivalent,
everything said about one can be said about the other, and math is
about saying things about stuff, not about precise definitions of
structure -- those are chosen for convenience.

-- Devin

From jeanpierreda at  Mon Jan  6 09:19:44 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Mon, 6 Jan 2014 00:19:44 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
 <> <laddnm$cmu$>
Message-ID: <>

On Mon, Jan 6, 2014 at 12:09 AM, Devin Jeanpierre
<jeanpierreda at> wrote:
> On Sun, Jan 5, 2014 at 9:08 PM, Terry Reedy <tjreedy at> wrote:
>> On 1/5/2014 12:48 PM, Andrew Barnert wrote:
>>> On Jan 5, 2014, at 3:09, David Townshend
>>> <aquavitae69 at> wrote:
>>>> Reading this thread made me start to think about why a string is a
>>>> sequence,
>> Because a string is defined in math/language theory as a sequence of symbols
>> from an alphabet. If you want to invent or define something else, such as an
>> atomic symbol type, please use a different term. For example:
> And sequences in math / CS are functions from the natural
> numbers to elements of the sequence. Since isinstance(str,
> types.FunctionType) isn't True, it must mean that Python strings
> aren't strings.
> But seriously, Python functions aren't functions, the set of Python
> complex numbers is not the set of complex numbers, Python types aren't
> types, and Python addition is not addition; mathematical terminology
> in programming is evocative and
> not actually literally true. Arguments based on trying to literally
> copy math to the letter are flawed, probably irretrievably so.
> The important feature of strings in math is not that they are
> literally a sequence of characters, but that they correspond to a
> sequence of characters isomorphically. You can represent them any way
> you like, as long as you maintain that isomorphism, and the operations
> with the right names do the right thing, etc. As evidence, observe
> that not every programming language has its string type obey the
> equivalent of Python's sequence interface or math's notion of
> "sequence" per se (mapping naturals to elements). For example, Haskell
> strings are linked lists; Rust strings are arrays behind the scenes
> but don't expose it within the str type; etc.
> It's not just strings, either, There are a multitude of ways of
> defining the natural numbers -- maybe a natural number is a set of a
> given structure (and which structure?), maybe it is a pair of integers
> where the second integer is 1, maybe it is an infinite sequence of
> rationals whose limit is a rational with denominator 1, maybe it is a
> bitstring of arbitrary finite length. The usual construction in math
> is the first, but Python uses the last one. To say Python doesn't
> actually have natural numbers but does have strings, is absurd, but it
> is what your logic points towards. If two things are equivalent,
> everything said about one can be said about the other, and math is
> about saying things about stuff, not about precise definitions of
> structure -- those are chosen for convenience.

Apologies, I wasn't thinking much and bungled that last argument
(should've talked about integers instead of naturals; and even did,
for half of it...). Fixed:

[...] There are a multitude of ways of defining the integers -- maybe
an integer is an equivalence class over the pairs of naturals, maybe
it is rational number with denominator 1, maybe it is an infinite
sequence of rationals whose limit is a rational with denominator 1,
maybe it is a two's complement bitstring of arbitrary length. The
usual construction in math is the first (or the second to last), but
Python uses the last one. To say that Python doesn't actually have
integers, but does have strings, is absurd, but [...]

-- Devin

From geertj at  Mon Jan  6 09:28:18 2014
From: geertj at (Geert Jansen)
Date: Mon, 6 Jan 2014 09:28:18 +0100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 5, 2014 at 8:33 PM, Ethan Furman <ethan at> wrote:
> As anyone who has worked with Python 3 and low-level protocols knows, Python
> 3 has no 'bytestring' type.  It has immutable and mutable versions of arrays
> of integers, otherwise known as 'bytes' and 'bytearray'.
> How many would be interested in having a 'bytestring'?

I'm not missing a new type, but I am missing the format method on the
binary types.


From stephen at  Mon Jan  6 11:57:18 2014
From: stephen at (Stephen J. Turnbull)
Date: Mon, 06 Jan 2014 19:57:18 +0900
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

Geert Jansen writes:

 > I'm not missing a new type, but I am missing the format method on the
 > binary types.

I'm curious about precisely what your use cases are, and just what
formatting they need.

The problem that Python 2 code has over and over imposed on me is that
the temptation to avoid the overhead of conversion to and then from
unicode when processing text by just using str results in the
equivalent of

    bs1 = returns_a_bytestring_encoded_in_utf8()
    bs2 = returns_a_bytestring_encoded_in_koi8()

    bs3 = b'{0} {1}'.format(bs1, bs2)
    # and lose big when something expects valid UTF-8 in bs3

In low-level code, the assignments to bs1, bs2, and bs3 are likely to
be in three separate contexts, even three separate modules.  I
understand about consenting adults, but it's just too hard to enforce
good practice here if you make it easy to pass around and operate on
encoded bytestrings.  I don't see how you avoid this pitfall, except
by making it easier to pass around Unicode than encoded strings.  And
given that encoding and decoding are unavoidable, that means making
use of bytestrings with text semantics painful.

So to answer my question from my own point of view, for example, I
would have no problem at all with

    b'{0:c}'.format(27) == b'\x1b'           # insert an ASCII ESC character

I would be leery of

    b'{0:s}'.format(b'\x1b[M') == b'\x1b[M'  # insert a ANSI control sequence

for the reason given above (for this use case, I would prefer

    blue_code = ord('M')                    # Or b'M', doesn't matter!
    b'\x1b[{0:c}'.format(blue_code) == b'\x1b[M'

-- and forgive me for not looking up my ANSI color sequences, it's
only luck if that's close) and I would consider

    b'{0:d}'.format(27) == b'27'             # insert the ASCII representation

to be an abomination since there's no reason to suppose that any given
bytestring is encoded in an ASCII-compatible way, or bigendian for
that matter.  Ditto everything else that involves representing a
number as a string of numeric characters.

From steve at  Mon Jan  6 11:57:33 2014
From: steve at (Steven D'Aprano)
Date: Mon, 6 Jan 2014 21:57:33 +1100
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
Message-ID: <20140106105732.GI29356@ando>

On Sun, Jan 05, 2014 at 11:06:10PM -0800, Bruce Leban wrote:

> As someone who doesn't use them [complex numbers], my
> opinion is clearly more important that that of those that use them.




From abarnert at  Mon Jan  6 12:16:05 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 6 Jan 2014 03:16:05 -0800 (PST)
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

From: Nick Coghlan <ncoghlan at>
Sent: Sunday, January 5, 2014 2:57 PM

>I actually expected someone to have experimented with an "encodedstr" type by now. This would be a type that behaved like the Python 2 str type, but had an encoding attribute. On encountering Unicode text strings, it would encode then appropriately.

I did something like this when I was first playing with 3.0, and I managed to find it.?

I tried two different implementations, a bytes subclass that fakes being a str as well as possible by decoding on the fly (or, in some cases, by encoding its arguments on the fly), and a str that fakes being a bytes as well as possible by doing the opposite.

>However, people have generally instead followed the model of decoding to text and operating in that domain, since it avoids a lot of subtle issues (like accidentally embedding byte order marks when concatenating strings).

It's also conceptually cleaner to work with text as text instead of as bytes that you can sort of use as text.

Also, one major reason people resist working with text (or upgrading to 3.x) is the perceived performance costs of dealing with Unicode. But if you want to do any kind of string processing on your text beyond searching for ASCII header names and the like, you pretty much have to do it as Unicode or it's wrong. So, you'd need something that allows you to do those ASCII header searches in 8-bit-land, but either doesn't allow full string processing, or automatically decodes and re-encodes on the fly (which obviously isn't going to be faster).

>This is likely encouraged by the fact that str, bytes and bytearray don't currently implement type coercion correctly (which in turn is due to a long standing bug in the way the abstract C API handles sequence types defined in C rather than Python), so an encodedstr type would need to inherit from str or bytes to get interoperability, and then wouldn't interoperate with the other one.

What's the bug? Anyway, I started off with the idea of inheriting from str or bytes in the first place because it seemed more natural than delegating, so I guess I didn't run into it.?

In general, it seems like you can interoperate just fine; an ebytes or estr (the names of my two classes) can, e.g., find, format, join, radd, whatever a bytes, str, ebytes, or estr without a problem, returning the appropriate types.

The problem is interacting with functions that explicitly want the other type. This includes C functions that, e.g., take a "U" parameter, like TextIOWrapper.write, but it's just as much of a problem with Python functions that check isinstance(str) (either to reject bytes, or to switch and do different things on bytes and str). So, you have to write things like "f.write(str(s))" instead of "f.write(s)" all over the place.

There's also a problem with functions that will take a str and do something useful, or take a bytes and do something stupid, like assume it must be in the appropriate encoding for the filesystem. An ebytes just looks like a bytes to such functions, and therefore does the wrong thing. Again, you have to do things like "open(str(s))"?and, if you don't, instead of an error you get silent mojibake. (Which I guess is a good simulation of the Python 2 str type after all?)

I couldn't find a way around the problem for ebytes. For estr, I fought for a while to make it support the buffer protocol (I wrote a Cython wrapper to let me delegate to another buffer from Python so I wouldn't have to write the whole thing in C), which fixes the problems with most C API functions, but doesn't help at all for Python functions.

Meanwhile, there are some design issues that aren't entirely clear.

The most obvious one is the performance issue I raised above. Should we cache the Unicode? Maybe even pre-compute it? I went with no caching just because it was the simplest implementation.

Exactly which methods should act on bytes and which on characters? My initial cut was that searching-related methods like startswith, index, split, or replace should be bytes, while things like casefold and zfill Unicode. The division isn't entirely clear, but it's something to start with. (I also considered switching on the types of the other arguments?e.g., replace would be byte-based when given a bytes or an ebytes of the same encoding, but Unicode-based when given a str or an ebytes of a different encoding?but that seemed overly complicated.)

Should indexing and iteration return numbers, as with bytes?

It's obvious what encode should do (transcode to an ebytes in a different encoding), but what about decode? (I left bytes.decode alone, but I think that was a bad choice; that makes it an inverse to a change_encoding function that reinterprets the bytes as a different encoding, rather than an inverse to encode.)

All that being said, just being able to use format or % with a mix of str and known-encoding-bytes is pretty handy.

Anyway, in case anyone wants to take a look at it, I can't find the Cython wrapper, so I dropped estr, but cleaned up ebytes and made sure it works with 3.3 and 3.4 and?uploaded it to? Please forgive the clunky way I wrote all the forwarding methods.

From geertj at  Mon Jan  6 12:19:08 2014
From: geertj at (Geert Jansen)
Date: Mon, 6 Jan 2014 12:19:08 +0100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 6, 2014 at 11:57 AM, Stephen J. Turnbull <stephen at> wrote:

>  > I'm not missing a new type, but I am missing the format method on the
>  > binary types.
> I'm curious about precisely what your use cases are, and just what
> formatting they need.

One use case I came across was when creating chunks for the HTTP
chunked encoding. Chunks contain a ascii header, a raw/encoded chunk
body, and an ascii trailer. Using a bytes.format, it would look like

  chunk = '{0:X}\r\n{1}\r\n'.format(len(buf), buf)

This is what I am using now:

  chunk = bytearray()


> The problem that Python 2 code has over and over imposed on me is that
> the temptation to avoid the overhead of conversion to and then from
> unicode when processing text by just using str results in the
> equivalent of
>     bs1 = returns_a_bytestring_encoded_in_utf8()
>     bs2 = returns_a_bytestring_encoded_in_koi8()
>     bs3 = b'{0} {1}'.format(bs1, bs2)
>     # and lose big when something expects valid UTF-8 in bs3
> In low-level code, the assignments to bs1, bs2, and bs3 are likely to
> be in three separate contexts, even three separate modules.  I
> understand about consenting adults, but it's just too hard to enforce
> good practice here if you make it easy to pass around and operate on
> encoded bytestrings.  I don't see how you avoid this pitfall, except
> by making it easier to pass around Unicode than encoded strings.  And
> given that encoding and decoding are unavoidable, that means making
> use of bytestrings with text semantics painful.
> So to answer my question from my own point of view, for example, I
> would have no problem at all with
>     b'{0:c}'.format(27) == b'\x1b'           # insert an ASCII ESC character
> I would be leery of
>     b'{0:s}'.format(b'\x1b[M') == b'\x1b[M'  # insert a ANSI control sequence
> for the reason given above (for this use case, I would prefer
>     blue_code = ord('M')                    # Or b'M', doesn't matter!
>     b'\x1b[{0:c}'.format(blue_code) == b'\x1b[M'
> -- and forgive me for not looking up my ANSI color sequences, it's
> only luck if that's close) and I would consider
>     b'{0:d}'.format(27) == b'27'             # insert the ASCII representation
> to be an abomination since there's no reason to suppose that any given
> bytestring is encoded in an ASCII-compatible way, or bigendian for
> that matter.  Ditto everything else that involves representing a
> number as a string of numeric characters.

From abarnert at  Mon Jan  6 12:34:31 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 6 Jan 2014 03:34:31 -0800 (PST)
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

From: Geert Jansen <geertj at>

Sent: Monday, January 6, 2014 12:28 AM

> I'm not missing a new type, but I am missing the format method on the
> binary types.

I miss that too, but it's a bit tricky.

'{}'.format(x) calls str(x).

b'{}'.format(x) can't call bytes(x). At least not unless you want b'#{}'.format(6) to give you b'#\0\0\0\0\0\0'. Besides, most types don't provide a __bytes__, so even if it weren't for this problem, it wouldn't really be useful for anything except inserting bytes into other bytes.?So, what _should_ it call?

You could add encoding and errors keyword parameters (defaulting to 'ascii' and 'strict'), so b'{}'.format(x, encoding='utf-8') calls str(x).encode('utf-8'), which solves all of those problems? except that now it means you can't stick bytes objects into bytes formats, which is even worse.

You could solve that by making objects that support the buffer protocol (like bytes) copy as-is instead of going through str and encode. That would mean you can't use bytes with a placeholder with any format flags, but maybe that's a good thing anyway (e.g., do you really want b'{:3}'.format(b'\xc3\xa9') to only pad to 2 characters instead of 3 because it's a 2-byte character?).

That would be enough to let you cram pre-encoded/formatted bytes, and things like numbers, into bytes formats made up of ASCII headers, which I think is 90% of what people want here. Does that seem worth pursuing?

From abarnert at  Mon Jan  6 12:52:33 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 6 Jan 2014 03:52:33 -0800 (PST)
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

I didn't receive Stephen's email, so forgive me for replying through a reply?

From: Geert Jansen <geertj at>
Sent: Monday, January 6, 2014 3:19 AM

> On Mon, Jan 6, 2014 at 11:57 AM, Stephen J. Turnbull <stephen at> 
> wrote:
>> ? > I'm not missing a new type, but I am missing the format method on 
> the
>> ? > binary types.
>>  I'm curious about precisely what your use cases are, and just what
>>  formatting they need.

Besides Geert's chunked HTTP example, there are tons of intern protocols and?file formats (including Python source code!), that have ASCII headers (that in some way define an encoding for the actual payload). So things like b'Content-Length: {}'.format(len(payload)) or even b'Content-Type: text/html; charset={}'.format(encoding) are useful.

>> ? I would consider
>> ? ?  b'{0:d}'.format(27) == b'27'? ? ? ? ? ?  # insert the ASCII representation
>>  to be an abomination since there's no reason to suppose that any given
>>  bytestring is encoded in an ASCII-compatible way, or bigendian for
>>  that matter.? Ditto everything else that involves representing a
>>  number as a string of numeric characters.

Endianness isn't relevant here; b'{}'.format(32768) is b'32768', not b'\x80\x00' or b'\x00\x80'. That's what the d format means.

As for assuming that it's ASCII-compatible, again, there are all kinds of protocols that work with any ASCII-compatbile charset but don't work otherwise. Yeah, this can be a problem if you want to create an HTTP page or a Python source file in EBCDIC or UTF-16-LE?but even then, if the headers are interpreted as pure ASCII and then the payload is extracted and decoded separately, it still works.?In fact, it works better than if people try to construct everything as text end then encode, giving you illegal/unreadable EBCDIC headers, and this is a common incorrect workaround that Python 2-familiar people do when forced to deal with Python 3.

Obviously you could solve most of the same problems by formatting the headers as text, encoding them to ASCII, then concatenating the payload. And I'm not really worried about performance issues with that. But I am worried about convenience and readability?compare the desired and actual versions of Geert's code.

As I said in my other email, I might be happy assuming ASCII-strict for everything that isn't a buffer, and copying bytes as-is for everything that is. That _might_ be more of an attractive nuisance than a useful feature, but? it definitely is attractive, and I'm not sure it's a nuisance.

From geertj at  Mon Jan  6 12:57:41 2014
From: geertj at (Geert Jansen)
Date: Mon, 6 Jan 2014 12:57:41 +0100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 6, 2014 at 12:34 PM, Andrew Barnert <abarnert at> wrote:

> b'{}'.format(x) can't call bytes(x). At least not unless you want b'#{}'.format(6) to give you b'#\0\0\0\0\0\0'. Besides, most types don't provide a __bytes__, so even if it weren't for this problem, it wouldn't really be useful for anything except inserting bytes into other bytes. So, what _should_ it call?
> You could add encoding and errors keyword parameters (defaulting to 'ascii' and 'strict'), so b'{}'.format(x, encoding='utf-8') calls str(x).encode('utf-8'), which solves all of those problems? except that now it means you can't stick bytes objects into bytes formats, which is even worse.
> You could solve that by making objects that support the buffer protocol (like bytes) copy as-is instead of going through str and encode. That would mean you can't use bytes with a placeholder with any format flags, but maybe that's a good thing anyway (e.g., do you really want b'{:3}'.format(b'\xc3\xa9') to only pad to 2 characters instead of 3 because it's a 2-byte character?).
> That would be enough to let you cram pre-encoded/formatted bytes, and things like numbers, into bytes formats made up of ASCII headers, which I think is 90% of what people want here. Does that seem worth pursuing?

Agreed that probably the main case is inserting bytes objects verbatim
in a message with a a small ASCII header and possibly trainer. Format
flags are useful, e.g. with chunked HTTP encoding you need to insert
the length in hex. But if those are only available for non-bytes
objects that'd probably be fine.

I'm not too familiar with the implementation of format() so I can't
say much about it.


From masklinn at  Mon Jan  6 12:59:13 2014
From: masklinn at (Masklinn)
Date: Mon, 6 Jan 2014 12:59:13 +0100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 2014-01-06, at 11:57 , Stephen J. Turnbull <stephen at> wrote:
> Geert Jansen writes:
>> I'm not missing a new type, but I am missing the format method on the
>> binary types.
> I'm curious about precisely what your use cases are, and just what
> formatting they need.

Building up protocol output, especially (but not solely) ascii-based
ones, from existing or computed parts. Basically the same reasons behind
Erlang's bit syntax (on the building side thereof):

Essentially a partial and more readable (especially more readable)
version of what `struct` provides, and one in which the "pattern" can
contain literal constant content. `struct` is nice, but it doesn't scale
very well to big binary creation, and it's fairly horrible when part of
the output is constant as constant parts *still* have to be patterned
and injected as parameters.

Also, no support for keyword arguments.

From denis.spir at  Mon Jan  6 13:34:01 2014
From: denis.spir at (spir)
Date: Mon, 06 Jan 2014 13:34:01 +0100
Subject: [Python-ideas] str.startswith taking any iterator instead of
 just tuple
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/05/2014 06:49 PM, Eric Snow wrote:
> On Jan 5, 2014 4:10 AM, "David Townshend" <aquavitae69 at> wrote:
>> Reading this thread made me start to think about why a string is a
> sequence, and I can't actually see any obvious reason, other than
> historical ones.
> Sometimes I think it would be more clear if strings weren't sequences but
> had various attributes that exposed sequence "views", e.g. codepoints,
> etc.  Making strings non-sequences isn't realistic at this point, but
> adding the sequence view attributes may still be nice.
> That said, at present it's not something I personally have any use case
> for.  There was an article floating around the web recently where the
> deficiencies of unicode implementations was discussed and I recall
> something there or in related discussions about use cases for having
> different views into a string.  Wow that was vague. :)  The different views
> into unicode strings certainly comes up from time to time on our lists.

This does not fit the picture as long as strings are indexable and sliceable, in 
my opinion.

But most importantly, from the user practice & experience perspective, and while 
from a theoretical one it may be debattable, I consider it a great feature of 
python that everyday "mondane" string processing can be done using simple and 
easy Python string routines (i include here indexing & slicing). Alternatives 
would be regexes (read: Perl) and/or matching/parsing/searching libs (eg 
pyparsing) everywhere in python code; both are difficult, error-prone, hard to 
debug. The former are plain esoteric (but terribly practicle ;-), and I'm happy 
to rarely have to decipher *others'* regexes when reading python code (my own 
are far easier, indeed ;-).


From ram.rachum at  Mon Jan  6 14:28:48 2014
From: ram.rachum at (Ram Rachum)
Date: Mon, 6 Jan 2014 05:28:48 -0800 (PST)
Subject: [Python-ideas] Getting file name of Path without suffix
Message-ID: <>

Hi guys,

What do you think about introducing this Path property:
    def suffixless_name(self):
        return[:-len(self.suffix)] if self.suffix else

It's simple but I'd really hate to have this conditional slicing in user 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Mon Jan  6 14:46:05 2014
From: solipsis at (Antoine Pitrou)
Date: Mon, 6 Jan 2014 14:46:05 +0100
Subject: [Python-ideas] Getting file name of Path without suffix
References: <>
Message-ID: <20140106144605.15a3b3f5@fsol>

On Mon, 6 Jan 2014 05:28:48 -0800 (PST)
Ram Rachum <ram.rachum at> wrote:

> Hi guys,
> What do you think about introducing this Path property:
>     @property
>     def suffixless_name(self):
>         return[:-len(self.suffix)] if self.suffix else
> It's simple but I'd really hate to have this conditional slicing in user 
> code.

Have you tried .stem?



From breamoreboy at  Mon Jan  6 14:57:55 2014
From: breamoreboy at (Mark Lawrence)
Date: Mon, 06 Jan 2014 13:57:55 +0000
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <laecou$gtg$>

On 06/01/2014 08:28, Geert Jansen wrote:
> On Sun, Jan 5, 2014 at 8:33 PM, Ethan Furman <ethan at> wrote:
>> As anyone who has worked with Python 3 and low-level protocols knows, Python
>> 3 has no 'bytestring' type.  It has immutable and mutable versions of arrays
>> of integers, otherwise known as 'bytes' and 'bytearray'.
>> How many would be interested in having a 'bytestring'?
> I'm not missing a new type, but I am missing the format method on the
> binary types.
> Regards,
> Geert

Is this what the new PEP 460 is aimed at or am I again barking in the 
wrong forest?

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ncoghlan at  Mon Jan  6 15:50:40 2014
From: ncoghlan at (Nick Coghlan)
Date: Tue, 7 Jan 2014 00:50:40 +1000
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <laecou$gtg$>
References: <>
Message-ID: <>

On 6 Jan 2014 21:58, "Mark Lawrence" <breamoreboy at> wrote:
> On 06/01/2014 08:28, Geert Jansen wrote:
>> On Sun, Jan 5, 2014 at 8:33 PM, Ethan Furman <ethan at> wrote:
>>> As anyone who has worked with Python 3 and low-level protocols knows,
>>> 3 has no 'bytestring' type.  It has immutable and mutable versions of
>>> of integers, otherwise known as 'bytes' and 'bytearray'.
>>> How many would be interested in having a 'bytestring'?
>> I'm not missing a new type, but I am missing the format method on the
>> binary types.
>> Regards,
>> Geert
> Is this what the new PEP 460 is aimed at or am I again barking in the
wrong forest?

Yep, parallel discussions.


> --
> My fellow Pythonistas, ask not what our language can do for you, ask what
you can do for our language.
> Mark Lawrence
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Jan  6 15:58:30 2014
From: ncoghlan at (Nick Coghlan)
Date: Tue, 7 Jan 2014 00:58:30 +1000
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On 6 Jan 2014 19:16, "Andrew Barnert" <abarnert at> wrote:
> From: Nick Coghlan <ncoghlan at>
> Sent: Sunday, January 5, 2014 2:57 PM
> >I actually expected someone to have experimented with an "encodedstr"
type by now. This would be a type that behaved like the Python 2 str type,
but had an encoding attribute. On encountering Unicode text strings, it
would encode then appropriately.
> I did something like this when I was first playing with 3.0, and I
managed to find it.
> I tried two different implementations, a bytes subclass that fakes being
a str as well as possible by decoding on the fly (or, in some cases, by
encoding its arguments on the fly), and a str that fakes being a bytes as
well as possible by doing the opposite.
> >However, people have generally instead followed the model of decoding to
text and operating in that domain, since it avoids a lot of subtle issues
(like accidentally embedding byte order marks when concatenating strings).
> It's also conceptually cleaner to work with text as text instead of as
bytes that you can sort of use as text.
> Also, one major reason people resist working with text (or upgrading to
3.x) is the perceived performance costs of dealing with Unicode. But if you
want to do any kind of string processing on your text beyond searching for
ASCII header names and the like, you pretty much have to do it as Unicode
or it's wrong. So, you'd need something that allows you to do those ASCII
header searches in 8-bit-land, but either doesn't allow full string
processing, or automatically decodes and re-encodes on the fly (which
obviously isn't going to be faster).
> >This is likely encouraged by the fact that str, bytes and bytearray
don't currently implement type coercion correctly (which in turn is due to
a long standing bug in the way the abstract C API handles sequence types
defined in C rather than Python), so an encodedstr type would need to
inherit from str or bytes to get interoperability, and then wouldn't
interoperate with the other one.
> What's the bug?

CPython doesn't check for NotImplemented results from sq_concat or
sq_repeat, so the sequence implementations raise TypeError directly and the
RHS doesn't get consulted to see if it can handle the operation.
Subclassing works anyway because subclasses are always checked first even
when they're the RHS.

Thanks for the info on your experiences with attempting to implement an
encodedstr type. I still feel there is potential merit to the concept, but
it's certainly going to take some thought.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Mon Jan  6 18:14:07 2014
From: stephen at (Stephen J. Turnbull)
Date: Tue, 07 Jan 2014 02:14:07 +0900
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

Geert Jansen writes:

 > One use case I came across was when creating chunks for the HTTP
 > chunked encoding. Chunks contain a ascii header, a raw/encoded chunk
 > body, and an ascii trailer. Using a bytes.format, it would look like
 > this:
 >   chunk = '{0:X}\r\n{1}\r\n'.format(len(buf), buf)

You forgot the b prefix.

 > This is what I am using now:
 >   chunk = bytearray()
 >   chunk.extend('{0:X}\r\n'.format(len(buf)).encode('ascii'))
 >   chunk.extend(buf)
 >   chunk.extend('\r\n'.encode('ascii'))

Either of those is a big win compared to this?

    # OK, we'd want efficient definition of a bunch of these,
    # which is a cost.
    def itox (n):
        return '{0:X}'.format(n).encode('ascii')

    chunk = b'\r\n'.join([itox(len(buf)), buf, b''])

But see my response to Andrew, also.

From stephen at  Mon Jan  6 18:16:23 2014
From: stephen at (Stephen J. Turnbull)
Date: Tue, 07 Jan 2014 02:16:23 +0900
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <laecou$gtg$>
References: <>
Message-ID: <>

Mark Lawrence writes:

 > Is this what the new PEP 460 is aimed at or am I again barking in the 
 > wrong forest?

Sure, but that's only hours old.  And I think there's a better way.

From stephen at  Mon Jan  6 19:37:36 2014
From: stephen at (Stephen J. Turnbull)
Date: Tue, 07 Jan 2014 03:37:36 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a new
 bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Aside: I just read Victor's PEP 460, and apparently a lot of the
assumptions I'm making are true!

Andrew Barnert writes:
 > From: Geert Jansen <geertj at>
 > > On Mon, Jan 6, 2014 at 11:57 AM, Stephen J. Turnbull <stephen at> 
 > > wrote:
 > > 
 > >> ? > I'm not missing a new type, but I am missing the format method on 
 > >>   > the binary types.
 > >> 
 > >>  I'm curious about precisely what your use cases are, and just what
 > >>  formatting they need.
 > Besides Geert's chunked HTTP example, there are tons of intern
 > protocols and?file formats (including Python source code!),

Python source code must use an ASCII-compatible encoding to use PEP
263.  No widechars, no EBCDIC.  But yes, I know about ASCII header
formats -- I'm a Mailman developer.

 > that have ASCII headers (that in some way define an encoding for
 > the actual payload). So things like
 > b'Content-Length: {}'.format(len(payload))
 > or even
 > b'Content-Type: text/html; charset={}'.format(encoding)
 > are useful.

Useful, sure.  But that much more useful than the alternative?  What's
wrong with

    def itob(n):
        # besides efficiency :-)
        return "{0:d}".format(n).encode('ascii')

    b'Content-Length: ' + itob(len(payload))

    b'Content-Type: text/html; charset=' + encoding

for such cases?  Not to forget that for cases with multiple parts to
combine, bytes.join() is way fast -- which matters to most people who
want these operations.  So I just don't see a real need for generic
formatting operations here.  (regex is another matter, but that's
already implemented.)

 > As for assuming that it's ASCII-compatible, again, there are all
 > kinds of protocols that work with any ASCII-compatbile charset but
 > don't work otherwise.

If you *can* assume it's ASCII-compatible bytes, what's wrong with str
in Python 3?  The basic idea is to use

    inbytes.decode('ascii', errors='surrogateescape')

which will DTRT if you try to encode it without the surrogateescape
handler: it raises an exception unless the bytes is pure ASCII.  It's
memory-efficient for pure ASCII, and has all the string facilities we
love.  But of course it would be too painful for sending JPEGs by
chunked HTTP a la Geert.

So ... now that we have the flexible string representation (PEP 393),
let's add a 7-bit representation!  (Don't take that too seriously,
there are interesting more general variants I'm not going to talk
about tonight.)

The 7-bit representation satisfies the following requirements:

1.  It is only produced on input by a new 'ascii-compatible' codec,
    which sets the "7-bit representation" flag in the str object on
    input if it encounters any non-ASCII bytes (if pure ASCII, it
    produces an 8-bit str object).  This will be slower than just
    reading in the bytes in many cases, but I hope not unacceptably so.

2.  When sliced, the result needs to be checked for non-ASCII bytes.
    If none, the result is promoted to 8-bit.

3.  When combined with a str in 8-bit representation:

    a.  If the 8-bit str contains any Latin-1 or C1 characters, both
        strs are promoted to 16-bit, and non-ASCII characters in the
        7-bit string are converted by the surrogateescape handler.

    b.  Otherwise they're combined into a 7-bit str.

4.  When combined with a str in 16-bit or 32-bit representation, the
    7-bit string is "decoded" to the same representation, as if using
    the 'ascii' codec with the 'surrogateescape' handler.

5.  String methods that would raise or produce undefined results if
    used on str containing surrogate-encoded bytes need to be taught
    to do the same on non-ASCII bytes in 7-bit str objects.

6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
    and pure ASCII 8-bit str, and raises on anything else.  (Sorry,
    no, ISO 8859-1 does *not* get passed through without exception.)

7.  On output other codecs raise on a 7-bit str, unless the
    surrogateescape handler is in use.

IOW, it's almost as fast as bytes if you restrict yourself to ASCII-
compatible behavior, and you pay the price if you try to mix it with
"real" Unicode str objects.  Otherwise you can do anything with it you
could do with a str.

I don't think this actually has serious efficiency implications for
Unicode handling, since the relevant compatibility tests need to be
done anyway when combining strs.  All the expensive operations occur
when mixing 7-bit str and "real" non-ASCII Unicode, but we really
don't want to do that if we can avoid it, any more than we want to use
surrogate encoding if we can avoid it.

Efficiency for low-level protocols could be improved by having the
'ascii-compatible' codec always produce 7-bit.  I haven't thought
carefully about this yet.

For same reasons, there should be few surprises where people
inadvertantly mix 7-bit str with "real" Unicode, since creating 7-bit
is only done by the 'ascii-compatible' codec.  People who are doing
that will be using ASCII compatible protocols and should be used to
being careful with non-ASCII bytes.

Finally, none of the natural idioms require a b prefix on their
literals. :-)

N.B. Much of the above assumes that working with Unicode in 8-bit
representation is basically as efficient as working with bytes.  That
is an assumption on my part, I hope it's verified.


From dreamingforward at  Mon Jan  6 19:53:29 2014
From: dreamingforward at (Mark Janssen)
Date: Mon, 6 Jan 2014 12:53:29 -0600
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

>> How many would be interested in having a 'bytestring'?
> I'm not missing a new type, but I am missing the format method on the
> binary types.

Wouldn't a type "cast" like TextFile(bytestring) be sufficient?


From tjreedy at  Tue Jan  7 00:39:10 2014
From: tjreedy at (Terry Reedy)
Date: Mon, 06 Jan 2014 18:39:10 -0500
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <>
References: <>
 <> <laddnm$cmu$>
Message-ID: <lafeqm$jau$>

On 1/6/2014 3:09 AM, Devin Jeanpierre wrote:
> On Sun, Jan 5, 2014 at 9:08 PM, Terry Reedy <tjreedy at> wrote:

>>> On Jan 5, 2014, at 3:09, David Townshend
>>> <aquavitae69 at> wrote:
>>>> Reading this thread made me start to think about why a string is a
>>>> sequence,
>> Because a string is defined in math/language theory as a sequence of symbols
>> from an alphabet. If you want to invent or define something else, such as an
>> atomic symbol type, please use a different term. For example:
> And sequences in math / CS are functions from the natural
> numbers to elements of the sequence.

And functions (mappings) in math are defined either by a rule for 
calculating the output from the input or by a table (set of pairs) 
giving the output for each input. If the input domain is the finite 
sequence of counts from 0 to k, the table can be condensed to a sequence 
of k+1 output values.

 > Since isinstance(str, types.FunctionType) isn't True,

Python has multiple builtin callable types, and users can define more, 
so you need to expand that test. Anyway, since a string is not a 
function defined by rule, it must be a function defined by a table. 
Since the input domain is a finite sequence of counts, we can and do 
condense the table to a sequence of output values. Which is an expansion 
of what I said.

 > [snip]
Terry Jan Reedy

From ethan at  Mon Jan  6 23:59:11 2014
From: ethan at (Ethan Furman)
Date: Mon, 06 Jan 2014 14:59:11 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/06/2014 10:37 AM, Stephen J. Turnbull wrote:
> Comments?

Having a 7-bit str variant is definitely an interesting idea, but it wouldn't help me and is probably insufficient for 
network protocols as well.  The binary data I deal with occupies the full 0-255 range, some of which is actually encoded 
text (and I decode it before passing it back to the user), some of which is simple binary data, and some of which is 
simple ASCII (metadata about fields and whatnot).


From jeanpierreda at  Tue Jan  7 01:20:18 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Mon, 6 Jan 2014 16:20:18 -0800
Subject: [Python-ideas] str.startswith taking any iterator instead of
	just tuple
In-Reply-To: <lafeqm$jau$>
References: <>
 <> <laddnm$cmu$>
Message-ID: <>

On Mon, Jan 6, 2014 at 3:39 PM, Terry Reedy <tjreedy at> wrote:
>> Since isinstance(str, types.FunctionType) isn't True,
> Python has multiple builtin callable types, and users can define more, so
> you need to expand that test. Anyway, since a string is not a function
> defined by rule, it must be a function defined by a table. Since the input
> domain is a finite sequence of counts, we can and do condense the table to a
> sequence of output values. Which is an expansion of what I said.

No, I don't need to expand the test -- the limitation of the test was
the entire point. I was making fun of your argument that because the
mathematical terms are the same, therefore they must be the same in
Python. "strings are sequences in math, therefore they are in python"
is a superficial and fundamentally wrong argument. Here's another
argument of that form: "the nth element of a string is not a string in
math, therefore the nth element of a string is not a string in

That's a lie, of course.

There are too many ways that type of argument falls flat.

-- Devin

From stephen at  Tue Jan  7 06:05:47 2014
From: stephen at (Stephen J. Turnbull)
Date: Tue, 07 Jan 2014 14:05:47 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman writes:

 > Having a 7-bit str variant is definitely an interesting idea, but
 > it wouldn't help me and is probably insufficient for network
 > protocols as well.

I'd like evidence for that latter.

 > The binary data I deal with occupies the full 0-255 range,

My proposal deals with such data.  It simply prevents the program from
interpreting the 128-255 range as Unicode characters.  You can still
use regexps etc on the full range 0-255.

 > some of which is actually encoded text (and I decode it before
 > passing it back to the user), some of which is simple binary data,
 > and some of which is simple ASCII (metadata about fields and
 > whatnot).

You're wrong, it would help you.  Encoded text must be decoded, and in
that case it doesn't help you.  Unless you can treat it as a single
ASCII-compatible encoding (eg, this works for ISO-8859 or KOI8), when
the proposal wins for you.  Binary data and pure ASCII, the proposal
wins for you, unless you're worried about spurious recognition of the
binary data as ASCII metadata.  In that last case, again, nothing is
going to help you as it's a domain problem.  My proposal is undefeated
in your use case.

From geertj at  Tue Jan  7 06:32:11 2014
From: geertj at (Geert Jansen)
Date: Tue, 7 Jan 2014 06:32:11 +0100
Subject: [Python-ideas] a new bytestring type?
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 6, 2014 at 7:53 PM, Mark Janssen <dreamingforward at> wrote:
>>> How many would be interested in having a 'bytestring'?
>> I'm not missing a new type, but I am missing the format method on the
>> binary types.
> Wouldn't a type "cast" like TextFile(bytestring) be sufficient?

Unless I'm missing something, no. For the use case described the
result needs to be a bytes object.


From ethan at  Tue Jan  7 06:51:11 2014
From: ethan at (Ethan Furman)
Date: Mon, 06 Jan 2014 21:51:11 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 01/06/2014 09:05 PM, Stephen J. Turnbull wrote:
> Ethan Furman writes:
>> The binary data I deal with occupies the full 0-255 range,
> My proposal deals with such data.  It simply prevents the program from
> interpreting the 128-255 range as Unicode characters.  You can still
> use regexps etc on the full range 0-255.
>> some of which is actually encoded text (and I decode it before
>> passing it back to the user), some of which is simple binary data,
>> and some of which is simple ASCII (metadata about fields and
>> whatnot).
> You're wrong, it would help you.  Encoded text must be decoded, and in
> that case it doesn't help you.  Unless you can treat it as a single
> ASCII-compatible encoding (eg, this works for ISO-8859 or KOI8), when
> the proposal wins for you.  Binary data and pure ASCII, the proposal
> wins for you, unless you're worried about spurious recognition of the
> binary data as ASCII metadata.  In that last case, again, nothing is
> going to help you as it's a domain problem.  My proposal is undefeated
> in your use case.

I just read your proposal again, and must admit I don't understand how it would help me, but I look forward to testing 
an implementation!

One wrinkle, though -- the data is binary, and if read would have to be read using the latin1 codec... although, I 
suppose I could open it, read the first 32 bytes, close it, figure out the encoding, reopen with the encoding.... hmmmm 
-- yup, still not sure how it would all work, but looking forward to testing it.


From stephen at  Tue Jan  7 14:00:17 2014
From: stephen at (Stephen J. Turnbull)
Date: Tue, 07 Jan 2014 22:00:17 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman writes:

 > I just read your proposal again, and must admit I don't understand
 > how it would help me, but I look forward to testing an
 > implementation!
 > One wrinkle, though -- the data is binary, and if read would have
 > to be read using the latin1 codec...

That depends on what you mean by "binary".  If the binary payload is
just a blob that gets passed on (eg, as in an HTTP client receiving
and storing a JPEG file), you read the stream as 'ascii-compatible',
parse the headers using regexps or whatever, print any relevant parsed
data to logs using 'ascii-compatible', slice off the blob, and write
the blob to disk as 'ascii-compatible'.  This has the advantage over
latin1 that the bytes are marked as "uninterpreted text".  It doesn't
mean you can't create mojibake; you still can.  But Python will
complain if you try to output it as text in an encoding (unless you
use the 'surrogateescape' handler, in which case you're explicitly
accepting responsibility for any mess you create).

If you mean to process the binary, it would depend on what you want to
do whether it would help or not.  struct- and ctypes-style processing,
no, it won't help because you need to convert to bytes to use those.
(It might make sense to read the headers into a buffer this way, parse
them as ASCII-compatible text, and then read the rest as bytes.)  Pure
byte code, doesn't help, although it probably doesn't hurt.

From steve at  Tue Jan  7 16:44:03 2014
From: steve at (Steven D'Aprano)
Date: Wed, 8 Jan 2014 02:44:03 +1100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
	new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <20140107154401.GK29356@ando>

On Tue, Jan 07, 2014 at 03:37:36AM +0900, Stephen J. Turnbull wrote:

> So ... now that we have the flexible string representation (PEP 393),
> let's add a 7-bit representation!  (Don't take that too seriously,
> there are interesting more general variants I'm not going to talk
> about tonight.)
> The 7-bit representation satisfies the following requirements:
> 1.  It is only produced on input by a new 'ascii-compatible' codec,
>     which sets the "7-bit representation" flag in the str object on
>     input if it encounters any non-ASCII bytes (if pure ASCII, it
>     produces an 8-bit str object).  This will be slower than just
>     reading in the bytes in many cases, but I hope not unacceptably so.

I'm confused by your suggestion here. It seems to me that you've got the 
conditions backwards. (Or I don't understand them.) Perhaps a couple of 
examples will make it clear.

Suppose we take a pure-ASCII byte-string and decode it:


According to the above, this will produce a regular str object, 'abcd', 
using the regular 8-bit internal representation, and the "7-bit repr" 
flag cleared. Correct? (So the flag is *cleared* when all the chars in 
the string are 7-bit, and *set* when at least one is not. Yes?)

Suppose we take a byte-string with a non-ASCII byte:


This will return... what? I think it returns a so-called 7-bit 
representation, but I'm not sure what it is a representation of. I 
presume the internals will actually contain the four bytes

    61 62 63 FF

and the "7-bit repr" flag will be set. Is that flag the only difference 
between these two strings?


Presumably they will compare equal, yes?

> 2.  When sliced, the result needs to be checked for non-ASCII bytes.
>     If none, the result is promoted to 8-bit.
> 3.  When combined with a str in 8-bit representation:
>     a.  If the 8-bit str contains any Latin-1 or C1 characters, both
>         strs are promoted to 16-bit, and non-ASCII characters in the
>         7-bit string are converted by the surrogateescape handler.
>     b.  Otherwise they're combined into a 7-bit str.

A concrete example:

    s = b'abcd'.decode('ascii-compatible')
    t = 'x'  # ASCII-compatible
    s + t
    => returns 'abcdx', with the "7-bit repr" flag cleared.

    s = b'abcd'.decode('ascii-compatible')
    t = '?'  # U+00FF, non-ASCII.

    s + t
    => returns 'abcd\uDCFF', with the "7-bit repr" flag set

The \uDCFF at the end is the ? encoded with the surrogateescape error 

There's a problem with this: two strings, visually indistinguishable, 
but differing only in the internal representation, give completely 
different results:

    b'abcd'.decode('ascii') + '?'
    => 'abcd\u00FF'

    b'abcd'.decode('ascii-compatible') + '?'
    => 'abcd\uDCFF'

> 4.  When combined with a str in 16-bit or 32-bit representation, the
>     7-bit string is "decoded" to the same representation, as if using
>     the 'ascii' codec with the 'surrogateescape' handler.

Another example:

    s = b'abcd'.decode('ascii-compatible')
    assert s = 'abcd'
    s + '?'
    => returns what?

Your description confuses me. The "7-bit string" is already text, how do 
you decode it to the 16-bit internal representation? 

> 5.  String methods that would raise or produce undefined results if
>     used on str containing surrogate-encoded bytes need to be taught
>     to do the same on non-ASCII bytes in 7-bit str objects.

Do you have an example of such string methods?

> 6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
>     and pure ASCII 8-bit str, and raises on anything else.  (Sorry,
>     no, ISO 8859-1 does *not* get passed through without exception.)
> 7.  On output other codecs raise on a 7-bit str, unless the
>     surrogateescape handler is in use.

What do you mean by "on output"? Do you mean when encoding?

This concerns me:

    => returns b'abcd'

    => raises

And yet, the two 'abcd' strings you get are visually indistinguishable, 
and only differ by a hidden, internal flag.

I've probably misunderstood something about your proposal, so please 
explain where I've gone wrong. Please give examples!


From ncoghlan at  Tue Jan  7 17:19:09 2014
From: ncoghlan at (Nick Coghlan)
Date: Wed, 8 Jan 2014 02:19:09 +1000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140107154401.GK29356@ando>
References: <>
Message-ID: <>

On 7 Jan 2014 23:45, "Steven D'Aprano" <steve at> wrote:
> On Tue, Jan 07, 2014 at 03:37:36AM +0900, Stephen J. Turnbull wrote:
> > So ... now that we have the flexible string representation (PEP 393),
> > let's add a 7-bit representation!  (Don't take that too seriously,
> > there are interesting more general variants I'm not going to talk
> > about tonight.)
> >
> > The 7-bit representation satisfies the following requirements:
> >
> > 1.  It is only produced on input by a new 'ascii-compatible' codec,
> >     which sets the "7-bit representation" flag in the str object on
> >     input if it encounters any non-ASCII bytes (if pure ASCII, it
> >     produces an 8-bit str object).  This will be slower than just
> >     reading in the bytes in many cases, but I hope not unacceptably so.
> I'm confused by your suggestion here. It seems to me that you've got the
> conditions backwards. (Or I don't understand them.) Perhaps a couple of
> examples will make it clear.
> Suppose we take a pure-ASCII byte-string and decode it:
>     b'abcd'.decode('ascii-compatible')
> According to the above, this will produce a regular str object, 'abcd',
> using the regular 8-bit internal representation, and the "7-bit repr"
> flag cleared. Correct? (So the flag is *cleared* when all the chars in
> the string are 7-bit, and *set* when at least one is not. Yes?)
> Suppose we take a byte-string with a non-ASCII byte:
>     b'abc\xFF'.decode('ascii-compatible')
> This will return... what? I think it returns a so-called 7-bit
> representation, but I'm not sure what it is a representation of. I
> presume the internals will actually contain the four bytes
>     61 62 63 FF
> and the "7-bit repr" flag will be set. Is that flag the only difference
> between these two strings?
>     b'abc\xFF'.decode('ascii-compatible')
>     'abc\xFF'
> Presumably they will compare equal, yes?
> > 2.  When sliced, the result needs to be checked for non-ASCII bytes.
> >     If none, the result is promoted to 8-bit.
> >
> > 3.  When combined with a str in 8-bit representation:
> >
> >     a.  If the 8-bit str contains any Latin-1 or C1 characters, both
> >         strs are promoted to 16-bit, and non-ASCII characters in the
> >         7-bit string are converted by the surrogateescape handler.
> >
> >     b.  Otherwise they're combined into a 7-bit str.
> A concrete example:
>     s = b'abcd'.decode('ascii-compatible')
>     t = 'x'  # ASCII-compatible
>     s + t
>     => returns 'abcdx', with the "7-bit repr" flag cleared.
>     s = b'abcd'.decode('ascii-compatible')
>     t = '?'  # U+00FF, non-ASCII.
>     s + t
>     => returns 'abcd\uDCFF', with the "7-bit repr" flag set
> The \uDCFF at the end is the ? encoded with the surrogateescape error
> handler.
> There's a problem with this: two strings, visually indistinguishable,
> but differing only in the internal representation, give completely
> different results:
>     b'abcd'.decode('ascii') + '?'
>     => 'abcd\u00FF'
>     b'abcd'.decode('ascii-compatible') + '?'
>     => 'abcd\uDCFF'
> > 4.  When combined with a str in 16-bit or 32-bit representation, the
> >     7-bit string is "decoded" to the same representation, as if using
> >     the 'ascii' codec with the 'surrogateescape' handler.
> Another example:
>     s = b'abcd'.decode('ascii-compatible')
>     assert s = 'abcd'
>     s + '?'
>     => returns what?
> Your description confuses me. The "7-bit string" is already text, how do
> you decode it to the 16-bit internal representation?
> > 5.  String methods that would raise or produce undefined results if
> >     used on str containing surrogate-encoded bytes need to be taught
> >     to do the same on non-ASCII bytes in 7-bit str objects.
> Do you have an example of such string methods?
> > 6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
> >     and pure ASCII 8-bit str, and raises on anything else.  (Sorry,
> >     no, ISO 8859-1 does *not* get passed through without exception.)
> >
> > 7.  On output other codecs raise on a 7-bit str, unless the
> >     surrogateescape handler is in use.
> What do you mean by "on output"? Do you mean when encoding?
> This concerns me:
>     b'abcd'.decode('ascii').encode('latin-1')
>     => returns b'abcd'
>     b'abcd'.decode('ascii-compatible').encode('latin-1')
>     => raises
> And yet, the two 'abcd' strings you get are visually indistinguishable,
> and only differ by a hidden, internal flag.
> I've probably misunderstood something about your proposal, so please
> explain where I've gone wrong. Please give examples!

I haven't been following the discussion in detail ( and the
Py3 discussions have most of my attention this week), but I'm definitely
not clear on how this 7-bit proposal differs meaningfully from just using
ascii with the surrogateescape error handler.


> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Tue Jan  7 18:46:15 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 7 Jan 2014 09:46:15 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
	new bytestring type?]
In-Reply-To: <20140107154401.GK29356@ando>
References: <>
 <> <20140107154401.GK29356@ando>
Message-ID: <>

I think Stephen's name "7-bit" is confusing people. If you try to interpret the name sensibly, you get Steven's broken interpretation. But if you read it as a nonsense word and work through the logic, it all makes sense.

On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at> wrote:

> On Tue, Jan 07, 2014 at 03:37:36AM +0900, Stephen J. Turnbull wrote:
>> So ... now that we have the flexible string representation (PEP 393),
>> let's add a 7-bit representation!  (Don't take that too seriously,
>> there are interesting more general variants I'm not going to talk
>> about tonight.)
>> The 7-bit representation satisfies the following requirements:
>> 1.  It is only produced on input by a new 'ascii-compatible' codec,
>>    which sets the "7-bit representation" flag in the str object on
>>    input if it encounters any non-ASCII bytes (if pure ASCII, it
>>    produces an 8-bit str object).  This will be slower than just
>>    reading in the bytes in many cases, but I hope not unacceptably so.
> I'm confused by your suggestion here. It seems to me that you've got the 
> conditions backwards. (Or I don't understand them.) Perhaps a couple of 
> examples will make it clear.
> Suppose we take a pure-ASCII byte-string and decode it:
>    b'abcd'.decode('ascii-compatible')
> According to the above, this will produce a regular str object, 'abcd', 
> using the regular 8-bit internal representation, and the "7-bit repr" 
> flag cleared. Correct? (So the flag is *cleared* when all the chars in 
> the string are 7-bit, and *set* when at least one is not. Yes?)

Correct. The floobl representation is not used because there are no non-ASCII bytes.

> Suppose we take a byte-string with a non-ASCII byte:
>    b'abc\xFF'.decode('ascii-compatible')
> This will return... what? I think it returns a so-called 7-bit 
> representation, but I'm not sure what it is a representation of.

The representation is the bytes 61 62 63 FF with the floobl flag set. It's a representation of an 'a' char, a 'b' char, a 'c' char, and a smuggled FF byte--identical to 'abc\uDCFF'.

(This last bit is the part I'm a bit wary of, as it promoted surrogate-escape to being an inherent part of the meaning of Unicode strings in Python. But maybe Stephen has an answer for that. And anyway, it's a much smaller problem than the one you think is there.)

> I 
> presume the internals will actually contain the four bytes
>    61 62 63 FF
> and the "7-bit repr" flag will be set. Is that flag the only difference 
> between these two strings?
>    b'abc\xFF'.decode('ascii-compatible')
>    'abc\xFF'

The floobl flag is the only difference between the two internal representations, but there's a big difference in the meaning.

> Presumably they will compare equal, yes?

I would hope not. One of them has the Unicode character U+FF, the other has smuggled byte 0xFF, so they'd better not compare equal.

However, the latter should compare equal to 'abc\uDCFF'. That's the entire key here: the new representation is nothing but a more compact way to represent strings that contain nothing but ASCII and surrogate escapes.

>> 2.  When sliced, the result needs to be checked for non-ASCII bytes.
>>    If none, the result is promoted to 8-bit.
>> 3.  When combined with a str in 8-bit representation:
>>    a.  If the 8-bit str contains any Latin-1 or C1 characters, both
>>        strs are promoted to 16-bit, and non-ASCII characters in the
>>        7-bit string are converted by the surrogateescape handler.
>>    b.  Otherwise they're combined into a 7-bit str.
> A concrete example:
>    s = b'abcd'.decode('ascii-compatible')
>    t = 'x'  # ASCII-compatible
>    s + t
>    => returns 'abcdx', with the "7-bit repr" flag cleared.

Right. Here both s and t are normal 8-bit strings reprs in the first place, so the new logic doesn't even get invoked. So yes, that's what it returns.

>    s = b'abcd'.decode('ascii-compatible')
>    t = '?'  # U+00FF, non-ASCII.
>    s + t
>    => returns 'abcd\uDCFF', with the "7-bit repr" flag set

No, you've missed two key bits here. 

First, you're again adding two regular 8-bit-repr strings, not a non-ASCII-smuggling string plus an 8-bit, so the new logic doesn't get invoked at all.

Plus, even if s were a 7-bit-flagged string like 'ab\xfe'.decode('ascii-compatible'), that wouldn't turn t into \uDCFF. Only bytes in the floobl-flagged string are surrogate-escaped; characters in the normal string are handled normally. So you'd have 'ab\uDCFE\xFF'.

Also, both strings are promoted to 16-bit, and the floobl flag is never set with 16-bit or 32-bit representations.

> The \uDCFF at the end is the ? encoded with the surrogateescape error 
> handler.
> There's a problem with this: two strings, visually indistinguishable, 
> but differing only in the internal representation, give completely 
> different results:
>    b'abcd'.decode('ascii') + '?'
>    => 'abcd\u00FF'
>    b'abcd'.decode('ascii-compatible') + '?'
>    => 'abcd\uDCFF'

Nope, again, these both give the first result.

>> 4.  When combined with a str in 16-bit or 32-bit representation, the
>>    7-bit string is "decoded" to the same representation, as if using
>>    the 'ascii' codec with the 'surrogateescape' handler.
> Another example:
>    s = b'abcd'.decode('ascii-compatible')
>    assert s = 'abcd'
>    s + '?'
>    => returns what?

'abcd?'. Since the first one is a plain 8-bit string, and the second a plain 16-bit string, the new logic never even gets involved. 

And again, if you change this so s is b'abc\xFE'.decode('ascii-compatible'), then you're adding a floobl string and a 16-bit string, so the FE byte gets encoded as DCFE, while the pi character is left unchanged, so you get 'abc\uDCFE?'.

> Your description confuses me. The "7-bit string" is already text, how do 
> you decode it to the 16-bit internal representation? 

By decoding its representation as if it were bytes, using surrogate-escape.

>> 5.  String methods that would raise or produce undefined results if
>>    used on str containing surrogate-encoded bytes need to be taught
>>    to do the same on non-ASCII bytes in 7-bit str objects.
> Do you have an example of such string methods?
>> 6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
>>    and pure ASCII 8-bit str, and raises on anything else.  (Sorry,
>>    no, ISO 8859-1 does *not* get passed through without exception.)
>> 7.  On output other codecs raise on a 7-bit str, unless the
>>    surrogateescape handler is in use.
> What do you mean by "on output"? Do you mean when encoding?

Presumably "output" means something like writing to a TextIOWrapper whose encoding whose codec is ascii-compatible. In which case you're right, it would be clearer to just say "when encoding".

However, I think there's a mistake in the design of 6 here. Surely encoding 'abc\uDCFF' should give you the bytes 61 62 63 FF, not an exception, right? (Unless the idea is that such a string is guaranteed to have a floobl-flagged 8-bit representation, not a 16-bit one, no matter how you try to create it in Python or in C, and I don't think the other rules make that guarantee.)

> This concerns me:
>    b'abcd'.decode('ascii').encode('latin-1')
>    => returns b'abcd'
>    b'abcd'.decode('ascii-compatible').encode('latin-1')
>    => raises

Nope. The decoding returns the string 'abcd', in normal 8-bit representation, in both cases. There are no non-ASCII bytes, so the floobl flag isn't set. So you get the same result either way.

> And yet, the two 'abcd' strings you get are visually indistinguishable, 
> and only differ by a hidden, internal flag.
> I've probably misunderstood something about your proposal, so please 
> explain where I've gone wrong. Please give examples!
> -- 
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From ethan at  Tue Jan  7 17:48:05 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 08:48:05 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On 01/07/2014 05:00 AM, Stephen J. Turnbull wrote:
> Ethan Furman writes:
>> I just read your proposal again, and must admit I don't understand
>> how it would help me, but I look forward to testing an
>> implementation!
>> One wrinkle, though -- the data is binary, and if read would have
>> to be read using the latin1 codec...
> If you mean to process the binary, it would depend on what you want to
> do whether it would help or not.  struct- and ctypes-style processing,
> no, it won't help because you need to convert to bytes to use those.
> (It might make sense to read the headers into a buffer this way, parse
> them as ASCII-compatible text, and then read the rest as bytes.)  Pure
> byte code, doesn't help, although it probably doesn't hurt.

Sounds like it doesn't help me then.  My binary stream is mixed:

   - binary that has to be converted (4-byte ints, for example)
   - ascii that has to be converted (ints stored as ascii text)
   - encoded text (character and memo fields)

and the precise location of each varies from file to file.


From solipsis at  Tue Jan  7 18:57:33 2014
From: solipsis at (Antoine Pitrou)
Date: Tue, 7 Jan 2014 18:57:33 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
Message-ID: <20140107185733.7ad1a3be@fsol>

On Tue, 07 Jan 2014 08:48:05 -0800
Ethan Furman <ethan at> wrote:
>    - ascii that has to be converted (ints stored as ascii text)
>    - encoded text (character and memo fields)

What is the difference supposed to be between those two?



From abarnert at  Tue Jan  7 19:11:07 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 7 Jan 2014 10:11:07 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
	new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

I think there are three problems with your proposal--all of which I mentioned in the long reply to Steven, but I suspect many people tl;dr'd over that, and I like your proposal enough that I want to make sure either I'm wrong, or you fix them. So:

On Jan 6, 2014, at 10:37, "Stephen J. Turnbull" <stephen at> wrote:

> So ... now that we have the flexible string representation (PEP 393),
> let's add a 7-bit representation!

The name has confused both Steven and Nick into misinterpreting the idea, and it confused me until I read over the details twice and it finally clicked, and it still doesn't make sense after I understand what you mean.

This is an 8-bit representation where non-ASCII bytes are used to smuggle non-ASCII bytes. Just like the existing 16-bit representation where surrogate escapes are used to smuggle non-ASCII bytes. It's not a 7-bit representation unless there's nothing but ASCII in it--and it's never used in the case where there's nothing but ASCII. I'm not sure what the right word is, but this isn't it.

> 1.  It is only produced on input by a new 'ascii-compatible' codec,

This name might also be confusing people.
> 3.  When combined with a str in 8-bit representation:
>    a.  If the 8-bit str contains any Latin-1 or C1 characters, both
>        strs are promoted to 16-bit, and non-ASCII characters in the
>        7-bit string are converted by the surrogateescape handler.

This part worries me a bit. The bytes 61 62 63 FF in this new representation actually _mean_ 'abc' followed by a smuggled FF byte. But the words 0061 0062 0063 DCFF in a 16-bit representation just mean 'abc\uDCFF', which _can be interpreted_, via the surrogate-escape mechanism, as 'abc' and a smuggled byte, but don't actually _mean_ that. It seems like your proposal only works if we change it so that they really _do_ mean that.

> 6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
>    and pure ASCII 8-bit str, and raises on anything else.

So if a 7-bit string gets converted to a surrogate-escaped 16-bit string, it can never be written out again? For a contrived example:

(b'abc\xff'.decode('ascii-compatible') + '\u1234')[:4].encode('ascii-compatible')

I'd expect to get back my b'abcd\xff'. But your rules give me an exception.

Maybe you were expecting this to be taken care of in the slicing, but rule 1 makes that impossible; you can never get a 7-bit string by doing anything but decoding ascii-compatible (or combining two 7-bit strings).

I think ascii-compatible has to accept non-8-bit-repr strings (by encoding ASCII as ASCII and surrogate escapes as bytes and everything else is an exception). This is necessary because 60 61 62 FF (7-bit) and 0061 0062 0063 DCFF (16-bit) are the same string anyway. But it's especially necessary because the former can be silently converted into the latter (and there's no way to even test whether that's happened).

Of course that means biting the bullet and saying that \uDCFF in python really means a smuggled FF byte, rather than just being a way to smuggle an FF byte through Unicode if want to you do so explicitly. But as I said above, I think you've already bitten that bullet.

From stephen at  Tue Jan  7 19:33:19 2014
From: stephen at (Stephen J. Turnbull)
Date: Wed, 08 Jan 2014 03:33:19 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan writes:

 > I haven't been following the discussion in detail ( and
 > the Py3 discussions have most of my attention this week), but I'm
 > definitely not clear on how this 7-bit proposal differs meaningfully
 > from just using ascii with the surrogateescape error handler.
 > Cheers, Nick.

It doesn't differ meaningfully to me.  I doubt I'll be writing any
programs in the near future that aren't just as well and efficiently
done by decoding as ascii with surrogateescape.

It does give you an 8-bit representation, with the benefits that gives
you (very fast encode and fast decode), whereas the ascii +
surrogateescape approach gives you a 16-bit representation sometimes.
Some people seem to care about that, eg, it seems to fit the chunked
HTTP use-case perfectly.

It gives you an 8-bit almost-bytes type without the b prefix on
literals.  I don't know if that would actually be useful to anybody.

Finally (and again, I haven't thought this through) you have a halfway
house that can in principle be mixed more or less freely with either
bytes (and bytearray and memoryview) or Unicode, but not with both.
(There is intentionally no way to get back to "ascii-compatible"
representation from one of the other str representations, and in the
same way combining with one of the bytes types would give a bytes
type.)  I realize this probably doesn't work without modification
because as designed it *is* str and the type system wouldn't be able
to distinguish between the ascii-compatible representation and a str
in another representation.  So maybe this would bring us back to the
idea of a new bytestring type.

I'll get back to Steven's post later, but it and others seem to be
stuck in the greylist.  (Hate spam, hate spam, hate what spam does to

From ethan at  Tue Jan  7 19:10:19 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 10:10:19 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140107185733.7ad1a3be@fsol>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On 01/07/2014 09:57 AM, Antoine Pitrou wrote:
> On Tue, 07 Jan 2014 08:48:05 -0800
> Ethan Furman <ethan at> wrote:
>>     - ascii that has to be converted (ints stored as ascii text)
>>     - encoded text (character and memo fields)
> What is the difference supposed to be between those two?

The method used for conversion and the return type:

   - ascii-encoded text:  b'123' --> int(123)
   - encoded text (ascii or russian or asian or ...):  b'abc' --> u'abc'

and for completeness:

   - binary integer:  b'\x00\x01' --> int(1)


From solipsis at  Tue Jan  7 19:47:52 2014
From: solipsis at (Antoine Pitrou)
Date: Tue, 7 Jan 2014 19:47:52 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
Message-ID: <20140107194752.304604a1@fsol>

On Tue, 07 Jan 2014 10:10:19 -0800
Ethan Furman <ethan at> wrote:
> On 01/07/2014 09:57 AM, Antoine Pitrou wrote:
> > On Tue, 07 Jan 2014 08:48:05 -0800
> > Ethan Furman <ethan at> wrote:
> >>     - ascii that has to be converted (ints stored as ascii text)
> >>     - encoded text (character and memo fields)
> >
> > What is the difference supposed to be between those two?
> The method used for conversion and the return type:
>    - ascii-encoded text:  b'123' --> int(123)
>    - encoded text (ascii or russian or asian or ...):  b'abc' --> u'abc'

I'm sorry, I still don't parse this. What is it in Python 3.3 that
prevents you from doing this?



From ethan at  Tue Jan  7 19:38:40 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 10:38:40 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
Message-ID: <>

On 01/07/2014 10:22 AM, MRAB wrote:
> On 2014-01-07 17:46, Andrew Barnert wrote:
>> On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at> wrote:
> I was thinking about Ethan's suggestion of introducing a new bytestring
> class and a lot of these suggestions are what I thought the bytestring
> class could do.

>>> Suppose we take a pure-ASCII byte-string and decode it:
>>>    b'abcd'.decode('ascii-compatible')
> That would be:
>      bytestring(b'abcd')
> or even:
>      bytestring('abcd')
> [snip]
>>> Suppose we take a byte-string with a non-ASCII byte:
>>>    b'abc\xFF'.decode('ascii-compatible')
> That would be:
>      bytestring(b'abc\xFF')
> Bytes outside the ASCII range would be mapped to Unicode low
> surrogates:
>      bytestring(b'abc\xFF') == bytestring('abc\uDCFF')

Not sure what you mean here.  The resulting bytes should be 'abc\xFF' and of length 4.


From python at  Tue Jan  7 20:32:26 2014
From: python at (MRAB)
Date: Tue, 07 Jan 2014 19:32:26 +0000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
Message-ID: <>

On 2014-01-07 18:38, Ethan Furman wrote:
> On 01/07/2014 10:22 AM, MRAB wrote:
>> On 2014-01-07 17:46, Andrew Barnert wrote:
>>> On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at> wrote:
>> I was thinking about Ethan's suggestion of introducing a new bytestring
>> class and a lot of these suggestions are what I thought the bytestring
>> class could do.
>>>> Suppose we take a pure-ASCII byte-string and decode it:
>>>>    b'abcd'.decode('ascii-compatible')
>> That would be:
>>      bytestring(b'abcd')
>> or even:
>>      bytestring('abcd')
>> [snip]
>>>> Suppose we take a byte-string with a non-ASCII byte:
>>>>    b'abc\xFF'.decode('ascii-compatible')
>> That would be:
>>      bytestring(b'abc\xFF')
>> Bytes outside the ASCII range would be mapped to Unicode low
>> surrogates:
>>      bytestring(b'abc\xFF') == bytestring('abc\uDCFF')
> Not sure what you mean here.  The resulting bytes should be 'abc\xFF' and of length 4.
'abc\xFF' is a Unicode string, but you wouldn't be able to convert it
to a bytestring because '\xFF' is a codepoint outside the ASCII range
and not a low surrogate.

From ethan at  Tue Jan  7 19:57:04 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 10:57:04 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140107194752.304604a1@fsol>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
Message-ID: <>

On 01/07/2014 10:47 AM, Antoine Pitrou wrote:
> On Tue, 07 Jan 2014 10:10:19 -0800
> Ethan Furman <ethan at> wrote:
>> On 01/07/2014 09:57 AM, Antoine Pitrou wrote:
>>> On Tue, 07 Jan 2014 08:48:05 -0800
>>> Ethan Furman <ethan at> wrote:
>>>>      - ascii that has to be converted (ints stored as ascii text)
>>>>      - encoded text (character and memo fields)
>>> What is the difference supposed to be between those two?
>> The method used for conversion and the return type:
>>     - ascii-encoded text:  b'123' --> int(123)
>>     - encoded text (ascii or russian or asian or ...):  b'abc' --> u'abc'
> I'm sorry, I still don't parse this. What is it in Python 3.3 that
> prevents you from doing this?

Nothing at all, and that part works fine.

The trouble (for me) comes in when I try to use single bytes, either when creating or extracting.  The above examples 
were to show that Stephen J Turnbull's idea wouldn't work for me.


From solipsis at  Tue Jan  7 20:59:36 2014
From: solipsis at (Antoine Pitrou)
Date: Tue, 7 Jan 2014 20:59:36 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
Message-ID: <20140107205936.7706c393@fsol>

On Tue, 07 Jan 2014 10:57:04 -0800
Ethan Furman <ethan at> wrote:
> Nothing at all, and that part works fine.
> The trouble (for me) comes in when I try to use single bytes,
> either when creating or extracting.

Hmm... aren't you exagerating the trouble? It's not very difficult to
work with single bytes in Python 3...



From ethan at  Tue Jan  7 21:07:15 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 12:07:15 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140107205936.7706c393@fsol>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
Message-ID: <>

On 01/07/2014 11:59 AM, Antoine Pitrou wrote:
> On Tue, 07 Jan 2014 10:57:04 -0800
> Ethan Furman <ethan at> wrote:
>> Nothing at all, and that part works fine.
>> The trouble (for me) comes in when I try to use single bytes,
>> either when creating or extracting.
> Hmm... aren't you exagerating the trouble? It's not very difficult to
> work with single bytes in Python 3...

No, I'm not.  I don't think of b'C' as the integer 67 any more than I think of the number 256 as the bytes b'\x01\xFF'. 
  I don't think of a series of bytes as a container anymore than I think of a series of characters as a container.


From solipsis at  Tue Jan  7 21:08:24 2014
From: solipsis at (Antoine Pitrou)
Date: Tue, 7 Jan 2014 21:08:24 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <20140107210824.1a60792d@fsol>

On Tue, 07 Jan 2014 12:07:15 -0800
Ethan Furman <ethan at> wrote:
> On 01/07/2014 11:59 AM, Antoine Pitrou wrote:
> > On Tue, 07 Jan 2014 10:57:04 -0800
> > Ethan Furman <ethan at> wrote:
> >>
> >> Nothing at all, and that part works fine.
> >>
> >> The trouble (for me) comes in when I try to use single bytes,
> >> either when creating or extracting.
> >
> > Hmm... aren't you exagerating the trouble? It's not very difficult to
> > work with single bytes in Python 3...
> No, I'm not.  I don't think of b'C' as the integer 67 any more than I
> think of the number 256 as the bytes b'\x01\xFF'. 

Ethan, can you please show a practical issue you're having?

From ethan at  Tue Jan  7 20:43:49 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 11:43:49 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
Message-ID: <>

On 01/07/2014 11:32 AM, MRAB wrote:
> On 2014-01-07 18:38, Ethan Furman wrote:
>> On 01/07/2014 10:22 AM, MRAB wrote:
>>>> On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at> wrote:
>>>>> Suppose we take a byte-string with a non-ASCII byte:
>>>>>    b'abc\xFF'.decode('ascii-compatible')
>>> That would be:
>>>      bytestring(b'abc\xFF')
>>> Bytes outside the ASCII range would be mapped to Unicode low
>>> surrogates:
>>>      bytestring(b'abc\xFF') == bytestring('abc\uDCFF')
>> Not sure what you mean here.  The resulting bytes should be 'abc\xFF' and of length 4.
> 'abc\xFF' is a Unicode string, but you wouldn't be able to convert it
> to a bytestring because '\xFF' is a codepoint outside the ASCII range
> and not a low surrogate.

I can see terminology is going to be a pain in this thread.  ;)

My vision for a bytestring type (more refined):

   - made up of single bytes in the range 0 - 255 (no unicode anywhere)

   - indexing returns a bytestring of length 1, not an integer (as bytes does)

   - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not 'bytestring(0, 0, 0, 0, 0, 0, 0)'

So my statement above of 'abc\xFF' should not be interpreted as a unicode string... I guess I'll use 'y' as an 
abbreviation for now: y'abc\xFF'.


From guido at  Tue Jan  7 21:49:47 2014
From: guido at (Guido van Rossum)
Date: Tue, 7 Jan 2014 10:49:47 -1000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
Message-ID: <>

On Tue, Jan 7, 2014 at 9:43 AM, Ethan Furman <ethan at> wrote:
> My vision for a bytestring type (more refined):
>   - made up of single bytes in the range 0 - 255 (no unicode anywhere)
>   - indexing returns a bytestring of length 1, not an integer (as bytes
> does)
>   - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not
> 'bytestring(0, 0, 0, 0, 0, 0, 0)'

It sounds like you are just unhappy with some of the behavior of the
bytes object. I agree that these two behaviors are suboptimal, but it
is just too late to change them, and it's not enough to add a new type
-- not by a long shot. The constructor behavior can be changed using a
custom factory function. The indexing behavior, unfortunately, needs
to be dealt with by changing b[i] into b[i:i+1] everywhere.

--Guido van Rossum (

From python at  Tue Jan  7 21:58:12 2014
From: python at (MRAB)
Date: Tue, 07 Jan 2014 20:58:12 +0000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
Message-ID: <>

On 2014-01-07 19:43, Ethan Furman wrote:
> On 01/07/2014 11:32 AM, MRAB wrote:
>> On 2014-01-07 18:38, Ethan Furman wrote:
>>> On 01/07/2014 10:22 AM, MRAB wrote:
>>>>> On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at> wrote:
>>>>>> Suppose we take a byte-string with a non-ASCII byte:
>>>>>>    b'abc\xFF'.decode('ascii-compatible')
>>>> That would be:
>>>>      bytestring(b'abc\xFF')
>>>> Bytes outside the ASCII range would be mapped to Unicode low
>>>> surrogates:
>>>>      bytestring(b'abc\xFF') == bytestring('abc\uDCFF')
>>> Not sure what you mean here.  The resulting bytes should be 'abc\xFF' and of length 4.
>> 'abc\xFF' is a Unicode string, but you wouldn't be able to convert it
>> to a bytestring because '\xFF' is a codepoint outside the ASCII range
>> and not a low surrogate.
> I can see terminology is going to be a pain in this thread.  ;)
> My vision for a bytestring type (more refined):
>     - made up of single bytes in the range 0 - 255 (no unicode anywhere)
>     - indexing returns a bytestring of length 1, not an integer (as bytes does)
>     - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not 'bytestring(0, 0, 0, 0, 0, 0, 0)'
> So my statement above of 'abc\xFF' should not be interpreted as a unicode string... I guess I'll use 'y' as an
> abbreviation for now: y'abc\xFF'.
No disagreement there.

The point about Unicode is about how it could behave if mixed with
Unicode strings.

From ethan at  Tue Jan  7 21:49:11 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 12:49:11 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140107210824.1a60792d@fsol>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
 <20140107205936.7706c393@fsol> <>
Message-ID: <>

On 01/07/2014 12:08 PM, Antoine Pitrou wrote:
> On Tue, 07 Jan 2014 12:07:15 -0800
> Ethan Furman <ethan at> wrote:
>> On 01/07/2014 11:59 AM, Antoine Pitrou wrote:
>>> On Tue, 07 Jan 2014 10:57:04 -0800
>>> Ethan Furman <ethan at> wrote:
>>>> Nothing at all, and that part works fine.
>>>> The trouble (for me) comes in when I try to use single bytes,
>>>> either when creating or extracting.
>>> Hmm... aren't you exagerating the trouble? It's not very difficult to
>>> work with single bytes in Python 3...
>> No, I'm not.  I don't think of b'C' as the integer 67 any more than I
>> think of the number 256 as the bytes b'\x01\xFF'.
> Ethan, can you please show a practical issue you're having?

Seriously?  You've already agreed with me on my first two points at the beginning of this thread.  It's safe to assume I 
was having practical issues with those points.


From ethan at  Tue Jan  7 21:58:40 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 12:58:40 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
Message-ID: <>

On 01/07/2014 12:49 PM, Guido van Rossum wrote:
> On Tue, Jan 7, 2014 at 9:43 AM, Ethan Furman <ethan at> wrote:
>> My vision for a bytestring type (more refined):
>>    - made up of single bytes in the range 0 - 255 (no unicode anywhere)
>>    - indexing returns a bytestring of length 1, not an integer (as bytes
>> does)
>>    - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not
>> 'bytestring(0, 0, 0, 0, 0, 0, 0)'
> It sounds like you are just unhappy with some of the behavior of the
> bytes object. I agree that these two behaviors are suboptimal, but it
> is just too late to change them, and it's not enough to add a new type
> -- not by a long shot. The constructor behavior can be changed using a
> custom factory function. The indexing behavior, unfortunately, needs
> to be dealt with by changing b[i] into b[i:i+1] everywhere.

Of course I'm unhappy with it, it doesn't behave the way I think it should, and it's not consistent.

The reason I started the thread was to hopefully gather others requirements to have a truly distinct and useful new 
type.  Doesn't seem to have happened, though.  :(

Is it too late to change the repr for bytes?  I can't think of anywhere else in the stdlib where what you see is not 
what you get:

--> [0, 1, 2]
[0, 1, 2]

--> [0, 1, 2][1]

--> {'this':'that', 'these':'those'}
{'this': 'that', 'these': 'those'}

--> {'this':'that', 'these':'those'}['these']

--> 'abcdef'

--> 'abcdef'[3]

But of course with bytes:

--> b'abcdef'

--> b'abcdef'[3]


From solipsis at  Tue Jan  7 22:48:12 2014
From: solipsis at (Antoine Pitrou)
Date: Tue, 7 Jan 2014 22:48:12 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
 <> <20140107210824.1a60792d@fsol>
Message-ID: <20140107224812.7cb45316@fsol>

On Tue, 07 Jan 2014 12:49:11 -0800
Ethan Furman <ethan at> wrote:
> On 01/07/2014 12:08 PM, Antoine Pitrou wrote:
> > On Tue, 07 Jan 2014 12:07:15 -0800
> > Ethan Furman <ethan at> wrote:
> >> On 01/07/2014 11:59 AM, Antoine Pitrou wrote:
> >>> On Tue, 07 Jan 2014 10:57:04 -0800
> >>> Ethan Furman <ethan at> wrote:
> >>>>
> >>>> Nothing at all, and that part works fine.
> >>>>
> >>>> The trouble (for me) comes in when I try to use single bytes,
> >>>> either when creating or extracting.
> >>>
> >>> Hmm... aren't you exagerating the trouble? It's not very difficult to
> >>> work with single bytes in Python 3...
> >>
> >> No, I'm not.  I don't think of b'C' as the integer 67 any more than I
> >> think of the number 256 as the bytes b'\x01\xFF'.
> >
> > Ethan, can you please show a practical issue you're having?
> Seriously?  You've already agreed with me on my first two points at the beginning of this thread.  It's safe to assume I 
> was having practical issues with those points.

Well, I agree with those points, but I still think they're minor, and
not very hard to workaround. Hence my comment about "exagerating the



From guido at  Tue Jan  7 22:52:41 2014
From: guido at (Guido van Rossum)
Date: Tue, 7 Jan 2014 11:52:41 -1000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
Message-ID: <>

On Tue, Jan 7, 2014 at 10:58 AM, Ethan Furman <ethan at> wrote:
> On 01/07/2014 12:49 PM, Guido van Rossum wrote:
>> On Tue, Jan 7, 2014 at 9:43 AM, Ethan Furman <ethan at> wrote:
>>> My vision for a bytestring type (more refined):
>>>    - made up of single bytes in the range 0 - 255 (no unicode anywhere)
>>>    - indexing returns a bytestring of length 1, not an integer (as bytes
>>> does)
>>>    - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not
>>> 'bytestring(0, 0, 0, 0, 0, 0, 0)'
>> It sounds like you are just unhappy with some of the behavior of the
>> bytes object. I agree that these two behaviors are suboptimal, but it
>> is just too late to change them, and it's not enough to add a new type
>> -- not by a long shot. The constructor behavior can be changed using a
>> custom factory function. The indexing behavior, unfortunately, needs
>> to be dealt with by changing b[i] into b[i:i+1] everywhere.

> Of course I'm unhappy with it, it doesn't behave the way I think it should,
> and it's not consistent.

Consistent with what? (Before you rush in an answer, remember that
there are almost always multiple sides to a consistency argument.)

> The reason I started the thread was to hopefully gather others requirements
> to have a truly distinct and useful new type.  Doesn't seem to have
> happened, though.  :(

So now is the time to man up and live with it. It's not going to change.

> Is it too late to change the repr for bytes?


> I can't think of anywhere else
> in the stdlib where what you see is not what you get:
> --> [0, 1, 2]
> [0, 1, 2]
> --> [0, 1, 2][1]
> 1
> --> {'this':'that', 'these':'those'}
> {'this': 'that', 'these': 'those'}
> --> {'this':'that', 'these':'those'}['these']
> 'those'
> --> 'abcdef'
> 'abcdef'
> --> 'abcdef'[3]
> 'd'
> But of course with bytes:
> --> b'abcdef'
> b'abcdef'
> --> b'abcdef'[3]
> 100

I don't see what's wrong with those. Both produce valid expressions
that, when entered, compare equal to the object whose repr() was
printed. What more would you *want*?

--Guido van Rossum (

From python at  Tue Jan  7 23:36:31 2014
From: python at (Alexander Heger)
Date: Wed, 8 Jan 2014 09:36:31 +1100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

>> Of course I'm unhappy with it, it doesn't behave the way I think it should,
>> and it's not consistent.
> Consistent with what? (Before you rush in an answer, remember that
> there are almost always multiple sides to a consistency argument.)

> I don't see what's wrong with those. Both produce valid expressions
> that, when entered, compare equal to the object whose repr() was
> printed. What more would you *want*?

I find that the definition str is inconsistent indeed, because the
items in a string are strings again, not characters (or code points).
I don't think there is too many other examples in Python where the
same is true; indexing a list does not give a list but the item that
is at the point.

In [4]: type(b'abc')
Out[4]: builtins.bytes

In [5]: type(b'abc'[1])

In [6]: type('abc')
Out[6]: builtins.str

In [7]: type('abc'[1])
Out[7]: builtins.str

there is no byte type in Python, so the closest is int (there is a
byte type in numpy); if there was one, indexing a byte array could
return that, but I assume the use case would be quite limited.  But
that there is no "characters" but only strings of length one is a
confusing concept.  It is as of scalars were the same as arrays of
length one.  These are different concepts, however.  (Though,
admittedly, numpy will take arrays of length 1 as scalars at least in
some cases as a convenience - though I think it should not as it
prevent users from writing consistent code that will be easy to read
later.  The same is here the case for Python with strings.)

In [11]: [1,2,3] + [1]
Out[11]: [1, 2, 3, 1]

In [12]: [1,2,3] + [1][0]
TypeError: can only concatenate list (not "int") to list

In [13]: 'abc' + 'd'
Out[13]: 'abcd'

In [14]: 'abc' + 'd'[0]
Out[14]: 'abcd'

so, yes, the interface to strings and arrays is inconsistent.  At
least in this aspect.

From tjreedy at  Tue Jan  7 23:38:42 2014
From: tjreedy at (Terry Reedy)
Date: Tue, 07 Jan 2014 17:38:42 -0500
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
Message-ID: <lahvla$ev8$>

On 1/7/2014 2:43 PM, Ethan Furman wrote:

> My vision for a bytestring type (more refined):

>    - made up of single bytes in the range 0 - 255 (no unicode anywhere)
>    - indexing returns a bytestring of length 1, not an integer (as bytes
> does)
>    - `bytestring(7)` either fails, or returns 'bytestring('\x07')' not
> 'bytestring(0, 0, 0, 0, 0, 0, 0)'

To me, a major feature of Python is that it a) has more than one basic 
structure type (versus just strings or symbolic expressions) but b) is 
conservative in its multiplicity. It is not minimal, but it is 
minimalistic. It took over a decade for Guido to agree that Python 
should have separate built-in bool and set classes instead of just using 
ints as bools and tuples, lists, and dicts as sets, or using imported 
classes for either.

The above describes a minor variation on bytes and seems to me to be a 
classic case for subclassing, whether in Python for ease or C for speed, 
in an imported module. The result could be kept private or made public 
as you wish. Yes, the minor differences would be important to you, the 
author of the subclass, but that is always the motivation for subclassing.

One of the major advances in Python was to make it possible (in 2.2) to 
subclass the basic builtin structure classes. It seems to me that 
subclasses that work in multiple versions of Python, such as are already 
being used, are the appropriate solution to the specialized problems 
that people have with the Python string builtins.

Terry Jan Reedy

From guido at  Wed Jan  8 00:06:39 2014
From: guido at (Guido van Rossum)
Date: Tue, 7 Jan 2014 13:06:39 -1000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
Message-ID: <>

You're off-topic for this sub-thread. Ethan said he wanted to change
the repr() of bytes, but didn't specify what change he wanted. The
inconsistency in the *interface* is not under discussion any more
(I've already said agree it is unfortunate, but not bad enough to
warrant a new type or a backward incompatible change).

On Tue, Jan 7, 2014 at 12:36 PM, Alexander Heger <python at> wrote:
>>> Of course I'm unhappy with it, it doesn't behave the way I think it should,
>>> and it's not consistent.
>> Consistent with what? (Before you rush in an answer, remember that
>> there are almost always multiple sides to a consistency argument.)
>> I don't see what's wrong with those. Both produce valid expressions
>> that, when entered, compare equal to the object whose repr() was
>> printed. What more would you *want*?
> I find that the definition str is inconsistent indeed, because the
> items in a string are strings again, not characters (or code points).
> I don't think there is too many other examples in Python where the
> same is true; indexing a list does not give a list but the item that
> is at the point.
> In [4]: type(b'abc')
> Out[4]: builtins.bytes
> In [5]: type(b'abc'[1])
> Out[5]:
> In [6]: type('abc')
> Out[6]: builtins.str
> In [7]: type('abc'[1])
> Out[7]: builtins.str
> there is no byte type in Python, so the closest is int (there is a
> byte type in numpy); if there was one, indexing a byte array could
> return that, but I assume the use case would be quite limited.  But
> that there is no "characters" but only strings of length one is a
> confusing concept.  It is as of scalars were the same as arrays of
> length one.  These are different concepts, however.  (Though,
> admittedly, numpy will take arrays of length 1 as scalars at least in
> some cases as a convenience - though I think it should not as it
> prevent users from writing consistent code that will be easy to read
> later.  The same is here the case for Python with strings.)
> In [11]: [1,2,3] + [1]
> Out[11]: [1, 2, 3, 1]
> In [12]: [1,2,3] + [1][0]
> TypeError: can only concatenate list (not "int") to list
> In [13]: 'abc' + 'd'
> Out[13]: 'abcd'
> In [14]: 'abc' + 'd'[0]
> Out[14]: 'abcd'
> so, yes, the interface to strings and arrays is inconsistent.  At
> least in this aspect.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From dreamingforward at  Wed Jan  8 00:20:45 2014
From: dreamingforward at (Mark Janssen)
Date: Tue, 7 Jan 2014 17:20:45 -0600
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <>

>>> The trouble (for me) comes in when I try to use single bytes,
>>> either when creating or extracting.
>> Hmm... aren't you exagerating the trouble? It's not very difficult to
>> work with single bytes in Python 3...
> No, I'm not.  I don't think of b'C' as the integer 67 any more than I think
> of the number 256 as the bytes b'\x01\xFF'.

There's something fundamentally wrong with these brainfarts coming out
on the list.  Just how, Ethan, did you think you could represent
binary data in a text string, whether preceded by the char 'b' or not?
 What did you think you would do when you got to character 0, the
first (pseudo)-symbol in ASCII?

Why don't you jackasses start listening instead of wanking each other
with bullshit?


From dreamingforward at  Wed Jan  8 00:49:32 2014
From: dreamingforward at (Mark Janssen)
Date: Tue, 7 Jan 2014 17:49:32 -0600
Subject: [Python-ideas] The fools shall start sucking the cock.
Message-ID: <>

Okay,  how's everyone doing with their Python 2 vs.3,  bytes/unicode
vs. shit-extruder expertise?

Anyone need some relief, perhaps some guidance?

*kicks feet up to table*

From brett at  Wed Jan  8 01:07:19 2014
From: brett at (Brett Cannon)
Date: Tue, 7 Jan 2014 19:07:19 -0500
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

That language is not called for (what the heck is the subject line even
supposed to mean?). While I'm not saying you can use a swear word here or
there to punctuate a statement, being this over-the-top is not considerate
of others.

On Tue, Jan 7, 2014 at 6:49 PM, Mark Janssen <dreamingforward at>wrote:

> Okay,  how's everyone doing with their Python 2 vs.3,  bytes/unicode
> vs. shit-extruder expertise?
> Anyone need some relief, perhaps some guidance?
> markj
> *kicks feet up to table*
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Wed Jan  8 01:39:11 2014
From: steve at (Steven D'Aprano)
Date: Wed, 8 Jan 2014 11:39:11 +1100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
	new bytestring type?]
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <20140108003910.GL29356@ando>

On Tue, Jan 07, 2014 at 08:48:05AM -0800, Ethan Furman wrote:

> [...] My binary stream is mixed:
>   - binary that has to be converted (4-byte ints, for example)
>   - ascii that has to be converted (ints stored as ascii text)
>   - encoded text (character and memo fields)

Ethan, you keep referring to ascii text and encoded text as if they are 
different things. They're not. You have a binary file containing bytes. 
Some of those bytes represent data of one kind (say, 4-bit ints). Some 
of those bytes represent data of a different kind (Latin-1 encoded text 
representing character and memo fields) and other bytes represent data 
of a third kind (ASCII encoded text representing ints, but you don't 
mention what the meaning of those ints is).

ASCII or Latin-1, the text is still encoded into bytes, and still needs 
to be decoded back to text. Since Latin-1 is a superset of ASCII, you 
could use Latin-1 for them all, and still get the same result.

Of course you can't just decode the entire file into Latin-1, since 
parts of it represent non-text data, but you could decode all the text 
parts individually using Latin-1 and/or ASCII.

(To those reading and wondering how I know the character and memo fields 
use Latin-1, Ethan has discussed this case on comp.lang.python.)


From ethan at  Wed Jan  8 01:56:37 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 16:56:37 -0800
Subject: [Python-ideas] [OT] banning Mark Janssen
Message-ID: <>


Mark Janssen's posts are becoming extremely abusive, which seems to me to be against he code of conduct.

Can we ban him, at least from the mailing lists?


From songofacandy at  Wed Jan  8 01:50:30 2014
From: songofacandy at (INADA Naoki)
Date: Wed, 8 Jan 2014 09:50:30 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
 <20140107205936.7706c393@fsol> <>
Message-ID: <>

I'm `PyMySQL <>`_ (pure Python MySQL
driver) developer.
I share my experience that I've suffered by bytes doesn't have %-format.

`MySQL-python <>`_ is a most major
DB-API 2.0 driver for MySQL.
Other MySQL drivers like PyMySQL, MySQL-connector-python are designed
compatible it as possible.
MySQL-python uses 'format' paramstyle.

MySQL protocol is basically encoded text, but it may contain arbitrary
(escaped) binary.
Here is simplified example constructing real SQL from SQL format and
arguments. (Works only on Python 2.7)

def escape_string(s):
    return s.replace("'", "''")

def convert(x):
    if isinstance(x, unicode):
        x = x.encode('utf-8')  # Use encoding assigned to connection in
    if isinstance(x, str):
        x = "'" + escape_string(x) + "'"  # 'quoted and '' escaped string'
        x = str(x)  # like 42
    return x

def build_query(query, *args):
    if isinstance(query, unicode):
        query = query.encode('utf-8')
    return query % tuple(map(convert, args))

textdata = b"hello"
bindata = b"abc\xff\x00"
query = "UPDATE table SET textcol=%s bincol=%s"

print build_query(query, textdata, bindata)

I can't port this to Python 3.
Fortunately, MySQL supports hex string like x'616263ff00'
So I use it and PyMySQL supports binary data on Python 3.
But hex string consumes double space than normal (escaped) bytes.
This is why I don't use hexstring on Python 2.

On Wed, Jan 8, 2014 at 8:20 AM, Mark Janssen <dreamingforward at>wrote:

> >>> The trouble (for me) comes in when I try to use single bytes,
> >>> either when creating or extracting.
> >>
> >> Hmm... aren't you exagerating the trouble? It's not very difficult to
> >> work with single bytes in Python 3...
> >
> > No, I'm not.  I don't think of b'C' as the integer 67 any more than I
> think
> > of the number 256 as the bytes b'\x01\xFF'.
> There's something fundamentally wrong with these brainfarts coming out
> on the list.  Just how, Ethan, did you think you could represent
> binary data in a text string, whether preceded by the char 'b' or not?
>  What did you think you would do when you got to character 0, the
> first (pseudo)-symbol in ASCII?
> Why don't you jackasses start listening instead of wanking each other
> with bullshit?
> markj
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

INADA Naoki  <songofacandy at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Wed Jan  8 02:19:38 2014
From: ethan at (Ethan Furman)
Date: Tue, 07 Jan 2014 17:19:38 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140108003910.GL29356@ando>
References: <>
 <> <>
 <> <>
 <> <>
Message-ID: <>

On 01/07/2014 04:39 PM, Steven D'Aprano wrote:
> On Tue, Jan 07, 2014 at 08:48:05AM -0800, Ethan Furman wrote:
>> [...] My binary stream is mixed:
>>    - binary that has to be converted (4-byte ints, for example)
>>    - ascii that has to be converted (ints stored as ascii text)
>>    - encoded text (character and memo fields)
> Ethan, you keep referring to ascii text and encoded text as if they are
> different things. They're not.

Would you feel better if I called them ASCII-encoded text, and other-encoded text?  And they are different, if for no 
other reason than they are using different encodings.  Further, the ASCII-encoded text can be directly compared with 
byte sequences because . . . they're bytes! ;)

>  You have a binary file containing bytes.
> Some of those bytes represent data of one kind (say, 4-bit ints). Some
> of those bytes represent data of a different kind (Latin-1 encoded text
> representing character and memo fields) and other bytes represent data
> of a third kind (ASCII encoded text representing ints, but you don't
> mention what the meaning of those ints is).

ASCII-encoded text reprenting ints are ints.  I don't know what they mean, but presumably they have something to do with 
whatever the user named the field.  For example, I would imagine that b'35' in an AGE field meant 35 years; luckily I 
only have to give the user back the integer 35, not figure out what it's supposed to mean.

> ASCII or Latin-1, the text is still encoded into bytes, and still needs
> to be decoded back to text.

No, it doesn't.  I don't need to convert b'35' into u'35' to convert to 35.  I don't need to convert b'N' to u'N' to 
know I have a Numeric field, nor b'T' to u'T' to get True.


From steve at  Wed Jan  8 03:20:18 2014
From: steve at (Steven D'Aprano)
Date: Wed, 8 Jan 2014 13:20:18 +1100
Subject: [Python-ideas] [OT] banning Mark Janssen
In-Reply-To: <>
References: <>
Message-ID: <20140108022018.GN29356@ando>

On Tue, Jan 07, 2014 at 04:56:37PM -0800, Ethan Furman wrote:
> Moderators,
> Mark Janssen's posts are becoming extremely abusive, which seems to me to 
> be against he code of conduct.
> Can we ban him, at least from the mailing lists?

I think he should be given one formal warning, but won't object if the 
moderators decide to just kick his arse out of here. It isn't as if he 
contributes anything useful to the discussion.

For the record, I have no objection to swearing or profanity (we're all 
adults here, or at least we're supposed to act like them), but there is 
a difference between "rude words" and abuse, and Mark crosses the line 
into abuse.

(I would also like to preemptively state that I object in the strongest 
possible terms to a blanket "no swearing" policy, just in case anyone is 
thinking of introducing such a thing.)


From abarnert at  Wed Jan  8 04:32:22 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 7 Jan 2014 19:32:22 -0800
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 7, 2014, at 16:07, Brett Cannon <brett at> wrote:

> That language is not called for

Personally, I find it useful. When I have no idea what a message means, sometimes that means I have to put more effort into it--maybe the author is way above my level of expertise, or maybe he's writing English as a third language--and sometimes it means I can just ignore it--maybe it's contentless, a troll, or the product of insanity. A subject line like this makes it much faster to figure out which case this is.

> (what the heck is the subject line even supposed to mean?). While I'm not saying you can use a swear word here or there to punctuate a statement, being this over-the-top is not considerate of others.
> On Tue, Jan 7, 2014 at 6:49 PM, Mark Janssen <dreamingforward at> wrote:
>> Okay,  how's everyone doing with their Python 2 vs.3,  bytes/unicode
>> vs. shit-extruder expertise?
>> Anyone need some relief, perhaps some guidance?
>> markj
>> *kicks feet up to table*
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Wed Jan  8 04:41:57 2014
From: at (Haoyi Li)
Date: Tue, 7 Jan 2014 19:41:57 -0800
Subject: [Python-ideas] [OT] banning Mark Janssen
In-Reply-To: <20140108022018.GN29356@ando>
References: <> <20140108022018.GN29356@ando>
Message-ID: <>

I'm for banning him. He has contributed in discussion occasionally, but
abuse is abuse.

On Tue, Jan 7, 2014 at 6:20 PM, Steven D'Aprano <steve at> wrote:

> On Tue, Jan 07, 2014 at 04:56:37PM -0800, Ethan Furman wrote:
> > Moderators,
> >
> > Mark Janssen's posts are becoming extremely abusive, which seems to me to
> > be against he code of conduct.
> >
> > Can we ban him, at least from the mailing lists?
> I think he should be given one formal warning, but won't object if the
> moderators decide to just kick his arse out of here. It isn't as if he
> contributes anything useful to the discussion.
> For the record, I have no objection to swearing or profanity (we're all
> adults here, or at least we're supposed to act like them), but there is
> a difference between "rude words" and abuse, and Mark crosses the line
> into abuse.
> (I would also like to preemptively state that I object in the strongest
> possible terms to a blanket "no swearing" policy, just in case anyone is
> thinking of introducing such a thing.)
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tim.peters at  Wed Jan  8 04:52:01 2014
From: tim.peters at (Tim Peters)
Date: Tue, 7 Jan 2014 21:52:01 -0600
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

[Brett Cannon]
> That language is not called for (what the heck is the subject line even
> supposed to mean?).

Allow me to clarify:  a cock is the male of any species of bird, not
just a rooster.  I was confused too before I looked that up ;-)

> While I'm not saying you can use a swear word here or
> there to punctuate a statement, being this over-the-top is
> not considerate of others.

I blame it on the PSF.  Apparently we haven't been clear enough on
what we're looking for when voting on Community Service Awards:

still-wondering-what-the-wise-shall-start-doing-ly y'rs  - tim

From stephen at  Wed Jan  8 07:04:44 2014
From: stephen at (Stephen J. Turnbull)
Date: Wed, 08 Jan 2014 15:04:44 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was:
	a	new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

I'm responding here rather than directly to Steven because Andrew
explains it as well as I could.  In all cases where I don't comment,
Andrew is 100% correct as to my intended semantics.

The critical point is just that in cases where "the ASCII characters
are themselves" and an 8-bit representation is theoretically possible,
an 8-bit representation is used.  More precisely, if the identities of
128-255 as characters is not important to the programmer, these bytes
are not interpreted as characters, in the same way that surrogate-
escaped bytes are uninterpreted in the current representation.

Andrew Barnert writes:

 > I think Stephen's name "7-bit" is confusing people.

Indeed, and I apologize for confusing Steven in particular, which is
entirely due to that poor choice.

 > If you try to interpret the name sensibly, you get Steven's broken
 > interpretation. But if you read it as a nonsense word and work
 > through the logic, it all makes sense.

Maybe "ascii-compatible" is better.  It's a union type, including all
encodings where octets 0-127 receive the standard mapping to the ASCII
characters, but octets 128-255 are ambiguous.

 > > Suppose we take a byte-string with a non-ASCII byte:
 > > 
 > >    b'abc\xFF'.decode('ascii-compatible')
 > > 
 > > This will return... what? I think it returns a so-called 7-bit 
 > > representation, but I'm not sure what it is a representation of.
 > The representation is the bytes 61 62 63 FF with the floobl flag
 > set. It's a representation of an 'a' char, a 'b' char, a 'c' char,
 > and a smuggled FF byte--identical to 'abc\uDCFF'.

Except that it's an 8-bit representation invisible to Python except
for maybe the timeit package, yes.

 > (This last bit is the part I'm a bit wary of, as it promoted
 > surrogate-escape to being an inherent part of the meaning of
 > Unicode strings in Python.

They're already part of the inherent meaning of Unicode strings.  The
alternative is to read ASCII-compatible streams as latin1, which
*changes their meaning*.

 > > Your description confuses me. The "7-bit string" is already text, how do 
 > > you decode it to the 16-bit internal representation? 
 > By decoding its representation as if it were bytes, using surrogate-escape.

Strictly speaking, it's not a "decoding", it's a change of internal

 > >> 5.  String methods that would raise or produce undefined results if
 > >>    used on str containing surrogate-encoded bytes need to be taught
 > >>    to do the same on non-ASCII bytes in 7-bit str objects.
 > > 
 > > Do you have an example of such string methods?

No, I don't, but I imagined there might be some.  (My original example
was case conversion, but that doesn't work because Python doesn't
check for whether something is actually a code point that can be a
character, even -- it just notices that surrogate-encoded bytes don't
have alternative cases in the database and passes them through.)

 > >> 7.  On output other codecs raise on a 7-bit str, unless the
 > >>    surrogateescape handler is in use.
 > > 
 > > What do you mean by "on output"? Do you mean when encoding?

Yes.  You (all, but Steven in particular) have my apology for the

 > However, I think there's a mistake in the design of 6 here. Surely
 > encoding 'abc\uDCFF' should give you the bytes 61 62 63 FF, not an
 > exception, right? (Unless the idea is that such a string is
 > guaranteed to have a floobl-flagged 8-bit representation, not a
 > 16-bit one, no matter how you try to create it in Python or in C,
 > and I don't think the other rules make that guarantee.)

Andrew is correct, that is a mistake in design.  I thought an 8-bit
representation was guaranteed in that case, with the "floobl" flag
set.  I think that Andrew's idea is correct, but this miss makes me
nervous about the coherence of the concept.

From stephen at  Wed Jan  8 07:08:17 2014
From: stephen at (Stephen J. Turnbull)
Date: Wed, 08 Jan 2014 15:08:17 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <lahvla$ev8$>
References: <>
 <> <lahvla$ev8$>
Message-ID: <>

Terry Reedy writes:

 > The above describes a minor variation on bytes and seems to me to be a 
 > classic case for subclassing, whether in Python for ease or C for speed, 
 > in an imported module.

I agree with you, but the discussion on python-dev indicates that the
majority of core devs, including Guido IIUC, disagree with us.  In
fact they want to add many str-like capabilities to bytes (and the
related mutable classes bytearray and memoryview).

From stephen at  Wed Jan  8 07:18:24 2014
From: stephen at (Stephen J. Turnbull)
Date: Wed, 08 Jan 2014 15:18:24 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Ethan Furman writes:

 > Sounds like it doesn't help me then.  My binary stream is mixed:
 >    - binary that has to be converted (4-byte ints, for example)
 >    - ascii that has to be converted (ints stored as ascii text)
 >    - encoded text (character and memo fields)
 > and the precise location of each varies from file to file.

Yes, I understand all that, but without code examples (or rather
precise specification of the semantics you're implementing) I can't
discuss whether my 'ascii-compatible' (the Artist Formerly Known as
"7-bit representation") would help you write efficient and readable
code.  Cf. INADA-san's post for what would help me.

From breamoreboy at  Wed Jan  8 09:02:20 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 08 Jan 2014 08:02:20 +0000
Subject: [Python-ideas] [OT] banning Mark Janssen
In-Reply-To: <>
References: <>
Message-ID: <laj0m8$ua3$>

On 08/01/2014 00:56, Ethan Furman wrote:
> Moderators,
> Mark Janssen's posts are becoming extremely abusive, which seems to me
> to be against he code of conduct.
> Can we ban him, at least from the mailing lists?
> --
> ~Ethan~

He's a complete waste of space, please get rid of him.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ncoghlan at  Wed Jan  8 10:59:33 2014
From: ncoghlan at (Nick Coghlan)
Date: Wed, 8 Jan 2014 19:59:33 +1000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <lahvla$ev8$>
Message-ID: <>

On 8 Jan 2014 14:08, "Stephen J. Turnbull" <stephen at> wrote:
> Terry Reedy writes:
>  > The above describes a minor variation on bytes and seems to me to be a
>  > classic case for subclassing, whether in Python for ease or C for
>  > in an imported module.
> I agree with you, but the discussion on python-dev indicates that the
> majority of core devs, including Guido IIUC, disagree with us.  In
> fact they want to add many str-like capabilities to bytes (and the
> related mutable classes bytearray and memoryview).

That's far from a foregone conclusion. The main problem we've had over the
past few years is the inability to get past "just give us back the Python 2
str type" responses from wire protocol developers attempting to migrate
that aren't happy with the approach of manipulating data in the text domain
and on to actual experiments with a suitable type for wire protocol
development that interoperates nicely with the Python 3 text model.

Now that your proposal has been better explained, yes, I agree that
"asciibytes" and "asciistr" types would be well worth experimenting with. I
mention both, since it's far from clear if a str subclass or a bytes
subclass (or neither, although that may require bug fixes in CPython) would
be more convenient for this use case.

The key difference between such a type and a str with surrogate escaped
elements or a Python 2 bytestring is that it would attempt to implicitly
*encode* any Unicode text it encountered as strict ASCII text. This would
allow text and binary processing to share code paths, with limited risk of
producing mojibake (particularly since this type wouldn't be a builtin).

The type would also share the str behaviour of returning a single element
subsequence when indexed rather than an integer.


> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From terrycwk1994 at  Wed Jan  8 11:16:47 2014
From: terrycwk1994 at (Terry Chia)
Date: Wed, 8 Jan 2014 18:16:47 +0800
Subject: [Python-ideas] Strong password hashing algorithms in the standard
Message-ID: <>

Hi all,

I would like to propose that a new library for strong password hashing
be included in the standard library. The proposed library should have
of one or more strong password hashes like pbkdf2, bcrypt or scrypt.

There already exist third party libraries like passlib[2] that accomplishes
the same thing
but I feel that inclusion of the algorithms in the standard library would
do a lot to help
people that are not as security-aware to do the right thing when it comes
to password

Alternatively, if the idea of adding the algorithms into the standard
library does not have
much support, I would like to see a warning added to the hashlib[3]
discouraging its use for password hashing.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From breamoreboy at  Wed Jan  8 11:18:12 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 08 Jan 2014 10:18:12 +0000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
 <lahvla$ev8$> <>
Message-ID: <laj8l2$5s2$>

On 08/01/2014 09:59, Nick Coghlan wrote:
> Now that your proposal has been better explained, yes, I agree that
> "asciibytes" and "asciistr" types would be well worth experimenting
> with. I mention both, since it's far from clear if a str subclass or a
> bytes subclass (or neither, although that may require bug fixes in
> CPython) would be more convenient for this use case.

Could you subclass both to get the best of both worlds?  As in

class asciixyz(str, bytes):

> Cheers,
> Nick.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From solipsis at  Wed Jan  8 11:34:08 2014
From: solipsis at (Antoine Pitrou)
Date: Wed, 8 Jan 2014 11:34:08 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <20140108113408.51509b48@fsol>

On Wed, 8 Jan 2014 09:50:30 +0900
INADA Naoki <songofacandy at>
> textdata = b"hello"

textdata shouldn't be a bytes object! If it's text it's a str.

> bindata = b"abc\xff\x00"
> query = "UPDATE table SET textcol=%s bincol=%s"
> print build_query(query, textdata, bindata)
> I can't port this to Python 3.

I'm sure you can port it. Just decode your bindata using

  bindata = bindata.decode('utf8', 'surrogateescape')

and then encode the query at the end:

  query = query.encode('utf8', 'surrogateescape')

It will be a little slower, though. 



From solipsis at  Wed Jan  8 11:35:31 2014
From: solipsis at (Antoine Pitrou)
Date: Wed, 8 Jan 2014 11:35:31 +0100
Subject: [Python-ideas] The fools shall start sucking the cock.
References: <>
Message-ID: <20140108113531.4ebe0148@fsol>

Not to mention the utter lack of content.



On Tue, 7 Jan 2014 19:07:19 -0500
Brett Cannon <brett at> wrote:
> That language is not called for (what the heck is the subject line even
> supposed to mean?). While I'm not saying you can use a swear word here or
> there to punctuate a statement, being this over-the-top is not considerate
> of others.
> On Tue, Jan 7, 2014 at 6:49 PM, Mark Janssen <dreamingforward at>wrote:
> > Okay,  how's everyone doing with their Python 2 vs.3,  bytes/unicode
> > vs. shit-extruder expertise?
> >
> > Anyone need some relief, perhaps some guidance?
> >
> > markj
> > *kicks feet up to table*
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> >

From enric.tejedor at  Wed Jan  8 12:21:31 2014
From: enric.tejedor at (Enric Tejedor)
Date: Wed, 08 Jan 2014 12:21:31 +0100
Subject: [Python-ideas] Decorators on loops
Message-ID: <>


I would like to discuss a new use of python decorators. I apologize if
this has already been suggested before.

The basic idea would be to support decorators on loops, in addition to
functions and classes. Something like this:

for i in range(10):
     # loop body

In mydecorator, I would like to have access to the loop body and the
iterable object.

In my case, I would use this to parallelize the iterations of the loop.

Thank you for your feedback,


WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

From songofacandy at  Wed Jan  8 12:31:10 2014
From: songofacandy at (INADA Naoki)
Date: Wed, 8 Jan 2014 20:31:10 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140108113408.51509b48@fsol>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
 <20140107205936.7706c393@fsol> <>
Message-ID: <>

On Wed, Jan 8, 2014 at 7:34 PM, Antoine Pitrou <solipsis at> wrote:

> On Wed, 8 Jan 2014 09:50:30 +0900
> INADA Naoki <songofacandy at>
> wrote:
> >
> > textdata = b"hello"
> textdata shouldn't be a bytes object! If it's text it's a str.
PyMySQL and MySQL-python supports both of unicode text and encoded text.
So bytes may be text in MySQL if it inserted into TEXT or VARCHAR column.

> > bindata = b"abc\xff\x00"
> > query = "UPDATE table SET textcol=%s bincol=%s"
> >
> > print build_query(query, textdata, bindata)
> >
> >
> > I can't port this to Python 3.
> I'm sure you can port it. Just decode your bindata using
> surrogateescape:
>   bindata = bindata.decode('utf8', 'surrogateescape')
> and then encode the query at the end:
>   query = query.encode('utf8', 'surrogateescape')
> It will be a little slower, though.

You're right. I've not considered using surrogateescape here.

But MySQL connection may be not utf8. It's default latin1 and you can use
many encoding.
Some encoding doesn't ensure roundtrip. In such encoding,

bindata = bindata.decode('sjis', 'surrogateescape')
query = query % bindata
query.encode('sjis', 'surrogateescape')

may break bindata.

I may be able to ascii for decoding when mysql uses ascii compatible

bindata = bindata.decode('ascii', 'surrogateescape')
query = query % bindata
query.encode('sjis', 'surrogateescape')

But I think decode/encode with surrogateescape is not only slow, but also
dangerous when using
encoding except ascii or utf8.

> Regards
> Antoine
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

INADA Naoki  <songofacandy at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Wed Jan  8 12:38:22 2014
From: solipsis at (Antoine Pitrou)
Date: Wed, 8 Jan 2014 12:38:22 +0100
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <20140108123822.252fc642@fsol>

On Wed, 8 Jan 2014 20:31:10 +0900
INADA Naoki <songofacandy at>
> You're right. I've not considered using surrogateescape here.
> But MySQL connection may be not utf8. It's default latin1 and you can use
> many encoding.
> Some encoding doesn't ensure roundtrip. In such encoding,
> But I think decode/encode with surrogateescape is not only slow, but also
> dangerous when using
> encoding except ascii or utf8.

You're right. Thanks exposing your use case, I think it's a good data
point for the bytes formatting PEP.



From terrycwk1994 at  Wed Jan  8 12:42:23 2014
From: terrycwk1994 at (Terry Chia)
Date: Wed, 8 Jan 2014 19:42:23 +0800
Subject: [Python-ideas] Strong password hashing algorithms in the
	standard library
In-Reply-To: <>
References: <>
Message-ID: <>

That's great!

Are there any plans to also include algorithms like bcrypt and scrypt given
that they are stronger than pbkdf2 for GPU/FPGA-using attackers?

Also, can the same warning be placed on older documentations like the 2.7
one given the large amount of people still using 2.7?

On Wed, Jan 8, 2014 at 7:30 PM, Ronald Oussoren <ronaldoussoren at>wrote:

> On Jan 08, 2014, at 11:17 AM, Terry Chia <terrycwk1994 at> wrote:
> Hi all,
> I would like to propose that a new library for strong password hashing
> algorithms[1]
> be included in the standard library. The proposed library should have
> implementations
> of one or more strong password hashes like pbkdf2, bcrypt or scrypt.
> There already exist third party libraries like passlib[2] that
> accomplishes the same thing
> but I feel that inclusion of the algorithms in the standard library would
> do a lot to help
> people that are not as security-aware to do the right thing when it comes
> to password
> storage.
> Alternatively, if the idea of adding the algorithms into the standard
> library does not have
> much support, I would like to see a warning added to the hashlib[3]
> documentation
> discouraging its use for password hashing.
> Python 3.4 will include hash lib.pbkdf2_hmac, see <
> That documentation also warns about using a plain hash function for
> creating password hashes.
> Ronald
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Wed Jan  8 12:42:24 2014
From: solipsis at (Antoine Pitrou)
Date: Wed, 8 Jan 2014 12:42:24 +0100
Subject: [Python-ideas] Strong password hashing algorithms in the
	standard library
References: <>
Message-ID: <20140108124224.69dcb257@fsol>

Hi Terry,

On Wed, 8 Jan 2014 18:16:47 +0800
Terry Chia <terrycwk1994 at> wrote:
> I would like to propose that a new library for strong password hashing
> algorithms[1]
> be included in the standard library. The proposed library should have
> implementations
> of one or more strong password hashes like pbkdf2, bcrypt or scrypt.

In 3.4, hashlib has gained a pbkdf2 implementation:

I think other similar primitives should be added alongside. It's
probably enough to open an issue on

If you want guidance on how to contribute code, please take a look at
the developers' guide:

Best regards


From songofacandy at  Wed Jan  8 12:53:26 2014
From: songofacandy at (INADA Naoki)
Date: Wed, 8 Jan 2014 20:53:26 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <20140108123822.252fc642@fsol>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
 <20140107205936.7706c393@fsol> <>
Message-ID: <>

FYI, I can make sample data that is not roundtrip easily with iso2022-jp

In [5]: b'\x1b$B\x1b(B'.decode('iso2022_jp')
Out[5]: ''

In [6]: b'\x1b$B\x1b(B'.decode('iso2022_jp',
'surrogateescape').encode('iso2022_jp', 'surrogateescape')
Out[6]: b''

On Wed, Jan 8, 2014 at 8:38 PM, Antoine Pitrou <solipsis at> wrote:

> On Wed, 8 Jan 2014 20:31:10 +0900
> INADA Naoki <songofacandy at>
> wrote:
> >
> > You're right. I've not considered using surrogateescape here.
> >
> > But MySQL connection may be not utf8. It's default latin1 and you can use
> > many encoding.
> > Some encoding doesn't ensure roundtrip. In such encoding,
> >
> [...]
> >
> > But I think decode/encode with surrogateescape is not only slow, but also
> > dangerous when using
> > encoding except ascii or utf8.
> You're right. Thanks exposing your use case, I think it's a good data
> point for the bytes formatting PEP.
> Regards
> Antoine.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

INADA Naoki  <songofacandy at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Wed Jan  8 12:59:10 2014
From: rosuav at (Chris Angelico)
Date: Wed, 8 Jan 2014 22:59:10 +1100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 8, 2014 at 10:21 PM, Enric Tejedor <enric.tejedor at> wrote:
> The basic idea would be to support decorators on loops, in addition to
> functions and classes. Something like this:
> @mydecorator
> for i in range(10):
>      # loop body
> In mydecorator, I would like to have access to the loop body and the
> iterable object.
> In my case, I would use this to parallelize the iterations of the loop.

That's a nice theory, but the basic form of the decorator wouldn't
work. Here's how decorators work on functions:

def bar():

is the same as:

def bar():
bar = foo(bar)

It depends on there being something assigned-to. With loops, that's
not the case, so it's not possible to decorate them in the usual

Can you turn your loop into a map() call? Something like this:

def loop_body(i):
    # all the code for your loop body
list(map(loop_body, range(10)))

Once you have it in that form, you can use multiprocessing.Pool() and
its map() method, which will parallelize the loop for you (by
distributing it over a pool of subprocesses). Would that cover what
you need?


From masklinn at  Wed Jan  8 13:08:31 2014
From: masklinn at (Masklinn)
Date: Wed, 8 Jan 2014 13:08:31 +0100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
Message-ID: <>

On 2014-01-08, at 12:59 , Chris Angelico <rosuav at> wrote:

> On Wed, Jan 8, 2014 at 10:21 PM, Enric Tejedor <enric.tejedor at> wrote:
>> The basic idea would be to support decorators on loops, in addition to
>> functions and classes. Something like this:
>> @mydecorator
>> for i in range(10):
>>     # loop body
>> In mydecorator, I would like to have access to the loop body and the
>> iterable object.
>> In my case, I would use this to parallelize the iterations of the loop.
> That's a nice theory, but the basic form of the decorator wouldn't
> work. Here's how decorators work on functions:
> @foo
> def bar():
>    pass
> is the same as:
> def bar():
>    pass
> bar = foo(bar)
> It depends on there being something assigned-to. With loops, that's
> not the case, so it's not possible to decorate them in the usual
> sense.
> Can you turn your loop into a map() call? Something like this:
> def loop_body(i):
>    # all the code for your loop body
> list(map(loop_body, range(10)))
> Once you have it in that form, you can use multiprocessing.Pool() and
> its map() method, which will parallelize the loop for you (by
> distributing it over a pool of subprocesses). Would that cover what
> you need?

Alternatively, wrap the loop in a function and then do AST munging in
the decorator. Something similar (in spirit at least) to Numba

You could even do something like immediate function invocation in the
decorator, and bind the result to the function name, although I'm not
sure your coworkers will like you.

From stephen at  Wed Jan  8 13:11:40 2014
From: stephen at (Stephen J. Turnbull)
Date: Wed, 08 Jan 2014 21:11:40 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <>

>>>>> INADA Naoki writes:

 > I share my experience that I've suffered by bytes doesn't have %-format.
 > `MySQL-python is a most major DB-API 2.0 driver for MySQL.
 > MySQL-python uses 'format' paramstyle.

 > MySQL protocol is basically encoded text, but it may contain arbitrary
 > (escaped) binary.
 > Here is simplified example constructing real SQL from SQL format and
 > arguments. (Works only on Python 2.7)

'>' quotes are omitted for clarity and comments deleted.

    def escape_string(s):
        return s.replace("'", "''")

    def convert(x):
        if isinstance(x, unicode):
            x = x.encode('utf-8')
        if isinstance(x, str):
            x = "'" + escape_string(x) + "'"
            x = str(x)
        return x

    def build_query(query, *args):
        if isinstance(query, unicode):
            query = query.encode('utf-8')
        return query % tuple(map(convert, args))

    textdata = b"hello"
    bindata = b"abc\xff\x00"
    query = "UPDATE table SET textcol=%s bincol=%s"

    print build_query(query, textdata, bindata)

 > I can't port this to Python 3.

Why not?  The obvious translation is

    # This is Python 3!!
    def escape_string(s):
        return s.replace("'", "''")

    def convert(x):
        if isinstance(x, bytes):
            x = escape_string(x.decode('ascii', errors='surrogateescape'))
            x = "'" + x + "'"
            x = str(x)
        return x

    def build_query(query, *args):
        query = query % tuple(map(convert, args))
        return query.encode('utf-8', errors='surrogateescape')

    textdata = "hello"
    bindata = b"abc\xff\x00"
    query = "UPDATE table SET textcol=%s bincol=%s"

    print build_query(query, textdata, bindata)

The main issue I can think you might have with this is that there will
need to be conversions to and from 16-bit representations, which take
up unnecessary space for bindata, and are relatively slow for bindata.
But it seems to me that these are second-order costs compared to the
other work an adapter needs to do.  What am I missing?

With the proposed 'ascii-compatible' representation, if you have to
handle many MB of binary or textdata with non-ASCII characters,

    def convert(x):
        if isinstance(x, str):
            x = x.encode('utf-8').decode('ascii-compatible')
        elif isinstance(x, bytes):
            x = escape_string(x.decode('ascii-compatible'))
            x = "'" + x + "'"
            x = str(x)  # like 42
        return x

    def build_query(query, *args):
        query = convert(query) % tuple(map(convert, args))
        return query.encode('utf-8', errors='surrogateescape')

ensures that the '%' format operator is always dealing with 8-bit
representations only.  There might be a conversion from 16-bit to
8-bit for str, but there will be no conversions from 8-bit to 16-bit
representations.  I don't know if that makes '%' itself faster, but
it might.

From ned at  Wed Jan  8 13:17:36 2014
From: ned at (Ned Batchelder)
Date: Wed, 08 Jan 2014 07:17:36 -0500
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/8/14 6:21 AM, Enric Tejedor wrote:
> Hello,
> I would like to discuss a new use of python decorators. I apologize if
> this has already been suggested before.
> The basic idea would be to support decorators on loops, in addition to
> functions and classes. Something like this:
> @mydecorator
> for i in range(10):
>       # loop body
> In mydecorator, I would like to have access to the loop body and the
> iterable object.
In the case of function and class decorators, Python has an object that 
can be passed to the decorator: the function or the class.  For a loop 
decorator, how would you "have access to the loop body"?  It sounds like 
it would have to be compiled differently, into a separate code object?

> In my case, I would use this to parallelize the iterations of the loop.
> Thank you for your feedback,
> Enric
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From ronaldoussoren at  Wed Jan  8 12:30:12 2014
From: ronaldoussoren at (Ronald Oussoren)
Date: Wed, 08 Jan 2014 11:30:12 +0000 (GMT)
Subject: [Python-ideas] Strong password hashing algorithms in the
	standard	library
In-Reply-To: <>
Message-ID: <>

On Jan 08, 2014, at 11:17 AM, Terry Chia <terrycwk1994 at> wrote:

Hi all,

I would like to propose that a new library for strong password hashing algorithms[1]
be included in the standard library. The proposed library should have implementations
of one or more strong password hashes like pbkdf2, bcrypt or scrypt.

There already exist third party libraries like passlib[2] that accomplishes the same thing
but I feel that inclusion of the algorithms in the standard library would do a lot to help
people that are not as security-aware to do the right thing when it comes to password

Alternatively, if the idea of adding the algorithms into the standard library does not have
much support, I would like to see a warning added to the hashlib[3] documentation
discouraging its use for password hashing.
Python 3.4 will include?hash lib.pbkdf2_hmac, see <>. That documentation also warns about using a plain hash function for creating password hashes.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From joshua at  Wed Jan  8 13:30:45 2014
From: joshua at (Joshua Landau)
Date: Wed, 8 Jan 2014 12:30:45 +0000
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

On 8 January 2014 00:07, Brett Cannon <brett at> wrote:
> On Tue, Jan 7, 2014 at 6:49 PM, Mark Janssen <dreamingforward at> wrote:
>> [insults]
> That language is not called for (what the heck is the subject line even
> supposed to mean?). While I'm not saying you can use a swear word here or
> there to punctuate a statement, being this over-the-top is not considerate
> of others.


Mark Janssen,

You have been disregarding acceptable public etiquette for a while in
your posts, both on this python-ideas and python-list. This is a
request for you to stop.

From songofacandy at  Wed Jan  8 14:10:42 2014
From: songofacandy at (INADA Naoki)
Date: Wed, 8 Jan 2014 22:10:42 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <>
 <> <>
 <> <>
 <20140107185733.7ad1a3be@fsol> <>
 <20140107194752.304604a1@fsol> <>
 <20140107205936.7706c393@fsol> <>
Message-ID: <>

You're right.
As I said previous mail, I had not considered about using surrogateescape.

But surrogateescpae is not silverbullet.
Decode with ascii and encode with target encoding is not valid on ascii
compatible encoding.

In [29]: bindata = b'abc'
In [30]: bindata = bindata.decode('ascii', 'surrogateescape')
In [31]: text = 'abc'
In [32]: query = 'SET textcolumn=%s bincolumn=%s' % ("'" + text + "'", "'"
+ bindata + "'")
In [33]: query.encode('utf16', 'surrogateescape')
Out[33]: b"\xff\xfeS\x00E\x00T\x00

Fortunately, I can't use utf16 as client encoding with MySQL.
mysql> SET NAMES utf16;
ERROR 1231 (42000): Variable 'character_set_client' can't be set to the
value of 'utf16'

On Wed, Jan 8, 2014 at 9:11 PM, Stephen J. Turnbull <stephen at>wrote:

> >>>>> INADA Naoki writes:
>  > I share my experience that I've suffered by bytes doesn't have %-format.
>  > `MySQL-python is a most major DB-API 2.0 driver for MySQL.
>  > MySQL-python uses 'format' paramstyle.
>  > MySQL protocol is basically encoded text, but it may contain arbitrary
>  > (escaped) binary.
>  > Here is simplified example constructing real SQL from SQL format and
>  > arguments. (Works only on Python 2.7)
> '>' quotes are omitted for clarity and comments deleted.
>     def escape_string(s):
>         return s.replace("'", "''")
>     def convert(x):
>         if isinstance(x, unicode):
>             x = x.encode('utf-8')
>         if isinstance(x, str):
>             x = "'" + escape_string(x) + "'"
>         else:
>             x = str(x)
>         return x
>     def build_query(query, *args):
>         if isinstance(query, unicode):
>             query = query.encode('utf-8')
>         return query % tuple(map(convert, args))
>     textdata = b"hello"
>     bindata = b"abc\xff\x00"
>     query = "UPDATE table SET textcol=%s bincol=%s"
>     print build_query(query, textdata, bindata)
>  > I can't port this to Python 3.
> Why not?  The obvious translation is
>     # This is Python 3!!
>     def escape_string(s):
>         return s.replace("'", "''")
>     def convert(x):
>         if isinstance(x, bytes):
>             x = escape_string(x.decode('ascii', errors='surrogateescape'))
>             x = "'" + x + "'"
>         else:
>             x = str(x)
>         return x
>     def build_query(query, *args):
>         query = query % tuple(map(convert, args))
>         return query.encode('utf-8', errors='surrogateescape')
>     textdata = "hello"
>     bindata = b"abc\xff\x00"
>     query = "UPDATE table SET textcol=%s bincol=%s"
>     print build_query(query, textdata, bindata)
> The main issue I can think you might have with this is that there will
> need to be conversions to and from 16-bit representations, which take
> up unnecessary space for bindata, and are relatively slow for bindata.
> But it seems to me that these are second-order costs compared to the
> other work an adapter needs to do.  What am I missing?
> With the proposed 'ascii-compatible' representation, if you have to
> handle many MB of binary or textdata with non-ASCII characters,
>     def convert(x):
>         if isinstance(x, str):
>             x = x.encode('utf-8').decode('ascii-compatible')
>         elif isinstance(x, bytes):
>             x = escape_string(x.decode('ascii-compatible'))
>             x = "'" + x + "'"
>         else:
>             x = str(x)  # like 42
>         return x
>     def build_query(query, *args):
>         query = convert(query) % tuple(map(convert, args))
>         return query.encode('utf-8', errors='surrogateescape')
> ensures that the '%' format operator is always dealing with 8-bit
> representations only.  There might be a conversion from 16-bit to
> 8-bit for str, but there will be no conversions from 8-bit to 16-bit
> representations.  I don't know if that makes '%' itself faster, but
> it might.

INADA Naoki  <songofacandy at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From dholth at  Wed Jan  8 14:52:34 2014
From: dholth at (Daniel Holth)
Date: Wed, 8 Jan 2014 08:52:34 -0500
Subject: [Python-ideas] Strong password hashing algorithms in the
	standard library
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 8, 2014 at 6:42 AM, Terry Chia <terrycwk1994 at> wrote:
> That's great!
> Are there any plans to also include algorithms like bcrypt and scrypt given
> that they are stronger than pbkdf2 for GPU/FPGA-using attackers?
> Also, can the same warning be placed on older documentations like the 2.7
> one given the large amount of people still using 2.7?

On some platforms os.crypt() can do bcrypt or an iterative sha crypt
used in red hat etc.

From enric.tejedor at  Wed Jan  8 15:23:04 2014
From: enric.tejedor at (Enric Tejedor)
Date: Wed, 08 Jan 2014 15:23:04 +0100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
Message-ID: <>

Thank you for your replies,

El 08/01/14 13:08, Masklinn escribi?:
>> That's a nice theory, but the basic form of the decorator wouldn't
>> work. Here's how decorators work on functions:
>> @foo
>> def bar():
>>    pass
>> is the same as:
>> def bar():
>>    pass
>> bar = foo(bar)
>> It depends on there being something assigned-to. With loops, that's
>> not the case, so it's not possible to decorate them in the usual
>> sense.
>> Can you turn your loop into a map() call? Something like this:
>> def loop_body(i):
>>    # all the code for your loop body
>> list(map(loop_body, range(10)))
>> Once you have it in that form, you can use multiprocessing.Pool() and
>> its map() method, which will parallelize the loop for you (by
>> distributing it over a pool of subprocesses). Would that cover what
>> you need?
> Alternatively, wrap the loop in a function and then do AST munging in
> the decorator. Something similar (in spirit at least) to Numba
> (
> You could even do something like immediate function invocation in the
> decorator, and bind the result to the function name, although I'm not
> sure your coworkers will like you.

I would use this feature as a part of a parallel programming model for
Python apps.

Ideally, the programmer would place a decorator before their loops in
order to parallelize them, similarly to OpenMP and its pragmas.

Yes, I could make the programmer wrap the body of their loops in
functions and then decorate those functions:

# decorator of my PM library
def parallel ( iterable ):
     def call ( func ):
          # parallelize the iterations here, maybe with multiprocessing
and map for local execution, or another strategy for remote execution
     return call

# user's code
@parallel ( range ( count ) )
 def loop (i):
       # loop body

But this solution requires programmers to modify the loops they want to
parallelize, and not simply place a decorator before them, like this:

for i in range(count):
     # loop body

> In the case of function and class decorators, Python has an object that 
> can be passed to the decorator: the function or the class.  For a loop 
> decorator, how would you "have access to the loop body"?  It sounds like 
> it would have to be compiled differently, into a separate code object?
> --Ned.

Yes, perhaps when a loop had a decorator, the loop body could be
encapsulated and compiled as a function (similar to the "loop" function
I wrote), and that function object would be received by the decorator,
along with an iterable object that represents the iteration space. All
this would be hidden from the programmer, who would only decorate a
regular loop.



WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

From rosuav at  Wed Jan  8 15:39:16 2014
From: rosuav at (Chris Angelico)
Date: Thu, 9 Jan 2014 01:39:16 +1100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 1:23 AM, Enric Tejedor <enric.tejedor at> wrote:
> Yes, perhaps when a loop had a decorator, the loop body could be
> encapsulated and compiled as a function (similar to the "loop" function
> I wrote), and that function object would be received by the decorator,
> along with an iterable object that represents the iteration space. All
> this would be hidden from the programmer, who would only decorate a
> regular loop.

The biggest problem with that kind of magic is scoping. Look at this:

def func():
    best = 0
    for x in range(10):
        val = long_computation(x)
        if val > best: best = val
    return best

(Granted, this can be done with builtins, but let's keep the example simple.)

If the body of the loop becomes a new function, there needs to be a
nonlocal directive to make sure 'best' references the outer one:

def func():
    best = 0
    def body(x):
        nonlocal best
        val = long_computation(x)
        if val > best: best = val
    return best

This syntax would work, but it'll raise UnboundLocalError without the
nonlocal declaration. Since Python tags non-local variables (as
opposed to C-like languages, which tag local variables), there's no
easy way to just add another scope and have it function invisibly. Any
bit of magic that creates a local scope is going to cause problems in
any but the simplest cases. Far better to force people to be explicit
about it, and then the rules are clearer.

Note that the parallelize decorator I use here would be a little
unusual, in that it has to actually call the function (and in fact
call it multiple times), and its return value is ignored. This would
work, but it might confuse people, so you'd want to name it something
that explains what's happening. It wouldn't be hard to write, in this
form, though - it'd basically just pass the iterable to

However, the example I give here wouldn't work (at least, I don't
think it would) with multiprocessing, because external variable scopes
would be duplicated, not shared, between processes. So once again,
you'd have to write your code with parallelization in mind, rather
than simply stick a decorator on a loop and have it fork out across


From alc at  Wed Jan  8 15:57:07 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Wed, 8 Jan 2014 15:57:07 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
Message-ID: <>


I'm new here. I am sorry if this idea has already been discussed, but I
have not found a way to search this list (I am not used to mailing lists at

I've seen recently some discussion in reddit about python 2 vs python 3,
and the slow adoption of the latter. I am proposing here pragmatic way to
speed up the process of porting old code and thus solving the split in the
community, that I believe it is a serious threat. It is not clean, not at
all, but it might work: just give python 2 whiners what they [we] want, and
do it using "from __past__ import", in a similar way "from __future__
import" is used.

The advantage of this method is that porting old code would be trivial, and
each module could be rewritten at its own pace (for example, when a new
feature is required).

The tool could be updated to perform as many safe changes as it
could (safe in the sense of 100% certainty of not breaking anything), and
import old features as needed. I am thinking both in language syntax like
division behaviour, unicode, str, etc, and major library changes. Past
features may be added per request, and the 2to3 tool should allow users to
force the use of any of them, just in case. The whole process should be
almost automatic: the user might just run the tool at the root folder of
the code base, with any required command-line arguments to force some
features, and the tool would generate working python3 code.

There might be some issues regarding the interaction of new python3 code
with code that uses old features, maintaining a more complex code base, and
there might be other issues I am missing (like fundamental changes in
python 3 internal architecture that can't accomodate some older features),
but it might work and I think it could be useful to discuss this idea.

A potential non technical problem involves users abusing this mechanism to
write new code with old features, but I believe it is a minor risk if this
means the whole python community finally moves to python 3.

Hope this is useful.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Wed Jan  8 16:05:38 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 9 Jan 2014 01:05:38 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2014 00:57, Alejandro L?pez Correa <alc at> wrote:
> Hi,
> I'm new here. I am sorry if this idea has already been discussed, but I have
> not found a way to search this list (I am not used to mailing lists at all).
> I've seen recently some discussion in reddit about python 2 vs python 3, and
> the slow adoption of the latter. I am proposing here pragmatic way to speed
> up the process of porting old code and thus solving the split in the
> community, that I believe it is a serious threat. It is not clean, not at
> all, but it might work: just give python 2 whiners what they [we] want, and
> do it using "from __past__ import", in a similar way "from __future__
> import" is used.


You may want to read through
before lending too much weight to ill-informed Reddit commentary.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Wed Jan  8 16:17:44 2014
From: rosuav at (Chris Angelico)
Date: Thu, 9 Jan 2014 02:17:44 +1100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 1:57 AM, Alejandro L?pez Correa <alc at> wrote:
> I am thinking both in language syntax like division behaviour, unicode, str,
> etc, and major library changes.

The point of the __future__ directive is to enable per-module changes,
which are applied at compile-time. The __future__ features spanning
the 2.x / 3.x gap are:

division (changes the meaning of an operator)
absolute_import (changes the way modules are searched for)
print_function (ditches some language magic in favour of a function)
unicode_literals (changes the meaning of unadorned quoted strings)

In theory, division and unicode_literals could probably be the targets
of a from __past__ directive, but there's little point. Change the
code now, use the directive, and then when you move to 3.x, the
directive does nothing. (The other two would be more of a problem - I
doubt the code to make print a statement exists in Py3, and the
complete rewrite of the import machinery would make old-style
importing dubious. In any case, you probably don't want old-style

The unicode and str (or str and bytes) types have been the subject of
some other discussions here on python-ideas, so I recommend reading up
on those threads; I won't try to reopen the discussion here. There've
been quite a few suggestions made, several of which could be quite
viable without even requiring interpreter changes.

Library changes are definitely not something you'd want a "from
__past__ import" statement for. That would be exceedingly messy.
However, there are a number of wrapper modules that let you bury the
2-vs-3 differences; instead of importing module X_name_1 or module
X_name_2, you simply import X from wrapper, and it'll automatically
give you the one you need. That at least covers the cases where the
APIs are the same and it's just the names that differ. When anything
more than that has changed, it wouldn't be possible to use a
per-module flag (as "from __future__ import" is) to change that

Once you feel the push to change interpreters and execute the code
under Python 3, it's best to make your code run properly under Py3,
rather than try to hold onto the past. Straddle the gap by continuing
to run a Py2 interpreter and progressively changing your code to use
__future__ print_function and division, and to get the text/bytes
distinction clear, and then the jump to Py3 will be way easier.


From ncoghlan at  Wed Jan  8 16:38:39 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 9 Jan 2014 01:38:39 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2014 01:05, Nick Coghlan <ncoghlan at> wrote:
> On 9 January 2014 00:57, Alejandro L?pez Correa <alc at> wrote:
>> Hi,
>> I'm new here. I am sorry if this idea has already been discussed, but I have
>> not found a way to search this list (I am not used to mailing lists at all).
>> I've seen recently some discussion in reddit about python 2 vs python 3, and
>> the slow adoption of the latter. I am proposing here pragmatic way to speed
>> up the process of porting old code and thus solving the split in the
>> community, that I believe it is a serious threat. It is not clean, not at
>> all, but it might work: just give python 2 whiners what they [we] want, and
>> do it using "from __past__ import", in a similar way "from __future__
>> import" is used.
> Hi,
> You may want to read through
> before lending too much weight to ill-informed Reddit commentary.

My apologies, that was rather rude of me when you're offering to help
(I'm irritable at the moment since I've deemed it necessary to spend a
bunch of time over the past week updating my Python 3 Q & A rather
than enjoying my Christmas holidays, working on Python 3.4 or, this
week, enjoying 2014).

Anyway, the problems impacting wire protocol developers are known, but
it's been damnably difficult to get anything other than "I like Python
2 better" out of them when it comes to discussing possible *solutions*
(although even the descriptions of the problems have been useful in
guiding some changes over the course of the 3.x series). The primary
pain point for developers of binary protocol manipulation code is that
the Python 2 text model was *right* for boundary code that converts
binary data to text or structured data. However, it's wrong for
basically everything else, which is why we changed it for Python 3.

The main challenge is thus getting people to stop asking the question
"How do we bring back the Python 2 text model" (which is never going
to happen - we changed the model for a reason), and instead ask "What
changes can be made to Python 3, such as introducing additional
purpose specific types, to make it a better language for wire protocol
development?". There's nothing actually *saying* "thou shalt only use
builtin types for manipulation of wire protocol data", but that's the
way all porting efforts have been carried out to date.

As part of addressing that, it's likely that certain kinds of Python 2
code will become easier to port to Python 3, but the bigger issue is
to actually try to improve wire protocol development in Python 3
rather than getting stuck on just recreating the deeply flawed Python
2 model.

I put some possible ideas for improvements at,
but what we really need at this point is some *experimentation* with
possible approaches (especially new types like asciiview and


> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From brett at  Wed Jan  8 16:41:58 2014
From: brett at (Brett Cannon)
Date: Wed, 8 Jan 2014 10:41:58 -0500
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

After others coming forward about previous behavior, this email is serving
as an official warning: one more infraction of the CoC and you will be
banned from this mailing list.

Please try to take this seriously and be respectful of others on this
mailing list.

On Tue, Jan 7, 2014 at 6:49 PM, Mark Janssen <dreamingforward at>wrote:

> Okay,  how's everyone doing with their Python 2 vs.3,  bytes/unicode
> vs. shit-extruder expertise?
> Anyone need some relief, perhaps some guidance?
> markj
> *kicks feet up to table*
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Wed Jan  8 16:46:14 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 09 Jan 2014 00:46:14 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107185733.7ad1a3be@fsol>
 <> <20140107194752.304604a1@fsol>
 <> <20140107205936.7706c393@fsol>
Message-ID: <>

>>>>> INADA Naoki writes:
 > On Wed, Jan 8, 2014 at 7:34 PM, Antoine Pitrou <solipsis at> wrote:
 >> INADA Naoki <songofacandy at> wrote:

 > Some encoding doesn't ensure roundtrip.

In that case, in Python 2 you're depending on all "text" to be encoded
in the same encoding.  And even so you may be in trouble:

    def convert(x):
        if isinstance(x, unicode):
            x = x.encode(round_trip_not_guaranteed)

could cause your query to fail when it should succeed.  'x' is
user-supplied data, so you have no control over that.

 > I may be able to ascii for decoding when mysql uses ascii compatible
 > encoding.

You can *always* use 'ascii', 'latin1', or 'utf-8' with
'surrogateescape' for decoding, and roundtrip is guaranteed.

 > But I think decode/encode with surrogateescape is not only slow,

Evidence?  Especially as compared with the connection overhead of the

 > but also dangerous when using encoding except ascii or utf8.

Or latin1.

But here's your code as translated to Python 3.3, assuming a
connection encoding of Shift JIS:

    # unchanged source, but this is Python 3 str == Unicode
    def escape_string(s):
        return s.replace("'", "''")

    def convert(x):
        if isinstance(x, str):                # Correct type unicode->str
            x = "'" + escape_string(x) + "'"
        elif isinstance(x, bytes):            # Correct type str->bytes
            # SAFE: ASCII is a Unicode subset, RT guaranteed.
            x = x.decode('ascii', errors='surrogateescape')
            x = "'" + escape_string(x) + "'"
            x = str(x)
        return x

    def build_query(query, *args):
        if isinstance(query, bytes):
            # want str for the format operator
            query = query.decode('sjis')
        query = query % tuple(map(convert, args))
        # CORRECT: for ASCII-compatible encodings, including Shift
        # JIS and Big 5, since the binary blob doesn't contain any
        # non-ASCII characters and the non-character bytes 128-255
        # will be restored properly by the error handler.
        return query.encode('sjis', errors='surrogate-escape')

    textdata = b"hello"            # or "hello"
    bindata = b"abc\xff\x00"
    query = "UPDATE table SET textcol=%s bincol=%s"

    print build_query(query, textdata, bindata)

The only problem with correctness will occur if the MySQL connection
uses a non-ASCII-compatible encoding (UTF-16, fixed-width EUC) in the
query string, because the ASCII bytes in the blob will be "widened" by

Widechar encodings could actually be handled with a "binary" codec
that recognizes *no* characters and always surrogate-encodes every
byte.  But that's pretty obviously going to be unacceptable.

I guess bytes.format() is pretty well unstoppable at this point.

From brett at  Wed Jan  8 16:49:18 2014
From: brett at (Brett Cannon)
Date: Wed, 8 Jan 2014 10:49:18 -0500
Subject: [Python-ideas] [OT] banning Mark Janssen
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 7, 2014 at 7:56 PM, Ethan Furman <ethan at> wrote:

> Moderators,
> Mark Janssen's posts are becoming extremely abusive, which seems to me to
> be against he code of conduct.
> Can we ban him, at least from the mailing lists?

I actually issued a warning last night but since I accidentally sent it
from my personal address it got bounced; just sent it from the proper

I have publicly stated people get one warning before getting banned so I
don't want to circumvent that practice. If the CoC is broken again feel
free to point out where it happened and the appropriate action will be
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From alc at  Wed Jan  8 17:22:02 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Wed, 8 Jan 2014 17:22:02 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

Answering both Chris Angelico and you, I think I have not made my point
clear. I am not really complaining about python 3. My opinion about the
changes (at least those of which I am aware) is that they are sane.
Unfortunately, many people are reluctant to change, and from what I've read
that is actually a problem (not that I have actual data). I think large
python 2 code bases won't change unless the benefits are larger than the
costs. In the costs we have to count the available developer time, for
example, and in many cases that means [a lot of] money. The idea is to
offer a solution to those programmers so they can trivially port their code
base, write new code in python 3 and rewrite old python 2 code as soon as

I am not suggesting offering back the whole python 2.7 by any means. Many
changes can be safely performed by the 2to3 tool, probably. My suggestion
is to offer a convenient way to bring everybody into python 3.

> My apologies, that was rather rude of me when you're offering to help
No worries.

> The main challenge is thus getting people to stop asking the question
> "How do we bring back the Python 2 text model" (which is never going
> to happen - we changed the model for a reason), and instead ask "What
> changes can be made to Python 3, such as introducing additional
> purpose specific types, to make it a better language for wire protocol
> development?". There's nothing actually *saying* "thou shalt only use
> builtin types for manipulation of wire protocol data", but that's the
> way all porting efforts have been carried out to date.
Enabling old functionality when required is not the same as bringing back
python 2, since python 3 is there by default and python 2 code won't work
by default. It means just providing a good way to make old code work. The
key part is the translation from 2 to 3. This does not mean that the code
has to run unchanged but that the translation may be performed
automatically, at least in 99.9% of cases. In practice this could involve a
mixture of changes to python 3 itself to support the 2to3 tool, and
improvements to the tool.

With a 2to3 tool that covers 99.99% of the cases, we could even have .py2
modules that would be translated transparently to .py when first used, in
the same way compilation works, raising an exception in case something goes

Anyway, I understand it is not a clean way to proceed, but something along
these lines might be the only way to speed up the adoption of python 3, and
minimise the risk of defection to other languages.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Wed Jan  8 17:46:22 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 9 Jan 2014 02:46:22 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2014 02:22, Alejandro L?pez Correa <alc at> wrote:
> Answering both Chris Angelico and you, I think I have not made my point
> clear. I am not really complaining about python 3. My opinion about the
> changes (at least those of which I am aware) is that they are sane.
> Unfortunately, many people are reluctant to change, and from what I've read
> that is actually a problem (not that I have actual data). I think large
> python 2 code bases won't change unless the benefits are larger than the
> costs.

This is mostly a communications problem on our part. I certainly
thought "5 years for Python 3 to be the default choice for new
projects" was fairly straightforward to interpret, but some overly
optimistic folks with less experience of corporate adoption rates
managed to misinterpret that as something more like "5 years until
more people are writing Python 3 code than Python 2 code". If the
latter was the goal, then we'd have a crisis, but it was never the
goal - Python 2 has a massive installed base, and it's going to take a
long time for new Python 3 projects and Python 2 to Python 3
migrations to overtake that.

> In the costs we have to count the available developer time, for
> example, and in many cases that means [a lot of] money. The idea is to offer
> a solution to those programmers so they can trivially port their code base,
> write new code in python 3 and rewrite old python 2 code as soon as
> possible.
> I am not suggesting offering back the whole python 2.7 by any means. Many
> changes can be safely performed by the 2to3 tool, probably. My suggestion is
> to offer a convenient way to bring everybody into python 3.

This is also largely an education problem. A couple of projects have
legitimate gripes about binary protocol handling in Python 3, and
since they have no pressing interest in migrating (and thus little
motivation to build the missing pieces of infrastructure themselves),
their response has been to tell the core team "we're not migrating
until *you* provide a suitable replacement for this particular Python
2 feature". It's a reasonable request, but hasn't been at the top of
the core teams priority list up to this point (that's now likely to
change for Python 3.5).

>> My apologies, that was rather rude of me when you're offering to help
> No worries.
>> The main challenge is thus getting people to stop asking the question
>> "How do we bring back the Python 2 text model" (which is never going
>> to happen - we changed the model for a reason), and instead ask "What
>> changes can be made to Python 3, such as introducing additional
>> purpose specific types, to make it a better language for wire protocol
>> development?". There's nothing actually *saying* "thou shalt only use
>> builtin types for manipulation of wire protocol data", but that's the
>> way all porting efforts have been carried out to date.
> Enabling old functionality when required is not the same as bringing back
> python 2, since python 3 is there by default and python 2 code won't work by
> default. It means just providing a good way to make old code work. The key
> part is the translation from 2 to 3. This does not mean that the code has to
> run unchanged but that the translation may be performed automatically, at
> least in 99.9% of cases. In practice this could involve a mixture of changes
> to python 3 itself to support the 2to3 tool, and improvements to the tool.

There's very little actually *missing* from Python 3 now, though, and
it's far from clear that the key remaining missing piece (a type for
manipulating ASCII compatible binary protocol data) can't be provided
as a library on PyPI.

> With a 2to3 tool that covers 99.99% of the cases, we could even have .py2
> modules that would be translated transparently to .py when first used, in
> the same way compilation works, raising an exception in case something goes
> wrong.

I'm pretty sure someone already wrote one of those - they're a
problem, because they mean the tracebacks for runtime exceptions don't
match the source code (that's one of the major reasons single-source
approaches came to dominate as the preferred migration mechanism for
libraries and frameworks, leaving 2to3 as an option mainly considered
by applications that can abandon Python 2 support when migrating to
Python 3).

> Anyway, I understand it is not a clean way to proceed, but something along
> these lines might be the only way to speed up the adoption of python 3, and
> minimise the risk of defection to other languages.

We're largely happy with the rate of adoption though - there were just
some folks that didn't grasp the kinds of time scales we're talking
about for a migration of this magnitude.

See this for more details:


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From barry at  Wed Jan  8 18:17:06 2014
From: barry at (Barry Warsaw)
Date: Wed, 8 Jan 2014 12:17:06 -0500
Subject: [Python-ideas] [OT] banning Mark Janssen
References: <>
Message-ID: <>

On Jan 08, 2014, at 01:20 PM, Steven D'Aprano wrote:

>(I would also like to preemptively state that I object in the strongest 
>possible terms to a blanket "no swearing" policy, just in case anyone is 
>thinking of introducing such a thing.)

"Swear" words in and of themselves don't violate the CoC.  It's how they're
used that matters.  (i.e. context is everything)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <>

From stephen at  Wed Jan  8 18:21:12 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 09 Jan 2014 02:21:12 +0900
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

Nick Coghlan writes:

 > Anyway, the problems impacting wire protocol developers are known,
 > but it's been damnably difficult to get anything other than "I like
 > Python 2 better" out of them when it comes to discussing possible
 > *solutions*

Good to know you feel that way too, I thought I just missed a lot of
important discussions. :-(

 > The main challenge is thus getting people to stop asking the question
 > "How do we bring back the Python 2 text model" (which is never going
 > to happen - we changed the model for a reason), and instead ask "What
 > changes can be made to Python 3, such as introducing additional
 > purpose specific types, to make it a better language for wire protocol
 > development?".

After spending enough time on Inada-san's use-case to find a real
problem with treating wire protocols as text, I've come to the
conclusion that those really are the same question, though.  Add even
a little bit of binary handling to a database connection, and even
though almost everything is actually just ASCII, the few things that
aren't blow everything up and you want everything to be bytes.

At that point you end up really wanting bytes to have pretty much
everything str does except maybe unidata lookups!

From enric.tejedor at  Wed Jan  8 18:47:45 2014
From: enric.tejedor at (Enric Tejedor)
Date: Wed, 08 Jan 2014 18:47:45 +0100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>


> The biggest problem with that kind of magic is scoping. Look at this:
> def func():
>     best = 0
>     for x in range(10):
>         val = long_computation(x)
>         if val > best: best = val
>     return best
> (Granted, this can be done with builtins, but let's keep the example simple.)
> If the body of the loop becomes a new function, there needs to be a
> nonlocal directive to make sure 'best' references the outer one:
> def func():
>     best = 0
>     @parallelize(range(10))
>     def body(x):
>         nonlocal best
>         val = long_computation(x)
>         if val > best: best = val
>     return best
> This syntax would work, but it'll raise UnboundLocalError without the
> nonlocal declaration. Since Python tags non-local variables (as
> opposed to C-like languages, which tag local variables), there's no
> easy way to just add another scope and have it function invisibly. Any
> bit of magic that creates a local scope is going to cause problems in
> any but the simplest cases. Far better to force people to be explicit
> about it, and then the rules are clearer.
> Note that the parallelize decorator I use here would be a little
> unusual, in that it has to actually call the function (and in fact
> call it multiple times), and its return value is ignored. This would
> work, but it might confuse people, so you'd want to name it something
> that explains what's happening. It wouldn't be hard to write, in this
> form, though - it'd basically just pass the iterable to
> multiprocessing.Pool().map().
> However, the example I give here wouldn't work (at least, I don't
> think it would) with multiprocessing, because external variable scopes
> would be duplicated, not shared, between processes. So once again,
> you'd have to write your code with parallelization in mind, rather
> than simply stick a decorator on a loop and have it fork out across
> processes.

Correct, this is indeed a problem. It would be tricky to make this work
in the general case.

In a simpler scenario, we could assume that iterations won't update the
same data.
On the other hand, to prevent the UnboundLocalError, the variables
needed inside the loop could be passed to the decorator and appear in
the loop function's signature.

results = [0] * 10

@parallel(range(10), results)
 def loop(i, results):
       results[i] = some_computation(i)

Then the decorator would be:

def parallel(*args):
   iterable = args[0]
   params = args[1:]
   def call(func):
       # create parallel invocations of func with iterable and params

   return call

I think this solution would work if you wanted to do things like
performing independent updates on a list.

Anyway now I see more clearly the implications of such a construct for
loops. Thanks again for your feedback,


> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.

From stephen at  Wed Jan  8 18:47:41 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 09 Jan 2014 02:47:41 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Andrew Barnert writes:

 > >    a.  If the 8-bit str contains any Latin-1 or C1 characters, both
 > >        strs are promoted to 16-bit, and non-ASCII characters in the
 > >        7-bit string are converted by the surrogateescape handler.
 > This part worries me a bit. The bytes 61 62 63 FF in this new
 > representation actually _mean_ 'abc' followed by a smuggled FF
 > byte.

No, it doesn't.  It means 'abc' followed by something that cannot be
encoded by any codec without the surrogateescape handler.
'ascii-compatible' merely defaults to that handler.  I wouldn't
actually be too upset if I were told, no, you have to specify

 > > 6.  On output the 'ascii-compatible' codec simply memcpy's 7-bit str
 > >    and pure ASCII 8-bit str, and raises on anything else.
 > So if a 7-bit string gets converted to a surrogate-escaped 16-bit
 > string, it can never be written out again?

Of course it can.  Use .encode('ascii', errors='surrogateescape')

 > (b'abc\xff'.decode('ascii-compatible') + '\u1234')[:4].encode('ascii-compatible')
 > I'd expect to get back my b'abcd\xff'. But your rules give me an
 > exception.

Yes.  This whole proposal was aimed at wire protocols.  It's very bad
if something intended to be ready to be squirted into the wire needs
(expensive) encoding.

 > I think ascii-compatible has to accept non-8-bit-repr strings (by
 > encoding ASCII as ASCII and surrogate escapes as bytes and
 > everything else is an exception). This is necessary because 60 61
 > 62 FF (7-bit) and 0061 0062 0063 DCFF (16-bit) are the same string
 > anyway. But it's especially necessary because the former can be
 > silently converted into the latter (and there's no way to even test
 > whether that's happened).

Well, one way around that would be to require that the latter not
exist (convert it to "7-bit" during construction).

But I've come to the conclusion that this is all too irregular and
confusing.  I'm pretty sure that I can come up with a set of rules
that are not inherently self-contradictory, but I'm also pretty sure
that the resulting type will behave unintuitively for almost
everybody.  Also, despite my original thought, it's really hard to see
how unnecessary encode/decode cycles can be eliminated.  So I think I
need to go back to the drawing board.

So I hope I haven't wasted too much of your time; it's been very
educational for me.

From abarnert at  Wed Jan  8 18:57:10 2014
From: abarnert at (Andrew Barnert)
Date: Wed, 8 Jan 2014 09:57:10 -0800
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
	new bytestring type?]
In-Reply-To: <laj8l2$5s2$>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
 <lahvla$ev8$> <>
Message-ID: <>

On Jan 8, 2014, at 2:18, Mark Lawrence <breamoreboy at> wrote:

> On 08/01/2014 09:59, Nick Coghlan wrote:
>> Now that your proposal has been better explained, yes, I agree that
>> "asciibytes" and "asciistr" types would be well worth experimenting
>> with. I mention both, since it's far from clear if a str subclass or a
>> bytes subclass (or neither, although that may require bug fixes in
>> CPython) would be more convenient for this use case.
> Could you subclass both to get the best of both worlds?  As in
> class asciixyz(str, bytes):

You can't. (Try it,) More importantly, how would that work?  

You'd have the implementation of str (effectively a tagged union of char8/char16/char32 arrays) plus the separate implementation of bytes (effectively a char8 array). Do you leave the first one empty? And then avoid super() and instead explicitly delegate only to the bytes base?

That could work (at the relatively minimal cost of an extra empty '' worth of storage) as long as you don't run into any code that tries to use the internal details of the str. But unfortunately, most builtins and extension module functions _do_ try to use the internal details of the str. 

In CPython, for example, a function that takes a string usually does so by parsing the argument as, say, a u#, which gives you the character array from a str directly. Even functions that take str objects will usually at some point call string-protocol functions to get at their array.

The simple way around this is to make all such functions effectively call __str__ on any object that isn't a real str. But that would make almost _everything_ usable as a string--f.write(2) would now work. So you'd really need to create a new dunder method (and C API slot) __asstr__ that's only implemented by objects that really want to act like a str, not just have a str representation. Also, I'm not sure all such functions have a reasonable way to refcount the resulting str object properly. 

The alternative would be to expose the entire string protocol into Python--including, most importantly, the methods to get at the array directly. I'm not sure how you'd even design the API for those methods in Python. We don't even expose the buffer protocol to Python today.

I didn't go into all this detail to try to prove that the idea is impossible, but rather in hopes that someone would have an answer that makes everything work. Making string-protocol strings more "pluggable" might have other benefits besides the "encodedstr" type. Imagine being able to build an explicitly UTF-16 type to make it faster and easier to deal with Win32 or Java or other such things. (Or could you just use encodedstr('utf-16-le') for that?) Or expose a "rope"-like type for large mutable strings. Or experiment with alternatives to the 3.3-style internal storage, like Stephen's ASCII-compatible byte-smuggling flag, by faking them in Python instead of building them in C. (That would probably be sufficient to find any holes in the specification, even if it wouldn't be very helpful for perf testing.)

From breamoreboy at  Wed Jan  8 19:11:10 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 08 Jan 2014 18:11:10 +0000
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
 <lahvla$ev8$> <>
 <laj8l2$5s2$> <>
Message-ID: <lak4bs$6hp$>

On 08/01/2014 17:57, Andrew Barnert wrote:
> On Jan 8, 2014, at 2:18, Mark Lawrence <breamoreboy at> wrote:
>> On 08/01/2014 09:59, Nick Coghlan wrote:
>>> Now that your proposal has been better explained, yes, I agree that
>>> "asciibytes" and "asciistr" types would be well worth experimenting
>>> with. I mention both, since it's far from clear if a str subclass or a
>>> bytes subclass (or neither, although that may require bug fixes in
>>> CPython) would be more convenient for this use case.
>> Could you subclass both to get the best of both worlds?  As in
>> class asciixyz(str, bytes):
> You can't. (Try it,) More importantly, how would that work?

I haven't the faintest idea :)

> but rather in hopes that someone would have an answer that makes everything work.

The reason I threw this in in the first place.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From flying-sheep at  Wed Jan  8 19:35:34 2014
From: flying-sheep at (Philipp A.)
Date: Wed, 8 Jan 2014 19:35:34 +0100
Subject: [Python-ideas] [OT] banning Mark Janssen
In-Reply-To: <>
References: <> <20140108022018.GN29356@ando>
Message-ID: <>

2014/1/8 Barry Warsaw <barry at>

> "Swear" words in and of themselves don't violate the CoC.  It's how they're
> used that matters.  (i.e. context is everything)
> -Barry

i?d say the amount and kind of all words is irrelevant as long as meaning
is conveyed.

spam, personal insults, etc. are not ok, no matter what words they are
composed of.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From g.brandl at  Wed Jan  8 19:41:04 2014
From: g.brandl at (Georg Brandl)
Date: Wed, 08 Jan 2014 19:41:04 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <lak62a$mm5$>

Am 08.01.2014 16:38, schrieb Nick Coghlan:
> On 9 January 2014 01:05, Nick Coghlan <ncoghlan at> wrote:
>> On 9 January 2014 00:57, Alejandro L?pez Correa <alc at> wrote:
>>> Hi,
>>> I'm new here. I am sorry if this idea has already been discussed, but I have
>>> not found a way to search this list (I am not used to mailing lists at all).
>>> I've seen recently some discussion in reddit about python 2 vs python 3, and
>>> the slow adoption of the latter. I am proposing here pragmatic way to speed
>>> up the process of porting old code and thus solving the split in the
>>> community, that I believe it is a serious threat. It is not clean, not at
>>> all, but it might work: just give python 2 whiners what they [we] want, and
>>> do it using "from __past__ import", in a similar way "from __future__
>>> import" is used.
>> Hi,
>> You may want to read through
>> before lending too much weight to ill-informed Reddit commentary.
> My apologies, that was rather rude of me when you're offering to help
> (I'm irritable at the moment since I've deemed it necessary to spend a
> bunch of time over the past week updating my Python 3 Q & A rather
> than enjoying my Christmas holidays, working on Python 3.4 or, this
> week, enjoying 2014).

Please know that we all love you a bit more for that :)


From masklinn at  Wed Jan  8 19:53:11 2014
From: masklinn at (Masklinn)
Date: Wed, 8 Jan 2014 19:53:11 +0100
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 2014-01-08, at 18:47 , Enric Tejedor <enric.tejedor at> wrote:
> Correct, this is indeed a problem. It would be tricky to make this work
> in the general case.
> In a simpler scenario, we could assume that iterations won't update the
> same data.
> On the other hand, to prevent the UnboundLocalError, the variables
> needed inside the loop could be passed to the decorator and appear in
> the loop function's signature.
> results = [0] * 10
> @parallel(range(10), results)
> def loop(i, results):
>       results[i] = some_computation(i)

At this point you don't really need a decorator anymore, this is an
odd-ish way to write `results = map(some_computation, range(10))`, and
as others have noted the standard library already has a parallelized
version thereof:

> Then the decorator would be:
> def parallel(*args):
>   iterable = args[0]
>   params = args[1:]
>   def call(func):
>       # create parallel invocations of func with iterable and params
>   return call
> I think this solution would work if you wanted to do things like
> performing independent updates on a list.

The problem of a loop being that its semantics are too generic to make
anything even remotely close to such an assumption.

From at  Wed Jan  8 20:05:39 2014
From: at (Haoyi Li)
Date: Wed, 8 Jan 2014 11:05:39 -0800
Subject: [Python-ideas] Decorators on loops
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

> The problem of a loop being that its semantics are too generic to make
anything even remotely close to such an assumption.

I think it's the opposite problem, really: the semantics (repeatedly
calling .next() on iter(...)) is too specific, and is incompatible with
what you want. What you want is a generic map() function which encodes the
semantics (independent updates) that you want, and map() is trivially

On Wed, Jan 8, 2014 at 10:53 AM, Masklinn <masklinn at> wrote:

> On 2014-01-08, at 18:47 , Enric Tejedor <enric.tejedor at> wrote:
> > Correct, this is indeed a problem. It would be tricky to make this work
> > in the general case.
> >
> > In a simpler scenario, we could assume that iterations won't update the
> > same data.
> > On the other hand, to prevent the UnboundLocalError, the variables
> > needed inside the loop could be passed to the decorator and appear in
> > the loop function's signature.
> >
> > results = [0] * 10
> >
> > @parallel(range(10), results)
> > def loop(i, results):
> >       results[i] = some_computation(i)
> At this point you don't really need a decorator anymore, this is an
> odd-ish way to write `results = map(some_computation, range(10))`, and
> as others have noted the standard library already has a parallelized
> version thereof:
> > Then the decorator would be:
> >
> > def parallel(*args):
> >   iterable = args[0]
> >   params = args[1:]
> >
> >   def call(func):
> >       # create parallel invocations of func with iterable and params
> >
> >   return call
> >
> > I think this solution would work if you wanted to do things like
> > performing independent updates on a list.
> The problem of a loop being that its semantics are too generic to make
> anything even remotely close to such an assumption.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Wed Jan  8 23:01:05 2014
From: steve at (Steven D'Aprano)
Date: Thu, 9 Jan 2014 09:01:05 +1100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <20140108220104.GS29356@ando>

On Wed, Jan 08, 2014 at 05:22:02PM +0100, Alejandro L?pez Correa wrote:
> Anyway, I understand it is not a clean way to proceed, but something along
> these lines might be the only way to speed up the adoption of python 3

One assumption in this discussion, and the various related discussions 
on Reddit and other places, is that adoption of Python 3 is too slow and 
needs to be sped up. I don't believe this is true. I believe adoption 
is just right and exactly what should be expected.

Alex Gaynor wrote a blog post a week or so ago claiming that, five years 
since Python 3 was first released, everyone should have migrated by now 
and that since only "five percent" (a figure which I believe he pulled 
out of thin air) have migrated, Python 3 has been a failure.

I challenge that belief. I've been hanging around here and on the 
Python-Dev list for a long time, and while I can't find any official 
pronouncement, the sense has always been that Python 3 adoption will 
take ten years, not five. (That's my recollection -- if any of the core 
developers wish to correct me, please do.) Rates of adoption are much, 
much higher than gossip on the Internet suggests. About 70% of the top 
200 projects on PyPI support Python 3, and downloads of Python 3 are 
very healthy, possibly even higher than downloads of Python 2. On the 
tutor list, I see a significant number of beginners using Python 3.

It seems to me that given the circumstances, Python 3 adoption is right 
where we should expect it to be half-way through a decade-long process. 
There will be a period at the start when hardly anyone will migrate, 
then a period of accelerating migration, which will accelerate further 
when the mainstream Linux distros start shipping Python 3 as their 
system Python (ArchLinux is not mainstream, but Fedora is planning the 
change), followed by a sudden rush in another four or five years when 
people realise that Python 2.7 becoming unmaintained is no longer a 
distant prospect but is about to happen.

For many people, waiting until the last minute is the most sensible 
thing that they can do. This gives time for the early adoptors to 
discover and iron out all the wrinkles and difficulties. Rather than 
approaching this as "Python 3 has been a failure, what can we do to save 
it?" we should be approaching this as "Python 3 has been a success, what 
lessons can we take from the early adoptors to make it even easier for 
the next wave of adoptors?"

"from __past__ import spam" does not make it easier to adopt. It just 
makes it easier to *put off adopting*.

> and minimise the risk of defection to other languages.

People threaten that, but it is an irrational threat. (Mind you, people 
do silly, irrational things every day.) If you think its hard to migrate 
from Python 2 to 3, when you get to keep 90% of your code base and most 
of the backward-incompatible changes are a few libraries that have been 
renamed and a handful of syntax changes, how hard will it be to throw 
away 100% of your code and start again with a completely different 


From alc at  Thu Jan  9 00:14:21 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Thu, 9 Jan 2014 00:14:21 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
Message-ID: <>

I am posting again this. I am new to mailing lists and I've realised
I've sent it only to Nick Coghlan four hours ago.

2014/1/8 Nick Coghlan <ncoghlan at>

> This is mostly a communications problem on our part. I certainly
> thought "5 years for Python 3 to be the default choice for new
> projects" was fairly straightforward to interpret
> [...]
> We're largely happy with the rate of adoption though - there were just
> some folks that didn't grasp the kinds of time scales we're talking
> about for a migration of this magnitude.
> See this for more details:

Ok, thanks. I see this has been a recurring topic and a lot of care
has been given. I am looking at a different issue, though. I am
thinking in existing projects, python 2 code in use, not even third
party libraries but end products. I fear that if existing projects
remain python 2 for too long because the benefit of expending
resources on the migration are not worth the return, when the time
finally comes for the upgrade another language might be chosen. At
that point the divergence between that ancient py2 code and the latest
py3 version would probably be greater than now, and other languages
may offer features like GIL-less multithreading. If this happens in a
large scale the whole python community may shrink and lose momentum. I
do not know whether this is a real risk or not, and it is really up to
you to assess it.

I think a convenient way to run old python 2 modules along with new
python 3 ones may be a good idea despite the cost. Even embedding a
complete python 2.7 interpreter that executes .py2 modules and somehow
shares the state with the main python 3 environment. I don't know
whether that "monstrosity" is feasible without changing python 3 core
too much, but it should help many people and nullify the risk of
losing them. A large community is desirable in this context.

> > With a 2to3 tool that covers 99.99% of the cases, we could even have .py2
> > modules that would be translated transparently to .py when first used
> I'm pretty sure someone already wrote one of those - they're a
> problem, because they mean the tracebacks for runtime exceptions don't
> match the source code

The idea is that exceptions that end up showing tracebacks should be,
uhmm, exceptional (the tool should work 99.9% of the time and we are
talking about working py2 code). When something happens, the problem
of a different source in the traceback could be handled by the
translation tool by adding annotations (even comments).


From rosuav at  Thu Jan  9 00:38:15 2014
From: rosuav at (Chris Angelico)
Date: Thu, 9 Jan 2014 10:38:15 +1100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 10:14 AM, Alejandro L?pez Correa <alc at> wrote:
>> I'm pretty sure someone already wrote one of those - they're a
>> problem, because they mean the tracebacks for runtime exceptions don't
>> match the source code
> The idea is that exceptions that end up showing tracebacks should be,
> uhmm, exceptional (the tool should work 99.9% of the time and we are
> talking about working py2 code). When something happens, the problem
> of a different source in the traceback could be handled by the
> translation tool by adding annotations (even comments).

That's a sort-of-viable option (C preprocessors have used #line
directives for decades), but not really ideal. For it to work with
current Python, it would have to actually _be_  comments, so every
line would have to have something appended: # "" 213

How would that behave on arbitrary code? What if there's backslash
continuation? Will people know to go looking elsewhere?

Exceptions DO happen. And when they do, the language should try to
make it easy to figure out what's going on. I'm not sure how well that
would be served by this, especially given that it's not supposed to be
a normal workflow. If you build a new language that uses Python as its
back-end, then manipulating the source code WOULD be the normal
workflow, and in that case I'd wholeheartedly support editing the
recorded line numbers (I think you can do that with AST
manipulation??) so tracebacks show the original file and line. But
this shouldn't be that normal.


From alc at  Thu Jan  9 00:34:22 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Thu, 9 Jan 2014 00:34:22 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <20140108220104.GS29356@ando>
References: <>
Message-ID: <>

2014/1/8 Steven D'Aprano <steve at>:
> About 70% of the top 200 projects on PyPI support Python 3, and
> downloads of Python 3 are very healthy, possibly even higher than
> downloads of Python 2.

I do not think that one is a particularly good metric. For each
project hosted at PyPI how many are not there? People have personal
projects, companies have internal software, and there are products
that contain at least some python and are targeted at final customers,
like games or Maya. Not everything is open source, but even if it is
proprietary software it is good to have it since that way more jobs
are offered and more people can earn money with this language, and
that is a guarantee for its long-term success.

>> and minimise the risk of defection to other languages.
> People threaten that, but it is an irrational threat. (Mind you, people
> do silly, irrational things every day.) If you think its hard to migrate
> from Python 2 to 3, when you get to keep 90% of your code base and most
> of the backward-incompatible changes are a few libraries that have been
> renamed and a handful of syntax changes, how hard will it be to throw
> away 100% of your code and start again with a completely different
> language?

I think human psychology works like that. Many people may delay the
acquisition of a new car, but once they are committed to buy a new one
they want the best they can afford (within their budget). Some
languages may gain momentum and gain the "cool" vibe. We saw the rise
of Ruby a while ago, and maybe a language that handles well multiple
cores could be a strong temptation in the future. If people keep
investing in python, small bits at a time, keeping their codebase
always up to date, it is more difficult, IMHO, to commit to a full


From breamoreboy at  Thu Jan  9 00:43:14 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 08 Jan 2014 23:43:14 +0000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <laknqg$iqs$>

On 08/01/2014 23:14, Alejandro L?pez Correa wrote:
> I think a convenient way to run old python 2 modules along with new
> python 3 ones may be a good idea despite the cost.

One of the major costs, quoting Winston Churchill, will be blood, toil, 
tears and sweat.  How much of this are you personally intending to put 
into this effort, or are you happy to try and force core developers into 
a situation that many of them don't want to be in?

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rosuav at  Thu Jan  9 00:54:23 2014
From: rosuav at (Chris Angelico)
Date: Thu, 9 Jan 2014 10:54:23 +1100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 10:34 AM, Alejandro L?pez Correa <alc at> wrote:
> 2014/1/8 Steven D'Aprano <steve at>:
>> About 70% of the top 200 projects on PyPI support Python 3, and
>> downloads of Python 3 are very healthy, possibly even higher than
>> downloads of Python 2.
> I do not think that one is a particularly good metric. For each
> project hosted at PyPI how many are not there? People have personal
> projects, companies have internal software, and there are products
> that contain at least some python and are targeted at final customers,
> like games or Maya.

But what IS a good metric? How are you going to measure any of that?
It's better to at least use PyPI stats than to pull numbers out of a

>>> and minimise the risk of defection to other languages.
>> People threaten that, but it is an irrational threat. (Mind you, people
>> do silly, irrational things every day.) If you think its hard to migrate
>> from Python 2 to 3, when you get to keep 90% of your code base and most
>> of the backward-incompatible changes are a few libraries that have been
>> renamed and a handful of syntax changes, how hard will it be to throw
>> away 100% of your code and start again with a completely different
>> language?
> I think human psychology works like that. Many people may delay the
> acquisition of a new car, but once they are committed to buy a new one
> they want the best they can afford (within their budget). Some
> languages may gain momentum and gain the "cool" vibe. We saw the rise
> of Ruby a while ago, and maybe a language that handles well multiple
> cores could be a strong temptation in the future. If people keep
> investing in python, small bits at a time, keeping their codebase
> always up to date, it is more difficult, IMHO, to commit to a full
> rewrite.

Maybe. But how much temptation would it need to be to induce a
complete rewrite? (Mind you, it's not always a *complete* rewrite.
I've been "porting" code from Win32 C++ to GTK Pike, and in the
process usually shortened it by 50% or better, but mostly what I'm
doing is reading the old code, taking maybe a few bits of it that are
so simple they'd be the same in nearly any language, and
reimplementing the original logic.) The expanded gap between Python
2.7 and Python 3.7 is mainly going to be features of 3.7 that you
could choose to use now that you've ported, rather than mandatory
changes. Python doesn't arbitrarily drop features or break stuff in
minor releases. That means the gap between 2.7 and 3.7 will still be
far FAR narrower than the gap between Python and Ruby - so,
correspondingly, the temptation to switch to Ruby would have to be
really strong. In the porting case I mentioned a moment ago, there
really was a very strong temptation (using Win32 APIs meant I was
bound to Windows (though Wine is a wonderful thing), and the C++ code
was going through stupid levels of overhead to manage memory and
such), so it was worth switching. I was NOT able to convince my boss
to switch our web site from PHP into Python, because he just couldn't
see enough benefit from changing language - but moving to a new PHP
was a much lower hump to get over. (Only a few things needed


From alc at  Thu Jan  9 01:08:23 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Thu, 9 Jan 2014 01:08:23 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <laknqg$iqs$>
References: <>
Message-ID: <>

2014/1/9 Mark Lawrence <breamoreboy at>:
> On 08/01/2014 23:14, Alejandro L?pez Correa wrote:
>> I think a convenient way to run old python 2 modules along with new
>> python 3 ones may be a good idea despite the cost.
> One of the major costs, quoting Winston Churchill, will be blood, toil,
> tears and sweat.  How much of this are you personally intending to put into
> this effort, or are you happy to try and force core developers into a
> situation that many of them don't want to be in?

I am not trying to force anything. I am offering my views and some
ideas. Personally, I am not experiencing any problem with python 3
other than some missing third party libraries.

I am sorry if my posts seem rude (I do not know whether that is the
case). When writing in English, many times it is more like adjusting
what I want to say to what I know how to say.

That "despite the cost" was a poor choice of words. I agree that it is
not reasonable to expect a change that means "blood, toil, tears and
sweat" for the core developers when they are against it. However,
there might be easy solutions not pretty but pragmatic and that might
be implemented without polluting the main code base a lot. But again,
I am not trying to force anything but convince people. I am starting
to realise this debate has been going on for too long and it seems to
have left some scars. I am sorry, it is brand new to me.


From alc at  Thu Jan  9 01:18:55 2014
From: alc at (=?UTF-8?Q?Alejandro_L=C3=B3pez_Correa?=)
Date: Thu, 9 Jan 2014 01:18:55 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

2014/1/9 Chris Angelico <rosuav at>:
> On Thu, Jan 9, 2014 at 10:34 AM, Alejandro L?pez Correa <alc at> wrote:
>> 2014/1/8 Steven D'Aprano <steve at>:
> But what IS a good metric? How are you going to measure any of that?
> It's better to at least use PyPI stats than to pull numbers out of a
> hat.

The problem I see is that metric might be equal or worse than just
guessing because it is clearly biased: it focuses on open source
projects hosted on PyPI. It is easy to measure it, but maybe it is not
good to do so if that measure is used to make important decisions. In
my [very limited] experience, the number of open source projects pales
in comparison to that of projects kept "in the shadows".

> Maybe. But how much temptation would it need to be to induce a
> complete rewrite? (Mind you, it's not always a *complete* rewrite.
> I've been "porting" code from Win32 C++ to GTK Pike, and in the
> process usually shortened it by 50% or better, but mostly what I'm
> doing is reading the old code, taking maybe a few bits of it that are
> so simple they'd be the same in nearly any language, and
> reimplementing the original logic.) The expanded gap between Python
> 2.7 and Python 3.7 is mainly going to be features of 3.7 that you
> could choose to use now that you've ported, rather than mandatory
> changes. Python doesn't arbitrarily drop features or break stuff in
> minor releases. That means the gap between 2.7 and 3.7 will still be
> far FAR narrower than the gap between Python and Ruby - so,
> correspondingly, the temptation to switch to Ruby would have to be
> really strong. In the porting case I mentioned a moment ago, there
> really was a very strong temptation (using Win32 APIs meant I was
> bound to Windows (though Wine is a wonderful thing), and the C++ code
> was going through stupid levels of overhead to manage memory and
> such), so it was worth switching. I was NOT able to convince my boss
> to switch our web site from PHP into Python, because he just couldn't
> see enough benefit from changing language - but moving to a new PHP
> was a much lower hump to get over. (Only a few things needed
> changing.)

Fair enough. I think it is a good argument.


From amber.yust at  Thu Jan  9 02:19:30 2014
From: amber.yust at (Amber Yust)
Date: Wed, 8 Jan 2014 17:19:30 -0800
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

Also note that even if publicly visible projects are outnumbered by private
projects, the public projects tend to have a much larger impact on the
overall ecosystem, because they're used by many entities (whereas private
projects are typically only used by a single entity given their nature).
On Jan 8, 2014 5:13 PM, "Alejandro L?pez Correa" <alc at> wrote:

> 2014/1/9 Chris Angelico <rosuav at>:
> > On Thu, Jan 9, 2014 at 10:34 AM, Alejandro L?pez Correa <alc at>
> wrote:
> >> 2014/1/8 Steven D'Aprano <steve at>:
> >
> > But what IS a good metric? How are you going to measure any of that?
> > It's better to at least use PyPI stats than to pull numbers out of a
> > hat.
> >
> The problem I see is that metric might be equal or worse than just
> guessing because it is clearly biased: it focuses on open source
> projects hosted on PyPI. It is easy to measure it, but maybe it is not
> good to do so if that measure is used to make important decisions. In
> my [very limited] experience, the number of open source projects pales
> in comparison to that of projects kept "in the shadows".
> > Maybe. But how much temptation would it need to be to induce a
> > complete rewrite? (Mind you, it's not always a *complete* rewrite.
> > I've been "porting" code from Win32 C++ to GTK Pike, and in the
> > process usually shortened it by 50% or better, but mostly what I'm
> > doing is reading the old code, taking maybe a few bits of it that are
> > so simple they'd be the same in nearly any language, and
> > reimplementing the original logic.) The expanded gap between Python
> > 2.7 and Python 3.7 is mainly going to be features of 3.7 that you
> > could choose to use now that you've ported, rather than mandatory
> > changes. Python doesn't arbitrarily drop features or break stuff in
> > minor releases. That means the gap between 2.7 and 3.7 will still be
> > far FAR narrower than the gap between Python and Ruby - so,
> > correspondingly, the temptation to switch to Ruby would have to be
> > really strong. In the porting case I mentioned a moment ago, there
> > really was a very strong temptation (using Win32 APIs meant I was
> > bound to Windows (though Wine is a wonderful thing), and the C++ code
> > was going through stupid levels of overhead to manage memory and
> > such), so it was worth switching. I was NOT able to convince my boss
> > to switch our web site from PHP into Python, because he just couldn't
> > see enough benefit from changing language - but moving to a new PHP
> > was a much lower hump to get over. (Only a few things needed
> > changing.)
> Fair enough. I think it is a good argument.
> Alejandro
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Thu Jan  9 05:44:44 2014
From: greg.ewing at (Greg Ewing)
Date: Thu, 09 Jan 2014 17:44:44 +1300
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Stephen J. Turnbull wrote:
> No, it doesn't.  It means 'abc' followed by something that cannot be
> encoded by any codec without the surrogateescape handler.
> 'ascii-compatible' merely defaults to that handler.  I wouldn't
> actually be too upset if I were told, no, you have to specify
> explicitly.

If I understand correctly, your intention is that
61 62 63 FF in this representation would simply be
a more compact version of 0061 0062 0063 DCFF,
with exactly the same semantics.

If that's right, then maybe something like "compressed
surrogateescape" or "8-bit surrogateescape" would be
a better name for it?

Also, it could be produced automatically where
possible by any decoding operation that specified
surrogateescape -- there wouldn't have to be a
dedicated encoding name for it (although there
could be for convenience).

It could also potentially be produced by any
slicing or other string operations that resulted
in characters within the appropriate ranges,
just like any of the other internal representations.


From stephen at  Thu Jan  9 09:40:04 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 09 Jan 2014 17:40:04 +0900
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
Message-ID: <>

Greg Ewing writes:

 > If I understand correctly, your intention is that
 > 61 62 63 FF in this representation would simply be
 > a more compact version of 0061 0062 0063 DCFF,
 > with exactly the same semantics.

Pretty much so.  There remain some ambiguities and questions about
efficient implementability in my mind.

 > If that's right, then maybe something like "compressed
 > surrogateescape" or "8-bit surrogateescape" would be
 > a better name for it?

Maybe.  Thanks for the suggestion!

However, as I mentioned already I'm going to back off on this for a
while, because in the process of analyzing Inada-san's use case I
realized that by itself it doesn't save much besides space, and isn't
pretty too boot.

From victor.stinner at  Thu Jan  9 09:55:29 2014
From: victor.stinner at (Victor Stinner)
Date: Thu, 9 Jan 2014 09:55:29 +0100
Subject: [Python-ideas] The fools shall start sucking the cock.
In-Reply-To: <>
References: <>
Message-ID: <>

2014/1/8 Brett Cannon <brett at>:
> After others coming forward about previous behavior, this email is serving
> as an official warning: one more infraction of the CoC and you will be
> banned from this mailing list.

( CoC stands for Code of Conduct, text available at )


From ncoghlan at  Thu Jan  9 12:12:42 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 9 Jan 2014 21:12:42 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <20140108220104.GS29356@ando>
References: <>
Message-ID: <>

On 9 Jan 2014 06:02, "Steven D'Aprano" <steve at> wrote:
> On Wed, Jan 08, 2014 at 05:22:02PM +0100, Alejandro L?pez Correa wrote:
> [...]
> > Anyway, I understand it is not a clean way to proceed, but something
> > these lines might be the only way to speed up the adoption of python 3
> One assumption in this discussion, and the various related discussions
> on Reddit and other places, is that adoption of Python 3 is too slow and
> needs to be sped up. I don't believe this is true. I believe adoption
> is just right and exactly what should be expected.
> Alex Gaynor wrote a blog post a week or so ago claiming that, five years
> since Python 3 was first released, everyone should have migrated by now
> and that since only "five percent" (a figure which I believe he pulled
> out of thin air) have migrated, Python 3 has been a failure.

Alex's numbers were real - they're based on user agent header analysis for
PyPI downloads. However, the other readily available metric is
python.orginstaller downloads, and those favour Python 3 (and that's
even before we
publish 3.4, which has nice additions like pip, statistics and asyncio).

Alex is a smart guy, but I don't know how he managed to get "After 5 years,
Python 3 should be more widely used than Python 2" (clearly unrealistic)
from "After 5 years, Python 3 should be mature enough to be the default
choice for new projects". The latter is what we actually said, and,
allowing for the 6 month delay to replace 3.0's unusably slow IO stack in
3.1, still looks plausible given the updates coming in Python 3.4.

That article was actually the one that made me realise my Q&A needed a few
more questions and answers :)

> I challenge that belief. I've been hanging around here and on the
> Python-Dev list for a long time, and while I can't find any official
> pronouncement, the sense has always been that Python 3 adoption will
> take ten years, not five. (That's my recollection -- if any of the core
> developers wish to correct me, please do.)

5 years to be the default for new projects, we never set a goal for
overtaking Python 2 overall.

This Q and the one after it are most directly relevant:

>Rates of adoption are much,
> much higher than gossip on the Internet suggests. About 70% of the top
> 200 projects on PyPI support Python 3, and downloads of Python 3 are
> very healthy, possibly even higher than downloads of Python 2.
>On the
> tutor list, I see a significant number of beginners using Python 3.

All our discussions with distros and redistributors are also about *how* to
manage the transition, rather than *if*.

Red Hat providing commercial support for Python 3.3 as of last September is
a *big* deal, particularly given this week's announcement about CentOS
being adopted as an officially Red Hat sponsored project and the popularity
of CentOS as a platform in the scientific community.

> It seems to me that given the circumstances, Python 3 adoption is right
> where we should expect it to be half-way through a decade-long process.
> There will be a period at the start when hardly anyone will migrate,
> then a period of accelerating migration, which will accelerate further
> when the mainstream Linux distros start shipping Python 3 as their
> system Python (ArchLinux is not mainstream, but Fedora is planning the
> change), followed by a sudden rush in another four or five years when
> people realise that Python 2.7 becoming unmaintained is no longer a
> distant prospect but is about to happen.

Yup. I actually started adding a timeline to my Q&A:

> For many people, waiting until the last minute is the most sensible
> thing that they can do. This gives time for the early adoptors to
> discover and iron out all the wrinkles and difficulties. Rather than
> approaching this as "Python 3 has been a failure, what can we do to save
> it?" we should be approaching this as "Python 3 has been a success, what
> lessons can we take from the early adoptors to make it even easier for
> the next wave of adoptors?"

We've definitely made some mistakes in the area of communications, though.
In particular, we probably should have had something like my Q&A available
as an authoritative information source from the beginning, instead of only
creating it around the release of Python 3.3.

> "from __past__ import spam" does not make it easier to adopt. It just
> makes it easier to *put off adopting*.
> > and minimise the risk of defection to other languages.
> People threaten that, but it is an irrational threat. (Mind you, people
> do silly, irrational things every day.) If you think its hard to migrate
> from Python 2 to 3, when you get to keep 90% of your code base and most
> of the backward-incompatible changes are a few libraries that have been
> renamed and a handful of syntax changes, how hard will it be to throw
> away 100% of your code and start again with a completely different
> language?

Exactly. By the time 2.7 goes into security fix only mode, we will have
been maintaining Python 2 and Python 3 in parallel for more than *8 years*.
This is a deliberate choice on our part to allow plenty of time for users
to decide to migrate on their own, rather than attempting to force them to
migrate with the stick of a lack of support.

Even after that, commercial Python 2.x support will be available until at
least 2023, and likely longer.


> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Thu Jan  9 12:16:51 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 9 Jan 2014 21:16:51 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 Jan 2014 09:49, "Amber Yust" <amber.yust at> wrote:
> Also note that even if publicly visible projects are outnumbered by
private projects, the public projects tend to have a much larger impact on
the overall ecosystem, because they're used by many entities (whereas
private projects are typically only used by a single entity given their

It also mistakenly assumes our goal is to get existing *applications* to
migrate. It really isn't - we're obviously delighted if app developers
choose to switch (as it indicates we have created a compelling platform),
but we *needed* key library and framework developers to add Python 3
support in order to bootstrap the Python 3 development ecosystem.


> On Jan 8, 2014 5:13 PM, "Alejandro L?pez Correa" <alc at> wrote:
>> 2014/1/9 Chris Angelico <rosuav at>:
>> > On Thu, Jan 9, 2014 at 10:34 AM, Alejandro L?pez Correa <alc at>
>> >> 2014/1/8 Steven D'Aprano <steve at>:
>> >
>> > But what IS a good metric? How are you going to measure any of that?
>> > It's better to at least use PyPI stats than to pull numbers out of a
>> > hat.
>> >
>> The problem I see is that metric might be equal or worse than just
>> guessing because it is clearly biased: it focuses on open source
>> projects hosted on PyPI. It is easy to measure it, but maybe it is not
>> good to do so if that measure is used to make important decisions. In
>> my [very limited] experience, the number of open source projects pales
>> in comparison to that of projects kept "in the shadows".
>> > Maybe. But how much temptation would it need to be to induce a
>> > complete rewrite? (Mind you, it's not always a *complete* rewrite.
>> > I've been "porting" code from Win32 C++ to GTK Pike, and in the
>> > process usually shortened it by 50% or better, but mostly what I'm
>> > doing is reading the old code, taking maybe a few bits of it that are
>> > so simple they'd be the same in nearly any language, and
>> > reimplementing the original logic.) The expanded gap between Python
>> > 2.7 and Python 3.7 is mainly going to be features of 3.7 that you
>> > could choose to use now that you've ported, rather than mandatory
>> > changes. Python doesn't arbitrarily drop features or break stuff in
>> > minor releases. That means the gap between 2.7 and 3.7 will still be
>> > far FAR narrower than the gap between Python and Ruby - so,
>> > correspondingly, the temptation to switch to Ruby would have to be
>> > really strong. In the porting case I mentioned a moment ago, there
>> > really was a very strong temptation (using Win32 APIs meant I was
>> > bound to Windows (though Wine is a wonderful thing), and the C++ code
>> > was going through stupid levels of overhead to manage memory and
>> > such), so it was worth switching. I was NOT able to convince my boss
>> > to switch our web site from PHP into Python, because he just couldn't
>> > see enough benefit from changing language - but moving to a new PHP
>> > was a much lower hump to get over. (Only a few things needed
>> > changing.)
>> Fair enough. I think it is a good argument.
>> Alejandro
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From g.rodola at  Thu Jan  9 15:03:27 2014
From: g.rodola at (Giampaolo Rodola')
Date: Thu, 9 Jan 2014 15:03:27 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 12:16 PM, Nick Coghlan <ncoghlan at> wrote:

> On 9 Jan 2014 09:49, "Amber Yust" <amber.yust at> wrote:
> >
> > Also note that even if publicly visible projects are outnumbered by
> private projects, the public projects tend to have a much larger impact on
> the overall ecosystem, because they're used by many entities (whereas
> private projects are typically only used by a single entity given their
> nature).
> It also mistakenly assumes our goal is to get existing *applications* to
> migrate. It really isn't - we're obviously delighted if app developers
> choose to switch (as it indicates we have created a compelling platform),
> but we *needed* key library and framework developers to add Python 3
> support in order to bootstrap the Python 3 development ecosystem.

I think one of the key points here is that different important libs haven't
been ported yet:
Too many of them are still marked red and IMO that is the main reason why a
lot of people are being so hesitant, not unicode.
"boto" alone counts as hundreds of thousands potential users which simply
cannot migrate.
Django made the transition only a couple of months ago, which basically
means it's still in a beta state, and AFAIK fundamental projects such as
Twisted don't even have an ETA.
Considering 5 years have passed since Python 3.0 first made it's appearance
I consider this a *serious* delay.
>From a user standpoint this sort of appears as a signal which translates
into "if neither big project X has migrated after 5 years why should I?".
That's likely to apply even if project X is not within the list of your
dependencies, because you may not depend from X now but maybe you will in
the future, either because you need X or because Y requires X in order to
work. It is *crucial* for people maintaining those libraries to put Python
3 porting on top of their TODO list at the cost of not working on new

--- Giampaolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Thu Jan  9 17:50:55 2014
From: ncoghlan at (Nick Coghlan)
Date: Fri, 10 Jan 2014 02:50:55 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 Jan 2014 22:03, "Giampaolo Rodola'" <g.rodola at> wrote:
> On Thu, Jan 9, 2014 at 12:16 PM, Nick Coghlan <ncoghlan at> wrote:
>> On 9 Jan 2014 09:49, "Amber Yust" <amber.yust at> wrote:
>> >
>> > Also note that even if publicly visible projects are outnumbered by
private projects, the public projects tend to have a much larger impact on
the overall ecosystem, because they're used by many entities (whereas
private projects are typically only used by a single entity given their
>> It also mistakenly assumes our goal is to get existing *applications* to
migrate. It really isn't - we're obviously delighted if app developers
choose to switch (as it indicates we have created a compelling platform),
but we *needed* key library and framework developers to add Python 3
support in order to bootstrap the Python 3 development ecosystem.
> True.
> I think one of the key points here is that different important libs
haven't been ported yet:
> Too many of them are still marked red and IMO that is the main reason why
a lot of people are being so hesitant, not unicode.
> "boto" alone counts as hundreds of thousands potential users which simply
cannot migrate.
> Django made the transition only a couple of months ago, which basically
means it's still in a beta state, and AFAIK fundamental projects such as
Twisted don't even have an ETA.
> Considering 5 years have passed since Python 3.0 first made it's
appearance I consider this a *serious* delay.
> From a user standpoint this sort of appears as a signal which translates
into "if neither big project X has migrated after 5 years why should I?".
> That's likely to apply even if project X is not within the list of your
dependencies, because you may not depend from X now but maybe you will in
the future, either because you need X or because Y requires X in order to
work. It is *crucial* for people maintaining those libraries to put Python
3 porting on top of their TODO list at the cost of not working on new

This is still focusing on migrating *existing* applications. We're not
especially worried if existing applications keep using 2.7 - it's a good
language that is almost certain to be commercially supported for at least
another decade, even though upstream support will switch to security fix
only mode in 2015. If it ain't broke (and for existing applications, 2.7
generally ain't broke), don't fix it. But if a project has persistent
problems with application developers persistently introducing bugs by using
8-bit strings where they should be using Unicode, or otherwise running into
the assorted bug magnets we removed in Python 3, the migration may be worth

A user that starts with Python 3 simply wouldn't consider a dependency like
boto as an option, and would reach for asyncio rather than Twisted for
their explicit asynchronous programming needs.


> --- Giampaolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Thu Jan  9 19:00:47 2014
From: abarnert at (Andrew Barnert)
Date: Thu, 9 Jan 2014 10:00:47 -0800
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 9, 2014, at 8:50, Nick Coghlan <ncoghlan at> wrote:

> But if a project has persistent problems with application developers persistently introducing bugs by using 8-bit strings where they should be using Unicode, or otherwise running into the assorted bug magnets we removed in Python 3, the migration may be worth considering.

One thing to note:

For many applications, it's not that hard to migrate to the six-able subset of 2.7/3.3. This allows 2.x-centric contributors (including those who want to be able to just use the python that Apple or Ubuntu pre-installed on their dev box), allows you to continue using py2exe for your Windows binaries, and gives you an out if you run into the "but what it we later need some library that hasn't been ported yet" problem (which I think is drastically overblown, but it's certainly a common enough fear).

And meanwhile, from my experience, it's at least as hard to introduce subtle Unicode bugs in a dual-version code base as a 3.x-only code base, and just as easy to debug them, so you get at least one advantage over 2.x-only.

And being able to migrate gradually instead of having a flag-day release is always nice.

In my day job, I work on a project that's written in multiple languages, and the python parts are all 2.7+/3.3+. While I miss being able to use some 3.3 features, and it's annoying to deal with problems like 2.7 using too much memory when processing giant XML files or the old version of sqlite in 2.7.2 panicking over a simple union and ignoring an index, it's still far better than having to debug mojibake in 2.7, or writing in node.js or ObjC.

From g.rodola at  Thu Jan  9 19:05:22 2014
From: g.rodola at (Giampaolo Rodola')
Date: Thu, 9 Jan 2014 19:05:22 +0100
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 9, 2014 at 5:50 PM, Nick Coghlan <ncoghlan at> wrote:

> On 9 Jan 2014 22:03, "Giampaolo Rodola'" <g.rodola at> wrote:
> >
> >
> >
> >
> > On Thu, Jan 9, 2014 at 12:16 PM, Nick Coghlan <ncoghlan at>
> wrote:
> >>
> >>
> >> On 9 Jan 2014 09:49, "Amber Yust" <amber.yust at> wrote:
> >> >
> >> > Also note that even if publicly visible projects are outnumbered by
> private projects, the public projects tend to have a much larger impact on
> the overall ecosystem, because they're used by many entities (whereas
> private projects are typically only used by a single entity given their
> nature).
> >>
> >> It also mistakenly assumes our goal is to get existing *applications*
> to migrate. It really isn't - we're obviously delighted if app developers
> choose to switch (as it indicates we have created a compelling platform),
> but we *needed* key library and framework developers to add Python 3
> support in order to bootstrap the Python 3 development ecosystem.
> >
> >
> > True.
> > I think one of the key points here is that different important libs
> haven't been ported yet:
> >
> > Too many of them are still marked red and IMO that is the main reason
> why a lot of people are being so hesitant, not unicode.
> > "boto" alone counts as hundreds of thousands potential users which
> simply cannot migrate.
> > Django made the transition only a couple of months ago, which basically
> means it's still in a beta state, and AFAIK fundamental projects such as
> Twisted don't even have an ETA.
> > Considering 5 years have passed since Python 3.0 first made it's
> appearance I consider this a *serious* delay.
> > From a user standpoint this sort of appears as a signal which translates
> into "if neither big project X has migrated after 5 years why should I?".
> > That's likely to apply even if project X is not within the list of your
> dependencies, because you may not depend from X now but maybe you will in
> the future, either because you need X or because Y requires X in order to
> work. It is *crucial* for people maintaining those libraries to put Python
> 3 porting on top of their TODO list at the cost of not working on new
> features.
> This is still focusing on migrating *existing* applications.
I was talking about existing third party libraries (Twisted, gevent, lxml
etc), not user applications.
In order to port user applications you need those libraries to be ported
first, and it is crucial that at least the most used ones are ported.

> A user that starts with Python 3 simply wouldn't consider a dependency
> like boto as an option
Why not? Note that I picked "boto" just because it's the first in that

> and would reach for asyncio rather than Twisted for their explicit
> asynchronous programming needs.
I would't be so sure about that. We're talking about two very mature and
established projects, with tons of third-party components (see, each solving a common set
of problems in their own way, which will likely continue to be used
independently from asyncio (which is still in a beta state) for quite a

--- Giampaolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Thu Jan  9 19:29:20 2014
From: ncoghlan at (Nick Coghlan)
Date: Fri, 10 Jan 2014 04:29:20 +1000
Subject: [Python-ideas] from __past__ import division, str, etc
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 Jan 2014 02:05, "Giampaolo Rodola'" <g.rodola at> wrote:
> On Thu, Jan 9, 2014 at 5:50 PM, Nick Coghlan <ncoghlan at> wrote:
>> On 9 Jan 2014 22:03, "Giampaolo Rodola'" <g.rodola at> wrote:
>> >
>> >
>> >
>> >
>> > On Thu, Jan 9, 2014 at 12:16 PM, Nick Coghlan <ncoghlan at>
>> >>
>> >>
>> >> On 9 Jan 2014 09:49, "Amber Yust" <amber.yust at> wrote:
>> >> >
>> >> > Also note that even if publicly visible projects are outnumbered by
private projects, the public projects tend to have a much larger impact on
the overall ecosystem, because they're used by many entities (whereas
private projects are typically only used by a single entity given their
>> >>
>> >> It also mistakenly assumes our goal is to get existing *applications*
to migrate. It really isn't - we're obviously delighted if app developers
choose to switch (as it indicates we have created a compelling platform),
but we *needed* key library and framework developers to add Python 3
support in order to bootstrap the Python 3 development ecosystem.
>> >
>> >
>> > True.
>> > I think one of the key points here is that different important libs
haven't been ported yet:
>> >
>> > Too many of them are still marked red and IMO that is the main reason
why a lot of people are being so hesitant, not unicode.
>> > "boto" alone counts as hundreds of thousands potential users which
simply cannot migrate.
>> > Django made the transition only a couple of months ago, which
basically means it's still in a beta state, and AFAIK fundamental projects
such as Twisted don't even have an ETA.
>> > Considering 5 years have passed since Python 3.0 first made it's
appearance I consider this a *serious* delay.
>> > From a user standpoint this sort of appears as a signal which
translates into "if neither big project X has migrated after 5 years why
should I?".
>> > That's likely to apply even if project X is not within the list of
your dependencies, because you may not depend from X now but maybe you will
in the future, either because you need X or because Y requires X in order
to work. It is *crucial* for people maintaining those libraries to put
Python 3 porting on top of their TODO list at the cost of not working on
new features.
>> This is still focusing on migrating *existing* applications.
> I was talking about existing third party libraries (Twisted, gevent, lxml
etc), not user applications.
> In order to port user applications you need those libraries to be ported
first, and it is crucial that at least the most used ones are ported.
>> A user that starts with Python 3 simply wouldn't consider a dependency
like boto as an option
> Why not? Note that I picked "boto" just because it's the first in that
>> and would reach for asyncio rather than Twisted for their explicit
asynchronous programming needs.
> I would't be so sure about that. We're talking about two very mature and
established projects, with tons of third-party components (see, each solving a common set
of problems in their own way, which will likely continue to be used
independently from asyncio (which is still in a beta state) for quite a

Yes, Python 3 will be an even *better* ecosystem as more of the Python 2
ecosystem becomes available. That is not in dispute.

The point is that *most new software* should be able to find appropriate
packages in Python 3 at this point in time, and also has access to modules
like "python-future", which make it relatively straightforward to downgrade
to Python 2.7 if you start in Python 3 and then find a Python 2 only
library that you absolutely positively have to depend on. This means that
those unported libraries aren't a reason to *start* a greenfield project in
Python 2.

"Python 3 by default" *also* doesn't mean "never any reason to start a new
Python 2 project instead".


> --- Giampaolo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ron3200 at  Thu Jan  9 21:01:17 2014
From: ron3200 at (Ron Adam)
Date: Thu, 09 Jan 2014 14:01:17 -0600
Subject: [Python-ideas] RFC: bytestring as a str representation [was: a
 new bytestring type?]
In-Reply-To: <>
References: <>
 <> <20140107154401.GK29356@ando>
 <> <>
 <> <>
Message-ID: <lamv61$ucm$>

On 01/07/2014 04:36 PM, Alexander Heger wrote:
>>> >>Of course I'm unhappy with it, it doesn't behave the way I think it should,
>>> >>and it's not consistent.

>> >Consistent with what? (Before you rush in an answer, remember that
>> >there are almost always multiple sides to a consistency argument.)
>> >I don't see what's wrong with those. Both produce valid expressions
>> >that, when entered, compare equal to the object whose repr() was
>> >printed. What more would you*want*?

> I find that the definition str is inconsistent indeed, because the
> items in a string are strings again, not characters (or code points).
> I don't think there is too many other examples in Python where the
> same is true; indexing a list does not give a list but the item that
> is at the point.

If you use slices, then it's more consistent with strings.  A slice of a 
list gives you a list,  a slice of a string gives you a string.

The idea of sub-components always breaks down at some level.  Then it 
shifts to equivalent translations, rather than smaller units.  Like 
converting strings to bytes, and back again.  They aren't sub components of 
each other.

Where you draw the lines is dependent on how close you look.  (Python, 
bytecode, C code, assemby, bytes, bits, voltages, ...)

We can stay at the python level if we choose the viewpoint that an object 
is the Python code that creates that object.  We have to allow for the 
execution of that code in our understanding of it.


From ram.rachum at  Sat Jan 11 15:18:32 2014
From: ram.rachum at (Ram Rachum)
Date: Sat, 11 Jan 2014 06:18:32 -0800 (PST)
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
Message-ID: <>

I think that `OrderedDict.items().__getitem__` should be implemented, to 
solve this ugliness:

What do you think? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sat Jan 11 15:36:11 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 01:36:11 +1100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 1:18 AM, Ram Rachum <ram.rachum at> wrote:
> I think that `OrderedDict.items().__getitem__` should be implemented, to
> solve this ugliness:
> What do you think?

Well, the first problem with that is that __getitem__ already exists,
and it's dict-style :) So you can't fetch out an item by its position
that way. But suppose you create a method that returns the Nth

The implementation in CPython 3.4 is a linked list, so getting an
arbitrary element by index would be quite inefficient. Getting
specifically the first can be done either with what you see in that
link (it could be made a tiny bit shorter, but not much), but anything
else would effectively entail iterating over the whole thing until you
get to that position, so you may as well do that explicitly.
Alternatively, if you're okay with it being a destructive operation,
you can use popitem() to snag the first (or last, if you wish)
key/value pair.


From __peter__ at  Sat Jan 11 16:36:49 2014
From: __peter__ at (Peter Otten)
Date: Sat, 11 Jan 2014 16:36:49 +0100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
References: <>
Message-ID: <larodu$qlm$>

Ram Rachum wrote:

> I think that `OrderedDict.items().__getitem__` should be implemented, to
> solve this ugliness:
> What do you think?

I think an O(N) __getitem__() is even uglier. Also, you should have really 
compelling reasons for allowing the interfaces of dict.items() and 
OrderedDict.items() to diverge.

Personally, I'd use a helper function

def first(items):
    for item in items:
        return item
    raise ValueError("No first item in an empty sequence")

and I don't understand why user thefourtheye is downvoted. Hiding a non-
obvious if small piece of code behind a self-explaining name seems like good 
programming practice.

From breamoreboy at  Sat Jan 11 16:55:19 2014
From: breamoreboy at (Mark Lawrence)
Date: Sat, 11 Jan 2014 15:55:19 +0000
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <larph8$6vo$>

On 11/01/2014 14:18, Ram Rachum wrote:
> I think that `OrderedDict.items().__getitem__` should be implemented, to
> solve this ugliness:
> What do you think?
> Thanks,
> Ram.

Use the more_itertools first function.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rymg19 at  Sat Jan 11 21:51:28 2014
From: rymg19 at (Ryan Gonzalez)
Date: Sat, 11 Jan 2014 14:51:28 -0600
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

Based on your popitem idea:

get_first = lambda d: d.copy().popitem()
get_last = lambda d: d.copy().popitem(last=True)

On Sat, Jan 11, 2014 at 8:36 AM, Chris Angelico <rosuav at> wrote:

> On Sun, Jan 12, 2014 at 1:18 AM, Ram Rachum <ram.rachum at> wrote:
> > I think that `OrderedDict.items().__getitem__` should be implemented, to
> > solve this ugliness:
> >
> >
> >
> > What do you think?
> Well, the first problem with that is that __getitem__ already exists,
> and it's dict-style :) So you can't fetch out an item by its position
> that way. But suppose you create a method that returns the Nth
> element.
> The implementation in CPython 3.4 is a linked list, so getting an
> arbitrary element by index would be quite inefficient. Getting
> specifically the first can be done either with what you see in that
> link (it could be made a tiny bit shorter, but not much), but anything
> else would effectively entail iterating over the whole thing until you
> get to that position, so you may as well do that explicitly.
> Alternatively, if you're okay with it being a destructive operation,
> you can use popitem() to snag the first (or last, if you wish)
> key/value pair.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

When your hammer is C++, everything begins to look like a thumb.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sat Jan 11 22:41:03 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 08:41:03 +1100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 7:51 AM, Ryan Gonzalez <rymg19 at> wrote:
> Based on your popitem idea:
> get_first = lambda d: d.copy().popitem()
> get_last = lambda d: d.copy().popitem(last=True)

That's a destructive operation, though. Great if you want it, terrible
if you don't.


From grosser.meister.morti at  Sat Jan 11 22:47:26 2014
From: grosser.meister.morti at (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 11 Jan 2014 22:47:26 +0100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

Why not:

	get_first = lambda d: next(iter(d.items()))

No need for a full copy of the dict.

On 01/11/2014 09:51 PM, Ryan Gonzalez wrote:
> Based on your popitem idea:
> get_first = lambda d: d.copy().popitem()
> get_last = lambda d: d.copy().popitem(last=True)
> On Sat, Jan 11, 2014 at 8:36 AM, Chris Angelico <rosuav at <mailto:rosuav at>> wrote:
>     On Sun, Jan 12, 2014 at 1:18 AM, Ram Rachum <ram.rachum at <mailto:ram.rachum at>> wrote:
>      > I think that `OrderedDict.items().__getitem__` should be implemented, to
>      > solve this ugliness:
>      >
>      >
>      >
>      > What do you think?
>     Well, the first problem with that is that __getitem__ already exists,
>     and it's dict-style :) So you can't fetch out an item by its position
>     that way. But suppose you create a method that returns the Nth
>     element.
>     The implementation in CPython 3.4 is a linked list, so getting an
>     arbitrary element by index would be quite inefficient. Getting
>     specifically the first can be done either with what you see in that
>     link (it could be made a tiny bit shorter, but not much), but anything
>     else would effectively entail iterating over the whole thing until you
>     get to that position, so you may as well do that explicitly.
>     Alternatively, if you're okay with it being a destructive operation,
>     you can use popitem() to snag the first (or last, if you wish)
>     key/value pair.
>     ChrisA
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at <mailto:Python-ideas at>
>     Code of Conduct:
> --
> Ryan
> When your hammer is C++, everything begins to look like a thumb.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From rosuav at  Sat Jan 11 23:03:39 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 09:03:39 +1100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 8:47 AM, Mathias Panzenb?ck
<grosser.meister.morti at> wrote:
> Why not:
>         get_first = lambda d: next(iter(d.items()))
> No need for a full copy of the dict.
> On 01/11/2014 09:51 PM, Ryan Gonzalez wrote:
>> Based on your popitem idea:
>> get_first = lambda d: d.copy().popitem()
>> get_last = lambda d: d.copy().popitem(last=True)

Oh right. Yeah, copy(). So this isn't destructive, but as Mathias
says, it's probably inefficient. (I say "probably" because it's
theoretically possible to optimize the copy operation - but I don't
see anything like that in the source code.)


From breamoreboy at  Sat Jan 11 23:32:02 2014
From: breamoreboy at (Mark Lawrence)
Date: Sat, 11 Jan 2014 22:32:02 +0000
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <lasgp3$llu$>

On 11/01/2014 22:03, Chris Angelico wrote:
> On Sun, Jan 12, 2014 at 8:47 AM, Mathias Panzenb?ck
> <grosser.meister.morti at> wrote:
>> Why not:
>>          get_first = lambda d: next(iter(d.items()))
>> No need for a full copy of the dict.
>> On 01/11/2014 09:51 PM, Ryan Gonzalez wrote:
>>> Based on your popitem idea:
>>> get_first = lambda d: d.copy().popitem()
>>> get_last = lambda d: d.copy().popitem(last=True)
> Oh right. Yeah, copy(). So this isn't destructive, but as Mathias
> says, it's probably inefficient. (I say "probably" because it's
> theoretically possible to optimize the copy operation - but I don't
> see anything like that in the source code.)
> ChrisA

Surely a shallow copy isn't guaranteed to work properly in all cases anyway?

      D.copy() -> a shallow copy of D

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rosuav at  Sat Jan 11 23:53:36 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 09:53:36 +1100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <lasgp3$llu$>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 9:32 AM, Mark Lawrence <breamoreboy at> wrote:
> Surely a shallow copy isn't guaranteed to work properly in all cases anyway?
> copy(...)
>      D.copy() -> a shallow copy of D

A shallow copy is sufficient if it's about to mutate the dictionary
itself (popitem). It's the right semantics... just the wrong
complexity, as it's expensive on large dictionaries :)


From rymg19 at  Sun Jan 12 00:01:57 2014
From: rymg19 at (Ryan Gonzalez)
Date: Sat, 11 Jan 2014 17:01:57 -0600
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

It is pretty inefficient. As for getting the last item, however, I think
something like that might end up the best.

And, you've gotta admit, it isn't bad for a 30-second solution with no real
planning whatsoever.

On Sat, Jan 11, 2014 at 4:03 PM, Chris Angelico <rosuav at> wrote:

> On Sun, Jan 12, 2014 at 8:47 AM, Mathias Panzenb?ck
> <grosser.meister.morti at> wrote:
> > Why not:
> >
> >         get_first = lambda d: next(iter(d.items()))
> >
> > No need for a full copy of the dict.
> >
> >
> > On 01/11/2014 09:51 PM, Ryan Gonzalez wrote:
> >>
> >> Based on your popitem idea:
> >>
> >> get_first = lambda d: d.copy().popitem()
> >> get_last = lambda d: d.copy().popitem(last=True)
> Oh right. Yeah, copy(). So this isn't destructive, but as Mathias
> says, it's probably inefficient. (I say "probably" because it's
> theoretically possible to optimize the copy operation - but I don't
> see anything like that in the source code.)
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

When your hammer is C++, everything begins to look like a thumb.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Sun Jan 12 00:04:06 2014
From: at (Yury Selivanov)
Date: Sat, 11 Jan 2014 18:04:06 -0500
Subject: [Python-ideas] namedtuple baseclass
Message-ID: <>

Hello all,

I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
function dynamically creates a class derived from 'tuple', which complicates
things like dynamic dispatch. Basically, the only way of checking if an
is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
hasattr(o, '_fields')".

One possible approach would be to:

1. Rename 'namedtuple' function to '_namedtuple'

2. Add a class 'namedtuple(tuple)', with its '__new__' method proxying
'_namedtuple' function

3. Modify the class template to derive namedtuples from the 'namedtuple'
class, instead of 'tuple'

This way, it's possible to simple write 'isinstance(o, namedtuple)'.

I have a working patch that implements the above logic (all python
unittests pass),
so if you find this useful I can start an issue on

Thank you,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sun Jan 12 00:24:47 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 10:24:47 +1100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 10:01 AM, Ryan Gonzalez <rymg19 at> wrote:
> It is pretty inefficient. As for getting the last item, however, I think
> something like that might end up the best.

For getting the last item, reversed() should be as fast as iter() is
for getting the first - at least in CPython 3.4, which is what I was
looking at.

> And, you've gotta admit, it isn't bad for a 30-second solution with no real
> planning whatsoever.

There is that :)


From jbvsmo at  Sun Jan 12 01:14:29 2014
From: jbvsmo at (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=)
Date: Sat, 11 Jan 2014 22:14:29 -0200
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

I never liked this implementation of namedtuple with "exec". I remember
some proposals
(and even a working implementation) of namedtuple done with metaclasses. I
Don't remember
why they were rejected.

I think at least having a base class other than tuple is something useful.


Jo?o Bernardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Sun Jan 12 01:40:44 2014
From: at (Yury Selivanov)
Date: Sat, 11 Jan 2014 19:40:44 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

Yeah, while I was working on the patch, I thought about rewriting it all
without the use of "exec".  But that would be too much of a change 10 days
before RC1. Therefore, the proposed change is minimal, aimed to only
slightly improve the current design.


On Sat, Jan 11, 2014 at 7:14 PM, Jo?o Bernardo <jbvsmo at> wrote:

> I never liked this implementation of namedtuple with "exec". I remember
> some proposals
> (and even a working implementation) of namedtuple done with metaclasses. I
> Don't remember
> why they were rejected.
> I think at least having a base class other than tuple is something useful.
> +1
> Jo?o Bernardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sun Jan 12 02:05:46 2014
From: steve at (Steven D'Aprano)
Date: Sun, 12 Jan 2014 12:05:46 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <20140112010542.GT3869@ando>

On Sat, Jan 11, 2014 at 06:04:06PM -0500, Yury Selivanov wrote:
> Hello all,
> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
> function dynamically creates a class derived from 'tuple', which complicates
> things like dynamic dispatch. Basically, the only way of checking if an
> object
> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
> hasattr(o, '_fields')".

Let me see if I understand your use-case. You want to dynamically 
dispatch on various objects. Given two objects:

p1 = (23, 42)
p2 = namedtuple("pair", "a b")(23, 42)
assert p1 == p2

you want to dispatch p1 and p2 differently. Is that correct?

Then, given a third object:

class Person(namedtuple("Person", "name sex age occupation id")):
    def say_hello(self):
        print("Hello %s" %

p3 = Person("Fred Smith", "M", 35, "nurse", 927056)

you want to dispatch p2 and p3 the same. Is that correct?

If I am correct, I wonder what sort of code you are writing that wants 
to treat p1 and p2 differently, and p2 and p3 the same. To me, this 
seems ill-advised. Apart from tuple (and object), p2 and p3 should not 
share a common base class, because they have nothing in common.

> This way, it's possible to simple write 'isinstance(o, namedtuple)'.

I am having difficulty thinking of circumstances where I would want to 
do that. 

-1 on the idea.


From at  Sun Jan 12 02:26:59 2014
From: at (Yury Selivanov)
Date: Sat, 11 Jan 2014 20:26:59 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112010542.GT3869@ando>
References: <>
Message-ID: <>

Hi Steven,

On Sat, Jan 11, 2014 at 8:05 PM, Steven D'Aprano <steve at> wrote:
> On Sat, Jan 11, 2014 at 06:04:06PM -0500, Yury Selivanov wrote:
>> Hello all,
>> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
>> function dynamically creates a class derived from 'tuple', which complicates
>> things like dynamic dispatch. Basically, the only way of checking if an
>> object
>> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
>> hasattr(o, '_fields')".
> Let me see if I understand your use-case. You want to dynamically
> dispatch on various objects. Given two objects:
> p1 = (23, 42)
> p2 = namedtuple("pair", "a b")(23, 42)
> assert p1 == p2
> you want to dispatch p1 and p2 differently. Is that correct?
> Then, given a third object:
> class Person(namedtuple("Person", "name sex age occupation id")):
>     def say_hello(self):
>         print("Hello %s" %
> p3 = Person("Fred Smith", "M", 35, "nurse", 927056)
> you want to dispatch p2 and p3 the same. Is that correct?

Well, it all depends on a use case ;) In my concrete use case - yes,
more to that below.

> If I am correct, I wonder what sort of code you are writing that wants
> to treat p1 and p2 differently, and p2 and p3 the same. To me, this
> seems ill-advised. Apart from tuple (and object), p2 and p3 should not
> share a common base class, because they have nothing in common.

Well, everything in python is a subclass/instance of object, so what?
Yes, I think that different namedtuples should be an instance of some remote
common parent, derived from tuple, because they are different, they *are*
namedtuples after all. They have field names for the data stored in them,
and that is what distinguishes them from plain tuples.

> [...]
>> This way, it's possible to simple write 'isinstance(o, namedtuple)'.
> I am having difficulty thinking of circumstances where I would want to
> do that.

My use case: I have a system that dumps python objects to some
intermediate format, which is later converted to html, or dumped
in a terminal (for debug, reporting, and other purposes). And I want
to dump namedtuples with their field names/values (not as a simple

I'm sure there are much more use cases than my current itch.

Python has the richest and most beautiful OO facilities, we have lots
of ABCs and elegant exceptions tree, everything is well structured.
To me, it's logical, that one of the most commonly used classes should
have a proper base class.


From eric at  Sun Jan 12 02:27:39 2014
From: eric at (Eric V. Smith)
Date: Sat, 11 Jan 2014 20:27:39 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112010542.GT3869@ando>
References: <>
Message-ID: <>

See also for a discussion of this issue. 


> On Jan 11, 2014, at 8:05 PM, Steven D'Aprano <steve at> wrote:
>> On Sat, Jan 11, 2014 at 06:04:06PM -0500, Yury Selivanov wrote:
>> Hello all,
>> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
>> function dynamically creates a class derived from 'tuple', which complicates
>> things like dynamic dispatch. Basically, the only way of checking if an
>> object
>> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
>> hasattr(o, '_fields')".
> Let me see if I understand your use-case. You want to dynamically 
> dispatch on various objects. Given two objects:
> p1 = (23, 42)
> p2 = namedtuple("pair", "a b")(23, 42)
> assert p1 == p2
> you want to dispatch p1 and p2 differently. Is that correct?
> Then, given a third object:
> class Person(namedtuple("Person", "name sex age occupation id")):
>    def say_hello(self):
>        print("Hello %s" %
> p3 = Person("Fred Smith", "M", 35, "nurse", 927056)
> you want to dispatch p2 and p3 the same. Is that correct?
> If I am correct, I wonder what sort of code you are writing that wants 
> to treat p1 and p2 differently, and p2 and p3 the same. To me, this 
> seems ill-advised. Apart from tuple (and object), p2 and p3 should not 
> share a common base class, because they have nothing in common.
> [...]
>> This way, it's possible to simple write 'isinstance(o, namedtuple)'.
> I am having difficulty thinking of circumstances where I would want to 
> do that. 
> -1 on the idea.
> -- 
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From at  Sun Jan 12 02:44:06 2014
From: at (Yury Selivanov)
Date: Sat, 11 Jan 2014 20:44:06 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Eric,

Thank you very much for bringing this up. I couldn't find that issue (perhaps,
because I was looking for an open ticket).

>From the discussion there, it seems that Raymond and Guido agreed to
have a common base class for namedtuple for py3.3; however, that was in

Perhaps, any doubts that existed at that time are not the case now?


On Sat, Jan 11, 2014 at 8:27 PM, Eric V. Smith <eric at> wrote:
> See also for a discussion of this issue.
> --
> Eric.
>> On Jan 11, 2014, at 8:05 PM, Steven D'Aprano <steve at> wrote:
>>> On Sat, Jan 11, 2014 at 06:04:06PM -0500, Yury Selivanov wrote:
>>> Hello all,
>>> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
>>> function dynamically creates a class derived from 'tuple', which complicates
>>> things like dynamic dispatch. Basically, the only way of checking if an
>>> object
>>> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
>>> hasattr(o, '_fields')".
>> Let me see if I understand your use-case. You want to dynamically
>> dispatch on various objects. Given two objects:
>> p1 = (23, 42)
>> p2 = namedtuple("pair", "a b")(23, 42)
>> assert p1 == p2
>> you want to dispatch p1 and p2 differently. Is that correct?
>> Then, given a third object:
>> class Person(namedtuple("Person", "name sex age occupation id")):
>>    def say_hello(self):
>>        print("Hello %s" %
>> p3 = Person("Fred Smith", "M", 35, "nurse", 927056)
>> you want to dispatch p2 and p3 the same. Is that correct?
>> If I am correct, I wonder what sort of code you are writing that wants
>> to treat p1 and p2 differently, and p2 and p3 the same. To me, this
>> seems ill-advised. Apart from tuple (and object), p2 and p3 should not
>> share a common base class, because they have nothing in common.
>> [...]
>>> This way, it's possible to simple write 'isinstance(o, namedtuple)'.
>> I am having difficulty thinking of circumstances where I would want to
>> do that.
>> -1 on the idea.
>> --
>> Steven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From abarnert at  Sun Jan 12 08:53:54 2014
From: abarnert at (Andrew Barnert)
Date: Sat, 11 Jan 2014 23:53:54 -0800
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

It sounds like the consensus there wasn't to have a base class for namedtuple, but instead to have an abc that all namedtuples, and C namedtuple-like types, would be registered with, and that would have no API beyond that of Sequence.

If I understand the original request in this thread, I'm not sure this would satisfy the use case. 

He's looking to detect namedtuples so he can extract their names along with their values. Which is a perfectly reasonable thing to do for the kind of reflective code he wants to write. It would presumably use code like this:

    if isinstance(x, NamedTuple);
        d = OrderedDict(zip(x._fields, x))

But that won't work with any abstract NamedTuple, only one that has a _fields member that lists the field names. So you'd need to write this:

    if isinstance(NamedTuple):
            d = OrderedDict(zip(x._fields, x))
        except AttributeError:
            whoops, it's an os.stat_result or something

And at that point, the isinstance check isn't helping anything over the duck typing on _fields, which you can already do today.

So to satisfy this use case, you'd either need an actual namedtuple base class instead of an abc, or an abc that adds some API for getting the field names (or name-value pairs). Either of which seems reasonable--except for the odd quirk of having a public API in a class that's prefixed with an underscore. (If it's not prefixed with an underscore, it can conflict with a field name, which defeats the whole purpose of namedtuple.)

Sent from a random iPhone

On Jan 11, 2014, at 17:44, Yury Selivanov < at> wrote:

> Hi Eric,
> Thank you very much for bringing this up. I couldn't find that issue (perhaps,
> because I was looking for an open ticket).
> From the discussion there, it seems that Raymond and Guido agreed to
> have a common base class for namedtuple for py3.3; however, that was in
> 2010/2011.
> Perhaps, any doubts that existed at that time are not the case now?
> Thanks,
> Yury
> On Sat, Jan 11, 2014 at 8:27 PM, Eric V. Smith <eric at> wrote:
>> See also for a discussion of this issue.
>> --
>> Eric.
>>> On Jan 11, 2014, at 8:05 PM, Steven D'Aprano <steve at> wrote:
>>>> On Sat, Jan 11, 2014 at 06:04:06PM -0500, Yury Selivanov wrote:
>>>> Hello all,
>>>> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
>>>> function dynamically creates a class derived from 'tuple', which complicates
>>>> things like dynamic dispatch. Basically, the only way of checking if an
>>>> object
>>>> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
>>>> hasattr(o, '_fields')".
>>> Let me see if I understand your use-case. You want to dynamically
>>> dispatch on various objects. Given two objects:
>>> p1 = (23, 42)
>>> p2 = namedtuple("pair", "a b")(23, 42)
>>> assert p1 == p2
>>> you want to dispatch p1 and p2 differently. Is that correct?
>>> Then, given a third object:
>>> class Person(namedtuple("Person", "name sex age occupation id")):
>>>   def say_hello(self):
>>>       print("Hello %s" %
>>> p3 = Person("Fred Smith", "M", 35, "nurse", 927056)
>>> you want to dispatch p2 and p3 the same. Is that correct?
>>> If I am correct, I wonder what sort of code you are writing that wants
>>> to treat p1 and p2 differently, and p2 and p3 the same. To me, this
>>> seems ill-advised. Apart from tuple (and object), p2 and p3 should not
>>> share a common base class, because they have nothing in common.
>>> [...]
>>>> This way, it's possible to simple write 'isinstance(o, namedtuple)'.
>>> I am having difficulty thinking of circumstances where I would want to
>>> do that.
>>> -1 on the idea.
>>> --
>>> Steven
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From rosuav at  Sun Jan 12 09:17:56 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 19:17:56 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 6:53 PM, Andrew Barnert <abarnert at> wrote:
> So to satisfy this use case, you'd either need an actual namedtuple base class instead of an abc, or an abc that adds some API for getting the field names (or name-value pairs). Either of which seems reasonable--except for the odd quirk of having a public API in a class that's prefixed with an underscore. (If it's not prefixed with an underscore, it can conflict with a field name, which defeats the whole purpose of namedtuple.)

Is compatibility with the current namedtuple important, or can this be
done another way? For instance, the fields could be retrieved with
__getitem__ instead:

# Hacking it in with a subclass. Gives no benefit
# but is a proof of concept.
class Point(namedtuple('Point', ['x', 'y'])):
    def __getitem__(self, which):
        if which=="fields": return self._fields
        return super().__getitem__(which)

>>> a=Point(1,2)
>>> a.x
>>> a.y
>>> a.fields
Traceback (most recent call last):
  File "<pyshell#233>", line 1, in <module>
AttributeError: 'Point' object has no attribute 'fields'
>>> a["fields"]
('x', 'y')
>>> a[0]
>>> a[1]

Normally, __getitem__ will be used with integers (since this is
basically a sequence, not a mapping). Would it break things to use a
string in this way? It's guaranteed not to collide with either form of
access (as a tuple, or as fields).


From techtonik at  Sun Jan 12 08:26:07 2014
From: techtonik at (anatoly techtonik)
Date: Sun, 12 Jan 2014 10:26:07 +0300
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 4:44 AM, Yury Selivanov < at> wrote:
> Perhaps, any doubts that existed at that time are not the case now?

Sometimes I feel that various questions about namedtuple class,
record and similar proposals need a separate FAQ, but like everybody
else I am lazy to create one, so it never happens.

From steve at  Sun Jan 12 12:43:01 2014
From: steve at (Steven D'Aprano)
Date: Sun, 12 Jan 2014 22:43:01 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <20140112114301.GZ3869@ando>

On Sun, Jan 12, 2014 at 07:17:56PM +1100, Chris Angelico wrote:
> On Sun, Jan 12, 2014 at 6:53 PM, Andrew Barnert <abarnert at> wrote:
> > So to satisfy this use case, you'd either need an actual namedtuple 
> > base class instead of an abc, or an abc that adds some API for 
> > getting the field names (or name-value pairs). Either of which seems 
> > reasonable--except for the odd quirk of having a public API in a 
> > class that's prefixed with an underscore. (If it's not prefixed with 
> > an underscore, it can conflict with a field name, which defeats the 
> > whole purpose of namedtuple.)
> >
> Is compatibility with the current namedtuple important, or can this be
> done another way? For instance, the fields could be retrieved with
> __getitem__ instead:

It's a tuple. It already uses __getitem__ to return items indexed by 
position. Adding magic so that obj["fields"] is an alias for 
obj._fields is, well, horrible.

> # Hacking it in with a subclass. Gives no benefit
> # but is a proof of concept.
> class Point(namedtuple('Point', ['x', 'y'])):
>     def __getitem__(self, which):
>         if which=="fields": return self._fields
>         return super().__getitem__(which)

I think you missed that namedtuple like objects written in C don't 
have a _fields attribute, e.g. os.stat_result. If you're going to insist 
that they add special handling in __getitem__, wouldn't it just be cleaner and simpler 
to get them to add a _fields attribute?


* An ABC for namedtuple as agreed by Raymond and Guido wouldn't include 
any extra functionality beyond Sequence, so it doesn't guarantee the 
existence of _fields; that doesn't satisfy the use-case.

* An actual namedtuple superclass only works for the namedtuple factory 
function, not for C namedtuple-like types.

Both could be fixed -- Python could define a namedtuple superclass, and 
all relevant C types like os.stat_result could be changed to inherit 
from them. (But what of those which don't?) Or the ABC could be extended 
to include a promise of _fields, but that would exclude C types. Either 
way, in order to satisfy this use-case, there would be a whole lot of 
changes needed.

Or, you can duck-type:

if isinstance(o, tuple):
        fields = o._fields
    except AttributeError:
        fields = ...  # fall back

Have I missed something?


From rosuav at  Sun Jan 12 12:46:51 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 22:46:51 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112114301.GZ3869@ando>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 10:43 PM, Steven D'Aprano <steve at> wrote:
> It's a tuple. It already uses __getitem__ to return items indexed by
> position. Adding magic so that obj["fields"] is an alias for
> obj._fields is, well, horrible.

It's only an alias in the simple version that I did there. If it were
to be used as a means of avoiding the _fields reserved name, it
wouldn't be an alias. But yes, it is somewhat magical. I was hunting
for an out-of-band way to get that sort of information.


From steve at  Sun Jan 12 12:55:16 2014
From: steve at (Steven D'Aprano)
Date: Sun, 12 Jan 2014 22:55:16 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <20140112115516.GA3869@ando>

On Sun, Jan 12, 2014 at 10:46:51PM +1100, Chris Angelico wrote:
> On Sun, Jan 12, 2014 at 10:43 PM, Steven D'Aprano <steve at> wrote:
> > It's a tuple. It already uses __getitem__ to return items indexed by
> > position. Adding magic so that obj["fields"] is an alias for
> > obj._fields is, well, horrible.
> It's only an alias in the simple version that I did there. If it were
> to be used as a means of avoiding the _fields reserved name, it
> wouldn't be an alias. But yes, it is somewhat magical. I was hunting
> for an out-of-band way to get that sort of information.

I still don't get how you think this solves the problem that the OP's 
use-case is to use isinstance() to identify namedtuples, then read 
_fields. But with the (proposed, not implemented) namedtuple ABC, 
isinstance(o, NamedTuple) could be true and o._fields fail. Breaking 
backwards compatibility to write that as o["fields"] instead won't help, 
because it will still fail:

py> t = os.stat_result([1]*10)
py> t["fields"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: tuple indices must be integers, not str

Changing namedtuple is not enough.

Oh, and this is a backwards-compatibility breaking change, because 
_fields is part of the *public* API for namedtuple, despite the leading 

So I fail to see how anything short of a massive re-engineering of not 
just namedtuple but also any C namedtuple-like types will satisfy the 
OP's use-case. Have I missed something?


From rosuav at  Sun Jan 12 13:07:46 2014
From: rosuav at (Chris Angelico)
Date: Sun, 12 Jan 2014 23:07:46 +1100
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112115516.GA3869@ando>
References: <>
Message-ID: <>

On Sun, Jan 12, 2014 at 10:55 PM, Steven D'Aprano <steve at> wrote:
> On Sun, Jan 12, 2014 at 10:46:51PM +1100, Chris Angelico wrote:
>> On Sun, Jan 12, 2014 at 10:43 PM, Steven D'Aprano <steve at> wrote:
>> > It's a tuple. It already uses __getitem__ to return items indexed by
>> > position. Adding magic so that obj["fields"] is an alias for
>> > obj._fields is, well, horrible.
>> It's only an alias in the simple version that I did there. If it were
>> to be used as a means of avoiding the _fields reserved name, it
>> wouldn't be an alias. But yes, it is somewhat magical. I was hunting
>> for an out-of-band way to get that sort of information.
> I still don't get how you think this solves the problem that the OP's
> use-case is to use isinstance() to identify namedtuples, then read
> _fields.

That was a slightly tangential comment stemming from Andrew Barnert's
remark that using _fields for a public API is quirky. (Which is why I
quoted him in my post.) This would no longer use an underscore name
for something public. That's all.


From at  Sun Jan 12 13:51:56 2014
From: at (Yury Selivanov)
Date: Sun, 12 Jan 2014 07:51:56 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112115516.GA3869@ando>
References: <>
Message-ID: <>


On Sun, Jan 12, 2014 at 6:55 AM, Steven D'Aprano <steve at> wrote:
> On Sun, Jan 12, 2014 at 10:46:51PM +1100, Chris Angelico wrote:
>> On Sun, Jan 12, 2014 at 10:43 PM, Steven D'Aprano <steve at> wrote:
>> > It's a tuple. It already uses __getitem__ to return items indexed by
>> > position. Adding magic so that obj["fields"] is an alias for
>> > obj._fields is, well, horrible.
>> It's only an alias in the simple version that I did there. If it were
>> to be used as a means of avoiding the _fields reserved name, it
>> wouldn't be an alias. But yes, it is somewhat magical. I was hunting
>> for an out-of-band way to get that sort of information.
> I still don't get how you think this solves the problem that the OP's
> use-case is to use isinstance() to identify namedtuples, then read
> _fields. But with the (proposed, not implemented) namedtuple ABC,
> isinstance(o, NamedTuple) could be true and o._fields fail.

If we decide to implement an ABC, then any class that satisfies it
should implement '_fields' (and _make, and other namedtuple public
methods) properly (this can be enforced in the ABC's '__subclasshook__')

> Breaking
> backwards compatibility to write that as o["fields"] instead won't help,
> because it will still fail:
> py> t = os.stat_result([1]*10)
> py> t["fields"]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: tuple indices must be integers, not str
> Changing namedtuple is not enough.
> Oh, and this is a backwards-compatibility breaking change, because
> _fields is part of the *public* API for namedtuple, despite the leading
> underscore.
> So I fail to see how anything short of a massive re-engineering of not
> just namedtuple but also any C namedtuple-like types will satisfy the
> OP's use-case. Have I missed something?

If we go with the ABC route, then we can simply implement '_fields' and
other namedtuple methods for the low-level C structure os.stat_results
is using later. But for now, stat_result is not a namedtuple (lacks all
of namedtuple API). So I'm not sure that C namedtuple-like types
should hold us bask on this proposal.

BTW, ABC proposal aside: the current namedtuple implementation
creates the class from a template with "exec" call. For every namedtuple,
it's entire set of methods is created over and over again. Even for the
memory efficiency sake, having a base class with *some* of the common
methods (which are currently in the template) is better.


From ericsnowcurrently at  Sun Jan 12 17:33:57 2014
From: ericsnowcurrently at (Eric Snow)
Date: Sun, 12 Jan 2014 09:33:57 -0700
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 12, 2014 5:52 AM, "Yury Selivanov" < at> wrote:
> BTW, ABC proposal aside: the current namedtuple implementation
> creates the class from a template with "exec" call. For every namedtuple,
> it's entire set of methods is created over and over again. Even for the
> memory efficiency sake, having a base class with *some* of the common
> methods (which are currently in the template) is better.

It's a trade-off.  We increase the definition-time cost by using exec, but
minimize the cost of traversing the attribute lookup chain when using
instances.  The purely ABC approach in the referenced issue preserves this
instance-favoring-optimization design.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Sun Jan 12 17:45:44 2014
From: at (Yury Selivanov)
Date: Sun, 12 Jan 2014 11:45:44 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>


On Sun, Jan 12, 2014 at 11:33 AM, Eric Snow <ericsnowcurrently at> wrote:
> On Jan 12, 2014 5:52 AM, "Yury Selivanov" < at> wrote:
>> BTW, ABC proposal aside: the current namedtuple implementation
>> creates the class from a template with "exec" call. For every namedtuple,
>> it's entire set of methods is created over and over again. Even for the
>> memory efficiency sake, having a base class with *some* of the common
>> methods (which are currently in the template) is better.
> It's a trade-off.  We increase the definition-time cost by using exec, but
> minimize the cost of traversing the attribute lookup chain when using
> instances.  The purely ABC approach in the referenced issue preserves this
> instance-favoring-optimization design.
> -eric

Correct me if i'm wrong, but what's the point of speeding up (2%?) attribute
lookup on "_make", "__repr__", and other namedtuple methods?  What matters
is the performance of "__getitem__" and field property access, but that would
be the same if a metaclass (or simple "type" call) is used to construct

Anyways, I'm not proposing to touch the main bulk of the current implementation
(and perhaps there are another reasons why it is as it is). The only
thing I think
would be nice to have (for now), is to have a base class for namedtuples other
than tuple.

Thank you,

From raymond.hettinger at  Sun Jan 12 21:01:35 2014
From: raymond.hettinger at (Raymond Hettinger)
Date: Sun, 12 Jan 2014 20:01:35 +0000
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2014, at 11:04 PM, Yury Selivanov < at> wrote:

> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
> function dynamically creates a class derived from 'tuple', which complicates
> things like dynamic dispatch.

A named tuple is a protocol, not a class. 

Here's the glossary entry:

named tuple
Any tuple-like class whose indexable elements are also accessible using named attributes (for example, time.localtime() returns a tuple-like object where the year is accessible either with an index such as t[0] or with a named attribute like t.tm_year).

A named tuple can be a built-in type such as time.struct_time, or it can be created with a regular class definition. A full featured named tuple can also be created with the factory function collections.namedtuple(). The latter approach automatically provides extra features such as a self-documenting representation like Employee(name='jones', title='programmer').


> Basically, the only way of checking if an object
> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and hasattr(o, '_fields')".

Yes, that is the correct way of doing it.

ABCs weren't meant to replace all instances of duck typing.


P.S. Here's a link to previous discussion on the subject:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sun Jan 12 22:17:20 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 12 Jan 2014 16:17:20 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <lav0ol$h0n$>

On 1/12/2014 3:01 PM, Raymond Hettinger wrote:
> On Jan 11, 2014, at 11:04 PM, Yury Selivanov
> < at
> < at>> wrote:
>> I propose to add a baseclass for all namedtuples. Right now 'namedtuple'
>> function dynamically creates a class derived from 'tuple', which
>> complicates
>> things like dynamic dispatch.
> A named tuple is a protocol, not a class.
> Here's the glossary entry:
> '''
> named tuple
>     Any tuple-like class whose indexable elements are also accessible
>     using named attributes (for example, time.localtime()
>     <library/time.html#time.localtime> returns a tuple-like object where
>     the /year/ is accessible either with an index such as t[0] or with a
>     named attribute like t.tm_year).
>     A named tuple can be a built-in type such as time.struct_time
>     <library/time.html#time.struct_time>, or it can be created with a
>     regular class definition. A full featured named tuple can also be
>     created with the factory function collections.namedtuple()
>     <library/collections.html#collections.namedtuple>. The latter
>     approach automatically provides extra features such as a
>     self-documenting representation like Employee(name='jones',
>     title='programmer').
> '''

That is a really nice glossary entry. I had not seen it before.

>> Basically, the only way of checking if an object
>> is an instance of 'namedtuple' is to do "isinstance(o, tuple) and
>> hasattr(o, '_fields')".
> Yes, that is the correct way of doing it.

That looks fine to me also, so I agree that nothing new is needed.

Terry Jan Reedy

From abarnert at  Sun Jan 12 22:35:34 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 12 Jan 2014 13:35:34 -0800 (PST)
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <20140112115516.GA3869@ando>
References: <>
Message-ID: <>

From: Steven D'Aprano <steve at>

Sent: Sunday, January 12, 2014 3:55 AM

> Changing namedtuple is not enough.

In fact, it's almost completely orthogonal to adding a NamedTuple ABC. Changing namedtuple shouldn't be necessary, and definitely won't be sufficient.

> So I fail to see how anything short of a massive re-engineering of not 
> just namedtuple but also any C namedtuple-like types will satisfy the 
> OP's use-case. Have I missed something?

I said pretty much the same thing yesterday? but on further reflection, I think it's a lot simpler than it looks.

First, let's write

? ? class NamedTuple(Sequence):
? ? ? ? @classmethod
? ? ? ? def __subclasshook__(cls, sub):
? ? ? ? ? ? if not issubclass(sub,
? ? ? ? ? ? ? ? return False
? ? ? ? ? ? try:
? ? ? ? ? ? ? ? sub._fields
? ? ? ? ? ? ? ? return True
? ? ? ? ? ? except:
? ? ? ? ? ? ? ? return NotImplemented

That's easy, and it works with namedtuple types with no change, and it should work with any Python wrapper type that's designed to emulate namedtuple without using it (e.g., if someone decides to write a custom implementation with a shared base class, so he can make all of his types share implementations for _make and friends, as has been suggested on this thread).

So, what about C types? Obviously they don't generally supply _fields?or anything else useful.

But most (all?) of the namedtuple-like types in builtins/stdlib are built with?PyStructSequence, and adding _fields to them requires just a few lines at the end of?PyStructSequence_InitType2:

? ? PyObject *_fields = PyTuple_New(visible_length_key); for (i=0; i!=visible_key_length; ++i) { PyObject *field = PyUnicode_FromString(desc->fields[i].name);
PyTuple_SET_ITEM(_fields, i, field);
PyDict_SetItemString(dict, "_fields", fields);

In fact, that might be worth doing even without the NamedTuple ABC proposal.

But StructSequence has only been an exposed, documented protocol since 3.3, so surely there are?extension modules out there that do their namedtuple-like types manually. (In a quick look around, I couldn't find any examples?although I did find a couple with Python wrappers that create a namedtuple around the result returned by a C implementation function?but I'm sure they exist.)

Obviously you need to be able to get the field names from somewhere?whether that's an attribute or method on the type, copy-pasting from documentation or source, or even parsing the repr of an instance or something?but then you can just generate a wrapper from the type and its field names.

And we could just leave it at that:?"Sorry, those aren't NamedTuple classes, but you can always implement a wrapper in Python yourself." Or we could add a wrapper-generator to the collections module.?Something like this:

? ? def namedtupleize(cls, fields):

? ? ? ? if isinstance(fields, str):
? ? ? ? ? ? fields = fields.split()
? ? ? ? class Sub:
? ? ? ? ? ? _fields = fields
? ? ? ? ? ? def __init__(self, *args, **kwargs):
? ? ? ? ? ? ? ? self.values = cls(*args, **kwargs)
? ? ? ? ? ? def __repr__(self):
? ? ? ? ? ? ? ? return repr(self.values)
? ? ? ? ? ? # a handful of other special methods that can't be getattrified
? ? ? ? ? ? def __getattr__(self, attr):
? ? ? ? ? ? ? ? return getattr(self.values, attr)
? ? ? ? return Sub

? ? statfields = 'st_mode st_ino st_dev st_nlink st_uid?st_gid st_size st_atime st_mtime st_ctime'
? ? Stat = namedtuplize(os.stat_result, stat fields)
? ? stats = (Stat(os.stat(f)) for f in os.listdir('.'))

(I'm using os.stat_result as an example, even though it's already a PyStructSequence so you wouldn't need it here, only for lack of a real-life example.)

And then you can write a wrapper around os.stat that returns a Stat instead of an os.stat_result.?Or, going the other way, in a quick&dirty script that just wraps a handful of these,?you can just even wrap?each object:

? ? def namedtuplify(obj, fields):
? ? ? ? return namedtuplize(type(obj), fields)(obj)

While the namedtuplize function could be useful in the stdlib, the namedtuplify function is less useful, and there are many cases where it's a bad idea, and it's trivial to write yourself if you have need it, so I wouldn't add that to collections, except maybe as a recipe in the docs.

One last thing: Either the ABC or the wrapper could also add _as_odict and the other methods that can be easily derived from _fields, because they're useful, and?I frequently see people doing _as_odict by calling getattr(self, field) on each field.

From at  Sun Jan 12 22:58:19 2014
From: at (Yury Selivanov)
Date: Sun, 12 Jan 2014 16:58:19 -0500
Subject: [Python-ideas] namedtuple baseclass
In-Reply-To: <>
References: <>
Message-ID: <etPan.52d30ffb.6b8b4567.e749@ysmbp.local>


On January 12, 2014 at 3:01:42 PM, Raymond Hettinger (raymond.hettinger at wrote:
> On Jan 11, 2014, at 11:04 PM, Yury Selivanov  
> wrote:
> > I propose to add a baseclass for all namedtuples. Right now 'namedtuple'  
> > function dynamically creates a class derived from 'tuple',  
> which complicates
> > things like dynamic dispatch.
> A named tuple is a protocol, not a class.

This line actually makes a lot of sense, thank you for the explanation.

Since it?s a protocol, and a widely used one, then how about reopening a?
discussion (started in #7796) on adding an ABC ?

I understand the issue with structseq, but we can have the ABC now for
regular named tuples. If/Once the named tuple API is implemented for
structseqs, it will automatically conform to the proposed ABC.

Thank you,

From abarnert at  Mon Jan 13 01:17:20 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 12 Jan 2014 16:17:20 -0800 (PST)
Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re:
	namedtuple base class)
In-Reply-To: <>
References: <>
Message-ID: <>

I don't think the proposed NamedTuple ABC adds anything on top of duck typing on _fields (or on whichever other method you need, and possibly checking for Sequence). As Raymond Hettinger summarized it nicely, namedtuple is a protocol, not a type.

But I think one of the ideas that came out of that discussion is worth pursuing on its own: giving a _fields member to every structseq type.

Most of the namedtuple-like classes in the builtins/stdlib, like os.stat_result, are implemented with PyStructSequence. Since 3.3, that's been a public, documented protocol. A structseq type is already a tuple. And it?stores all the information needed to expose the fields to Python, it just doesn't expose them in any way. And?making it do so is easy. (Either add it to the type __dict__ at type creation, or add a getter that generates it on the fly from tp_members.)

Of course a structseq can do more than a namedtuple. In particular, using a structseq via its _fields would mean that you miss its "non-sequence" fields, like st_mtime_ns. But then that's already true for using a structseq as a sequence, or just looking at its repr, so I don't think that's a problem. (The "visible fields" are visible for a reason?)

And this still wouldn't mean that _fields is part of the "named tuple protocol" described in the glossary, just that it's part of structseq types as well as collections.namedtuple types.

And this wouldn't give structseq an on-demand __dict__ so you can just call var(s) instead of OrderedDict(zip(s._fields, s)).

Still, it seems like a clear win. A small patch, a bit of extra storage on each structseq type object (not on the instances), and now you can reflect on the most common kind of C named tuple types the same way you do on the most common kind of Python named tuple types.

From abarnert at  Mon Jan 13 01:32:21 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 12 Jan 2014 16:32:21 -0800 (PST)
Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re:
	namedtuple base class)
In-Reply-To: <>
References: <>
Message-ID: <>

Here's a quick patch:

diff -r bc5f257f5cc1 Lib/test/
--- a/Lib/test/test_structseq.pySun Jan 12 14:12:59 2014 -0800
+++ b/Lib/test/test_structseq.pySun Jan 12 16:31:15 2014 -0800
@@ -28,6 +28,16 @@
? ? ? ? ?for i in range(-len(t), len(t)-1):
? ? ? ? ? ? ?self.assertEqual(t[i], astuple[i])
+ ? ?def test_fields(self):
+ ? ? ? ?t = time.gmtime()
+ ? ? ? ?self.assertEqual(t._fields,
+ ? ? ? ? ? ? ? ? ? ? ? ? ('tm_year', 'tm_mon', 'tm_mday', 'tm_hour', 'tm_min',?
+ ? ? ? ? ? ? ? ? ? ? ? ? ?'tm_sec', 'tm_wday', 'tm_yday', 'tm_isdst'))
+ ? ? ? ?st = os.stat(__file__)
+ ? ? ? ?self.assertIn("st_mode", st._fields)
+ ? ? ? ?self.assertIn("st_ino", st._fields)
+ ? ? ? ?self.assertIn("st_dev", st._fields)
? ? ?def test_repr(self):
? ? ? ? ?t = time.gmtime()
? ? ? ? ?self.assertTrue(repr(t))
diff -r bc5f257f5cc1 Objects/structseq.c
--- a/Objects/structseq.cSun Jan 12 14:12:59 2014 -0800
+++ b/Objects/structseq.cSun Jan 12 16:31:15 2014 -0800
@@ -7,6 +7,7 @@
?static char visible_length_key[] = "n_sequence_fields";
?static char real_length_key[] = "n_fields";
?static char unnamed_fields_key[] = "n_unnamed_fields";
+static char _fields_key[] = "_fields";
?/* Fields with this name have only a field index, not a field name.
? ? They are only allowed for indices < n_visible_fields. */
@@ -14,6 +15,7 @@
?#define VISIBLE_SIZE(op) Py_SIZE(op)
?#define VISIBLE_SIZE_TP(tp) PyLong_AsLong( \
@@ -327,6 +329,7 @@
? ? ?PyMemberDef* members;
? ? ?int n_members, n_unnamed_members, i, k;
? ? ?PyObject *v;
+ ? ?PyObject *_fields;
?#ifdef Py_TRACE_REFS
? ? ?/* if the type object was chained, unchain it first
@@ -389,6 +392,19 @@
? ? ?SET_DICT_FROM_INT(real_length_key, n_members);
? ? ?SET_DICT_FROM_INT(unnamed_fields_key, n_unnamed_members);
+ ? ?_fields = PyTuple_New(desc->n_in_sequence);
+ ? ?if (!_fields)
+ ? ? ? ?return -1;
+ ? ?for (i = 0; i != desc->n_in_sequence; ++i) {
+ ? ? ? ?PyObject *field = PyUnicode_FromString(members[i].name);
+ ? ? ? ?PyTuple_SET_ITEM(_fields, i, field);
+ ? ?}
+ ? ?if (PyDict_SetItemString(dict, _fields_key, _fields) < 0) {
+ ? ? ? ?Py_DECREF(_fields);
+ ? ? ? ?return -1;
+ ? ?}
+ ? ?Py_DECREF(_fields);
? ? ?return 0;
@@ -417,7 +433,8 @@
? ? ?if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
? ? ? ? ?|| _PyUnicode_FromId(&PyId_n_fields) == NULL
- ? ? ? ?|| _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
+ ? ? ? ?|| _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL
+ ? ? ? ?|| _PyUnicode_FromId(&PyId__fields) == NULL)
? ? ? ? ?return -1;
? ? ?return 0;

----- Original Message -----
> From: Andrew Barnert <abarnert at>
> To: "python-ideas at" <python-ideas at>
> Cc: 
> Sent: Sunday, January 12, 2014 4:17 PM
> Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)
> I don't think the proposed NamedTuple ABC adds anything on top of duck 
> typing on _fields (or on whichever other method you need, and possibly checking 
> for Sequence). As Raymond Hettinger summarized it nicely, namedtuple is a 
> protocol, not a type.
> But I think one of the ideas that came out of that discussion is worth pursuing 
> on its own: giving a _fields member to every structseq type.
> Most of the namedtuple-like classes in the builtins/stdlib, like os.stat_result, 
> are implemented with PyStructSequence. Since 3.3, that's been a public, 
> documented protocol. A structseq type is already a tuple. And it?stores all the 
> information needed to expose the fields to Python, it just doesn't expose 
> them in any way. And?making it do so is easy. (Either add it to the type 
> __dict__ at type creation, or add a getter that generates it on the fly from 
> tp_members.)
> Of course a structseq can do more than a namedtuple. In particular, using a 
> structseq via its _fields would mean that you miss its "non-sequence" 
> fields, like st_mtime_ns. But then that's already true for using a structseq 
> as a sequence, or just looking at its repr, so I don't think that's a 
> problem. (The "visible fields" are visible for a reason?)
> And this still wouldn't mean that _fields is part of the "named tuple 
> protocol" described in the glossary, just that it's part of structseq 
> types as well as collections.namedtuple types.
> And this wouldn't give structseq an on-demand __dict__ so you can just call 
> var(s) instead of OrderedDict(zip(s._fields, s)).
> Still, it seems like a clear win. A small patch, a bit of extra storage on each 
> structseq type object (not on the instances), and now you can reflect on the 
> most common kind of C named tuple types the same way you do on the most common 
> kind of Python named tuple types.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From ethan at  Mon Jan 13 01:44:03 2014
From: ethan at (Ethan Furman)
Date: Sun, 12 Jan 2014 16:44:03 -0800
Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re:
 namedtuple base class)
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/12/2014 04:32 PM, Andrew Barnert wrote:
> Here's a quick patch:

Please put the patch on the issue tracker[1].  Create a new issue if an appropriate one does not already exist.




From abarnert at  Mon Jan 13 02:16:20 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 12 Jan 2014 17:16:20 -0800 (PST)
Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re:
	namedtuple base class)
In-Reply-To: <>
References: <>
Message-ID: <>

See? for the issue and patch. Thanks to Ethan Furman for telling me to post it there instead of here.

----- Original Message -----
> From: Andrew Barnert <abarnert at>
> To: Andrew Barnert <abarnert at>; "python-ideas at" <python-ideas at>
> Cc: 
> Sent: Sunday, January 12, 2014 4:32 PM
> Subject: Re: [Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)
> Here's a quick patch:
> diff -r bc5f257f5cc1 Lib/test/
> --- a/Lib/test/test_structseq.pySun Jan 12 14:12:59 2014 -0800
> +++ b/Lib/test/test_structseq.pySun Jan 12 16:31:15 2014 -0800
> @@ -28,6 +28,16 @@
> ? ? ? ? ?for i in range(-len(t), len(t)-1):
> ? ? ? ? ? ? ?self.assertEqual(t[i], astuple[i])
> ?
> + ? ?def test_fields(self):
> + ? ? ? ?t = time.gmtime()
> + ? ? ? ?self.assertEqual(t._fields,
> + ? ? ? ? ? ? ? ? ? ? ? ? ('tm_year', 'tm_mon', 
> 'tm_mday', 'tm_hour', 'tm_min',?
> + ? ? ? ? ? ? ? ? ? ? ? ? ?'tm_sec', 'tm_wday', 
> 'tm_yday', 'tm_isdst'))
> + ? ? ? ?st = os.stat(__file__)
> + ? ? ? ?self.assertIn("st_mode", st._fields)
> + ? ? ? ?self.assertIn("st_ino", st._fields)
> + ? ? ? ?self.assertIn("st_dev", st._fields)
> +
> ? ? ?def test_repr(self):
> ? ? ? ? ?t = time.gmtime()
> ? ? ? ? ?self.assertTrue(repr(t))
> diff -r bc5f257f5cc1 Objects/structseq.c
> --- a/Objects/structseq.cSun Jan 12 14:12:59 2014 -0800
> +++ b/Objects/structseq.cSun Jan 12 16:31:15 2014 -0800
> @@ -7,6 +7,7 @@
> ?static char visible_length_key[] = "n_sequence_fields";
> ?static char real_length_key[] = "n_fields";
> ?static char unnamed_fields_key[] = "n_unnamed_fields";
> +static char _fields_key[] = "_fields";
> ?
> ?/* Fields with this name have only a field index, not a field name.
> ? ? They are only allowed for indices < n_visible_fields. */
> @@ -14,6 +15,7 @@
> ?_Py_IDENTIFIER(n_sequence_fields);
> ?_Py_IDENTIFIER(n_fields);
> ?_Py_IDENTIFIER(n_unnamed_fields);
> +_Py_IDENTIFIER(_fields);
> ?
> ?#define VISIBLE_SIZE(op) Py_SIZE(op)
> ?#define VISIBLE_SIZE_TP(tp) PyLong_AsLong( \
> @@ -327,6 +329,7 @@
> ? ? ?PyMemberDef* members;
> ? ? ?int n_members, n_unnamed_members, i, k;
> ? ? ?PyObject *v;
> + ? ?PyObject *_fields;
> ?
> ?#ifdef Py_TRACE_REFS
> ? ? ?/* if the type object was chained, unchain it first
> @@ -389,6 +392,19 @@
> ? ? ?SET_DICT_FROM_INT(real_length_key, n_members);
> ? ? ?SET_DICT_FROM_INT(unnamed_fields_key, n_unnamed_members);
> ?
> + ? ?_fields = PyTuple_New(desc->n_in_sequence);
> + ? ?if (!_fields)
> + ? ? ? ?return -1;
> + ? ?for (i = 0; i != desc->n_in_sequence; ++i) {
> + ? ? ? ?PyObject *field = PyUnicode_FromString(members[i].name);
> + ? ? ? ?PyTuple_SET_ITEM(_fields, i, field);
> + ? ?}
> + ? ?if (PyDict_SetItemString(dict, _fields_key, _fields) < 0) {
> + ? ? ? ?Py_DECREF(_fields);
> + ? ? ? ?return -1;
> + ? ?}
> + ? ?Py_DECREF(_fields);
> +
> ? ? ?return 0;
> ?}
> ?
> @@ -417,7 +433,8 @@
> ?{
> ? ? ?if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
> ? ? ? ? ?|| _PyUnicode_FromId(&PyId_n_fields) == NULL
> - ? ? ? ?|| _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
> + ? ? ? ?|| _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL
> + ? ? ? ?|| _PyUnicode_FromId(&PyId__fields) == NULL)
> ? ? ? ? ?return -1;
> ?
> ? ? ?return 0;
> ----- Original Message -----
>>  From: Andrew Barnert <abarnert at>
>>  To: "python-ideas at" <python-ideas at>
>>  Cc: 
>>  Sent: Sunday, January 12, 2014 4:17 PM
>>  Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re: 
> namedtuple base class)
>>  I don't think the proposed NamedTuple ABC adds anything on top of duck 
>>  typing on _fields (or on whichever other method you need, and possibly 
> checking 
>>  for Sequence). As Raymond Hettinger summarized it nicely, namedtuple is a 
>>  protocol, not a type.
>>  But I think one of the ideas that came out of that discussion is worth 
> pursuing 
>>  on its own: giving a _fields member to every structseq type.
>>  Most of the namedtuple-like classes in the builtins/stdlib, like 
> os.stat_result, 
>>  are implemented with PyStructSequence. Since 3.3, that's been a public, 
>>  documented protocol. A structseq type is already a tuple. And it?stores all 
> the 
>>  information needed to expose the fields to Python, it just doesn't 
> expose 
>>  them in any way. And?making it do so is easy. (Either add it to the type 
>>  __dict__ at type creation, or add a getter that generates it on the fly 
> from 
>>  tp_members.)
>>  Of course a structseq can do more than a namedtuple. In particular, using a 
>>  structseq via its _fields would mean that you miss its 
> "non-sequence" 
>>  fields, like st_mtime_ns. But then that's already true for using a 
> structseq 
>>  as a sequence, or just looking at its repr, so I don't think that's 
> a 
>>  problem. (The "visible fields" are visible for a reason?)
>>  And this still wouldn't mean that _fields is part of the "named 
> tuple 
>>  protocol" described in the glossary, just that it's part of 
> structseq 
>>  types as well as collections.namedtuple types.
>>  And this wouldn't give structseq an on-demand __dict__ so you can just 
> call 
>>  var(s) instead of OrderedDict(zip(s._fields, s)).
>>  Still, it seems like a clear win. A small patch, a bit of extra storage on 
> each 
>>  structseq type object (not on the instances), and now you can reflect on 
> the 
>>  most common kind of C named tuple types the same way you do on the most 
> common 
>>  kind of Python named tuple types.
>>  _______________________________________________
>>  Python-ideas mailing list
>>  Python-ideas at
>>  Code of Conduct:

From ncoghlan at  Mon Jan 13 03:41:21 2014
From: ncoghlan at (Nick Coghlan)
Date: Mon, 13 Jan 2014 12:41:21 +1000
Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re:
 namedtuple base class)
In-Reply-To: <>
References: <>
Message-ID: <>

On 13 Jan 2014 11:19, "Andrew Barnert" <abarnert at> wrote:
> See for the issue and patch. Thanks to
Ethan Furman for telling me to post it there instead of here.

This approach sounds good to me for 3.5.

The ABC recipe might make a good addition to the ActiveState cookbook.


> ----- Original Message -----
> > From: Andrew Barnert <abarnert at>
> > To: Andrew Barnert <abarnert at>; "python-ideas at" <
python-ideas at>
> > Cc:
> > Sent: Sunday, January 12, 2014 4:32 PM
> > Subject: Re: [Python-ideas] Making PyStructSequence expose _fields (was
Re: namedtuple base class)
> >
> > Here's a quick patch:
> >
> > diff -r bc5f257f5cc1 Lib/test/
> > --- a/Lib/test/test_structseq.pySun Jan 12 14:12:59 2014 -0800
> > +++ b/Lib/test/test_structseq.pySun Jan 12 16:31:15 2014 -0800
> > @@ -28,6 +28,16 @@
> >          for i in range(-len(t), len(t)-1):
> >              self.assertEqual(t[i], astuple[i])
> >
> > +    def test_fields(self):
> > +        t = time.gmtime()
> > +        self.assertEqual(t._fields,
> > +                         ('tm_year', 'tm_mon',
> > 'tm_mday', 'tm_hour', 'tm_min',
> > +                          'tm_sec', 'tm_wday',
> > 'tm_yday', 'tm_isdst'))
> > +        st = os.stat(__file__)
> > +        self.assertIn("st_mode", st._fields)
> > +        self.assertIn("st_ino", st._fields)
> > +        self.assertIn("st_dev", st._fields)
> > +
> >      def test_repr(self):
> >          t = time.gmtime()
> >          self.assertTrue(repr(t))
> > diff -r bc5f257f5cc1 Objects/structseq.c
> > --- a/Objects/structseq.cSun Jan 12 14:12:59 2014 -0800
> > +++ b/Objects/structseq.cSun Jan 12 16:31:15 2014 -0800
> > @@ -7,6 +7,7 @@
> >  static char visible_length_key[] = "n_sequence_fields";
> >  static char real_length_key[] = "n_fields";
> >  static char unnamed_fields_key[] = "n_unnamed_fields";
> > +static char _fields_key[] = "_fields";
> >
> >  /* Fields with this name have only a field index, not a field name.
> >     They are only allowed for indices < n_visible_fields. */
> > @@ -14,6 +15,7 @@
> >  _Py_IDENTIFIER(n_sequence_fields);
> >  _Py_IDENTIFIER(n_fields);
> >  _Py_IDENTIFIER(n_unnamed_fields);
> > +_Py_IDENTIFIER(_fields);
> >
> >  #define VISIBLE_SIZE(op) Py_SIZE(op)
> >  #define VISIBLE_SIZE_TP(tp) PyLong_AsLong( \
> > @@ -327,6 +329,7 @@
> >      PyMemberDef* members;
> >      int n_members, n_unnamed_members, i, k;
> >      PyObject *v;
> > +    PyObject *_fields;
> >
> >  #ifdef Py_TRACE_REFS
> >      /* if the type object was chained, unchain it first
> > @@ -389,6 +392,19 @@
> >      SET_DICT_FROM_INT(real_length_key, n_members);
> >      SET_DICT_FROM_INT(unnamed_fields_key, n_unnamed_members);
> >
> > +    _fields = PyTuple_New(desc->n_in_sequence);
> > +    if (!_fields)
> > +        return -1;
> > +    for (i = 0; i != desc->n_in_sequence; ++i) {
> > +        PyObject *field = PyUnicode_FromString(members[i].name);
> > +        PyTuple_SET_ITEM(_fields, i, field);
> > +    }
> > +    if (PyDict_SetItemString(dict, _fields_key, _fields) < 0) {
> > +        Py_DECREF(_fields);
> > +        return -1;
> > +    }
> > +    Py_DECREF(_fields);
> > +
> >      return 0;
> >  }
> >
> > @@ -417,7 +433,8 @@
> >  {
> >      if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
> >          || _PyUnicode_FromId(&PyId_n_fields) == NULL
> > -        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
> > +        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL
> > +        || _PyUnicode_FromId(&PyId__fields) == NULL)
> >          return -1;
> >
> >      return 0;
> >
> >
> >
> >
> > ----- Original Message -----
> >>  From: Andrew Barnert <abarnert at>
> >>  To: "python-ideas at" <python-ideas at>
> >>  Cc:
> >>  Sent: Sunday, January 12, 2014 4:17 PM
> >>  Subject: [Python-ideas] Making PyStructSequence expose _fields (was
> > namedtuple base class)
> >>
> >>  I don't think the proposed NamedTuple ABC adds anything on top of duck
> >>  typing on _fields (or on whichever other method you need, and possibly
> > checking
> >>  for Sequence). As Raymond Hettinger summarized it nicely, namedtuple
is a
> >>  protocol, not a type.
> >>
> >>  But I think one of the ideas that came out of that discussion is worth
> > pursuing
> >>  on its own: giving a _fields member to every structseq type.
> >>
> >>  Most of the namedtuple-like classes in the builtins/stdlib, like
> > os.stat_result,
> >>  are implemented with PyStructSequence. Since 3.3, that's been a
> >
> >>  documented protocol. A structseq type is already a tuple. And
it stores all
> > the
> >>  information needed to expose the fields to Python, it just doesn't
> > expose
> >>  them in any way. And making it do so is easy. (Either add it to the
> >>  __dict__ at type creation, or add a getter that generates it on the
> > from
> >>  tp_members.)
> >>
> >>  Of course a structseq can do more than a namedtuple. In particular,
using a
> >
> >>  structseq via its _fields would mean that you miss its
> > "non-sequence"
> >>  fields, like st_mtime_ns. But then that's already true for using a
> > structseq
> >>  as a sequence, or just looking at its repr, so I don't think that's
> > a
> >>  problem. (The "visible fields" are visible for a reason?)
> >>
> >>  And this still wouldn't mean that _fields is part of the "named
> > tuple
> >>  protocol" described in the glossary, just that it's part of
> > structseq
> >>  types as well as collections.namedtuple types.
> >>
> >>  And this wouldn't give structseq an on-demand __dict__ so you can just
> > call
> >>  var(s) instead of OrderedDict(zip(s._fields, s)).
> >>
> >>  Still, it seems like a clear win. A small patch, a bit of extra
storage on
> > each
> >>  structseq type object (not on the instances), and now you can reflect
> > the
> >>  most common kind of C named tuple types the same way you do on the
> > common
> >>  kind of Python named tuple types.
> >>  _______________________________________________
> >>  Python-ideas mailing list
> >>  Python-ideas at
> >>
> >>  Code of Conduct:
> >>
> >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From musicdenotation at  Mon Jan 13 07:48:29 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 13 Jan 2014 13:48:29 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
Message-ID: <>

Proposed syntaxes:
> let function(*args,**kwargs):
>     ...body...
> function2(...args...):
>     ...body...
> in:
>     [statements]

> do:
>     [statements]
> where [function declarations in the same form as above]

Inspired by Haskell and Julia.

This has the advantage that declared functions aren't binded to names outside their context.

From amber.yust at  Mon Jan 13 08:11:49 2014
From: amber.yust at (Amber Yust)
Date: Mon, 13 Jan 2014 07:11:49 +0000
Subject: [Python-ideas]  Multi-statement anonymous functions
References: <>
Message-ID: <3296035059350497264@gmail297201516>

Can't you already essentially accomplish the same thing by simply nesting
function definitions within another function?
On Sun Jan 12 2014 at 10:49:10 PM, <musicdenotation at> wrote:

> Proposed syntaxes:
> > let function(*args,**kwargs):
> >     ...body...
> > function2(...args...):
> >     ...body...
> > in:
> >     [statements]
> > do:
> >     [statements]
> > where [function declarations in the same form as above]
> Inspired by Haskell and Julia.
> This has the advantage that declared functions aren't binded to names
> outside their context.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 13 09:21:07 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 13 Jan 2014 00:21:07 -0800 (PST)
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

From: "musicdenotation at" <musicdenotation at>

Sent: Sunday, January 12, 2014 10:48 PM

> Subject: [Python-ideas] Multi-statement anonymous functions
> Proposed syntaxes:
>>  let function(*args,**kwargs):
>> ? ?  ...body...
>>  function2(...args...):
>> ? ?  ...body...
>>  in:
>> ? ?  [statements]
>>  do:
>> ? ?  [statements]
>>  where [function declarations in the same form as above]
> Inspired by Haskell and Julia.
> This has the advantage that declared functions aren't binded to names 
> outside their context.

I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?

If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features?in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):

? ? @in statement that uses function1
? ? def function1(*args, **kwargs):
? ? ? ? body

? ? statement that uses function1 and var1 given:
? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var1 = value

Meanwhile, my first question for your syntax is: Why limit it to function definitions??It's worth noting that a Haskell?let statement creates local bindings for any values you want; it's not restricted to functions.?And that restriction is the only thing that forces the awkward block structure (which would?need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:

? ? let:

? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var = value
? ? ? ? any other statement you want
? ? in:
? ? ? ? statements

? or, for that matter, just a local-scope statement:

? ? local:
? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var = value
? ? ? ? any other statement you want
? ? ? ? statements that use those definitions

This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)

But I'm wondering why you need a local scope.?

The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable,?you just use an assignment/def/class/etc.?And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.

Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.

Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?

From musicdenotation at  Mon Jan 13 11:23:01 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 13 Jan 2014 17:23:01 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 13, 2014, at 15:21, Andrew Barnert <abarnert at> wrote:

> From: "musicdenotation at" <musicdenotation at>
> Sent: Sunday, January 12, 2014 10:48 PM
>> Subject: [Python-ideas] Multi-statement anonymous functions
>> Proposed syntaxes:
>>> let function(*args,**kwargs):
>>>      ...body...
>>> function2(...args...):
>>>      ...body...
>>> in:
>>>      [statements]
>>> do:
>>>      [statements]
>>> where [function declarations in the same form as above]
>> Inspired by Haskell and Julia.
>> This has the advantage that declared functions aren't binded to names 
>> outside their context.
> I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?
> If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):
>     @in statement that uses function1
>     def function1(*args, **kwargs):
>         body
>     statement that uses function1 and var1 given:
>         def function1(*args, **kwargs):
>             body
>         var1 = value
> Meanwhile, my first question for your syntax is: Why limit it to function definitions? It's worth noting that a Haskell let statement creates local bindings for any values you want; it's not restricted to functions. And that restriction is the only thing that forces the awkward block structure (which would need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:
>     let:
>         def function1(*args, **kwargs):
>             body
>         var = value
>         any other statement you want
>     in:
>         statements
> ? or, for that matter, just a local-scope statement:
>     local:
>         def function1(*args, **kwargs):
>             body
>         var = value
>         any other statement you want
>         statements that use those definitions
> This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)
> But I'm wondering why you need a local scope. 
> The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable, you just use an assignment/def/class/etc. And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.
> Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.
> Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?

I change my proposal:
> let:
>     [all new variables created here are local to the scope, can use global and nonlocal]
> in:
>     [all new variables created here belong to the surrounding scope, but variables introduced in the let statement will be usable and reassignable]
> do:
>     [same semantics as in above]
> where:
>     [same semantics as let above]
Or my original proposal but with variable assignment allowed.

Actually, my original proposal was because I didn't want to mess up with globals() and locals().
And the where statement is to allow function definitions after their usage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From musicdenotation at  Mon Jan 13 12:06:00 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 13 Jan 2014 18:06:00 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
Message-ID: <>

Mutable namespaces and modules are just workarounds and cannot be substituted for local namespaces.

---Original message---
From: Andrew Barnert
Sent: Mon, 13 Jan 2014 00:21:07 -0800
To: <|musicdenotation at|><|python-ideas at|>
Subject: Re: [Python-ideas] Multi-statement anonymous functions

From: "musicdenotation at" <musicdenotation at>

Sent: Sunday, January 12, 2014 10:48 PM

> Subject: [Python-ideas] Multi-statement anonymous functions
> Proposed syntaxes:
>>  let function(*args,**kwargs):
>> ? ?  ...body...
>>  function2(...args...):
>> ? ?  ...body...
>>  in:
>> ? ?  [statements]
>>  do:
>> ? ?  [statements]
>>  where [function declarations in the same form as above]
> Inspired by Haskell and Julia.
> This has the advantage that declared functions aren't binded to names 
> outside their context.

I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?

If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features?in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):

? ? @in statement that uses function1
? ? def function1(*args, **kwargs):
? ? ? ? body

? ? statement that uses function1 and var1 given:
? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var1 = value

Meanwhile, my first question for your syntax is: Why limit it to function definitions??It's worth noting that a Haskell?let statement creates local bindings for any values you want; it's not restricted to functions.?And that restriction is the only thing that forces the awkward block structure (which would?need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:

? ? let:

? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var = value
? ? ? ? any other statement you want
? ? in:
? ? ? ? statements

? or, for that matter, just a local-scope statement:

? ? local:
? ? ? ? def function1(*args, **kwargs):
? ? ? ? ? ? body
? ? ? ? var = value
? ? ? ? any other statement you want
? ? ? ? statements that use those definitions

This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)

But I'm wondering why you need a local scope.?

The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable,?you just use an assignment/def/class/etc.?And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.

Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.

Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?

From abarnert at  Mon Jan 13 12:13:15 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 13 Jan 2014 03:13:15 -0800
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

So I assume you haven't read PEP 403 and 3150, and don't intend to, even though they directly relate to your idea?

Sent from a random iPhone

On Jan 13, 2014, at 2:23, musicdenotation at wrote:

>> On Jan 13, 2014, at 15:21, Andrew Barnert <abarnert at> wrote:
>> From: "musicdenotation at" <musicdenotation at>
>> Sent: Sunday, January 12, 2014 10:48 PM
>>> Subject: [Python-ideas] Multi-statement anonymous functions
>>> Proposed syntaxes:
>>>> let function(*args,**kwargs):
>>>>      ...body...
>>>> function2(...args...):
>>>>      ...body...
>>>> in:
>>>>      [statements]
>>>> do:
>>>>      [statements]
>>>> where [function declarations in the same form as above]
>>> Inspired by Haskell and Julia.
>>> This has the advantage that declared functions aren't binded to names 
>>> outside their context.
>> I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?
>> If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):
>>     @in statement that uses function1
>>     def function1(*args, **kwargs):
>>         body
>>     statement that uses function1 and var1 given:
>>         def function1(*args, **kwargs):
>>             body
>>         var1 = value
>> Meanwhile, my first question for your syntax is: Why limit it to function definitions? It's worth noting that a Haskell let statement creates local bindings for any values you want; it's not restricted to functions. And that restriction is the only thing that forces the awkward block structure (which would need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:
>>     let:
>>         def function1(*args, **kwargs):
>>             body
>>         var = value
>>         any other statement you want
>>     in:
>>         statements
>> ? or, for that matter, just a local-scope statement:
>>     local:
>>         def function1(*args, **kwargs):
>>             body
>>         var = value
>>         any other statement you want
>>         statements that use those definitions
>> This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)
>> But I'm wondering why you need a local scope. 
>> The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable, you just use an assignment/def/class/etc. And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.
>> Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.
>> Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?
> I change my proposal:
>> let:
>>     [all new variables created here are local to the scope, can use global and nonlocal]
>> in:
>>     [all new variables created here belong to the surrounding scope, but variables introduced in the let statement will be usable and reassignable]
> or:
>> do:
>>     [same semantics as in above]
>> where:
>>     [same semantics as let above]
> Or my original proposal but with variable assignment allowed.
> Actually, my original proposal was because I didn't want to mess up with globals() and locals().
> And the where statement is to allow function definitions after their usage.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 13 12:16:43 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 13 Jan 2014 03:16:43 -0800
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 13, 2014, at 3:06, musicdenotation at wrote:

> Mutable namespaces and modules are just workarounds and cannot be substituted for local namespaces.

Sure, in the exact same way that mutable file objects are just workarounds and cannot be substituted for an I/O monad.

If you don't think being able to write "a=3" and modify the current (module/class/local) scope is helpful, I think you may be using the wrong language.

> ---Original message---
> From: Andrew Barnert
> Sent: Mon, 13 Jan 2014 00:21:07 -0800
> To: <|musicdenotation at|><|python-ideas at|>
> Subject: Re: [Python-ideas] Multi-statement anonymous functions
> From: "musicdenotation at" <musicdenotation at>
> Sent: Sunday, January 12, 2014 10:48 PM
>> Subject: [Python-ideas] Multi-statement anonymous functions
>> Proposed syntaxes:
>>> let function(*args,**kwargs):
>>>      ...body...
>>> function2(...args...):
>>>      ...body...
>>> in:
>>>      [statements]
>>> do:
>>>      [statements]
>>> where [function declarations in the same form as above]
>> Inspired by Haskell and Julia.
>> This has the advantage that declared functions aren't binded to names 
>> outside their context.
> I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?
> If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):
>     @in statement that uses function1
>     def function1(*args, **kwargs):
>         body
>     statement that uses function1 and var1 given:
>         def function1(*args, **kwargs):
>             body
>         var1 = value
> Meanwhile, my first question for your syntax is: Why limit it to function definitions? It's worth noting that a Haskell let statement creates local bindings for any values you want; it's not restricted to functions. And that restriction is the only thing that forces the awkward block structure (which would need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:
>     let:
>         def function1(*args, **kwargs):
>             body
>         var = value
>         any other statement you want
>     in:
>         statements
> ? or, for that matter, just a local-scope statement:
>     local:
>         def function1(*args, **kwargs):
>             body
>         var = value
>         any other statement you want
>         statements that use those definitions
> This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)
> But I'm wondering why you need a local scope. 
> The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable, you just use an assignment/def/class/etc. And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.
> Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.
> Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?

From ncoghlan at  Mon Jan 13 14:53:08 2014
From: ncoghlan at (Nick Coghlan)
Date: Mon, 13 Jan 2014 23:53:08 +1000
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

On 13 January 2014 21:13, Andrew Barnert <abarnert at> wrote:
> So I assume you haven't read PEP 403 and 3150, and don't intend to, even
> though they directly relate to your idea?

In particular: :)

It isn't in PEP 3150 itself any more (since it was no longer relevant
after the PEP switched to explicit forward references), but any
multiple-name-binding based proposal needs to account for the torture
test I created back when PEP 3150 allowed implicit access to the
statement local scope:


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From musicdenotation at  Mon Jan 13 15:06:03 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 13 Jan 2014 21:06:03 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 13, 2014, at 18:16, Andrew Barnert <abarnert at> wrote:
>> On Jan 13, 2014, at 3:06, musicdenotation at wrote:
>> Mutable namespaces and modules are just workarounds and cannot be substituted for local namespaces.
> Sure, in the exact same way that mutable file objects are just workarounds and cannot be substituted for an I/O monad.
> If you don't think being able to write "a=3" and modify the current (module/class/local) scope is helpful, I think you may be using the wrong language.
>> ---Original message---
>> From: Andrew Barnert
>> Sent: Mon, 13 Jan 2014 00:21:07 -0800
>> To: <|musicdenotation at|><|python-ideas at|>
>> Subject: Re: [Python-ideas] Multi-statement anonymous functions
>> From: "musicdenotation at" <musicdenotation at>
>> Sent: Sunday, January 12, 2014 10:48 PM
>>> Subject: [Python-ideas] Multi-statement anonymous functions
>>> Proposed syntaxes:
>>>> let function(*args,**kwargs):
>>>>     ...body...
>>>> function2(...args...):
>>>>     ...body...
>>>> in:
>>>>     [statements]
>>>> do:
>>>>     [statements]
>>>> where [function declarations in the same form as above]
>>> Inspired by Haskell and Julia.
>>> This has the advantage that declared functions aren't binded to names 
>>> outside their context.
>> I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?
>> If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):
>>    @in statement that uses function1
>>    def function1(*args, **kwargs):
>>        body
>>    statement that uses function1 and var1 given:
>>        def function1(*args, **kwargs):
>>            body
>>        var1 = value
>> Meanwhile, my first question for your syntax is: Why limit it to function definitions? It's worth noting that a Haskell let statement creates local bindings for any values you want; it's not restricted to functions. And that restriction is the only thing that forces the awkward block structure (which would need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:
>>    let:
>>        def function1(*args, **kwargs):
>>            body
>>        var = value
>>        any other statement you want
>>    in:
>>        statements
>> ? or, for that matter, just a local-scope statement:
>>    local:
>>        def function1(*args, **kwargs):
>>            body
>>        var = value
>>        any other statement you want
>>        statements that use those definitions
>> This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)
>> But I'm wondering why you need a local scope. 
>> The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable, you just use an assignment/def/class/etc. And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.
>> Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.
>> Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?
No, an I/O monad is a workaround for free side effects. What I want is a canonical, obvious, natural solution to a problem, not a workaround.

From masklinn at  Mon Jan 13 16:11:19 2014
From: masklinn at (Masklinn)
Date: Mon, 13 Jan 2014 16:11:19 +0100
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

On 2014-01-13, at 15:06 , musicdenotation at wrote:
> No, an I/O monad is a workaround for free side effects. What I want is a canonical, obvious, natural solution to a problem, not a workaround.

Monads are not workarounds for anything (anymore than option types are a
workaround for a lack of null), they're a type-safe encoding of a
sequential computation, the IO monad being the application of the
concept to the IO subset of side-effecting computations. Monads are not
restricted to side-effecting computations (let alone IO ones), in
Haskell option types and lists are also monadic types.

From musicdenotation at  Mon Jan 13 16:50:10 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 13 Jan 2014 22:50:10 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

>> On Jan 13, 2014, at 18:13, Andrew Barnert <abarnert at> wrote:
> So I assume you haven't read PEP 403 and 3150, and don't intend to, even though they directly relate to your idea?
> Sent from a random iPhone
>> On Jan 13, 2014, at 2:23, musicdenotation at wrote:
>>>> On Jan 13, 2014, at 15:21, Andrew Barnert <abarnert at> wrote:
>>> From: "musicdenotation at" <musicdenotation at>
>>> Sent: Sunday, January 12, 2014 10:48 PM
>>>> Subject: [Python-ideas] Multi-statement anonymous functions
>>>> Proposed syntaxes:
>>>>> let function(*args,**kwargs):
>>>>>      ...body...
>>>>> function2(...args...):
>>>>>      ...body...
>>>>> in:
>>>>>      [statements]
>>>>> do:
>>>>>      [statements]
>>>>> where [function declarations in the same form as above]
>>>> Inspired by Haskell and Julia.
>>>> This has the advantage that declared functions aren't binded to names 
>>>> outside their context.
>>> I think there's something interesting here, but I'm not seeing it. What's the actual use case for this?
>>> If you haven't read PEP 403 and PEP 3150, you should; they both offer similar (but not identical) features in a way that seems more readable (both more compact, and "fronting" the most important part of the construct):
>>>     @in statement that uses function1
>>>     def function1(*args, **kwargs):
>>>         body
>>>     statement that uses function1 and var1 given:
>>>         def function1(*args, **kwargs):
>>>             body
>>>         var1 = value
>>> Meanwhile, my first question for your syntax is: Why limit it to function definitions? It's worth noting that a Haskell let statement creates local bindings for any values you want; it's not restricted to functions. And that restriction is the only thing that forces the awkward block structure (which would need to be parsed differently than existing Python structures, both by the compiler and by human readers). Why not just a let statement that lets you execute _any_ statements in a local scope, then use that scope:
>>>     let:
>>>         def function1(*args, **kwargs):
>>>             body
>>>         var = value
>>>         any other statement you want
>>>     in:
>>>         statements
>>> ? or, for that matter, just a local-scope statement:
>>>     local:
>>>         def function1(*args, **kwargs):
>>>             body
>>>         var = value
>>>         any other statement you want
>>>         statements that use those definitions
>>> This has an advantage over Nick Coghlan's two proposals in that you get to run a full suite with the local scope, instead of just a single statement. (His fronting of the statement makes that restriction necessary; yours doesn't.)
>>> But I'm wondering why you need a local scope. 
>>> The let statement is necessary in Haskell because namespaces, like everything else, are immutable, and there are no real assignments; if you want to bind another variable, you have to create a new scope with that binding on top of the existing one. In Python, if you want to bind another variable, you just use an assignment/def/class/etc. And if you're worried about the name being accessible from outside of the namespace (e.g., if someone does a "from foo import *" on you), there are already idiomatic ways to deal with that: prefix the name with _, or give the module an __all__. Or, again: Python namespaces are mutable, so you can just del a binding after you're done with it if you really need to.
>>> Coming at it from a different angle, JavaScript?which has mutable namespaces very much like Python?needs local scopes pretty frequently. But that's only because it has no modules, so everything is in one giant global namespace, which makes it hard to avoid conflicts, figure out where things are defined, etc. So that doesn't seem to apply to Python either.
>>> Also, in most cases where you _do_ need a local scope, just defining and calling a function works just fine. That's what people do in Python when they need a local binding for micro-optimization purposes. And the same idiom is used all over the place in JavaScript (which, again, needs local scopes much more often than Python). Is there a use case where that isn't appropriate?
>> I change my proposal:
>>> let:
>>>     [all new variables created here are local to the scope, can use global and nonlocal]
>>> in:
>>>     [all new variables created here belong to the surrounding scope, but variables introduced in the let statement will be usable and reassignable]
>> or:
>>> do:
>>>     [same semantics as in above]
>>> where:
>>>     [same semantics as let above]
>> Or my original proposal but with variable assignment allowed.
>> Actually, my original proposal was because I didn't want to mess up with globals() and locals().
>> And the where statement is to allow function definitions after their usage.
I have read them. I intentionally introduce two ways to do it because the let variation allows you to not pollute up the namespace while still programming the "traditional" but more obvious Python way.

Trivia: The following lines of code contains a cultural reference. Do you know what it is? It is very very recent.

print(not_a_foot, file=be_seen)

del it
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Jan 14 03:33:50 2014
From: steve at (Steven D'Aprano)
Date: Tue, 14 Jan 2014 13:33:50 +1100
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <20140114023350.GD3403@ando>

On Mon, Jan 13, 2014 at 09:06:03PM +0700, musicdenotation at wrote:

> What I want is a canonical, obvious, natural solution to a problem, 
> not a workaround.

Please explain what the problem is, what you consider "canonical", 
"obvious", and "natural", and how we should distinguish a "solution" 
from a "workaround".

Dropping arbitrary syntax into our laps with no explanation of what it 
means and what is metasyntax does not help. For example, you proposed:

    let function(*args,**kwargs):
    where [function declarations in the same form as above]

I have no idea what that is supposed to mean. E.g. is "function" a 
keyword (part of the syntax) or the name of something (a new function 
perhaps?)? Are the dots and [] syntax or metasyntax?

Don't assume we are familiar with Haskell and Julia, or that we can tell 
which bits are metasyntax and which are intended as new syntax. A good 
way to proceed is to give examples of the syntax, show the expected 
output, and preferrably include a plain English description of what it 
does that is new or different from existing syntax.

Thank you.


From musicdenotation at  Thu Jan 16 05:55:00 2014
From: musicdenotation at (musicdenotation at
Date: Thu, 16 Jan 2014 11:55:00 +0700
Subject: [Python-ideas] J
Message-ID: <>

An embedded and charset-unspecified text was scrubbed...
Name: cid:user:composed
URL: <>

From denis.spir at  Thu Jan 16 11:20:08 2014
From: denis.spir at (spir)
Date: Thu, 16 Jan 2014 11:20:08 +0100
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/11/2014 03:36 PM, Chris Angelico wrote:
> On Sun, Jan 12, 2014 at 1:18 AM, Ram Rachum <ram.rachum at> wrote:
>> I think that `OrderedDict.items().__getitem__` should be implemented, to
>> solve this ugliness:
>> What do you think?
> Well, the first problem with that is that __getitem__ already exists,
> and it's dict-style :) So you can't fetch out an item by its position
> that way. But suppose you create a method that returns the Nth
> element.
> The implementation in CPython 3.4 is a linked list, so getting an
> arbitrary element by index would be quite inefficient. [...]

I have occasionnally implemented ordered sets or associative arrays in a really 
simple, stupid manner [1]: just store the items or entries (pairs) in an array 
(instead of, say, anywhere in memory at the allocator's convenience), in 
addition to the usual array of "buckets".

About efficiency, there is a kind of balance of benefits & costs: In average, 
common operations should in principle be faster due to memory compacity and 
consequent cache usage. No cost in memory size (entries must be somewhere 
anyway). There is a little cost when the whole data structure grows, since now 2 
arrays have to be resized up; to mitigate, i provide a 'predim' (predimension) 
method that avoids most growing operations.

The point is that now entries form an array that keeps insertion order, can be 
traversed and even indexed. An issue, however, like for any flexible-size array, 
is with item deletion. I don't delete at once, which would require compacting 
the array everytime an entry is removed, instead just mark entries as deleted. 
Whenever a big proportion of items are removed [2], there is automatic 
compaction. But with such a trick, indexing in then invalid (and prevented); to 
retrieve this indexing feature, if items have been deleted, clients must first 
run a 'compact' method, that actually removes all deleted items at once (and if 
many worth it resizes down). For traversal however, there is no issue: the 
implementation just needs to skip items marked as deleted.

I have no idea whether such a stupid way to make ordered sets/dicts is 
compatible with the present requirements or implementation for python's 
ordereddicts (but suspect it is not). And I guess the constraint on indexing 
does not really fit the python way, in that an implementation constraint leaks 
into the client interface. Just wanted however to say a few words about that 
scheme due its simplicity and practicality. Comments welcome.


[1] The common need is usually for what I call "mod tables", used as symbol 
tables: a mod table is like a hash table, but with keys beeing unsigned ints, 
thus there is no hash, instead plain modulo. This makes a sort of sparse array, 
but ordered. Numeric keys actually represent interned strings which themselves 
are keys in symbol tables (scopes, namespaces...).

[2] To avoid a kind of threashold effect, compaction happens when the count of 
items is less than 3/8 of capacity, not half of it. There must be an hysteresis 
(difference of threashold to resize up versus down) to avoid instability in the 
hypothetical case where the count of items is close to a resizing capacity and 
items are constantly beeing put & removed.

From oscar.j.benjamin at  Thu Jan 16 11:53:42 2014
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Thu, 16 Jan 2014 10:53:42 +0000
Subject: [Python-ideas] `OrderedDict.items().__getitem__`
In-Reply-To: <larodu$qlm$>
References: <>
Message-ID: <>

On Sat, Jan 11, 2014 at 04:36:49PM +0100, Peter Otten wrote:
> Ram Rachum wrote:
> > I think that `OrderedDict.items().__getitem__` should be implemented, to
> > solve this ugliness:
> > 
> >
> item-of-ordereddict-in-python-3
> > 
> > What do you think?
> I think an O(N) __getitem__() is even uglier. Also, you should have really 
> compelling reasons for allowing the interfaces of dict.items() and 
> OrderedDict.items() to diverge.

Agreed, but I do think that OrderedDict could be more helpful here. I haven't
wanted to get the first item before but I have wanted to get the last without
popping it off. Since this can be provided in O(1) I think it would make a
reasonable addition as a property of OrderedDict:

    def last(self):
        if not self:
            raise KEyError('dictionary is empty')
        return self.__root.prev

Just returning the key sufficient but in my own use cases I would have wanted
the whole item which you could easily do:

    def lastitem(self):
        if not self:
            raise KEyError('dictionary is empty')
        key = self.__root.prev
        return key, self.__map[key]


From ram.rachum at  Fri Jan 17 14:00:50 2014
From: ram.rachum at (Ram Rachum)
Date: Fri, 17 Jan 2014 05:00:50 -0800 (PST)
Subject: [Python-ideas] Add `n_threads` argument to
Message-ID: <>


I'd like to use `concurrent.futures.ProcessPoolExecutor` but have each 
process contain multiple worker threads. We could have an `n_threads` 
argument to the constructor, defaulting to 1 to maintain backward 
compatibility, and setting a value higher than 1 would cause multiple 
threads to be spawned in each process.

What do you think? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From elazarg at  Fri Jan 17 14:45:14 2014
From: elazarg at (=?UTF-8?B?15DXnNei15bXqA==?=)
Date: Fri, 17 Jan 2014 15:45:14 +0200
Subject: [Python-ideas] Make max() stable
Message-ID: <>

Hi all,

Given several objects with the same key, max() returns the first one:

    >>> key = lambda x: 0
    >>> max(1, 2, key=key)

This means it is not stable, at least according to the definition in
"Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):

"Informally, an algorithm is stable if it respects the original order of
equivalent objects. So if we think of minimum and maximum as selecting,
respectively, the smallest and the second smallest from a list of two
arguments, stability requires that when called with equivalent elements,
minimum should return the first and maximum should return the second."

A page later, In a side note, the authors mention that "STL incorrectly
requires that `max(a, b)` return `a` when `a` and `b` are equivalent." (As
a reminder, Stepanov is the chief designer of the STL).

So, I know this is not a big issue, to say the least, but is there any
reason *not* to return the last argument? Are we trying to be compatible
with STL somehow?

I admit I don't know of any real use case where this really matters, but I
can point out that

   >>> ab = (1, 2)
   >>> (min(ab, key=key), max(ab, key=key))
   (1, 1)

is visibly less sensible than (1, 2).

Hope I'm not being a crank here...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Fri Jan 17 17:06:04 2014
From: steve at (Steven D'Aprano)
Date: Sat, 18 Jan 2014 03:06:04 +1100
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <20140117160604.GJ3915@ando>

On Fri, Jan 17, 2014 at 03:45:14PM +0200, ????? wrote:
> Hi all,
> Given several objects with the same key, max() returns the first one:
>     >>> key = lambda x: 0
>     >>> max(1, 2, key=key)
>     1

A more natural example showing that both min and max return the *first* 
of equal elements is:

py> values = (1, 2, 1.0, 2.0)
py> min(values)
py> max(values)

> This means it is not stable, at least according to the definition in
> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
> "Informally, an algorithm is stable if it respects the original order of
> equivalent objects.

Stability is normally only of concern with sorting algorithms. I'm not 
sure whether Stepanov and McJones' definition is widely accepted, or 
important. There are clear reasons for desiring sorting to be stable. 
Are there any clear reasons to desire max to be stable in this sense?

(There is another, unrelated, meaning of stability with regard to 
numeric algorithms, but it has nothing to do with the order of objects.)

Note that stability in this sense only is meaningful when your values 
are objects that you care about their identity as well as value.

> So if we think of minimum and maximum as selecting,
> respectively, the smallest and the second smallest from a list of two
> arguments, stability requires that when called with equivalent elements,
> minimum should return the first and maximum should return the second."
> A page later, In a side note, the authors mention that "STL incorrectly
> requires that `max(a, b)` return `a` when `a` and `b` are equivalent." (As
> a reminder, Stepanov is the chief designer of the STL).
> So, I know this is not a big issue, to say the least, but is there any
> reason *not* to return the last argument? Are we trying to be compatible
> with STL somehow?

No, I expect that the result is an accident of implementation. 
Specifically, the min and max algorithms probably look something like 

min = first item
for each item:
    if item < min: min = item

max = first item
for each item:
    if item > max: max = item

> I admit I don't know of any real use case where this really matters, but I
> can point out that
>    >>> ab = (1, 2)
>    >>> (min(ab, key=key), max(ab, key=key))
>    (1, 1)
> is visibly less sensible than (1, 2).

Using your weird key function here is going to give bizarre results. 

ab = (2, 1)
min(ab, key=key), max(ab, key=key)

Current behaviour is to return (2, 2). I don't think that returning 2 
for the minimum and 1 for the maximum is more sensible, but that's 
because the key function is not sensible, not because of any objection 
to making max stable in this sense.


From rosuav at  Fri Jan 17 17:15:50 2014
From: rosuav at (Chris Angelico)
Date: Sat, 18 Jan 2014 03:15:50 +1100
Subject: [Python-ideas] Make max() stable
In-Reply-To: <20140117160604.GJ3915@ando>
References: <>
Message-ID: <>

On Sat, Jan 18, 2014 at 3:06 AM, Steven D'Aprano <steve at> wrote:
> Using your weird key function here is going to give bizarre results.
> Consider:
> ab = (2, 1)
> min(ab, key=key), max(ab, key=key)
> Current behaviour is to return (2, 2). I don't think that returning 2
> for the minimum and 1 for the maximum is more sensible, but that's
> because the key function is not sensible, not because of any objection
> to making max stable in this sense.

Imagine implementing min and max this way (ignoring key= and the
possibility of a single iterable arg):

def min(*args):
    return sorted(args)[0]

def max(*args):
    return sorted(args)[-1]

By that definition, a stable sort means that:

lst = sorted((x,y))
assert lst == [min(lst), max(lst)]

will pass for any x and y.

That said, I don't see any particular use cases for this identity.
Maybe the OP can enlighten?


From tjreedy at  Sat Jan 18 00:34:24 2014
From: tjreedy at (Terry Reedy)
Date: Fri, 17 Jan 2014 18:34:24 -0500
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <lbcelr$a0f$>

On 1/17/2014 8:45 AM, ????? wrote:
> Hi all,
> Given several objects with the same key, max() returns the first one:
>      >>> key = lambda x: 0
>      >>> max(1, 2, key=key)
>      1

As documented: "If multiple items are maximal, the function returns the 
first one encountered."

> This means it is not stable, at least according to the definition in
> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
> "Informally, an algorithm is stable if it respects the original order of
> equivalent objects.

Min and max are inherently functions of multisets, with order irrelevant 
but duplicate values allowed. So I do not think 'stability' applies, 
even if a multiset is presented in some arbitrary order.

 > So if we think of minimum and maximum as selecting,
> respectively, the smallest and the second smallest from a list of two
> arguments,

Why not say largest and second largest, but why either rather than just 
smallest and largest?

> stability requires that when called with equivalent elements,
> minimum should return the first and maximum should return the second."

The Python dev who wrote the doc disagrees: "This is consistent with 
other sort-stability preserving tools such as sorted(iterable, 
key=keyfunc, reverse=True)[0] and heapq.nlargest(1, iterable, key=keyfunc)."

A simpler reason for the current behavior is that it is more efficient 
to not rebind the internal variable when an equal object is encountered.

In any case, changing the definition and implementation of max will 
break any code that depends on returning first versus last. We have to 
have a good reason to do so. If there is no such code (other than the 
test code that checks the 'first encountered' behavior), then there is 
no need to change.

Terry Jan Reedy

From abarnert at  Sat Jan 18 01:09:55 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 17 Jan 2014 16:09:55 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <lbcelr$a0f$>
References: <>
Message-ID: <>

On Jan 17, 2014, at 15:34, Terry Reedy <tjreedy at> wrote:

> On 1/17/2014 8:45 AM, ????? wrote:
>> Hi all,
>> Given several objects with the same key, max() returns the first one:
>>     >>> key = lambda x: 0
>>     >>> max(1, 2, key=key)
>>     1
> As documented: "If multiple items are maximal, the function returns the first one encountered."
>> This means it is not stable, at least according to the definition in
>> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
>> "Informally, an algorithm is stable if it respects the original order of
>> equivalent objects.
> Min and max are inherently functions of multisets, with order irrelevant but duplicate values allowed.

I'm not sure that's necessarily true. The maximal value of a sequence makes every bit as much sense as the maximal value of a set or multiset, and I think it comes up quite often. For example, if I have a series of experiments at different times, the (time, value) pairs have an obvious meaningful order, and asking for max(experiments, key=itemgetter(1)) is a meaningful thing to do.

But often, even for a sequence, you don't care which max you get.

And often, when you do care, you explicitly want the first. The high score on a video game belongs to the first person who reached that score, not to someone who later tied him.

Sure, _sometimes_ you want the last rather than the first. I've actually written variants of max, nlargest, groupby, etc. that track the last value instead of the first (e.g., to make groupby treat adjacent runs as a group) multiple times. But I wouldn't expect that to be the default behavior of any of those functions.

So, I think the existing design of all these functions is less surprising than the alternative, and adequately documented, and it's easy enough to write the alternative when you need it.

> In any case, changing the definition and implementation of max will break any code that depends on returning first versus last.

This is about as perfect a reason as possible. If it ever matters, we can't change it; if it never matters, we have no reason to change it...

From tjreedy at  Sat Jan 18 01:48:55 2014
From: tjreedy at (Terry Reedy)
Date: Fri, 17 Jan 2014 19:48:55 -0500
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
 <lbcelr$a0f$> <>
Message-ID: <lbcj1h$liu$>

On 1/17/2014 7:09 PM, Andrew Barnert wrote:
> On Jan 17, 2014, at 15:34, Terry Reedy <tjreedy at> wrote:

>> Min and max are inherently functions of multisets, with order
>> irrelevant but duplicate values allowed.

I should have said a multiset of comparable objects.

> I'm not sure that's necessarily true. The maximal value of a sequence
> makes every bit as much sense as the maximal value of a set or
> multiset

A list of comparable objects *is* a multiset of comparable objects. So 
is any iterable of comparable objects. Which is why 'iterable of 
comparable objects' is the proper domain for max. Similar comments apply 
to any commutative associative operator.

> For example, if I have
> a series of experiments at different times, the (time, value) pairs
> have an obvious meaningful order, and asking for max(experiments,
> key=itemgetter(1)) is a meaningful thing to do.

max((value,time) for time,value in experiments)

gives the lastest high value. In general

max((val,i) for i,val in enumerate(iterable))

does the same.

If max gave the last maximum, it would be trickier to get the first 
maximum, just as it is now to get the last minimum.

Terry Jan Reedy

From ethan at  Sat Jan 18 01:26:48 2014
From: ethan at (Ethan Furman)
Date: Fri, 17 Jan 2014 16:26:48 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
 <lbcelr$a0f$> <>
Message-ID: <>

On 01/17/2014 04:09 PM, Andrew Barnert wrote:
> This is about as perfect a reason as possible. If it ever matters, we can't change it; if it never matters, we have no reason to change it...



From ned at  Sat Jan 18 02:35:57 2014
From: ned at (Ned Batchelder)
Date: Fri, 17 Jan 2014 20:35:57 -0500
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/17/14 8:45 AM, ????? wrote:
> Hi all,
> Given several objects with the same key, max() returns the first one:
> >>> key = lambda x: 0
> >>> max(1, 2, key=key)
> 1
> This means it is not stable, at least according to the definition in 
> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
> "Informally, an algorithm is stable if it respects the original order 
> of equivalent objects. So if we think of minimum and maximum as 
> selecting, respectively, the smallest and the second smallest from a 
> list of two arguments, stability requires that when called with 
> equivalent elements, minimum should return the first and maximum 
> should return the second."
I don't understand this logic at all. Stability matters in sorting 
because sort() takes a sequence and returns a sequence, and for various 
reasons you might need to sort a list twice, with different criteria. 
Stability guarantees that the second sort won't discard the work of the 
first sort.

Is there an example of an actual problem that stability of min and max 
would make easier to solve?


From nas-python at  Sat Jan 18 04:22:19 2014
From: nas-python at (Neil Schemenauer)
Date: Fri, 17 Jan 2014 21:22:19 -0600
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python 3.x
Message-ID: <>

The transition to Python 3 is happening but there is still a massive
amount of code that needs to be ported.  One of the most disruptive
changes in Python 3 is the strict separation of bytes from unicode
strings.  Most of the other incompatible changes can be handled by

Here is a far out idea to make transition smoother.  Release version
2.8 of Python with nearly all Python 3.x incompatible changes except
for the bytes/unicode changes.  This could include:

- print as function

- default string literal as unicode

- return view objects for dict.keys(), etc

- rename modules in standard library

- rename long to int

- rename .next() to __next__()

- accept only new 'raise' syntax

- remove backticks for repr

- rename unicode to str

- removal of 'apply', 'buffer', 'callable', 'execfile'

- exec as function

- rename os.getcwdu() to os.getcwd()

- remove dict.has_key

- move intern to sys.intern()

- rename xrange to range

- remove xreadlines

New features of Python 3.x could be backported if easy since they
could be useful to entice developers to move from 2.7 to 2.8.

Problems with this idea:

- it would be a huge amount of work.  There are thousands of
  commits to Python 3.x since it was branched.  Most of them are not
  related to the above features but back porting them would still be
  a huge effort.  I tried backport 'print' as a function just to get
  an idea of the work.

- if people install this new version of Python as the default, old
  scripts and programs will break.  I believe this breakage was the
  movation for making Python 3 an all-at-once jump.  I'm not sure
  how to handle this, maybe this version could be used only by
  developers during their Python 3 porting efforts.  Alternatively,
  only install it as 'python2.8', never 'python' or 'python2'.

An alternative approach to producing Python 2.8 would be to start
with the Python 3.x latest branch.  Modify bytesobject and
unicodeobject to have as close to Python 2 behavior as practical.

A-journey-of-a-thousand-miles-begins-ly y'rs


From ethan at  Sat Jan 18 04:59:23 2014
From: ethan at (Ethan Furman)
Date: Fri, 17 Jan 2014 19:59:23 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

This thread just happened not three weeks ago.  Python 2.8 ain't gonna happen.


From rosuav at  Sat Jan 18 05:49:39 2014
From: rosuav at (Chris Angelico)
Date: Sat, 18 Jan 2014 15:49:39 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 18, 2014 at 2:22 PM, Neil Schemenauer
<nas-python at> wrote:
> - it would be a huge amount of work.  There are thousands of
>   commits to Python 3.x since it was branched.  Most of them are not
>   related to the above features but back porting them would still be
>   a huge effort.  I tried backport 'print' as a function just to get
>   an idea of the work.

Guido's time machine strikes again. Put this at the top of your script
and run it under 2.7 or 2.6:

from __future__ import print_function


From tjreedy at  Sat Jan 18 06:24:33 2014
From: tjreedy at (Terry Reedy)
Date: Sat, 18 Jan 2014 00:24:33 -0500
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <lbd36b$87t$>

On 1/17/2014 10:22 PM, Neil Schemenauer wrote:
> The transition to Python 3 is happening but there is still a massive
> amount of code that needs to be ported.

For application code, why does it need to be ported.

>  One of the most disruptive
> changes in Python 3 is the strict separation of bytes from unicode
> strings.  Most of the other incompatible changes can be handled by
> 2to3.

For many application areas, the text problem seems to have been somewhat 
solved, to the point where people are writing 2&3 code successfully.

> Here is a far out idea to make transition smoother.   Release version
> 2.8 of Python with nearly all Python 3.x incompatible changes except
> for the bytes/unicode changes.

Various people have suggested versions of this idea. At one time, I 
could imagine it, even after PEP404. But a 2.8 project should have 
started soon after 2.7 was released with 2.8 released soon after 3.3 or 
certainly now with 3.4. I think it too late now.

>  This could include:

I believe you left out the int division change.

> Problems with this idea:

People who cannot move to 3.x because of libraries could not move to 2.8 
for the same reason. Over half of the most commonly downloaded libraries 
already have 3.x versions.

Major linux distributions are already in the process of switching to 3.x 
as default Python.

> - it would be a huge amount of work.

Yes, and the current volunteer pydev group will not do it. So this is 
literally the wrong forum. Martijn Faassen posted the following on 
python-list on the 6th.
I've started an informal channel "#python2.8" on freenode. It's to 
discuss the potential for a Python 2.8 version -- to see whether there 
is interest in it, what it could contain, how it could facilitate 
porting to Python 3, who would work on it, etc. If you are interested in 
constructive discussion about a Python 2.8, please join.

I realize that if there is actual code created, and if it's not under 
the umbrella of the PSF, it couldn't be called "Python 2.8" due to 
trademark reasons. But that's premature - let's have some discussions 
first to see whether anything can happen.
>  There are thousands of
>    commits to Python 3.x since it was branched.  Most of them are not
>    related to the above features but back porting them would still be
>    a huge effort.  I tried backport 'print' as a function just to get
>    an idea of the work.

You are unusual. Many 2.8 advocates want it handed to them for free.

Terry Jan Reedy

From stephen at  Sat Jan 18 06:26:19 2014
From: stephen at (Stephen J. Turnbull)
Date: Sat, 18 Jan 2014 14:26:19 +0900
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Ned Batchelder writes:
 > On 1/17/14 8:45 AM, ????? wrote:

 > > "Informally, an algorithm is stable if it respects the original order 
 > > of equivalent objects. So if we think of minimum and maximum as 
 > > selecting, respectively, the smallest and the second smallest from a 
 > > list of two arguments, stability requires that when called with 
 > > equivalent elements, minimum should return the first and maximum 
 > > should return the second."

 > I don't understand this logic at all. Stability matters in sorting 
 > because sort() takes a sequence and returns a sequence, and for various 
 > reasons you might need to sort a list twice, with different criteria. 
 > Stability guarantees that the second sort won't discard the work of the 
 > first sort.

Two comments.  First, I don't understand at all why earlier members of
a sequence may be presumed to be smaller.  It could easily go the
other way around.

Second, since these operations are *selections* from a collection
(which might impose order or not, which might impose uniqueness or
not), it's the same problem that Steven d'Aprano faced in defining
mode for the statistics PEP: Do you admit failure (here,
noncomparability of some of the maximal 
items), so that a value that
is none of the items must be returned?  In the case of multiple
equivalent values, do you return a representative or the collection?

From tim.peters at  Sat Jan 18 06:37:28 2014
From: tim.peters at (Tim Peters)
Date: Fri, 17 Jan 2014 23:37:28 -0600
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

[????? <elazarg at>]
> Given several objects with the same key, max() returns the first one:
>     >>> key = lambda x: 0
>     >>> max(1, 2, key=key)
>     1
> This means it is not stable, at least according to the definition in
> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
> "Informally, an algorithm is stable if it respects the original order of
> equivalent objects. So if we think of minimum and maximum as selecting,
> respectively, the smallest and the second smallest from a list of two
> arguments, stability requires that when called with equivalent elements,
> minimum should return the first and maximum should return the second."

A sound argument, provided one accepts the "if".  But nobody in the
known history of the world *does* think of min and max that way
outside this silly quote ;-)

> A page later, In a side note, the authors mention that "STL incorrectly
> requires that `max(a, b)` return `a` when `a` and `b` are equivalent." (As a
> reminder, Stepanov is the chief designer of the STL).
> So, I know this is not a big issue, to say the least, but is there any
> reason *not* to return the last argument? Are we trying to be compatible
> with STL somehow?

No.  We're just doing what everyone *really* expects min and max to do
- including whoever implemented the STL's max().

> ...
> Hope I'm not being a crank here...

One removed from being a crank is not itself being a crank :-)

From abarnert at  Sat Jan 18 06:42:13 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 17 Jan 2014 21:42:13 -0800 (PST)
Subject: [Python-ideas] Make max() stable
In-Reply-To: <lbcj1h$liu$>
References: <>
 <lbcelr$a0f$> <>
Message-ID: <>

From: Terry Reedy <tjreedy at>

Sent: Friday, January 17, 2014 4:48 PM

> On 1/17/2014 7:09 PM, Andrew Barnert wrote:

>>  On Jan 17, 2014, at 15:34, Terry Reedy <tjreedy at> wrote:
>>>  Min and max are inherently functions of multisets, with order
>>>  irrelevant but duplicate values allowed.
> I should have said a multiset of comparable objects.
>>  I'm not sure that's necessarily true. The maximal value of a 
> sequence
>>  makes every bit as much sense as the maximal value of a set or
>>  multiset
> A list of comparable objects *is* a multiset of comparable objects.

No, a list is a multiset _with order_. Which is the whole point. You claimed that because it's a multiset, the order doesn't matter. But because the domain of max is a sequence (or, better, as you correctly point out, an iterable), not a multiset,?the order does matter. Otherwise this entire question wouldn't arise in the first place.

>> ??asking for max(experiments,?key=itemgetter(1)) is a meaningful thing to do.
> max((value,time) for time,value in experiments)

> gives the lastest high value. In general
> max((val,i) for i,val in enumerate(iterable))
> does the same.

Sure, given my list of experiments in time order, these give the same result as my expression (except with the members of the tuple reversed, which we can ignore). And?

No matter how you write this, you're not just picking the highest value, you're picking the highest value _with the earliest time_. Which is a meaningful thing to do.

> If max gave the last maximum, it would be trickier to get the first maximum,?
> just as it is now to get the last minimum.

Of course. If you read my whole message, that was exactly my point: both are useful. Well, that, and the fact that the current behavior is (a) useful more often than the opposite, and (b): compatible with reams of existing code. And therefore, it would be a bad idea to gratuitously change max to return the last instead of the first. Which I think you agree with completely, so I'm not sure why you're trying to disprove it.

From stephen at  Sat Jan 18 08:24:43 2014
From: stephen at (Stephen J. Turnbull)
Date: Sat, 18 Jan 2014 16:24:43 +0900
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Andrew Barnert writes:

 > No, a list is a multiset _with order_. Which is the whole
 > point. You claimed that because it's a multiset, the order doesn't
 > matter. But because the domain of max is a sequence (or, better, as
 > you correctly point out, an iterable), not a multiset,?the order
 > does matter. Otherwise this entire question wouldn't arise in the
 > first place.

Iterables need not have order in a sense that allows definition of
"stability".  That's why things like OrderedDict are necessary.

From jeanpierreda at  Sat Jan 18 08:40:49 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Fri, 17 Jan 2014 23:40:49 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 17, 2014 at 5:35 PM, Ned Batchelder <ned at> wrote:
> Is there an example of an actual problem that stability of min and max would
> make easier to solve?

In a language like C++, you if min and max had the property specified
by the OP, you might do:

x = min(a, b);
y = max(a, b);

And then x is the smallest, and y is the other one, and it's simple and
easy and less code than an if statement. I suspect this is where the
desire comes from.

In Python, of course, you do x, y = sorted([a, b])

-- Devin

From abarnert at  Sat Jan 18 08:56:21 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 17 Jan 2014 23:56:21 -0800 (PST)
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

From: Neil Schemenauer <nas-python at>

Sent: Friday, January 17, 2014 7:22 PM

> Here is a far out idea to make transition smoother.? Release version
> 2.8 of Python with nearly all Python 3.x incompatible changes except
> for the bytes/unicode changes.

What exactly do you mean by "the bytes/unicode changes"? There's a wide range of differences between 2.7 and 3.4 that could fall into this category. At least two of them, you'll specifically included in your proposed 2.8, including one of the three huge ones. Here's the ones I can think of off the top of my head, in rough order of most to least code-breaking:

?* No automatic conversions from bytes to unicode.

?* No automatic conversions from unicode to bytes.
?* Rename unicode to str (included in your suggestion).

?* File objects can be either unicode-based (text) or bytes-based (binary), defaulting to unicode.

?* The stdin/out/err files, StringIO, and various other common file objects are text.

?* __str__ (and __repr__) must return unicode, not bytes?and it's what print, "%s", default "{}", etc. call.

?* __bytes__ (the 3.x equivalent of 2.x's __str__) exists, but is not called by anything but bytes(), and is not supplied by most builtin/stdlib types (which is why, e.g., bytes(2) returns b'\0\0', not b'2').
?* Dozens of builtins and stdlib functions that used to work on bytes (or, in some cases, on either bytes or unicode) now work on unicode (e.g, csv.reader, json.loads).
?* Default string literal as unicode (included in your suggestion; already available with a future statement).
?* No bytes.encode or unicode.decode. (In 2.x, when used with codecs like 'ascii' or 'utf-8' these were almost always errors? but errors that a lot of badly-written code relies on to "work", as long as you never give it a non-ASCII character.)
?* No bytes.__mod__ or bytes.format (at least in 3.4; this may change later).
?* Bytes is an iterable of small ints rather than of single-char bytes.

?* File objects are the wrappers from the io module, not thin wrappers around C stdio.

?* All text files have universal newlines enabled, unless otherwise specified by the (not in 2.x) newline param.

?* Functions like chr and ord are based on?Unicode code points, not bytes. (There are no bytes equivalent because there's no need if bytes is an iterable of ints.)

?* Different internal representation for unicode objects.

?* Different C API for unicode objects.
?* No basestring.

So? which of these do you want, and which do you not?

I suspect that, whatever your exact answers, it would be a lot easier to fork 3.4 and port the 2.7 behavior you want than to fork 2.7 and backport almost all of 3.4.

And if you do it that way, you could even adapt the idea someone proposed a few weeks ago?not popular on this list, but maybe popular with your target audience?of turning each change on and off with a "from __past__ import misfeature" statement, so people could pick and choose the ones they need, and gradually remove past statements as they port from your forked 2.8 to real 3.4.

However,?I also suspect that, whatever your exact answers, it won't be that useful.?Look at people's reasons for not moving to 3.x:

?*?If your app already works in 2.7, and has no need for any new 3.x-only packages, it makes perfect sense to stay with 2.7. Which means there's no reason to move to 2.8.
?* If your app works in 2.7, but you're worried that it will eventually become hard to find?supported 2.7 installations to run on, would you really expect finding 2.8 installations to be be easier?
?*?If you're staying with 2.7 because your OS, hosting company, dev team, school, whatever provides it, there's no reason to go to 2.8.
?* If?you depend on a package that hasn't been ported to 3.x? well, that's four separate issues.
?* If you depend on an in-house/small-market package that hasn't been ported, it's really the same case as "I have an app that works just fine in 2.7."

?* If you depend on a package that hasn't been ported because it's effectively moribund, it's not going to be ported to 2.8.
?* If you depend on a package that actually has been ported to 3.x, but you're too stupid to find information anywhere but blog posts or StackOverflow questions dated 2009 (which is depressingly common?), those posts are not going to tell you about 2.8.
?* If you depend on a package that's legitimately hard to port to 3.x, it obviously won't be ported to 2.8 yet either?and since it'll probably?be a lower priority for the developers, even if 2.8 is an easier port than 3.4 there's no guarantee it'll come sooner. (Also, consider that typically, people depend on 6 packages that have been ported and 1 that hasn't; if they switch to 2.8, that'll be 7 packages they need to wait on rather than 1.)
?* If you have code that sort of works in 2.7?if you're careful to feed it only ASCII, just renaming str will almost certainly break your code. If you fix it, it will be as easy to port to 3.4 as to 2.8. If you don't fix it? well, at best this is the same as the first case; if not, it's the same as the next one.
?* If you have code that's legitimately difficult to port to 3.x because, e.g., it relies on parsing and creating network messages or file formats that mix ASCII text and binary or encoded-text payloads, just renaming str will break your code. And it may be non-trivial to fix.

I'm having a hard time imagining code that would be easy to port to 2.8, but not to 3.x. For example:

? ? payload = <some object with a __str__ method to serialize it>
? ? sock.sendall('Header: {}\r\nAnother: {}\r\n\r\n{}'.format(
? ? ? ? headers['header'], headers['another'], payload))

Even with just the two changes you already suggested:?First, you have to change the literal to a bytes literal. More seriously, you have to rename that payload type's __str__ method to __bytes__. And if it does any string stuff internally, like encoding JSON, that has to change. Meanwhile, your?logging code probably relies on the same _str__ method actually returning a str, so you have to add one of those. Assuming headers is a dict of strs, you either need to go back up the chain (or into the API that provides it) and change that so it's been a dict of bytes all along, or you need to explicitly encode the headers here. That doesn't sound too hard overall? but that gives you working Python 3.5?code (assuming PEP 460 goes through). And there doesn't seem to be any shortcut that would give you working 2.8 code without also working in 3.5.

Also, one quick comment:

> - removal of 'apply', 'buffer', 'callable', 

'callable' exists in Python 3.2+.

Not a big deal, unless this implies that you're basing everything on the state of the ecosystem back in Python 3.1. I don't think that it does, but just in case:?Three years ago, people didn't have much experience with porting yet (e.g., writing 2.x code and running it through 2to3 at install time was considered the best way to port things gradually?) and most of PyPI didn't exist for 3.x yet. Back then, this suggestion would have been a lot more compelling than it is today, because all anyone could say was, "Wait and see, we're hoping it'll be better" instead of "Look and see, it already is better."

From steve at  Sat Jan 18 09:12:48 2014
From: steve at (Steven D'Aprano)
Date: Sat, 18 Jan 2014 19:12:48 +1100
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <20140118081248.GN3915@ando>

On Fri, Jan 17, 2014 at 11:40:49PM -0800, Devin Jeanpierre wrote:
> On Fri, Jan 17, 2014 at 5:35 PM, Ned Batchelder <ned at> wrote:
> > Is there an example of an actual problem that stability of min and max would
> > make easier to solve?
> In a language like C++, you if min and max had the property specified
> by the OP, you might do:
> x = min(a, b);
> y = max(a, b);
> And then x is the smallest, and y is the other one, and it's simple and
> easy and less code than an if statement.

But that's how max and min work right now, modulo that object identity 
is not important. If you care about object identity, you're probably 
doing something underhanded *wink*

Given the case that a and b are *equal* (as measured by the key 
function, if given) then it shouldn't matter whether you get

smallest = a
biggest = b


smallest = b
biggest = a


smallest = biggest = a


smallest = biggest = b

These variations only are meaningful if a and b are different types 
with the same value, or the same type but different identities. Even if 
these variations are important, I don't think there is any inherent 
benefit to one over the other.

Personally, I'd either keep the current behaviour, or purely for the 
symmetry, pick the so-called "stable" behaviour. But I don't see any 
rational reason for preferring one over the other. Now that Python 3 
documents the specific behaviour, it's not worth changing.

> I suspect this is where the desire comes from.
> In Python, of course, you do x, y = sorted([a, b])

Now the interesting thing here is that sorted *is* stable, so if a and b 
are equal, sorted([a, b]) is guaranteed to return [a, b]. Which gives 
the behaviour requested.


From greg.ewing at  Sat Jan 18 09:25:23 2014
From: greg.ewing at (Greg Ewing)
Date: Sat, 18 Jan 2014 21:25:23 +1300
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Devin Jeanpierre wrote:
> In a language like C++, you if min and max had the property specified
> by the OP, you might do:
> x = min(a, b);
> y = max(a, b);
> And then x is the smallest, and y is the other one

With Python's current definition of max(), you can
get that effect using

x, y = min(a, b), max(b, a)

So max() *does* respect the order of its operands;
it's just that the order it respects may not be obvious
unless you're Dutch.


From abarnert at  Sat Jan 18 09:26:25 2014
From: abarnert at (Andrew Barnert)
Date: Sat, 18 Jan 2014 00:26:25 -0800 (PST)
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

From: Devin Jeanpierre <jeanpierreda at>

Sent: Friday, January 17, 2014 11:40 PM

> In a language like C++, you if min and max had the property specified
> by the OP, you might do:
> x = min(a, b);
> y = max(a, b);
> And then x is the smallest, and y is the other one, and it's simple and
> easy and less code than an if statement. I suspect this is where the
> desire comes from.
> In Python, of course, you do x, y = sorted([a, b])

You can write the exact same thing in C++:

? ? template <typename T> whatever(T a, T b) {
? ? ? ? T x, y;
? ? ? ? tie(x, y) = tuplize(sorted(make_vector(a, b)));

And then x is the smallest, and y is the other one. Plus, all that static type safety makes it even better than the Python version, right? And extending all those helpers to take a variable number of arguments is a simple matter of template metaprogramming (either with a bit of preprocessor help via boost, or with template parameter packs).

Of course you need to write those three helper function templates. Making them work for two parameters is trivial; making them work for?an arbitrary number of parameters is a simple matter of template metaprogramming, which any sixth-year C++ student can write in a matter of weeks; it's just partially specializing on parameter packs, which can be done with simple compile-time recursion.

And the advantage is that all that static typing makes sure that you get errors at compile time instead of run time if you make any mistakes?or often even if you don't.

From greg.ewing at  Sat Jan 18 09:37:52 2014
From: greg.ewing at (Greg Ewing)
Date: Sat, 18 Jan 2014 21:37:52 +1300
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Andrew Barnert wrote:
> And the advantage is that all that static typing makes sure that you get
verrors at compile time instead of run time if you make any mistakes?or often
> even if you don't.

Also, if you do it right, the whole computation is
performed at compile time, so you don't even have to
run the program. This makes it very easy to deploy
in a cross-platform manner on minimally-specced


From jeanpierreda at  Sat Jan 18 11:40:50 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Sat, 18 Jan 2014 02:40:50 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <20140118081248.GN3915@ando>
References: <>
Message-ID: <>

On Sat, Jan 18, 2014 at 12:12 AM, Steven D'Aprano <steve at> wrote:
> These variations only are meaningful if a and b are different types
> with the same value, or the same type but different identities. Even if
> these variations are important, I don't think there is any inherent
> benefit to one over the other.

These variations are also important if a and b are just plain
different values, same type or no. This can happen if max/min are
passed a key function -- equality of a sort key doesn't mean the
values are interchangeable for all purposes

if x and y are strings with the same length, min(x, y, key=len) +
max(x, y, key=len) is something different in each of those helpfully
enumerated cases, and that's with a well behaved type and a
superficially OK looking expression.

That said, considering (as Greg points out) that all four variations
can be transformed into one another by rearranging the arguments to
min and max, I think it's pretty clear that there's nothing strongly
favoring any one of them, so on that (and the rest) I agree.

-- Devin

From musicdenotation at  Sat Jan 18 15:52:17 2014
From: musicdenotation at (musicdenotation at
Date: Sat, 18 Jan 2014 21:52:17 +0700
Subject: [Python-ideas] Tail recursion elimination
Message-ID: <>

On April 22, 2009, Guido van Rossum wrote:
> First, as one commenter remarked, TRE is incompatible with nice stack traces: when a tail recursion is eliminated, there's no stack frame left to use to print a traceback when something goes wrong later. This will confuse users who inadvertently wrote something recursive (the recursion isn't obvious in the stack trace printed), and makes debugging hard. Providing an option to disable TRE seems wrong to me: Python's default is and should always be to be maximally helpful for debugging.
What are "nice" stack traces? If you mean stack traces that record every function call then it is not nice and helpful at all given their length. Do loops have nice stack traces as you mean it? No. When something goes wrong in a loop, you don't get to see every iteration in the stack trace. You debug loops by examining the values of variables in the iteration it goes wrong. Likewise you debug a tail recursive function by examining the arguments that went into the function call that blows up. And if you don't want to turn on TRE by default, you can turn it off and offer an option to enable.
> Second, the idea that TRE is merely an optimization, which each Python implementation can choose to implement or not, is wrong. Once tail recursion elimination exists, developers will start writing code that depends on it, and their code won't run on implementations that don't provide it: a typical Python implementation allows 1000 recursions, which is plenty for non-recursively written code and for code that recurses to traverse, for example, a typical parse tree, but not enough for a recursively written loop over a large list.
Yes, it is more of a language feature than an implementation feature. But once CPython implements it, I think other implementations will follow suit or Python developers will not write code that uses TRE just like JavaScript developers don't use Mozilla-specific extensions like let keyword.
> Third, I don't believe in recursion as the basis of all programming. This is a fundamental belief of certain computer scientists, especially those who love Scheme and like to teach programming by starting with a "cons" cell and recursion. But to me, seeing recursion as the basis of everything else is just a nice theoretical approach to fundamental mathematics (turtles all the way down), not a day-to-day tool.

This isn't a valid argument. That something isn't fundamental is almost never an argument to leave it out. (Except for Scheme.)
> Of course, none of this does anything to address my first three arguments. Is it really such a big deal to rewrite your function to use a loop? (After all TRE only addresses recursion that can easily be replaced by a loop. :-)

It isn't a big deal, yes. But why restrict programmers? Python is a multi-paradigm language anyway: you can write imperative or functional code with(out) object orientation. "There should be one ? and only one ? obvious way to do it"? You are misinterpreting it. It is about having only one obvious way to do things, but you have removed another obvious way to do certain things because certain problems are better expressed using recursion rather than looping (traversing a tree, for example). I think the most obvious way to do something is how it is defined. If there was really one way to do anything, there should be no for-loops, list comprehensions, or even object orientation at all in Python. Remember that recursion is either something lower- or higher-level than looping. And if you don't like abstractions over primitive concepts, you should be coding in machine code right now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jsbueno at  Sat Jan 18 16:08:38 2014
From: jsbueno at (Joao S. O. Bueno)
Date: Sat, 18 Jan 2014 13:08:38 -0200
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

You can use tail recursion elimination in Python as it is today.

So, if you are needing that, just package this reference implementation
in a module:-


On 18 January 2014 12:52,  <musicdenotation at> wrote:
> On April 22, 2009, Guido van Rossum wrote:
> First, as one commenter remarked, TRE is incompatible with nice stack
> traces: when a tail recursion is eliminated, there's no stack frame left to
> use to print a traceback when something goes wrong later. This will confuse
> users who inadvertently wrote something recursive (the recursion isn't
> obvious in the stack trace printed), and makes debugging hard. Providing an
> option to disable TRE seems wrong to me: Python's default is and should
> always be to be maximally helpful for debugging.
> What are "nice" stack traces? If you mean stack traces that record every
> function call then it is not nice and helpful at all given their length. Do
> loops have nice stack traces as you mean it? No. When something goes wrong
> in a loop, you don't get to see every iteration in the stack trace. You
> debug loops by examining the values of variables in the iteration it goes
> wrong. Likewise you debug a tail recursive function by examining the
> arguments that went into the function call that blows up. And if you don't
> want to turn on TRE by default, you can turn it off and offer an option to
> enable.
> Second, the idea that TRE is merely an optimization, which each Python
> implementation can choose to implement or not, is wrong. Once tail recursion
> elimination exists, developers will start writing code that depends on it,
> and their code won't run on implementations that don't provide it: a typical
> Python implementation allows 1000 recursions, which is plenty for
> non-recursively written code and for code that recurses to traverse, for
> example, a typical parse tree, but not enough for a recursively written loop
> over a large list.
> Yes, it is more of a language feature than an implementation feature. But
> once CPython implements it, I think other implementations will follow suit
> or Python developers will not write code that uses TRE just like JavaScript
> developers don't use Mozilla-specific extensions like let keyword.
> Third, I don't believe in recursion as the basis of all programming. This is
> a fundamental belief of certain computer scientists, especially those who
> love Scheme and like to teach programming by starting with a "cons" cell and
> recursion. But to me, seeing recursion as the basis of everything else is
> just a nice theoretical approach to fundamental mathematics (turtles all the
> way down), not a day-to-day tool.
> This isn't a valid argument. That something isn't fundamental is almost
> never an argument to leave it out. (Except for Scheme.)
> Of course, none of this does anything to address my first three arguments.
> Is it really such a big deal to rewrite your function to use a loop? (After
> all TRE only addresses recursion that can easily be replaced by a loop. :-)
> It isn't a big deal, yes. But why restrict programmers? Python is a
> multi-paradigm language anyway: you can write imperative or functional code
> with(out) object orientation. "There should be one ? and only one ? obvious
> way to do it"? You are misinterpreting it. It is about having only one
> obvious way to do things, but you have removed another obvious way to do
> certain things because certain problems are better expressed using recursion
> rather than looping (traversing a tree, for example). I think the most
> obvious way to do something is how it is defined. If there was really one
> way to do anything, there should be no for-loops, list comprehensions, or
> even object orientation at all in Python. Remember that recursion is either
> something lower- or higher-level than looping. And if you don't like
> abstractions over primitive concepts, you should be coding in machine code
> right now.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From musicdenotation at  Sat Jan 18 16:58:23 2014
From: musicdenotation at (musicdenotation at
Date: Sat, 18 Jan 2014 22:58:23 +0700
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at> wrote:

> You can use tail recursion elimination in Python as it is today.
I have seen many "implementations" of tail-call optimization, and their common problem is that they all require special syntax to work. I need a solution that is directly usable with Python's orrdinary return statement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From breamoreboy at  Sat Jan 18 17:28:20 2014
From: breamoreboy at (Mark Lawrence)
Date: Sat, 18 Jan 2014 16:28:20 +0000
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <lbea2s$uie$>

On 18/01/2014 15:58, 
musicdenotation at wrote:
>> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno"
>> <jsbueno at
>> <mailto:jsbueno at>> wrote:
>> You can use tail recursion elimination in Python as it is today.
> I have seen many "implementations" of tail-call optimization, and their
> common problem is that they all require special syntax to work. I need a
> solution that is directly usable with Python's orrdinary /return/
> statement.

Then implement one and publish it so everybody else can use it.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From stefan_ml at  Sat Jan 18 17:40:37 2014
From: stefan_ml at (Stefan Behnel)
Date: Sat, 18 Jan 2014 17:40:37 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <lbeapq$69h$>

musicdenotation at, 18.01.2014 16:58:
> I have seen many "implementations" of tail-call optimization, and their
> common problem is that they all require special syntax to work. I need a
> solution that is directly usable with Python's orrdinary return
> statement.

What do you need it for?

(and note that your answer might to be more suited for python-list than
python-ideas, in which case you may reply over there)


From jsbueno at  Sat Jan 18 18:12:29 2014
From: jsbueno at (Joao S. O. Bueno)
Date: Sat, 18 Jan 2014 15:12:29 -0200
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

This one is. What are you taliking about?

On 18 January 2014 13:58,  <musicdenotation at> wrote:
> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at> wrote:
> You can use tail recursion elimination in Python as it is today.
> I have seen many "implementations" of tail-call optimization, and their
> common problem is that they all require special syntax to work. I need a
> solution that is directly usable with Python's orrdinary return statement.

What are you talking about? This one is usable with ordinary returns.
It just requires a decorator.

From ned at  Sat Jan 18 19:02:28 2014
From: ned at (Ned Batchelder)
Date: Sat, 18 Jan 2014 13:02:28 -0500
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/18/14 10:58 AM, musicdenotation at wrote:
>> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at 
>> <mailto:jsbueno at>> wrote:
>> You can use tail recursion elimination in Python as it is today.
> I have seen many "implementations" of tail-call optimization, and 
> their common problem is that they all require special syntax to work. 
> I need a solution that is directly usable with Python's orrdinary 
> /return/ statement.
You haven't explained why you need it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Sat Jan 18 19:29:46 2014
From: abarnert at (Andrew Barnert)
Date: Sat, 18 Jan 2014 10:29:46 -0800
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

Whether or not you really need it, adding it to Python is a fun challenge that's worth trying.

The first problem is that CPython makes a C function call for every Python function call, and C doesn't eliminate tail calls; the only way to do it manually is with longjmp. So, you probably want to add it to either Stackless or PyPy instead.

Second, detecting tail recursion to eliminate it is pretty hard, and you have to do a runtime check to detect that the function being called is already on the stack, which potentially slows down every function call. Fortunately, eliminating _all_ tail calls instead of just recursive ones is a lot easier in Python, and it's better in just about every way.

Third, eliminating tail calls means the aren't on the stack at runtime, which means there's no obvious way to display useful tracebacks. I don't think too many Python users would accept the tradeoff of giving up good tracebacks to enable certain kinds of non-pythonic code, but even if you don't solve this, you can always maintain a fork the same way that Stackless has been doing.

Meanwhile, it will be a lot easier to do this in steps: First add a tail statement that's like return except that its expression must be a Call; instead of compiling to a return bytecode it replaces the call_function* with a tail_function*, which you can initially fake by doing a call and return. Next, write a real implementation of tail_function* that jumps instead of calling and returning. Next, write a simple keyhole optimizer that converts any call_function* followed immediately by return into a tail_function*, which makes your custom syntax unnecessary, so you can revert it. Finally, solve the traceback problem in some way. (Maybe you could do something tricky here: Split the stack frame object into the bit needed for tracebacks and the bit needed for actual calling; tail call elimination takes care of the second one, and a different optimization to detect and run-compress loops takes care of the first one. Making that fast enough so that it doesn't slow down every call unacceptable probably means keeping around a dict mapping some kind of "position" record to frames.)

Sent from a random iPhone

On Jan 18, 2014, at 7:58, musicdenotation at wrote:

>> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at> wrote:
>> You can use tail recursion elimination in Python as it is today.
> I have seen many "implementations" of tail-call optimization, and their common problem is that they all require special syntax to work. I need a solution that is directly usable with Python's orrdinary return statement.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sun Jan 19 00:09:39 2014
From: tjreedy at (Terry Reedy)
Date: Sat, 18 Jan 2014 18:09:39 -0500
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <lbf1je$n02$>

On 1/18/2014 9:52 AM, 
musicdenotation at wrote:

A few notes first:

1) A tail call is a 'top level' call in a return statement.
     return f(*args, **kwds)
A directly recursive call, where f refers to the function with the 
return statement, is a special case.

2) Whether a tail call is directly recursive cannot be determined, in 
Python, at compile time merely by looking at the function definition. If 
'f' is the definition name of the function, an assumption can be made, 
but doing so is a semantic change in the language.

3). 'Tail recursion elimination' has two quite different meanings.

3a. Compile the (in Python, assumed to be) directly recursive function 
as if it had been written with a while loop. This is typically done in 
languages that do not expose while loops. This eliminates the function 
call overhead as well as stack growth. This does nothing for indirect 
recursion through intermediate function calls.

3b. At runtime, make the call but reuse the current stack frame. This is 
easily done (at least in CPython) for all tail calls. But doing so for 
all tail calls will make stack traces pretty useless, as tail calls are 
rather common.  Determining whether the call is recursive, to limit 
stack reuse, takes extra work. Either choice only eliminates stack growth.

Comments on some of you responses to a *subset* of TRE issues.

> On April 22, 2009, Guido van Rossum wrote:
>> First, <the stack trace issue>
> Do [while/for] loops have nice stack traces as you mean it?

No, if you want a complete trace, you must add a print statement inside 
the loop. Looping with tail recursion gives you the complete trace for free.

>> Second, <what is fundamental>
What is fundamental, besides alternation, is the ability to express an
unbounded amount of repetition, with variation, in a small finite amount 
of code.  I call this computational induction, in analogy to 
mathematical induction. The alternative implemenations include recursion 
and explicit while/for loops. There are also the combinator and fixed 
point approaches.

>> <writing repetition with explicit loops>

> I think the most obvious way to do something is how it is /defined/.

Most recursive definitions are naturally written with body recursion 
rather than tail recursion. A simple example (without input checking) is 
the factorial function.

def fact(n):
   if n: return fact(n-1) * n
   else: return 1

To get TRE, one must re-write in tail-recursive form. (Python default 
arguments actually make this easier than in most languages.)

def fact(N, n=1, result=1):
   if n<N: return fact(n+1, result*(n+1)
   else: return result

I have intentionally written the tail form so it calculates intermediate 
results in the same order, making no assumption that the reduction 
operator is either cummutative or associative. Once the work of 
converting to tail form is done, conversion to while form is trivial and 
easier than the conversion from body to tail recursion.

def fact(N):
   n, result = 1, 1
   while n < N:
     n, result = n+1, result*(n+1)

But in fact, Python has another alternative, for loops.

def fact(N):
   result = 1
   for n in range(2, N):
     result = result * n
   return result

For loops are the proper way to write most collection processing that 
one could write with tail recursion, as it makes use of Python's 
iterator protocol.

def sum(iterable):
   sum = 0
   for n in iterable:
     sum = sum + n
   return sum

To write that with either recursion or while, one must call iter() and 
next() and catch StopIteration oneself. Try it (I have) and see if you 
really think it more natural. Lisp/scheme get away with recursion for 
collection processing because there is only one collection type and 
clever pattern-matching syntax to deconstruct an instance to get one 
element at a time. I personally consider the efficiency of for loops, 
for both programmer and computer, to  be a major reason to not bother 
with TRE.

Terry Jan Reedy

From steve at  Sun Jan 19 01:45:16 2014
From: steve at (Steven D'Aprano)
Date: Sun, 19 Jan 2014 11:45:16 +1100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <20140119004515.GP3915@ando>

On Sat, Jan 18, 2014 at 10:29:46AM -0800, Andrew Barnert wrote:

> Whether or not you really need it, adding it to Python is a fun 
> challenge that's worth trying.

"Need" is a funny thing. There's nothing you can do with a for-loop that 
you can't do with a while-loop, but that doesn't mean we don't "need" 
for-loops. Certain algorithms and idioms are simply better written in 
terms of for-loops, and certain algorithms are simply better written in 
terms of recursion than looping.

You can go a long way without recursion, or only shallow recursion. In 
15 years + of writing Python code, I've never been in a position that I 
couldn't solve a problem because of the lack of tail recursion. But 
every time I manually convert a recursive algorithm to an iterative one, 
I feel that I'm doing make-work, manually doing something which the 
compiler is much better at than I am, and the result is often less 
natural, or even awkward. (Trampolines? Ewww.)

> Third, eliminating tail calls means the aren't on the stack at 
> runtime, which means there's no obvious way to display useful 
> tracebacks. I don't think too many Python users would accept the 
> tradeoff of giving up good tracebacks to enable certain kinds of 
> non-pythonic code, 

What makes you say that it is "non-pythonic"? You seem to be assuming 
that *by definition* anything written recursively is non-pythonic. I do 
not subscribe to that view.

In fact, in some cases, I *would* willingly give up *non-useful* 
tracebacks for the ability to write more idiomatic code. Have you seen 
the typical recursive traceback? They're a great argument for "less is 
more". What normally happens is that you get a RuntimeError and the 
traceback blows away your xterm's buffer with hundreds of identical or 
near-identical lines. But even in the case where you didn't hit the 
recursion limit, the traceback is pretty much a picture of redundancy 
and noise:

py> a(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 2, in a
    return b(n-1)
  File "./", line 5, in b
    return c(n-1) + a(n)
  File "./", line 9, in c
    return 1/n
ZeroDivisionError: division by zero

The only thing that I care about is the very last line, that function c 
tries to divide by zero. The rest of the traceback is just noise, I 
don't even look at it.

Now, it's okay if you disagree, or if you can see something useful in 
the traceback other than the last entry. I'm not suggesting that TCE 
should be compulsary. I would be happy with a commandline switch to 
turn it on, or better still, a decorator to apply it to certain 
functions and not others. I expect that I'd have TCE turned off for 
debugging. But perhaps not -- it's not like Haskell and Scheme 
programmers are unable to debug their recursive code.

The point is that tracebacks are not sacrosanct, and, yes, I would like 
the choice between writing idiomatic recursive code and more extensive 
tracebacks. Trading off speed for convenience is perfectly Pythonic -- 
that's why we have the ability to write C extensions, is it not?

> but even if you don't solve this, you can always 
> maintain a fork the same way that Stackless has been doing.

Having to fork the entire compiler just to write a few functions in 
their most idiomatic, natural (recursive) form seems a bit extreme, 
wouldn't you say? 


From rymg19 at  Sun Jan 19 01:56:38 2014
From: rymg19 at (Ryan)
Date: Sat, 18 Jan 2014 18:56:38 -0600
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <>

I wrote one that uses decorators. How is that special syntax?

musicdenotation at wrote:
>> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at>
>> You can use tail recursion elimination in Python as it is today.
>I have seen many "implementations" of tail-call optimization, and their
>common problem is that they all require special syntax to work. I need
>a solution that is directly usable with Python's orrdinary return
>Python-ideas mailing list
>Python-ideas at
>Code of Conduct:

Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From nas-python at  Sun Jan 19 02:13:32 2014
From: nas-python at (Neil Schemenauer)
Date: Sat, 18 Jan 2014 19:13:32 -0600
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <lbd36b$87t$>
References: <>
Message-ID: <>

On 2014-01-18, Terry Reedy wrote:
> On 1/17/2014 10:22 PM, Neil Schemenauer wrote:
> >The transition to Python 3 is happening but there is still a massive
> >amount of code that needs to be ported.
> For application code, why does it need to be ported.

Unless Python 2.x is going to be maintained in perpetuity then code
will have to be ported.  This point seems obvious to me.

> For many application areas, the text problem seems to have been
> somewhat solved, to the point where people are writing 2&3 code
> successfully.

Sure you can write code that's compatible with 2&3, that's not the
code I'm talking about.  I'm talking about the millions (maybe
billions) of lines of existing Python code.

> I think it too late now.

I disagree.  The amount of Python 2 code that exists exceeds the
amount of Python 3 by orders of magnitude.  That existing codebase
either stops evolving and stays Python 2 forever or we do all that's
practical to help people move it to a current version of Python.

> I believe you left out the int division change.

That should be on the list.

> People who cannot move to 3.x because of libraries could not move to
> 2.8 for the same reason. Over half of the most commonly downloaded
> libraries already have 3.x versions.

That's a necessary condition but the vast amount of existing Python
2 code has not been ported.  Lots of it would be private libraries
or applications.  You only have to look at the download stats for
the Python interpreter to confirm this.

> I realize that if there is actual code created, and if it's not
> under the umbrella of the PSF, it couldn't be called "Python 2.8"
> due to trademark reasons.

I don't give a shit what it's called.  A Python 2 fork is going to
happen whether the PSF blesses it or not, I can't believe that's
even a point of discussion.  People are still maintaining Cobol


From nas-python at  Sun Jan 19 02:14:13 2014
From: nas-python at (Neil Schemenauer)
Date: Sat, 18 Jan 2014 19:14:13 -0600
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

On 2014-01-17, Andrew Barnert wrote:
> What exactly do you mean by "the bytes/unicode changes"?

I mean all of those things that you listed.   bytes = the Python 2.7
str object, str object is Python 2.7 unicode object.

> I suspect that, whatever your exact answers, it would be a lot
> easier to fork 3.4 and port the 2.7 behavior you want than to fork
> 2.7 and backport almost all of 3.4.

It's a lot of work no matter which way you do it.  That's one of the
biggest problems with this idea.

> And if you do it that way, you could even adapt the idea someone
> proposed a few weeks ago?not popular on this list, but maybe
> popular with your target audience?of turning each change on and
> off with a "from __past__ import misfeature" statement, so people
> could pick and choose the ones they need, and gradually remove
> past statements as they port from your forked 2.8 to real 3.4.

You can't make those changes with __future__/__past__ imports.  They
effect the whole runtime, not single module.

> However,?I also suspect that, whatever your exact answers, it
> won't be that useful.?Look at people's reasons for not moving to
> 3.x:

Imagine I'm a developer with the Python 2.x codebase.  I'm either
lazy or I'm too busy with other company projects that I can't put
the effort into porting my 2.x code to 3.x, even if all the 3rd
party libraries have been ported.

How can we make it easier for them to move their code towards Python
3.x rather than keeping it as 2.x?  A maintained interpreter to run
Python 2.x code is going to continue to exist.  Some python-dev
people seem to suggest we can suggest that end of maintenance of
Python 2.7 is going to force people to port their code.  That's

I want to make it more attractive for these developers to move
towards Python 3 rather than stalling out on Python 2.7 forever.
How best to do that is still to be determined.  I think my 2.8 idea
might be better than the status quo but it's just a crazy idea.

> I'm having a hard time imagining code that would be easy to port to 2.8, but not to 3.x. For example:
> ? ? payload = <some object with a __str__ method to serialize it>
> ? ? sock.sendall('Header: {}\r\nAnother: {}\r\n\r\n{}'.format(
> ? ? ? ? headers['header'], headers['another'], payload))
> Even with just the two changes you already suggested:?First, you
> have to change the literal to a bytes literal. 

That part is easy, could even be done with an automated tool
(change u' to ' and ' to b').

> More seriously, you have to rename that payload type's __str__
> method to __bytes__.

Nope, no __bytes__ in my proposed 2.8.

> And if it does any string stuff internally, like encoding JSON,
> that has to change. Meanwhile, your?logging code probably relies
> on the same _str__ method actually returning a str, so you have to
> add one of those. Assuming headers is a dict of strs, you either
> need to go back up the chain (or into the API that provides it)
> and change that so it's been a dict of bytes all along, or you
> need to explicitly encode the headers here. That doesn't sound too
> hard overall? but that gives you working Python 3.5?code (assuming
> PEP 460 goes through). And there doesn't seem to be any shortcut
> that would give you working 2.8 code without also working in 3.5.

I think you are misunderstanding my proposal, no problems like the
ones you suggest, bytes() would be the Python 2.7 str class.  All
the internal bytes/unicode internals would be like 2.7.  That's
basically the whole idea of this proposal, the bytes/str change in
3.x is the really disruptive one, separate it into separate
interpreter versions to make porting easier.


From rosuav at  Sun Jan 19 02:31:20 2014
From: rosuav at (Chris Angelico)
Date: Sun, 19 Jan 2014 12:31:20 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 19, 2014 at 12:14 PM, Neil Schemenauer
<nas-python at> wrote:
>  That's
> basically the whole idea of this proposal, the bytes/str change in
> 3.x is the really disruptive one, separate it into separate
> interpreter versions to make porting easier.

It may be disruptive to a whole lot of code that's been happily
oblivious to the whole issue, but it's also central to more and more
of Py3's library. It's going to become increasingly difficult to
backport stuff from Py3 to a system that doesn't have the same
back-end string handling.

If you're prepared to make a whole bunch of incompatible changes to
move to this hypothetical 2.8, why not make all the changes at once?
Unless 2.x will be maintained forever (with a 2.9, a 2.10, and so on),
the changes will have to be made. If it's so costly to make a full
pass over your code to port it to 3.3/3.4/3.5, surely it would be
twice as costly to make that exact same full pass to port it to 2.8,
and then another just the same to port 2.8 to 3.6?

I still maintain that the biggest complaints about the jump from 2.x
to 3.x are largely dealt with by 2.7 and 3.3/3.4. Yes, it's hard to
jump from 2.5 to 3.1, but you don't have to. Just stick with 2.x until
your users are all on 2.7, then strip out all the code that's
supporting pre-2.7 versions. Once you have that, you can start in with
some __future__ directives (division, print_function,
unicode_literals), and start sorting out the bytes/unicode distinction
*at your leisure*. (In some cases, that "sorting out" is simply a
matter of naming. I have some code that reads from a socket, and it's
divided into three parts: first pass works with "data" and handles
TELNET codes, second pass works with "text" and handles ANSI codes,
third pass works with "text" and handles newlines. It's obvious from
the parameter names that the conversion from bytes to Unicode has to
happen between the first and second passes.) Then, when you finally
come to port it to 3.x (which mightn't be for another however-many
years, when 2.7's support finally ends, or it might be even
later, when RHEL stops shipping patches, or it might not even be then
- code doesn't stop working just because it's not supported), you make
the jump to whichever version is most convenient. Currently, I'm
seeing 3.3 as easier to jump to than 3.2 (eg the redundant
compatibility notation u"str" is available), and 3.4 is getting some
more on that front; maybe some features won't make it into 3.4 so
they'll be in 3.5. And maybe it'll be 3.7 that you jump to. That's not
a problem. Whatever version you port it to, you make *one* assault on
your code, and there you are, taking advantage of exactly as much of
3.x as you feel like, and it's all working.


From steve at  Sun Jan 19 03:04:04 2014
From: steve at (Steven D'Aprano)
Date: Sun, 19 Jan 2014 13:04:04 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
Message-ID: <20140119020404.GQ3915@ando>

On Sat, Jan 18, 2014 at 07:13:32PM -0600, Neil Schemenauer wrote:
> On 2014-01-18, Terry Reedy wrote:
> > On 1/17/2014 10:22 PM, Neil Schemenauer wrote:
> > >The transition to Python 3 is happening but there is still a massive
> > >amount of code that needs to be ported.
> > 
> > For application code, why does it need to be ported.
> Unless Python 2.x is going to be maintained in perpetuity then code
> will have to be ported.  This point seems obvious to me.

Why? If it isn't broken, don't break it. At last year's US PyCon, there 
was at least one person still using Python 1.5 in production. Doing so 
means that he gets no bug fixes or security updates for 1.5, but if he 
doesn't need them, that is no loss.

Python 2.7 will almost certainly still be supported by (for example) Red 
Hat until 2023, which is probably longer than most applications will be 
still in use.

> > For many application areas, the text problem seems to have been
> > somewhat solved, to the point where people are writing 2&3 code
> > successfully.
> Sure you can write code that's compatible with 2&3, that's not the
> code I'm talking about.  I'm talking about the millions (maybe
> billions) of lines of existing Python code.
> > I think it too late now.
> I disagree.  The amount of Python 2 code that exists exceeds the
> amount of Python 3 by orders of magnitude.  That existing codebase
> either stops evolving and stays Python 2 forever 

Why is that a problem? Some people will never migrate away from Python 
2.7/2.5/2.4/1.5. That's okay. A few months ago I ported an application 
from 2.3 to 2.6. It's not well recognised that Python 3 is not the first 
time Python broke backwards compatibility: string exceptions

raise "This is an error"

became a warning in 2.5 (I think) and a SyntaxError in 2.6. This 
application made extensive use of string exceptions. My customer was 
happy with 2.3 code for years, until they upgraded their server to a 
version of Centos with 2.6, and that was the only reason they upgraded. 
I expect they will stick with 2.6 until such time as they upgrade the 
server again in another decade or so, and that's fine. They may never 
upgrade, and that's fine too.

> or we do all that's
> practical to help people move it to a current version of Python.

Define "all that's practical". How much hand-holding do they need? On 
the Python-Dev list, there are *hundreds* of emails about this issue, 
which is distracting the core devs from making Python 3 even more 

> I don't give a shit what it's called.  A Python 2 fork is going to
> happen whether the PSF blesses it or not

I doubt that. Stackless may try to call itself Python 2.8, but it won't 
be porting Python 3 features:

Stackless wanted to release a 2.8, but it wouldn't contain any 
additional Py3 features:

it would be a version bump to support a newer Microsoft compiler.

There are plenty of people who *say* they want a Python 2.8 with half 
the Python 3 features, but nobody as far as I can see is actually 
willing to do the work. If they were, why haven't they started? They 
don't need permission.


From steve at  Sun Jan 19 03:28:11 2014
From: steve at (Steven D'Aprano)
Date: Sun, 19 Jan 2014 13:28:11 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <20140119022811.GR3915@ando>

On Fri, Jan 17, 2014 at 09:22:19PM -0600, Neil Schemenauer wrote:
> Here is a far out idea to make transition smoother.  Release version
> 2.8 of Python with nearly all Python 3.x incompatible changes except
> for the bytes/unicode changes.  This could include:

It's hardly a "far out" idea. You're not the first to suggest this. 
There are many people asking for -- demanding, almost -- a Python 2.8 
that provides only the subset of Python3 that they are interested in and 
gives them an excuse to avoid migrating for another three or five or ten 

Because really that's what 2.8 is all about -- providing people an 
excuse to put off migrating for a bit longer. But the thing is, they've 
still got a good three or more years before 2.7 goes into "security 
patches only" mode, and likely years more before it becomes 
unmaintained. And then there's third-party support from companies like 
RedHat. They will continue supporting Python 2 until end of 2023:

I wonder whether the 2 to 3 transition might not have been handled 
better with a long-drawn out process of slowly adding 
backwards-incompatible changes a few at a time? This is like the old 
chestnut about whether it is better to ease yourself into a really cold 
swimming pool a little at a time, or get it over with in one go by 
diving in. In both cases, you have pain, is it better to have a lot of 
pain that only lasts a short while, or a little bit of pain that goes on 
and on and on and on...? I think that had Python decided to add 
backwards-incompatible changes a few at a time, people now would be 
complaining about that and demanding that there be a once-off cutover 

> - removal of 'apply', 'buffer', 'callable', 'execfile'

callable is back in Python 3.3.

> Problems with this idea:
> - it would be a huge amount of work. [...]
> - if people install this new version of Python as the default, old
>   scripts and programs will break. [...]

- It gives people an excuse to avoid migrating, and as sure as the sun 
rises in the east, will lead to people calling for Python 2.9 a few 
years from now.

> An alternative approach to producing Python 2.8 would be to start
> with the Python 3.x latest branch.  Modify bytesobject and
> unicodeobject to have as close to Python 2 behavior as practical.
> A-journey-of-a-thousand-miles-begins-ly y'rs

The journey *already began* back in Python 2.6. Python 2.6 is the start 
of the journey, it introduces dict views, next() builtin, from 
__future__ absolute_import print_function and unicode_literals, and 
probably more that I have forgotten.

So really, people have had 2.6 and 2.7 to ease the transition from 2.5 
to 3.x. If they haven't taken advantage of that, what makes you think 
that 2.8 and 2.9 will convince them to migrate?

But you don't have to believe me. Python is open source. Feel free to 
fork it and backport whatever features you like, and see how much 
interest you get from the wider community. Just don't call it "2.8", 
that sends the wrong message and is a pretty rude thing to do given that 
the core developers have said that they will not make a 2.8:

Just because there will not be a CPython 2.8 doesn't mean you can't go 
ahead with your plan to backport 3 features to a 2 base. Just call it 
something else.


From steve at  Sun Jan 19 03:34:19 2014
From: steve at (Steven D'Aprano)
Date: Sun, 19 Jan 2014 13:34:19 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <20140119023419.GS3915@ando>

On Sat, Jan 18, 2014 at 07:14:13PM -0600, Neil Schemenauer wrote:
> On 2014-01-17, Andrew Barnert wrote:

> > And if you do it that way, you could even adapt the idea someone
> > proposed a few weeks ago?not popular on this list, but maybe
> > popular with your target audience?of turning each change on and
> > off with a "from __past__ import misfeature" statement, so people
> > could pick and choose the ones they need, and gradually remove
> > past statements as they port from your forked 2.8 to real 3.4.
> You can't make those changes with __future__/__past__ imports.  They
> effect the whole runtime, not single module.

I believe you are wrong. from __future__ imports are designed to effect 
the single module they are in. I see no reason why from __past__ can't 
work the same way.

[steve at ando ~]$ cat
def func():
    return "Hello World"
[steve at ando ~]$ cat
from __future__ import unicode_literals

def func():
    return "Hello World"
[steve at ando ~]$ python2.7 -c "import b, a; print repr(b.func()), repr(a.func())"
u'Hello World' 'Hello World'


From at  Sun Jan 19 04:42:00 2014
From: at (Haoyi Li)
Date: Sat, 18 Jan 2014 19:42:00 -0800
Subject: [Python-ideas] Tail recursion elimination
Message-ID: <3426697229381222197@unknownmsgid>

MacroPy also has an implementation of TCO implemented using trampolining.
It trades stack introspection for load-time-analysis, which could be a win
or a loss depending on how you view things.
From: Ryan
Sent: 1/18/2014 4:57 PM
To: musicdenotation at; Joao S. O. Bueno; python-ideas at
Subject: Re: [Python-ideas] Tail recursion elimination

I wrote one that uses decorators. How is that special syntax?

musicdenotation at wrote:
> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at>
> wrote:
> You can use tail recursion elimination in Python as it is today.
> I have seen many "implementations" of tail-call optimization, and their
> common problem is that they all require special syntax to work. I need a
> solution that is directly usable with Python's orrdinary *return*statement.
> ------------------------------
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ben+python at  Sun Jan 19 04:58:59 2014
From: ben+python at (Ben Finney)
Date: Sun, 19 Jan 2014 14:58:59 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
References: <> <lbd36b$87t$>
Message-ID: <>

Neil Schemenauer <nas-python at>

> On 2014-01-18, Terry Reedy wrote:
> > For application code, why does it need to be ported [to Python 3].
> Unless Python 2.x is going to be maintained in perpetuity then code
> will have to be ported.  This point seems obvious to me.

Maintained by whom? The PSF will stop maintaining Python 2, yes.

But that doesn't stop other parties ? Red Hat, ActiveState, etc. ? doing
so for whatever customers are still interested in compensating them for
their work.

So long as the cost of getting the Python interpreter maintained by
*someone* is lower than the perceived cost of porting to Python 3, the
code doesn't need to be ported.

This is a great and salient benefit of Python itself being free
software: Unlike a non-free software platform, no recipient of a
free-software Python is beholden to the vendor for ongoing maintenance.

That point seems obvious to me.

 \         ?It is the fundamental duty of the citizen to resist and to |
  `\          restrain the violence of the state.? ?Noam Chomsky, 1971 |
_o__)                                                                  |
Ben Finney

From rymg19 at  Sun Jan 19 05:05:49 2014
From: rymg19 at (Ryan)
Date: Sat, 18 Jan 2014 22:05:49 -0600
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <3426697229381222197@unknownmsgid>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

Now there's a new library I need to try!

Haoyi Li < at> wrote:
>MacroPy also has an implementation of TCO implemented using
>It trades stack introspection for load-time-analysis, which could be a
>or a loss depending on how you view things.
>From: Ryan
>Sent: 1/18/2014 4:57 PM
>To: musicdenotation at; Joao S. O. Bueno;
>python-ideas at
>Subject: Re: [Python-ideas] Tail recursion elimination
>I wrote one that uses decorators. How is that special syntax?
>musicdenotation at wrote:
>> On Jan 18, 2014, at 22:08, "Joao S. O. Bueno" <jsbueno at>
>> wrote:
>> You can use tail recursion elimination in Python as it is today.
>> I have seen many "implementations" of tail-call optimization, and
>> common problem is that they all require special syntax to work. I
>need a
>> solution that is directly usable with Python's orrdinary
>> ------------------------------
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
>Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Sent from my Android phone with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From musicdenotation at  Sun Jan 19 08:39:15 2014
From: musicdenotation at (musicdenotation at
Date: Sun, 19 Jan 2014 14:39:15 +0700
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

I propose tail-call optimization to be added into CPython.

From abarnert at  Sun Jan 19 08:50:39 2014
From: abarnert at (Andrew Barnert)
Date: Sat, 18 Jan 2014 23:50:39 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 18, 2014, at 17:12, Neil Schemenauer <neil at> wrote:

> On 2014-01-17, Andrew Barnert wrote:
>> What exactly do you mean by "the bytes/unicode changes"?
> I mean all of those things that you listed.   bytes = the Python 2.7
> str object, str object is Python 2.7 unicode object.

If you really mean I the things I listed, we already have that. It's called Python 3.4. If you want to fork it and rename it 2.8, I can't imagine who that would help.

A smaller list of changes _might_ mean a useful intermediate target, but if you're not even willing to think through the issues and discuss them, you're not going to come up with such a list.

>> I suspect that, whatever your exact answers, it would be a lot
>> easier to fork 3.4 and port the 2.7 behavior you want than to fork
>> 2.7 and backport almost all of 3.4.
> It's a lot of work no matter which way you do it.  That's one of the
> biggest problems with this idea.
>> And if you do it that way, you could even adapt the idea someone
>> proposed a few weeks ago?not popular on this list, but maybe
>> popular with your target audience?of turning each change on and
>> off with a "from __past__ import misfeature" statement, so people
>> could pick and choose the ones they need, and gradually remove
>> past statements as they port from your forked 2.8 to real 3.4.
> You can't make those changes with __future__/__past__ imports.  They
> effect the whole runtime, not single module.

Sure you can. It already works for __future__ Unicode literals in 2.7. Most of the other changes would work just as well. A few might not--but again, you have to go through them one by one and decide.

>> However, I also suspect that, whatever your exact answers, it
>> won't be that useful. Look at people's reasons for not moving to
>> 3.x:
> Imagine I'm a developer with the Python 2.x codebase.  I'm either
> lazy or I'm too busy with other company projects that I can't put
> the effort into porting my 2.x code to 3.x, even if all the 3rd
> party libraries have been ported.
> How can we make it easier for them to move their code towards Python
> 3.x rather than keeping it as 2.x?  

Not by publishing something that requires the exact same code changes as 3.4 and calling it 2.8. That might trick a handful of devs, and help a handful of others trick their managers, but that's not much benefit.

> A maintained interpreter to run
> Python 2.x code is going to continue to exist.  Some python-dev
> people seem to suggest we can suggest that end of maintenance of
> Python 2.7 is going to force people to port their code.  That's
> ridiculous.

I've never heard anyone suggest this. The people who are most gung ho about 3.x are the ones who keep pointing out that many apps never need to port and that people like RedHat are likely to continue supporting 2.7 long after the PSF stops doing so.

> I want to make it more attractive for these developers to move
> towards Python 3 rather than stalling out on Python 2.7 forever.
> How best to do that is still to be determined.  I think my 2.8 idea
> might be better than the status quo but it's just a crazy idea.

A crazy idea is one thing; a misinformed idea is another.
>> I'm having a hard time imagining code that would be easy to port to 2.8, but not to 3.x. For example:
>>     payload = <some object with a __str__ method to serialize it>
>>     sock.sendall('Header: {}\r\nAnother: {}\r\n\r\n{}'.format(
>>         headers['header'], headers['another'], payload))
>> Even with just the two changes you already suggested: First, you
>> have to change the literal to a bytes literal.
> That part is easy, could even be done with an automated tool
> (change u' to ' and ' to b').
>> More seriously, you have to rename that payload type's __str__
>> method to __bytes__.
> Nope, no __bytes__ in my proposed 2.8.

Then the code just doesn't work. The payload types existing __str__ returns a bytes object, which raises a TypeError.

>> And if it does any string stuff internally, like encoding JSON,
>> that has to change. Meanwhile, your logging code probably relies
>> on the same _str__ method actually returning a str, so you have to
>> add one of those. Assuming headers is a dict of strs, you either
>> need to go back up the chain (or into the API that provides it)
>> and change that so it's been a dict of bytes all along, or you
>> need to explicitly encode the headers here. That doesn't sound too
>> hard overall? but that gives you working Python 3.5 code (assuming
>> PEP 460 goes through). And there doesn't seem to be any shortcut
>> that would give you working 2.8 code without also working in 3.5.
> I think you are misunderstanding my proposal, no problems like the
> ones you suggest, bytes() would be the Python 2.7 str class.  All
> the internal bytes/unicode internals would be like 2.7.

You're contradicting yourself. You explicitly said that your proposal includes all of the changes I suggested. That includes, right near the very top, things like no automatic str/bytes conversions. But you seem to be assuming they would still exist even though you decided to remove them.

> That's
> basically the whole idea of this proposal, the bytes/str change in
> 3.x is the really disruptive one, separate it into separate
> interpreter versions 

But you've proposed that the all of the elements of the str/bytes change should be part of 2.8, which means it will be just as disruptive as 3.4.

From ncoghlan at  Sun Jan 19 09:04:46 2014
From: ncoghlan at (Nick Coghlan)
Date: Sun, 19 Jan 2014 18:04:46 +1000
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
Message-ID: <>

On 19 January 2014 11:13, Neil Schemenauer <nas-python at> wrote:
> On 2014-01-18, Terry Reedy wrote:
>> On 1/17/2014 10:22 PM, Neil Schemenauer wrote:
>> >The transition to Python 3 is happening but there is still a massive
>> >amount of code that needs to be ported.
>> For application code, why does it need to be ported.
> Unless Python 2.x is going to be maintained in perpetuity then code
> will have to be ported.  This point seems obvious to me.

Red Hat will offer commercial Python 2 support until at least 2023
(since the RHEL7 beta was just released with Python 2.7 as the system
Python and the current lifecycle for RHEL releases is 10 years), and I
expect other commercial redistributors will similarly extend the
lifetime of Python 2 well beyond 2015 when the level of support we
provide for free reverts to security fix only mode. With CentOS and
other downstream community rebuilds of RHEL available, this even
includes the availability of *free* prebuilt versions.

So Python 2 application developers don't have anything to worry about
*if they're happy with Python 2.7 as it stands*, especially after
accounting for the Python 3 standard library modules that are also
available on PyPI for Python 2 (or are relatively easy to fork and
port back to Python 2, or just copy and paste the relevant code into a
private utility module).

However, now that we're approaching the release of Python 3.4 (the
second Python 3 release without a corresponding Python 2 release),
some Python 2 developers are finally beginning to realise how much
they had come to rely on the relatively steady cadence of new features
and functionality previously delivered in an easily consumable bundle
by the CPython core development team.

So, those developers are now faced with a few different options:

- invest in migrating to Python 3 themselves (the cost of which will
vary from being similar to any major Python version update, with most
of the cost being in compatibility testing, to substantially more
expensive, depending on the exact nature of the application, its
dependencies and the quality of their respective test suites)
- try to guilt the existing core developers into creating Python 2.8
for them for free (not going to happen, read PEP 404)
- try to hire enough of the core developers to convince Guido to
approve a Python 2.8 release from (not impossible, but
likely prohibitively expensive, since most, perhaps all, of the core
development team are already gainfully employed elsewhere)
- fork CPython to create their own Python 2.8 (also cost prohibitive
from a time and materials perspective, unless you already have the
infrastructure and community in place to maintain a CPython fork)

That last point is relevant to the discussions around Stackless 2.8:
CCP and the rest of the Stackless community have been maintaining a
CPython fork for so long that the idea of porting some of the
backwards compatible Python 3.3 and 3.4 changes over to a Stackless
2.8 release is a relatively straightforward one for them and something
they're seriously considering. However, significant compatibility
testing costs would still be incurred in a switch from CPython 2.7 to
Stackless 2.8, so conservative developers are still likely to stick
with the devil they know (most of the really interesting changes in
Python 3 are the backwards incompatible ones, so they won't be
backported, even in Stackless 2.8).

There's lots and lots of info about the state of the Python 3
transition here:

I'd call reading that Q&A the starting point for any discussion of
creating a Python 2.8 release, but it really isn't. The starting point
is a deep understanding behind the business drivers of open source
based commercial operations and how they deal with cases where they
depend on things that upstream has decided not to support any more.
Sometimes they invest in the infrastructure needed to create their own
fork (since their motivations no longer align with the existing
development team's motivations), sometimes they pay commercial
redistributors to continue supporting the older version (an approach I
appreciate, since it represents one of the things that ultimately gets
me paid), sometimes they approach the existing development team (or a
related foundation) about directly funding continued development of
the version being discontinued and sometimes they decide to invest in
updating to the newer platform themselves.

This dynamic isn't unique to open source though, as it impacts even
large proprietary platform vendors like Microsoft - Windows XP almost
certainly remained supported for so long because a whole lot of paying
users that weren't happy with the state of Windows Vista and offered
Microsoft enough money to ensure they could keep using Windows XP
until Windows 7 was available. The only difference there is that in
the proprietary case, the *only* option users have is to beg the
vendor to continue maintenance - the options of forking or paying
someone else to take up maintenance aren't available due to the
licensing restrictions on the proprietary platform.

Returning to the Python 3 case, as things currently stand, the
combination of Armin Ronacher's python-modernize with Benjamin
Petersen's six module is one approach to smooth migration, as is Ed
Schofield's python-future module and its futurize tool. For
application porting (which may be able to just drop Python 2 support
rather than needing to maintain Python 2 and Python 3 support in
parallel), Guido's original 2to3 conversion tool may suffice.

PEP 461 will likely add a binary interpolation feature *back* to
Python 3.5, removing an additional blocker to forward migrations for
current Python 2 users (just as PEP 414 did by restoring Unicode
literal support).

While the Stackless community are looking at creating a Stackless 2.8
release, and some Python 2 users may decide it is worth migrating to
the Stackless fork to gain access to any Python 3 features they decide
to backport, rather than migrating to Python 3 itself, this is all
perfectly fine - it's the open source model working *exactly as it is
supposed to*, by giving people the option to take steps that meet
*their* needs, rather than being completely subject to the desires of
the core development team.

The only thing people *don't* get to do is make suggestions about what
*should* happen without also explaining:

* what problem the suggestion is designed to solve, and how it
actually helps to solve it
* how the proposal is going to be resourced, especially when it is
something the existing development team have disclaimed any interest
in doing for free


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sun Jan 19 09:13:06 2014
From: ncoghlan at (Nick Coghlan)
Date: Sun, 19 Jan 2014 18:13:06 +1000
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <20140119020404.GQ3915@ando>
References: <> <lbd36b$87t$>
 <> <20140119020404.GQ3915@ando>
Message-ID: <>

On 19 January 2014 12:04, Steven D'Aprano <steve at> wrote:
> On Sat, Jan 18, 2014 at 07:13:32PM -0600, Neil Schemenauer wrote:
>> I disagree.  The amount of Python 2 code that exists exceeds the
>> amount of Python 3 by orders of magnitude.  That existing codebase
>> either stops evolving and stays Python 2 forever
> Why is that a problem? Some people will never migrate away from Python
> 2.7/2.5/2.4/1.5. That's okay. A few months ago I ported an application
> from 2.3 to 2.6. It's not well recognised that Python 3 is not the first
> time Python broke backwards compatibility: string exceptions
> raise "This is an error"
> became a warning in 2.5 (I think) and a SyntaxError in 2.6. This
> application made extensive use of string exceptions. My customer was
> happy with 2.3 code for years, until they upgraded their server to a
> version of Centos with 2.6, and that was the only reason they upgraded.
> I expect they will stick with 2.6 until such time as they upgrade the
> server again in another decade or so, and that's fine. They may never
> upgrade, and that's fine too.

For anyone that ever travels by plane, it can be worth watching
aircraft entertainment systems go through their boot cycle to see what
they're running on (the difficulty of getting software, even
entertainment software, approved to run on aircraft can make for very
long lead times). The last one I checked was based on Red Hat 7.1,
released in 2001 and unsupported for a very long time, but still
entirely serviceable for that particular use case.

Old doesn't always mean broken, sometimes it just annoys your
developers to be asked to use such old and blunt tools when newer and
sharper ones are available :)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From musicdenotation at  Sun Jan 19 09:47:00 2014
From: musicdenotation at (musicdenotation at
Date: Sun, 19 Jan 2014 15:47:00 +0700
Subject: [Python-ideas] Make print() not append line break by default
Message-ID: <>

And add println()

From stefan_ml at  Sun Jan 19 10:10:13 2014
From: stefan_ml at (Stefan Behnel)
Date: Sun, 19 Jan 2014 10:10:13 +0100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <20140119022811.GR3915@ando>
References: <> <20140119022811.GR3915@ando>
Message-ID: <lbg4pa$d22$>

Steven D'Aprano, 19.01.2014 03:28:
> - It gives people an excuse to avoid migrating, and as sure as the sun 
> rises in the east, will lead to people calling for Python 2.9 a few 
> years from now.

Thank you, Steven, for taking the time to write this post.


From bruce at  Sun Jan 19 11:00:36 2014
From: bruce at (Bruce Leban)
Date: Sun, 19 Jan 2014 02:00:36 -0800
Subject: [Python-ideas] Make print() not append line break by default
In-Reply-To: <>
References: <>
Message-ID: <>

I think this is a great suggestion if the goal is to break lots of programs
for no good reason. Can we rename 'dict' to 'map' while we're at it?


The best suggestions are motivated by an actual problem or use case.
There's no problem here. Just use:

print(..., end='')

The bar for breaking changes is very high. This is -100 on a scale of 0 to

--- Bruce
I'm hiring:
Latest blog post: Alice's Puzzle Page
Learn how hackers think:

On Sun, Jan 19, 2014 at 12:47 AM, <musicdenotation at> wrote:

> And add println()
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From breamoreboy at  Sun Jan 19 11:52:02 2014
From: breamoreboy at (Mark Lawrence)
Date: Sun, 19 Jan 2014 10:52:02 +0000
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <lbgaob$421$>

On 19/01/2014 07:39, 
musicdenotation at wrote:
> I propose tail-call optimization to be added into CPython.

Then implement it so everybody else can use it.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From breamoreboy at  Sun Jan 19 12:01:38 2014
From: breamoreboy at (Mark Lawrence)
Date: Sun, 19 Jan 2014 11:01:38 +0000
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
Message-ID: <lbgbab$cjp$>

On 19/01/2014 01:13, Neil Schemenauer wrote:
> I don't give a shit what it's called.  A Python 2 fork is going to
> happen whether the PSF blesses it or not, I can't believe that's
> even a point of discussion.  People are still maintaining Cobol
> compilers.

I don't care what it's called either.  And I'll believe the fork when I 
see it.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From steve at  Sun Jan 19 12:08:03 2014
From: steve at (Steven D'Aprano)
Date: Sun, 19 Jan 2014 22:08:03 +1100
Subject: [Python-ideas] Make print() not append line break by default
In-Reply-To: <>
References: <>
Message-ID: <20140119110803.GT3915@ando>

On Sun, Jan 19, 2014 at 03:47:00PM +0700, musicdenotation at wrote:

> And add println()

Print natural log?

For the major use cases print is designed for, you want it to print a 
newline at the end. For those rare times you don't, print(..., end='') 
is simple enough.

Besides, print has inserted a newline at the end since at least 
Python 1.5. There is a lot of code relying on that behaviour. Even if we 
wanted to change, backwards-compatibility considerations would prevent 


From tjreedy at  Sun Jan 19 12:09:56 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 06:09:56 -0500
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
Message-ID: <lbgbpu$h94$>

On 1/18/2014 8:13 PM, Neil Schemenauer wrote:
> On 2014-01-18, Terry Reedy wrote:

>> I realize that if there is actual code created, and if it's not
>> under the umbrella of the PSF, it couldn't be called "Python 2.8"
>> due to trademark reasons.

Except I did not. This is part of a quote from Martjin Faasen. You 
should have left the attribution and quote marks in.

> I don't give a shit what it's called.  A Python 2 fork is going to
> happen whether the PSF blesses it or not,

The core developers said years ago that if *other* people want to make a 
post 2.7 Python, just not called 'Python 2.8' (because we do care), they 
are free to. We *expect* that there will be commercial support (Red Hat, 
for instance) at least for keeping 2.7 updated to work on new platforms, 
perhaps with a few other patches.  If you are correct about the 
tremendous demand for a 'something 2.8', then some group should be able 
to make money creating and selling it. However, as far as I know, no 
person and no corporation has yet offered money to PSF or individual 
core developers to develop a possibly PSF-blessed Python 2.8.

 > I can't believe that's even a point of discussion.

You are the one who brought it up on *this* list, where is it mostly 
off-topic, because *this* list is about future Python 3 versions. That 
was the point of me directing you to Faasen's 'something 2.8' discussion.

Terry Jan Reedy

From stefan_ml at  Sun Jan 19 12:24:51 2014
From: stefan_ml at (Stefan Behnel)
Date: Sun, 19 Jan 2014 12:24:51 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
Message-ID: <lbgclo$qa2$>

Andrew Barnert, 18.01.2014 19:29:
> The first problem is that CPython makes a C function call for every 
> Python function call, and C doesn't eliminate tail calls; the only way 
> to do it manually is with longjmp

Many C compilers actually fold tail recursion into loops. However, that has
nothing to do with an *interpreter* that happens to be written in C not
eliminating tail recursion. There is no technical reason you couldn't do
TRE in CPython at the *interpreter* level. And this has nothing to do with


From jsbueno at  Sun Jan 19 12:54:42 2014
From: jsbueno at (Joao S. O. Bueno)
Date: Sun, 19 Jan 2014 09:54:42 -0200
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <lbgaob$421$>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
> On 19/01/2014 07:39, musicdenotation at wrote:
>> I propose tail-call optimization to be added into CPython.
> Then implement it so everybody else can use it.

On a second though,  it actually could be done, at the VM level.
I am not a proponent, but after my second though I am from "-1" to "+0".

I believe that anytime one have the sequence:

             20 CALL_FUNCTION            1
             23 RETURN_VALUE

in byte code, the current stack frame could be discarded prior
to making the function call. Looking from 10000 meters, it feels
like it would not impact any other aspect of the language but for
enabling automatically tail recursion calls.


> --
> My fellow Pythonistas, ask not what our language can do for you, ask what
> you can do for our language.
> Mark Lawrence
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From jsbueno at  Sun Jan 19 12:57:34 2014
From: jsbueno at (Joao S. O. Bueno)
Date: Sun, 19 Jan 2014 09:57:34 -0200
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

OTOH, since we are at it, we'd better check
2009 BDLF's opinion on the subject:

On 19 January 2014 09:54, Joao S. O. Bueno <jsbueno at> wrote:
> On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
>> On 19/01/2014 07:39, musicdenotation at wrote:
>>> I propose tail-call optimization to be added into CPython.
>> Then implement it so everybody else can use it.
> On a second though,  it actually could be done, at the VM level.
> I am not a proponent, but after my second though I am from "-1" to "+0".
> I believe that anytime one have the sequence:
>              20 CALL_FUNCTION            1
>              23 RETURN_VALUE
> in byte code, the current stack frame could be discarded prior
> to making the function call. Looking from 10000 meters, it feels
> like it would not impact any other aspect of the language but for
> enabling automatically tail recursion calls.
>    js
>  -><-
>> --
>> My fellow Pythonistas, ask not what our language can do for you, ask what
>> you can do for our language.
>> Mark Lawrence
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:

From tjreedy at  Sun Jan 19 13:12:16 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 07:12:16 -0500
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <20140119004515.GP3915@ando>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <lbgfeq$kok$>

TCO (Tail Call Optimization) means that when TCO is in effect and a tail 
call "return f(<args>)" is executed, the current execution context 
(stack frame) is used for the call instead of allocating a new one. What 
is 'optimized' is space usage. The effect on time is not clear.

On 1/18/2014 7:45 PM, Steven D'Aprano wrote:

> What makes you say that it is "non-pythonic"? You seem to be assuming
> that *by definition* anything written recursively is non-pythonic. I do
> not subscribe to that view.

Neither do I.

> In fact, in some cases, I *would* willingly give up *non-useful*
> tracebacks for the ability to write more idiomatic code.
> The point is that tracebacks are not sacrosanct, and, yes, I would like
> the choice between writing idiomatic recursive code and more extensive
> tracebacks. Trading off speed for convenience is perfectly Pythonic --
> that's why we have the ability to write C extensions, is it not?

Are you willing to do any of the work needed to make the option 
available, starting with a specification? If so, I have some ideas.

> Having to fork the entire compiler just to write a few functions in
> their most idiomatic, natural (recursive) form seems a bit extreme,
> wouldn't you say?

A 'fork' could consist of a relatively small patch that could be 
uploaded to, for instance, PyPI. I would not be surprised if 100-200 
lines might be enough.

Terry Jan Reedy

From ncoghlan at  Sun Jan 19 13:31:06 2014
From: ncoghlan at (Nick Coghlan)
Date: Sun, 19 Jan 2014 22:31:06 +1000
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <lbgfeq$kok$>
References: <>
 <20140119004515.GP3915@ando> <lbgfeq$kok$>
Message-ID: <>

On 19 January 2014 22:12, Terry Reedy <tjreedy at> wrote:
> TCO (Tail Call Optimization) means that when TCO is in effect and a tail
> call "return f(<args>)" is executed, the current execution context (stack
> frame) is used for the call instead of allocating a new one. What is
> 'optimized' is space usage. The effect on time is not clear.
> On 1/18/2014 7:45 PM, Steven D'Aprano wrote:
>> What makes you say that it is "non-pythonic"? You seem to be assuming
>> that *by definition* anything written recursively is non-pythonic. I do
>> not subscribe to that view.
> Neither do I.

Guido is on record as preferring iterative algorithms as more
comprehensible for more people, and explicitly opposed to adding tail
call optimisation. I tend to agree with him - functional programming
works OK in the small (and pure functions are a fine tool for managing
complexity), but to scale up in a way that fits people's brains, you
need to start writing code that looks more like a cookbook.

If you want inspiration on how to design a language for typical human
thought patterns, look to cookbooks, training guides and operator
manuals, not mathematics.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ned at  Sun Jan 19 13:56:21 2014
From: ned at (Ned Batchelder)
Date: Sun, 19 Jan 2014 07:56:21 -0500
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <>

On 1/19/14 6:54 AM, Joao S. O. Bueno wrote:
> On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
>> On 19/01/2014 07:39, musicdenotation at wrote:
>>> I propose tail-call optimization to be added into CPython.
>> Then implement it so everybody else can use it.
> On a second though,  it actually could be done, at the VM level.
> I am not a proponent, but after my second though I am from "-1" to "+0".
> I believe that anytime one have the sequence:
>               20 CALL_FUNCTION            1
>               23 RETURN_VALUE
> in byte code, the current stack frame could be discarded prior
> to making the function call. Looking from 10000 meters, it feels
> like it would not impact any other aspect of the language but for
> enabling automatically tail recursion calls.
A big confusion here is between "tail recursion calls" and "tail 
calls".  This change would eliminate all tail calls, so that if f() 
ended by calling g(), then g would reuse the stack frame of f.  If g 
raises an exception, the stack trace would have no evidence of f in it.  
This is what people mean about unusable stack traces.  And don't forget 
that the stack is inspectable at runtime, so we aren't only talking 
about the visible stack trace produced on error, but the result of 
inspect.stack() etc, also.

Sure, if you eliminate only *recursive* tail calls, then the resulting 
stack traces aren't so bad, because you can do bookkeeping so that the 
1000 recursive calls to the same function are represented by one frame 
with an annotation of 1000 on it somewhere.  But how do you make it work 
your above code work only for recursive calls? And what about mutually 
recursive calls? Aren't those important too?

So we have two choices: the relatively easy job of eliminating all tail 
calls, which will throw away information we value, or the unsolved 
problem of how to eliminate recursive tail calls.

>     js
>   -><-
>> --
>> My fellow Pythonistas, ask not what our language can do for you, ask what
>> you can do for our language.
>> Mark Lawrence
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From musicdenotation at  Sun Jan 19 15:00:00 2014
From: musicdenotation at (musicdenotation at
Date: Sun, 19 Jan 2014 21:00:00 +0700
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <>

> On Jan 19, 2014, at 18:57, "Joao S. O. Bueno" <jsbueno at> wrote:
> OTOH, since we are at it, we'd better check
> 2009 BDLF's opinion on the subject:
>> On 19 January 2014 09:54, Joao S. O. Bueno <jsbueno at> wrote:
>>> On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
>>>> On 19/01/2014 07:39, musicdenotation at wrote:
>>>> I propose tail-call optimization to be added into CPython.
>>> Then implement it so everybody else can use it.
>> On a second though,  it actually could be done, at the VM level.
>> I am not a proponent, but after my second though I am from "-1" to "+0".
>> I believe that anytime one have the sequence:
>>             20 CALL_FUNCTION            1
>>             23 RETURN_VALUE
>> in byte code, the current stack frame could be discarded prior
>> to making the function call. Looking from 10000 meters, it feels
>> like it would not impact any other aspect of the language but for
>> enabling automatically tail recursion calls.
>>   js
>> -><-
>>> --
>>> My fellow Pythonistas, ask not what our language can do for you, ask what
>>> you can do for our language.
>>> Mark Lawrence
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
Actually, my original post is a response to his arguments.

From denis.spir at  Sun Jan 19 15:03:05 2014
From: denis.spir at (spir)
Date: Sun, 19 Jan 2014 15:03:05 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <lbf1je$n02$>
References: <>
Message-ID: <>

On 01/19/2014 12:09 AM, Terry Reedy wrote:

I share all your views except for the following, which in my view is incomplete:

> 1) A tail call is a 'top level' call in a return statement.
>      return f(*args, **kwds)
> A directly recursive call, where f refers to the function with the return
> statement, is a special case.

This is also true for "actions" (rather than proper functions, which purpose is 
to compute a new piece of data). It's actually rather common (also in C ;-). 
There is no data procuct. (see PS)


PS: An example may be an "action" writing out list items which calls, or rather 
delegates to, another action that writes additional items preceded by a separator.

def write_items (stream, l):
     n = len(l)
     if n == 0:

     if n == 1 : return

     write_other_items(stream, l, n)	# tail call

def write_other_items (stream, l, n):
     for i in range(1,n):
         stream.write(" ")

from sys import stdout
l = []
write_items(stdout, l)
l = [1]
write_items(stdout, l)
l = [1,2,3]
write_items(stdout, l)

From nas-python at  Sun Jan 19 15:18:19 2014
From: nas-python at (Neil Schemenauer)
Date: Sun, 19 Jan 2014 08:18:19 -0600
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <20140119022811.GR3915@ando>
References: <>
Message-ID: <>

On 2014-01-19, Steven D'Aprano wrote:
> [Neil]
> > - if people install this new version of Python as the default, old
> >   scripts and programs will break. [...]
> - It gives people an excuse to avoid migrating, and as sure as the sun 
> rises in the east, will lead to people calling for Python 2.9 a few 
> years from now.

That would be progress though.  My proposed 2.8 would have most of
the incompatible changes from 3.x so if people port it they will be
much closer to 3.x.


From rosuav at  Sun Jan 19 15:22:38 2014
From: rosuav at (Chris Angelico)
Date: Mon, 20 Jan 2014 01:22:38 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <20140119022811.GR3915@ando>
Message-ID: <>

On Mon, Jan 20, 2014 at 1:18 AM, Neil Schemenauer
<nas-python at> wrote:
> On 2014-01-19, Steven D'Aprano wrote:
>> [Neil]
>> > - if people install this new version of Python as the default, old
>> >   scripts and programs will break. [...]
>> - It gives people an excuse to avoid migrating, and as sure as the sun
>> rises in the east, will lead to people calling for Python 2.9 a few
>> years from now.
> That would be progress though.  My proposed 2.8 would have most of
> the incompatible changes from 3.x so if people port it they will be
> much closer to 3.x.

I still haven't seen any evidence that porting half-way and then
half-way again later is going to be less work than just biting the
bullet and porting to 3.x (either sooner or later, whichever is more


From solipsis at  Sun Jan 19 15:23:08 2014
From: solipsis at (Antoine Pitrou)
Date: Sun, 19 Jan 2014 15:23:08 +0100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
References: <> <20140119022811.GR3915@ando>
Message-ID: <20140119152308.6da15740@fsol>

On Sun, 19 Jan 2014 08:18:19 -0600
Neil Schemenauer <nas-python at>
> On 2014-01-19, Steven D'Aprano wrote:
> > [Neil]
> > > - if people install this new version of Python as the default, old
> > >   scripts and programs will break. [...]
> > 
> > - It gives people an excuse to avoid migrating, and as sure as the sun 
> > rises in the east, will lead to people calling for Python 2.9 a few 
> > years from now.
> That would be progress though.  My proposed 2.8 would have most of
> the incompatible changes from 3.x so if people port it they will be
> much closer to 3.x.

Not sure how code that would be incompatible with both 2.7 and 3.x
should be considered progress.



From denis.spir at  Sun Jan 19 15:27:13 2014
From: denis.spir at (spir)
Date: Sun, 19 Jan 2014 15:27:13 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <>

On 01/19/2014 12:54 PM, Joao S. O. Bueno wrote:
> On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
>> On 19/01/2014 07:39, musicdenotation at wrote:
>>> I propose tail-call optimization to be added into CPython.
>> Then implement it so everybody else can use it.
> On a second though,  it actually could be done, at the VM level.
> I am not a proponent, but after my second though I am from "-1" to "+0".
> I believe that anytime one have the sequence:
>               20 CALL_FUNCTION            1
>               23 RETURN_VALUE
> in byte code, the current stack frame could be discarded prior
> to making the function call. Looking from 10000 meters, it feels
> like it would not impact any other aspect of the language but for
> enabling automatically tail recursion calls.

You also need to adjust frame size, possibly even its structure (dunno, depends 
on implementation details of python's "calling convention" so to say), to get a 
right space (and disposition) for the callee's input variables.


From denis.spir at  Sun Jan 19 16:07:16 2014
From: denis.spir at (spir)
Date: Sun, 19 Jan 2014 16:07:16 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <20140119004515.GP3915@ando>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <>

On 01/19/2014 01:45 AM, Steven D'Aprano wrote:
> What makes you say that it is "non-pythonic"? You seem to be assuming
> that *by definition* anything written recursively is non-pythonic. I do
> not subscribe to that view.

It is certainly hard to judge the size of the field of "naturally" recursive 
algorithms. First it depends on applications or application domains, on thinking 
habits, on kinds of data structures... one is accustomed with. Second, there is 
much vagueness and ambiguity on the very term of recursion in programming. 
Recursion and recurrence just mean literally re-running, that is a cyclic, 
repetitive, looping, (re)iterative process.

The typical case used as example is factorial.
	fact(0) = 1      fact(n) = n * fact(n-1)

This is plain semantics. To get an *algorithm* to *compute* fact(n), one can 
interpret these semantics in 2 ways:
* forward iteration: start with base case (0) then as long as we don't reach n 
compute the next factorial
* backward iteration: start with n, then as long as we don't reach the base case 
compute the previous factorial
Both are recursive, but in programming we call the second case a recursion, 
while the former is at times called corecursion (see wikipedia); corecursion is 
just equivalent to plain loops, since moving forward, and just as easy to 
understand. [Which is for the least strange, I'd happily swap the terms.] [And 
the actual computing process of backward iteration is actually forward, indeed, 
but this does not appear in code.]

Funnily enough (since sum is rarely used as example) the trivial case of sum can 
lead to a similar reasoning.

For quite a while I played with mostly functional languages, and as a 
consequence implemented many routines as "backward recursions" (ending with the 
base case), even when back to procedural programming, especially when dealing 
with tree-like or other linked data. I remember a case of a radix trie (see 
wikipedia) which I had a hard time getting right, actually a hard time 
*thinking*. Then by pure chance I stepped on an implementation of tries in 
python, using "forward recursion", which was trivial to understand. I translated 
the logic to my C trie, very happily. This radically changed my view on 
"naturally" recursive algorithms.
That tries (and other linked data when implemented the functional way) are 
*self-similar* (see wikipedia) data structures, that thus corresponding 
algorithms too are "naturally" self-similar, does not imply that *backward* 
recursion is the natural / simple / easy way.

(Now, I do agree that
     def fact (n): return 1 if n<2 else n * fact(n-1)
is very ok: simple and easy enough... once one gets the very principle of 
backward recursion, meaning thinking recurrence so-to-say inside out.)


From jsbueno at  Sun Jan 19 16:12:21 2014
From: jsbueno at (Joao S. O. Bueno)
Date: Sun, 19 Jan 2014 13:12:21 -0200
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

On 19 January 2014 12:27, spir <denis.spir at> wrote:
> On 01/19/2014 12:54 PM, Joao S. O. Bueno wrote:
>> On 19 January 2014 08:52, Mark Lawrence <breamoreboy at> wrote:
>>> On 19/01/2014 07:39, musicdenotation at wrote:
>>>> I propose tail-call optimization to be added into CPython.
>>> Then implement it so everybody else can use it.
>> On a second though,  it actually could be done, at the VM level.
>> I am not a proponent, but after my second though I am from "-1" to "+0".
>> I believe that anytime one have the sequence:
>>               20 CALL_FUNCTION            1
>>               23 RETURN_VALUE
>> in byte code, the current stack frame could be discarded prior
>> to making the function call. Looking from 10000 meters, it feels
>> like it would not impact any other aspect of the language but for
>> enabling automatically tail recursion calls.
> You also need to adjust frame size, possibly even its structure (dunno,
> depends on implementation details of python's "calling convention" so to
> say), to get a right space (and disposition) for the callee's input
> variables.

Not in this suggestion -  I did not propose re-using the frame,
as seens to be the case around the calls, just because of that:
these frames in Python seen to be tied to the code object
within it. My suggestion is simply to discard the current frame
before building the frame for the call. (Maybe adding some logging
information on this next frame so that the stack trace could be complete)


> Denis
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From nicholas.cole at  Sun Jan 19 16:22:38 2014
From: nicholas.cole at (Nicholas Cole)
Date: Sun, 19 Jan 2014 15:22:38 +0000
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <20140119022811.GR3915@ando>
Message-ID: <>

On Sun, Jan 19, 2014 at 2:22 PM, Chris Angelico <rosuav at> wrote:
> On Mon, Jan 20, 2014 at 1:18 AM, Neil Schemenauer
> <nas-python at> wrote:
>> On 2014-01-19, Steven D'Aprano wrote:
>>> [Neil]
>>> > - if people install this new version of Python as the default, old
>>> >   scripts and programs will break. [...]
>>> - It gives people an excuse to avoid migrating, and as sure as the sun
>>> rises in the east, will lead to people calling for Python 2.9 a few
>>> years from now.
>> That would be progress though.  My proposed 2.8 would have most of
>> the incompatible changes from 3.x so if people port it they will be
>> much closer to 3.x.
> I still haven't seen any evidence that porting half-way and then
> half-way again later is going to be less work than just biting the
> bullet and porting to 3.x (either sooner or later, whichever is more
> convenient).

All of these threads do remind me of the Achilles Paradox.

From denis.spir at  Sun Jan 19 16:31:53 2014
From: denis.spir at (spir)
Date: Sun, 19 Jan 2014 16:31:53 +0100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>	<>	<>	<lbgaob$421$>	<>	<>
Message-ID: <>

On 01/19/2014 04:12 PM, Joao S. O. Bueno wrote:
>> >You also need to adjust frame size, possibly even its structure (dunno,
>> >depends on implementation details of python's "calling convention" so to
>> >say), to get a right space (and disposition) for the callee's input
>> >variables.
> Not in this suggestion -  I did not propose re-using the frame,
> as seens to be the case around the calls, just because of that:
> these frames in Python seen to be tied to the code object
> within it. My suggestion is simply to discard the current frame
> before building the frame for the call. (Maybe adding some logging
> information on this next frame so that the stack trace could be complete)

All right, I did not rightly get your proposal.


From techtonik at  Sun Jan 19 10:08:27 2014
From: techtonik at (anatoly techtonik)
Date: Sun, 19 Jan 2014 12:08:27 +0300
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 18, 2014 at 6:22 AM, Neil Schemenauer
<nas-python at> wrote:
> The transition to Python 3 is happening but there is still a massive
> amount of code that needs to be ported.

That's a common illusion. Python 2 is a good binary language, Python 3
is a good text language. Leaving things as-is saves lifetime and
energy. There is a conflicting constraint that you can't get all

1. readable language
2. work with strings as abstract unicode datapoints
3. work with strings as binary data

Python 2 was more explicit for unicode data (and this was tiresome for
text lovers) and Python 3 is explicit about binary (which makes life
harder for those who work with binary data).

> One of the most disruptive
> changes in Python 3 is the strict separation of bytes from unicode
> strings.  Most of the other incompatible changes can be handled by
> 2to3.

2to3 is far from being a perfect tool, not a user level one, for sure, but
I don't maintain list of all things that cause troubles. Probably the major
one is that there is no docs how to write own fixers (and you need that
for 3rd party projects).

The thing I disagree is that incompatible changes can be handled by 2to3.
There are many internal things that make Python 3 awesome, but they
were not ported to Python 2, because people wanted "the next better
thing" and thought about Python 2 as a dead end. Some of us still think
this way, but I hope that recent threads made them more flexible.

Many internal features would be good to be backported into Python 2
series and these are invisible on 2to3 level.

> Here is a far out idea to make transition smoother.  Release version
> 2.8 of Python with nearly all Python 3.x incompatible changes except
> for the bytes/unicode changes.  This could include:
> - print as function
> - default string literal as unicode

And this will be literally the end of Python 2.8 in the same way as Python 3.
Just attach here the list of consequences. Good exercise for story-writing:
"And now all strings are unicorne".

> - return view objects for dict.keys(), etc
> - rename modules in standard library
> - rename long to int
> - rename .next() to __next__()
> - accept only new 'raise' syntax
> - remove backticks for repr
> - rename unicode to str
> - removal of 'apply', 'buffer', 'callable', 'execfile'
> - exec as function
> - rename os.getcwdu() to os.getcwd()
> - remove dict.has_key
> - move intern to sys.intern()
> - rename xrange to range
> - remove xreadlines
> New features of Python 3.x could be backported if easy since they
> could be useful to entice developers to move from 2.7 to 2.8.

What if people don't need bloated Python with all there features?
What if Python 4 should not only move stdlib into modules, but features

I look at this list as an RPG called "Personal Python". You generate
you character by selecting traits you like. Some of them are conflicting
like "default binary vs unicode". Once your character is ready, you may
start to play with it. Probably something I'd expect from PyPy project,
but well it requires more engineering and experiment time than it is
possible in open source projects. Here is the idea without
implementation how to pack those features:

> An alternative approach to producing Python 2.8 would be to start
> with the Python 3.x latest branch.  Modify bytesobject and
> unicodeobject to have as close to Python 2 behavior as practical.

I'd start with PyPy. They need more help with Python 3 transition.

> A-journey-of-a-thousand-miles-begins-ly y'rs

From techtonik at  Sun Jan 19 10:16:59 2014
From: techtonik at (anatoly techtonik)
Date: Sun, 19 Jan 2014 12:16:59 +0300
Subject: [Python-ideas] Make print() not append line break by default
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 19, 2014 at 11:47 AM,  <musicdenotation at> wrote:
> And add println()

Python 2:

  def echo(msg, lineend=''):
     import sys
     sys.stdout.write(msg + lineend)

It is better than having dozen of print functions in documentation
that make this documentation unreadable.
anatoly t.

From stephen at  Sun Jan 19 18:19:51 2014
From: stephen at (Stephen J. Turnbull)
Date: Mon, 20 Jan 2014 02:19:51 +0900
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
Message-ID: <>

Neil Schemenauer writes:
 > On 2014-01-18, Terry Reedy wrote:
 > > On 1/17/2014 10:22 PM, Neil Schemenauer wrote:
 > > >The transition to Python 3 is happening but there is still a massive
 > > >amount of code that needs to be ported.
 > > 
 > > For application code, why does it need to be ported.
 > Unless Python 2.x is going to be maintained in perpetuity then code
 > will have to be ported.  This point seems obvious to me.

But it's not even true.  Python 2.7 is a Turing-complete language, it
can do anything that any language can do as an abstract computation,
and 2.7.6 has extremely few bugs and sufficient bindings to OS
facilities to do almost anything in practice as well.  It's a pretty
darn good language.  Most Python 2 programs will probably be abandoned
before Python 2.7.6 will need additional maintenance beyond what is
already provided by various OS distros.

 > I disagree.  The amount of Python 2 code that exists exceeds the
 > amount of Python 3 by orders of magnitude.  That existing codebase
 > either stops evolving and stays Python 2 forever

But "stays Python 2 forever" != "stops evolving".  There is absolutely
nothing to stop a Python 2 program from evolving dramatically over the
indefinite future, any more than sticking to C89 stops a lot of C
programs from evolving.  I don't see any real reason to suppose that
most applications will find a true need to evolve in directions that
Python 2 doesn't support for quite a while.

 > A Python 2 fork is going to happen whether the PSF blesses it or
 > not, I can't believe that's even a point of discussion.

It's not a point of discussion.  In the same sense that COBOL
compilers continue to be maintained today, Python 2 was forked long
ago.  Not only are there non-CPython implementations of the language,
every distro (commercial or not) has their own patches (perhaps a null
set for Python 2.7.6).  That's not going to stop, and as Nick points
out Stackless is even likely to add some Python 3 features to their
implementation of 2.x.  But that's a specialty interest, and not even
all Stackless users will necessarily use those features.  I doubt many
commercial packagers of CPython will have customers interested in them
-- the Stackless guys want the Python 3 features for internal use as
much as for their clients IIRC.

But a fork of the kind you propose isn't going to happen.  Definitely
not under the auspices of the PSF, that's been settled with PEP 404.
Nor with volunteer labor -- there aren't any volunteers for that.  If
there were, they would have started long ago.  And I don't see a story
for a commercial fork, either.

The problem is that there's no "halfway point" here.  Porting a
program from Python 2 to Python 3 either does not require a
fundamental rethink of its internal text processing, or it does.  In
the former case, 2to3 does a pretty good job, and what's left is a
SMOP, mostly to fit appropriate decoding/encoding on to I/O.  In the
latter case, you've got big problems -- a complete redesign and an
audit of all code for conformance to the new design.  This is the
watershed; there's no way to create a language intermediate between
Python 2 and Python 3 so that porting Python 2 to Python-sqrt(6) is
half the work, and porting Python-sqrt(6) to Python 3 is half the

From stephen at  Sun Jan 19 18:26:30 2014
From: stephen at (Stephen J. Turnbull)
Date: Mon, 20 Jan 2014 02:26:30 +0900
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

Joao S. O. Bueno writes:

 > My suggestion is simply to discard the current frame before
 > building the frame for the call. (Maybe adding some logging
 > information on this next frame so that the stack trace could be
 > complete)

That way lies madness.  The logging information needs to be stored
somewhere.  If it's to be "complete", it may as well be in ... wait
for it ... a stack frame.

From abarnert at  Sun Jan 19 21:01:00 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 19 Jan 2014 12:01:00 -0800 (PST)
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <20140119004515.GP3915@ando>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <>

From: Steven D'Aprano <steve at>

Sent: Saturday, January 18, 2014 4:45 PM

> On Sat, Jan 18, 2014 at 10:29:46AM -0800, Andrew Barnert wrote:
>>  Whether or not you really need it, adding it to Python is a fun 
>>  challenge that's worth trying.
> "Need" is a funny thing.

Which I why I made that point. It's not a completely objective question, and it may be hard for the OP (or you, or anyone else) to convince anyone that he "needs" it even though he does (or, more importantly, convince people that _they_ need it). If so, he doesn't have to let that stop him from writing and sharing an implementation. It may turn out that, once people have a chance to play with it, that will convince everyone better than any abstract argument he could make. If not, at least he's had fun, learned about CPython internals, and, most importantly, produced a fork that he can maintain as long as he thinks he needs it. Depending on your time and resources, that may not be worth doing, but that's the same decision as any other development project; there's nothing actually stopping anyone from doing it if it's worth their while, so anyone who wants this should consider whether it's worth their while to do it.

> You can go a long way without recursion, or only shallow recursion. In 
> 15 years + of writing Python code, I've never been in a position that I 
> couldn't solve a problem because of the lack of tail recursion. But 
> every time I manually convert a recursive algorithm to an iterative one, 
> I feel that I'm doing make-work, manually doing something which the 
> compiler is much better at than I am, and the result is often less 
> natural, or even awkward. (Trampolines? Ewww.)

But the same is true for converting a naive recursive algorithm to tail-recursive. It's unpleasant make-work, just like converting it to iteration. In a language like Common Lisp, it's about the same amount of work, but the tail-recursive version often ends up looking more natural. In a language like Python, where we typically deal in iterables rather than recursive data structures, I believe it would often be _more_ work rather than the same amount, and end up looking a lot less natural rather than more. I'm sure there would be exceptions, but I suspect they would be rare.

>>  Third, eliminating tail calls means the aren't on the stack at 
>>  runtime, which means there's no obvious way to display useful 
>>  tracebacks. I don't think too many Python users would accept the 
>>  tradeoff of giving up good tracebacks to enable certain kinds of 
>>  non-pythonic code, 
> What makes you say that it is "non-pythonic"? You seem to be assuming 
> that *by definition* anything written recursively is non-pythonic.

Not at all. There's plenty of code that's naturally recursive even in Python?and much of that code is written recursively today. For a good example, see os.walk.

However, the main driver for TCE is the ability to write looping constructs recursively, which is not possible without it (unless the thing you're looping over is guaranteed not to be too big). Look at any tutorial on tail recursion; it's always recursing over a cons list or something similar. And looping that way in Python will almost always be non-pythonic, because you will have to drive the iterable manually. Again, there are surely exceptions, but I doubt they'd be very common.

> In fact, in some cases, I *would* willingly give up *non-useful*?

> tracebacks for the ability to write more idiomatic code. Have you seen 
> the typical recursive traceback?

But if you eliminate tail calls, you're not just eliminating recursive tracebacks; you're eliminating every stack frame that ends in a tail call. Which includes a huge number of useful frames.

If you restrict it to _only_ eliminating recursive tail calls, then it goes from something that can be done at compile time (as I showed in my previous email) to something that has to be done at runtime, making every function call slower. And it doesn't work with mutual or indirect recursion (unless you want to walk the whole stack to see if the function being called exists higher up?which makes it even slower, and also gets us back to eliminating useful tracebacks).

> py> a(7)
> Traceback (most recent call last):
> ? File "<stdin>", line 1, in <module>
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 9, in c
> ? ? return 1/n
> ZeroDivisionError: division by zero
> The only thing that I care about is the very last line, that function c 
> tries to divide by zero. The rest of the traceback is just noise, I 
> don't even look at it.

Your example is not actually tail-recursive.

I'm guessing you know this, and decided that having something that blows up fast just to have an example of a recursive traceback was more important than having an example that also fits into the rest of the discussion?which is perfectly reasonable.?

But it's still worth calling that out, because at least half the blog posts out there that say "Python sucks because it doesn't have TCE" prove Python's suckiness by showing a non-tail-recursive algorithm that would blow up exactly the same way in Scheme as in Python.?

> Now, it's okay if you disagree, or if you can see something useful in?

> the traceback other than the last entry.

Sure. Unless that line in b is the only place in your code that ever calls c, I think it would be useful to know how we got to c and why n is 0. If that isn't useful, than _no_ tracebacks are ever useful, not just recursive ones.

> I'm not suggesting that TCE 
> should be compulsary. I would be happy with a commandline switch to 
> turn it on, or better still, a decorator to apply it to certain 
> functions and not others. I expect that I'd have TCE turned off for 
> debugging.

But the primary reason people want TCE is to be able to write functions that otherwise wouldn't run. Nobody asks for TCE because they're concerned about 2KB wasted on stack traces in their shallow algorithm; they ask for it because their deep algorithm fails with a recursion error. So, turning it off to debug it means turning off the ability to reproduce the error you're trying to debug.

>>  but even if you don't solve this, you can always?

>>  maintain a fork the same way that Stackless has been doing.
> Having to fork the entire compiler just to write a few functions in 
> their most idiomatic, natural (recursive) form seems a bit extreme, 
> wouldn't you say? 

Not necessarily.

The whole reason Stackless exists is to be able to write some algorithms in a natural way that wasn't possible with mainline CPython. At least early on, it looked at least plausible that Stackless would eventually be merged into the main core, although that turned out not to happen. There are some core language changes that were inspired by Stackless. Someone?(Ralf Schmidt, I think?) was able to extract some of Stackless's functionality into a module that works with CPython, which is very cool. But even without any of that, people were able to use?Stackless when they wanted to write code that required its features. That's surely better than not being able to write it, period.

And a TCE fork could go the same way. It might get merged into the core one day, or it might inspire some changes in the core, or it might turn out to be possible to extract the key functionality into a module for CPython?but even if none of that happens, you, and others, can still use your fork when you want to.

If you prefer to call it a patch or a branch or something else instead of a fork, that's fine, but it's basically the same amount of work either way, and there's nothing stopping anyone who wants it from doing it.

From at  Sun Jan 19 22:33:28 2014
From: at (Haoyi Li)
Date: Sun, 19 Jan 2014 13:33:28 -0800
Subject: [Python-ideas] Tail recursion elimination
Message-ID: <7822052862240502399@unknownmsgid>

> Having to fork the entire compiler just to write a few functions in
> their most idiomatic, natural (recursive) form seems a bit extreme,
> wouldn't you say?

You don't need to.

MacroPy's @tco decorator is about as easy as you could ask for. 'pip
install macropy', 'from macropy.experimental.tco import macros, tco' is
about as easy as you could ask for. Works for arbitrary tail-calls too,
not just tail recursion.

If you haven't tried it out, complaining about the difficulty of
implementing tail-call-optimization yourself seems silly.
From: Andrew Barnert
Sent: 1/19/2014 12:04 PM
To: Steven D'Aprano; python-ideas at
Subject: Re: [Python-ideas] Tail recursion elimination
From: Steven D'Aprano <steve at>

Sent: Saturday, January 18, 2014 4:45 PM

> On Sat, Jan 18, 2014 at 10:29:46AM -0800, Andrew Barnert wrote:
>>  Whether or not you really need it, adding it to Python is a fun
>>  challenge that's worth trying.
> "Need" is a funny thing.

Which I why I made that point. It's not a completely objective
question, and it may be hard for the OP (or you, or anyone else) to
convince anyone that he "needs" it even though he does (or, more
importantly, convince people that _they_ need it). If so, he doesn't
have to let that stop him from writing and sharing an implementation.
It may turn out that, once people have a chance to play with it, that
will convince everyone better than any abstract argument he could
make. If not, at least he's had fun, learned about CPython internals,
and, most importantly, produced a fork that he can maintain as long as
he thinks he needs it. Depending on your time and resources, that may
not be worth doing, but that's the same decision as any other
development project; there's nothing actually stopping anyone from
doing it if it's worth their while, so anyone who wants this should
consider whether it's worth their while to do it.

> You can go a long way without recursion, or only shallow recursion. In
> 15 years + of writing Python code, I've never been in a position that I
> couldn't solve a problem because of the lack of tail recursion. But
> every time I manually convert a recursive algorithm to an iterative one,
> I feel that I'm doing make-work, manually doing something which the
> compiler is much better at than I am, and the result is often less
> natural, or even awkward. (Trampolines? Ewww.)

But the same is true for converting a naive recursive algorithm to
tail-recursive. It's unpleasant make-work, just like converting it to
iteration. In a language like Common Lisp, it's about the same amount
of work, but the tail-recursive version often ends up looking more
natural. In a language like Python, where we typically deal in
iterables rather than recursive data structures, I believe it would
often be _more_ work rather than the same amount, and end up looking a
lot less natural rather than more. I'm sure there would be exceptions,
but I suspect they would be rare.

>>  Third, eliminating tail calls means the aren't on the stack at
>>  runtime, which means there's no obvious way to display useful
>>  tracebacks. I don't think too many Python users would accept the
>>  tradeoff of giving up good tracebacks to enable certain kinds of
>>  non-pythonic code,
> What makes you say that it is "non-pythonic"? You seem to be assuming
> that *by definition* anything written recursively is non-pythonic.

Not at all. There's plenty of code that's naturally recursive even in
Python?and much of that code is written recursively today. For a good
example, see os.walk.

However, the main driver for TCE is the ability to write looping
constructs recursively, which is not possible without it (unless the
thing you're looping over is guaranteed not to be too big). Look at
any tutorial on tail recursion; it's always recursing over a cons list
or something similar. And looping that way in Python will almost
always be non-pythonic, because you will have to drive the iterable
manually. Again, there are surely exceptions, but I doubt they'd be
very common.

> In fact, in some cases, I *would* willingly give up *non-useful*

> tracebacks for the ability to write more idiomatic code. Have you seen
> the typical recursive traceback?

But if you eliminate tail calls, you're not just eliminating recursive
tracebacks; you're eliminating every stack frame that ends in a tail
call. Which includes a huge number of useful frames.

If you restrict it to _only_ eliminating recursive tail calls, then it
goes from something that can be done at compile time (as I showed in
my previous email) to something that has to be done at runtime, making
every function call slower. And it doesn't work with mutual or
indirect recursion (unless you want to walk the whole stack to see if
the function being called exists higher up?which makes it even slower,
and also gets us back to eliminating useful tracebacks).

> py> a(7)
> Traceback (most recent call last):
> ? File "<stdin>", line 1, in <module>
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 2, in a
> ? ? return b(n-1)
> ? File "./", line 5, in b
> ? ? return c(n-1) + a(n)
> ? File "./", line 9, in c
> ? ? return 1/n
> ZeroDivisionError: division by zero
> The only thing that I care about is the very last line, that function c
> tries to divide by zero. The rest of the traceback is just noise, I
> don't even look at it.

Your example is not actually tail-recursive.

I'm guessing you know this, and decided that having something that
blows up fast just to have an example of a recursive traceback was
more important than having an example that also fits into the rest of
the discussion?which is perfectly reasonable.

But it's still worth calling that out, because at least half the blog
posts out there that say "Python sucks because it doesn't have TCE"
prove Python's suckiness by showing a non-tail-recursive algorithm
that would blow up exactly the same way in Scheme as in Python.

> Now, it's okay if you disagree, or if you can see something useful in

> the traceback other than the last entry.

Sure. Unless that line in b is the only place in your code that ever
calls c, I think it would be useful to know how we got to c and why n
is 0. If that isn't useful, than _no_ tracebacks are ever useful, not
just recursive ones.

> I'm not suggesting that TCE
> should be compulsary. I would be happy with a commandline switch to
> turn it on, or better still, a decorator to apply it to certain
> functions and not others. I expect that I'd have TCE turned off for
> debugging.

But the primary reason people want TCE is to be able to write
functions that otherwise wouldn't run. Nobody asks for TCE because
they're concerned about 2KB wasted on stack traces in their shallow
algorithm; they ask for it because their deep algorithm fails with a
recursion error. So, turning it off to debug it means turning off the
ability to reproduce the error you're trying to debug.

>>  but even if you don't solve this, you can always

>>  maintain a fork the same way that Stackless has been doing.
> Having to fork the entire compiler just to write a few functions in
> their most idiomatic, natural (recursive) form seems a bit extreme,
> wouldn't you say?

Not necessarily.

The whole reason Stackless exists is to be able to write some
algorithms in a natural way that wasn't possible with mainline
CPython. At least early on, it looked at least plausible that
Stackless would eventually be merged into the main core, although that
turned out not to happen. There are some core language changes that
were inspired by Stackless. Someone?(Ralf Schmidt, I think?) was able
to extract some of Stackless's functionality into a module that works
with CPython, which is very cool. But even without any of that, people
were able to use?Stackless when they wanted to write code that
required its features. That's surely better than not being able to
write it, period.

And a TCE fork could go the same way. It might get merged into the
core one day, or it might inspire some changes in the core, or it
might turn out to be possible to extract the key functionality into a
module for CPython?but even if none of that happens, you, and others,
can still use your fork when you want to.

If you prefer to call it a patch or a branch or something else instead
of a fork, that's fine, but it's basically the same amount of work
either way, and there's nothing stopping anyone who wants it from
doing it.
Python-ideas mailing list
Python-ideas at
Code of Conduct:

From musicdenotation at  Sun Jan 19 23:32:30 2014
From: musicdenotation at (musicdenotation at
Date: Mon, 20 Jan 2014 05:32:30 +0700
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <>

>> On Jan 19, 2014, at 19:31, Nick Coghlan <ncoghlan at> wrote:
>> On 19 January 2014 22:12, Terry Reedy <tjreedy at> wrote:
>> TCO (Tail Call Optimization) means that when TCO is in effect and a tail
>> call "return f(<args>)" is executed, the current execution context (stack
>> frame) is used for the call instead of allocating a new one. What is
>> 'optimized' is space usage. The effect on time is not clear.
>>> On 1/18/2014 7:45 PM, Steven D'Aprano wrote:
>>> What makes you say that it is "non-pythonic"? You seem to be assuming
>>> that *by definition* anything written recursively is non-pythonic. I do
>>> not subscribe to that view.
>> Neither do I.
> Guido is on record as preferring iterative algorithms as more
> comprehensible for more people, and explicitly opposed to adding tail
> call optimisation. I tend to agree with him - functional programming
> works OK in the small (and pure functions are a fine tool for managing
> complexity), but to scale up in a way that fits people's brains, you
> need to start writing code that looks more like a cookbook.
> If you want inspiration on how to design a language for typical human
> thought patterns, look to cookbooks, training guides and operator
> manuals, not mathematics.
> Nick
> -- 
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
See this:

It fits peoples' brains more because of familiarity, not "nature". While procedures in a guide (cookbook, user manual,...) are better written imperatively because of the way things are done (so are user interfaces), the behind-the-scenes algorithms have no single "intuitive" way to write that applies for all cases. They are written imperatively because of performance (and later, familiarity).

Poor support for functional programming + Global Interpreter Lock = Outdated language.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mcepl at  Sun Jan 19 23:33:56 2014
From: mcepl at (=?UTF-8?Q?Mat=C4=9Bj?= Cepl)
Date: Sun, 19 Jan 2014 23:33:56 +0100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
Message-ID: <>

Hash: SHA1

On 2014-01-19, 03:58 GMT, you wrote:
> But that doesn't stop other parties ? Red Hat, ActiveState, 
> etc. ? doing so for whatever customers are still interested in 
> compensating them for their work.

a) necessary disclaimer: I AM not speaking for my employer, just 
words out of my ass.
b) The point which is overlooked here, that people promoting 
python 2.8 are not speaking for STABILITY in the sense RHEL is 
stable. They want further DEVELOPMENT and CHANGES of Python to 
improve and react to the changed circumstances.

That is not, as far as I understand it, the business Red Hat is 
in. Our customers ask us to support Python 2.7.* (or 2.6.* for 
RHEL-6, and 2.4.* for RHEL-5) with API UNCHANGED as it is now so 
that their applications developed now for RHEL 7 (or RHEL 6, 5, 
etc.) are running UNCHANGED. They are usually NOT interested in 
further development and changing Python API. So, I don't see us 
as rooting for the further development of Python 2.* API.



Version: GnuPG v2.0.22 (GNU/Linux)


From tjreedy at  Mon Jan 20 00:13:56 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 18:13:56 -0500
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <lbhm77$aei$>

Proposal (mostly not mine): add 'return from f(args)', in analogy with 
'yield from iterator', to return a value to the caller from an execution 
frame running f(args) (and either reuse or delete the frame that ran 
'return from'). The function name 'f' would not have to match the name 
of the function being compiled, this would actually be TCO, even if it 
were nearly always used for recursive tail calls. That does mean that is 
would work for mutually tail recursive functions.

On 1/19/2014 6:57 AM, Joao S. O. Bueno wrote:
> OTOH, since we are at it, we'd better check
> 2009 BDLF's opinion on the subject:

I read throught the comments and near the very end, in July 2013, Dan 
LaMotte said... '''
Definitely seems to be complicated/impossible to determine a function is 
tail recursion 'compliant' statically in python, however, what if it 
were an 'opt in' feature that uses a different 'return' keyword?

     def f(n):
     if n > 0:
     tailcall f(n - 1)
     return 0
In additional paragraphs, he noted, among other things, that this makes 
the feature 'opt-in' on a function by function basis.

Guido replied "Dan: your proposal has the redeeming quality of clearly 
being a language feature rather than a possible optimization. I don't 
really expect there to be enough demand to actually add this to the 
language though. Maybe you can use macropy to play around with the idea 

???? then suggested 'return from'. My only contribution is to point 
out the analogy with the new, and initially strange, 'yield from'.

Guido seems to have said that if a) someone tries out the idea with 
macropy, and b) someone demonstrates enough demand, he might consider 
adding such a feature. So this seems to me the best option to pursue to 
get something into CPython. I also think it is the best proposal so far.

As for a), I have not looked as macropy, but:
On 1/19/2014 4:33 PM, Haoyi Li wrote:> MacroPy's @tco decorator is about 
as easy as you could ask for. 'pip
 > install macropy', 'from macropy.experimental.tco import macros, tco' 
 > is about as easy as you could ask for. Works for arbitrary tail-calls 
 > too, not just tail recursion.

That leaves b) for those of you who want the feature.

Any PEP should admit that the feature might be abused. Someone might write
   return from len(composite)
Unless return from refuses to delete the frame making a call to a C 
function, the effect would be to save a trivial O(1) space as the cost 
of deleting the most important line of a stack trace should len() raise. 
But I think this falls under the 'consenting adults' principle. A 
proposed doc should make it clear that the intended use is to make 
deeply recursive or mutually recursive functions run and not to replace 
all tail calls.

Terry Jan Reedy

From ben+python at  Mon Jan 20 00:35:41 2014
From: ben+python at (Ben Finney)
Date: Mon, 20 Jan 2014 10:35:41 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
References: <> <lbd36b$87t$>
 <> <>
Message-ID: <>

Mat?j Cepl <mcepl at> writes:

> On 2014-01-19, 03:58 GMT, you wrote:
> > But that doesn't stop other parties ? Red Hat, ActiveState, 
> > etc. ? doing so for whatever customers are still interested in 
> > compensating them for their work.
> a) necessary disclaimer: I AM not speaking for my employer, just 
> words out of my ass.
> b) The point which is overlooked here, that people promoting 
> python 2.8 are not speaking for STABILITY in the sense RHEL is 
> stable. They want further DEVELOPMENT and CHANGES of Python to 
> improve and react to the changed circumstances.

I'm not overlooking that, I'm pointing out that Python is free software,
so *the option is there*, for those who want Python 2 maintained
indefinitely, to motivate and compensate some party to do it.

Python 2 is free software, so any capable party can fulfil the developer
and maintainer role without any further permission required. The PSF has
made it clear they will not be that party past a certain point; but
Python 2 is licensed freely from the PSF to all recipients, so the PSF's
decision not to maintain Python 2 in no way prevents anyone else doing

So, what ?people promoting the continuance of Python 2? are asking for
is entirely within their power to have, if they want it enough. Will
they do it? That's up to them; no-one is stopping them.

> That is not, as far as I understand it, the business Red Hat is in.[?]
> So, I don't see us as rooting for the further development of Python
> 2.* API.

And that's an entirely reasonable decision for Red Hat to make. My point
is that *nothing the PSF is doing prevents* such a party from choosing
to do so.

In other words, those who want Python 2 to continue need to either bite
the bullet and move their migration to Python 3 forward, or get
themselves organised and come up with an entity which will maintain
Python 2 for as long as they want it maintained.

It's no-one else's responsibility, and no-one else is stopping them. Put
up or shut up, folks!

 \      ?Software patents provide one more means of controlling access |
  `\      to information. They are the tool of choice for the internet |
_o__)                                     highwayman.? ?Anthony Taylor |
Ben Finney

From timothy.c.delaney at  Mon Jan 20 00:39:52 2014
From: timothy.c.delaney at (Tim Delaney)
Date: Mon, 20 Jan 2014 10:39:52 +1100
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <lbhm77$aei$>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

On 20 January 2014 10:13, Terry Reedy <tjreedy at> wrote:

> Proposal (mostly not mine): add 'return from f(args)', in analogy with
> 'yield from iterator', to return a value to the caller from an execution
> frame running f(args) (and either reuse or delete the frame that ran
> 'return from'). The function name 'f' would not have to match the name of
> the function being compiled, this would actually be TCO, even if it were
> nearly always used for recursive tail calls. That does mean that is would
> work for mutually tail recursive functions.

As someone who is happy with the status quo, "return from" seems to me to
be the only sensible way to incorporate it into the language. Direct
analogy with yield from, clear semantics ... I like it.

Any PEP should admit that the feature might be abused. Someone might write
>   return from len(composite)
> Unless return from refuses to delete the frame making a call to a C
> function, the effect would be to save a trivial O(1) space as the cost of
> deleting the most important line of a stack trace should len() raise. But I
> think this falls under the 'consenting adults' principle. A proposed doc
> should make it clear that the intended use is to make deeply recursive or
> mutually recursive functions run and not to replace all tail calls.

Consenting adults does make things nice and simple.

I'm not proposing the following semantics, but I can think of an
alternative that might be useful, but likely difficult (and costly) to
implement, and difficult to explain. When code goes through a "return
from", that frame is retained, but when a new frame for the same code
object is created in the call stack, you *then* delete the calling frame.

Hmm - actually, you could keep a structure (e.g. a dict) on the side
mapping code objects to the most recent frame for that code object - that
would make it reasonably cheap to do. Wouldn't get particularly large
either since you'd only be recording frames that continued through a
"return from".

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From var.mail.daniel at  Mon Jan 20 00:41:18 2014
From: var.mail.daniel at (Daniel da Silva)
Date: Sun, 19 Jan 2014 18:41:18 -0500
Subject: [Python-ideas] Predicate Sets
Message-ID: <>

Below is a description of a very simple but immensely useful class called a
"predicate set". In combination with the set and list comprehensions they
would allow another natural layer of reasoning with mathematical set logic
in Python.

In my opinion, a concept like this would be best located in the functools

    Sets in mathematics can be defined by a list of elements without
repetitions, and alternatively by a predicate (function) that determines
inclusion. A predicate set would be a set-like class that is instantiated
with a predicate function that is called to determine ``a in

>> myset = predicateset(lambda s: s.startswith('a'))
>> 'xyz' in myset
>> 'abc' in myset
>> len(myself)
Traceback (most recent call last):

*Example Uses:*
# Dynamic excludes in searching
foo_files = search_files('foo', exclude=set(['a.out', 'Makefile']))
bar_files = search_files('bar', exclude=predicateset(lambda fname: not
fname.endswith('~'))) # exclude *~

# Use in place of a set with an ORM
validusernames = predicateset(lambda s: re.match(s, '[a-zA-Z0-9]+'))

class Users(db.Model):
    username = db.StringProperty(choices=validusernames)
    password = db.StringProperty()
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Mon Jan 20 00:44:32 2014
From: ethan at (Ethan Furman)
Date: Sun, 19 Jan 2014 15:44:32 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
Message-ID: <>

On 01/19/2014 03:35 PM, Ben Finney wrote:
> In other words, those who want Python 2 to continue need to either bite
> the bullet and move their migration to Python 3 forward

Um, if they want Python 2 to continue, why would they migrate to Python 3?


From rosuav at  Mon Jan 20 00:49:59 2014
From: rosuav at (Chris Angelico)
Date: Mon, 20 Jan 2014 10:49:59 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
Message-ID: <>

On Mon, Jan 20, 2014 at 9:33 AM, Mat?j Cepl <mcepl at> wrote:
> On 2014-01-19, 03:58 GMT, you wrote:
>> But that doesn't stop other parties ? Red Hat, ActiveState,
>> etc. ? doing so for whatever customers are still interested in
>> compensating them for their work.

Please, this is a list with lots of recipients. Don't say "you" wrote
here - use a name :) Thanks!


From ian at  Mon Jan 20 01:04:59 2014
From: ian at (Ian Foote)
Date: Mon, 20 Jan 2014 00:04:59 +0000
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

Hash: SHA1

On 19/01/14 23:41, Daniel da Silva wrote:
> Below is a description of a very simple but immensely useful class 
> called a "predicate set". In combination with the set and list 
> comprehensions they would allow another natural layer of reasoning
> with mathematical set logic in Python.
> In my opinion, a concept like this would be best located in the 
> functools module.
> *Overview:* Sets in mathematics can be defined by a list of
> elements without repetitions, and alternatively by a predicate
> (function) that determines inclusion. A predicate set would be a
> set-like class that is instantiated with a predicate function that
> is called to determine ``a in the_predicate_set''.
>>> myset = predicateset(lambda s: s.startswith('a')) 'xyz' in
>>> myset
> False
>>> 'abc' in myset
> True
>>> len(myself)
> Traceback (most recent call last): [...] TypeError * * *Example
> Uses:* # Dynamic excludes in searching foo_files =
> search_files('foo', exclude=set(['a.out', 'Makefile'])) bar_files =
> search_files('bar', exclude=predicateset(lambda fname: not 
> fname.endswith('~'))) # exclude *~
> # Use in place of a set with an ORM validusernames =
> predicateset(lambda s: re.match(s, '[a-zA-Z0-9]+'))
> class Users(db.Model): username =
> db.StringProperty(choices=validusernames) password =
> db.StringProperty()

Hi Daniel,

That's an interesting idea. I'm not sure it would be used enough to
include in the standard library though. Have you considered releasing
an implementation on PyPI? That has the advantage that people can
start using it earlier than would be possible if it was added to the
standard library.


Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -


From steve at  Mon Jan 20 01:06:45 2014
From: steve at (Steven D'Aprano)
Date: Mon, 20 Jan 2014 11:06:45 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <20140120000645.GV3915@ando>

On Sun, Jan 19, 2014 at 03:44:32PM -0800, Ethan Furman wrote:
> On 01/19/2014 03:35 PM, Ben Finney wrote:
> >
> >In other words, those who want Python 2 to continue need to either bite
> >the bullet and move their migration to Python 3 forward
> Um, if they want Python 2 to continue, why would they migrate to Python 3?

Because you can't always get what you want. I want a pony, but since I 
can't afford one or have any place to keep it, I've made do without.


From ben+python at  Mon Jan 20 01:07:17 2014
From: ben+python at (Ben Finney)
Date: Mon, 20 Jan 2014 11:07:17 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

Ethan Furman <ethan at> writes:

> On 01/19/2014 03:35 PM, Ben Finney wrote:
> >
> > In other words, those who want Python 2 to continue need to either
> > bite the bullet and move their migration to Python 3 forward
> Um, if they want Python 2 to continue, why would they migrate to
> Python 3?

One of the often-stated justifications for wanting Python 2 to continue
is that the party wants to migrate their code base to Python 3, but

With that clause, I'm pointing out that ?we can't find anyone to
continue maintaining Python 2 the way we want for the price we want to
pay for the length of time we want to keep using Python 2? still leaves
the plaintiff with the option to hurry up and migrate to Python 3.

 \     ?Airports are ugly. Some are very ugly. Some attain a degree of |
  `\        ugliness that can only be the result of a special effort.? |
_o__)       ?Douglas Adams, _The Long Dark Tea-Time of the Soul_, 1988 |
Ben Finney

From steve at  Mon Jan 20 01:16:40 2014
From: steve at (Steven D'Aprano)
Date: Mon, 20 Jan 2014 11:16:40 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <20140119022811.GR3915@ando>
Message-ID: <20140120001640.GW3915@ando>

On Sun, Jan 19, 2014 at 08:18:19AM -0600, Neil Schemenauer wrote:
> On 2014-01-19, Steven D'Aprano wrote:
> > [Neil]
> > > - if people install this new version of Python as the default, old
> > >   scripts and programs will break. [...]
> > 
> > - It gives people an excuse to avoid migrating, and as sure as the sun 
> > rises in the east, will lead to people calling for Python 2.9 a few 
> > years from now.
> That would be progress though.  My proposed 2.8 would have most of
> the incompatible changes from 3.x so if people port it they will be
> much closer to 3.x.

Progress towards what, though? You say that they will be "closer" to 
migrating, but another way to look at it is that they will be *further 
away* from migrating:

- the only work they have to do is the easy parts, like adapting from
  zip returning a list to zip returning an iterator, in other words
  the part of the migration which can be handled by a simple-minded
  mechanical script like 2to3;

- in return they get access to many of the desirable new features of
  Python 3;

- which reduces their incentive to tackle the big, difficult, 
  structural changes needed for Python 3 (e.g. handling text as 
  Unicode properly).

To me, that's a step backwards.

One aim here is for the core developers to have one code base to 
maintain, not two. My grateful thanks to them for taking on all this 
extra work, and it has been a lot of work, to make it easier for users 
to migrate, but enough is enough. Adding 2.8 will extend that burden on 
the core developers by at least three years (18 months of active 
development plus 18 months of security features); adding 2.9 by the 
same again. It is entirely appropriate for the core devs to draw a line 
and say *this is when we stop supporting Python 2*, and that line has 
been drawn a long time ago at 2.7.

If people don't migrate after a decade, they won't migrate after 16 
years, especially if they get "all the good bits" apart from the Unicode 
text model (which many English speakers don't care about), so what 
you're actually suggesting is that the core devs agree to an extra 3-5 
years of maintaining the 2.x series for the sake of people who will 
very likely never migrate to 3.x.


From steve at  Mon Jan 20 01:23:22 2014
From: steve at (Steven D'Aprano)
Date: Mon, 20 Jan 2014 11:23:22 +1100
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <lbgfeq$kok$>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <20140120002322.GX3915@ando>

On Sun, Jan 19, 2014 at 07:12:16AM -0500, Terry Reedy wrote:

> Are you willing to do any of the work needed to make the option 
> available, starting with a specification? If so, I have some ideas.

Given the amount of controversy over this, it would probably need a PEP. 
I might be able to start with a pre-PEP, time permitting, and see how 
that goes. (If those interminable bytes/unicode/2.8 threads on the 
Python-Dev list would start to die off, I might have more time to treat 
this seriously.)

> >Having to fork the entire compiler just to write a few functions in
> >their most idiomatic, natural (recursive) form seems a bit extreme,
> >wouldn't you say?
> A 'fork' could consist of a relatively small patch that could be 
> uploaded to, for instance, PyPI. I would not be surprised if 100-200 
> lines might be enough.

Lines of *C* though, right? Which means for anyone to use it, they would 
have to be willing to build Python from source, applying your patch, or 
the maintainer would have to volunteer to provide pre-built binaries. 
Neither of which is exactly a recipe for broad take-up.


From tjreedy at  Mon Jan 20 01:34:22 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 19:34:22 -0500
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <lbhqu2$n4r$>

On 1/19/2014 6:41 PM, Daniel da Silva wrote:
> Below is a description of a very simple but immensely useful class
> called a "predicate set". In combination with the set and list
> comprehensions they would allow another natural layer of reasoning with
> mathematical set logic in Python.

Sets defined by predicates are usually infinite and mathematical set 
logic works fine with such.

> *Overview:*
>      Sets in mathematics can be defined by a list of elements without
> repetitions, and alternatively by a predicate (function) that determines
> inclusion. A predicate set would be a set-like class that is
> instantiated with a predicate function that is called to determine ``a
> in the_predicate_set''.
>  >> myset = predicateset(lambda s: s.startswith('a'))
>  >> 'xyz' in myset
> False
>  >> 'abc' in myset
> True
>  >> len(myself)
> Traceback (most recent call last):
>    [...]
> TypeError

This illustrates the problem with the idea. Only containment is really 
straightforward. (I am aware that some operations could be implemented 
by defining new predicates. To combines sets with predicatesets, the 
sets would have to be represented by predicates, as done below.)

> *Example Uses:*
> # Dynamic excludes in searching
> foo_files = search_files('foo', exclude=set(['a.out', 'Makefile']))
> bar_files = search_files('bar', exclude=predicateset(lambda fname: not
> fname.endswith('~'))) # exclude *~
> # Use in place of a set with an ORM
> validusernames = predicateset(lambda s: re.match(s, '[a-zA-Z0-9]+'))

I think these examples are backwards. The APIs should accept functions 
either in addition to or instead of collections. It is trivial to turn a 
collection into a predicate

 >>> p = {'a', 'b', 'c'}.__contains__
 >>> p('a')
 >>> p('d')

You need realistic examples that use other operations (but not len ;-).

Terry Jan Reedy

From steve at  Mon Jan 20 01:53:37 2014
From: steve at (Steven D'Aprano)
Date: Mon, 20 Jan 2014 11:53:37 +1100
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <20140120005335.GY3915@ando>

On Sun, Jan 19, 2014 at 10:31:06PM +1000, Nick Coghlan wrote:

> Guido is on record as preferring iterative algorithms as more
> comprehensible for more people, and explicitly opposed to adding tail
> call optimisation. 

Many people struggle with recursion. Many people struggle with 
couroutines, and asychronous programming, and Unicode. Some people never 
quite get the hang of object oriented programming. That doesn't imply 
that Python should only offer features which nobody struggles with. It 
would be a pretty bare language if that were the case :-)

> I tend to agree with him - functional programming
> works OK in the small (and pure functions are a fine tool for managing
> complexity), but to scale up in a way that fits people's brains, you
> need to start writing code that looks more like a cookbook.

Python is not a pure functional language. Adding TCE won't make it one. 
If somebody wants to write their app in a pure functional manner, 
they're either not going to use Python at all, or they'll do it 
regardless of the lack of TCE and just grumble that Python is only 
suitable for "toy" applications.

But as a *component* of a larger "cookbook" style application, pure 
functions are great. And some functions are more naturally written in 
recursive style rather than iterative. I have no interest in writing my 
entire app as a pure-functional app (if I wanted to do that, I'd use 
Haskell). But I do have great interest in being able to write functions 
in the most natural way possible, and that sometimes means recursively, 
without having to compromise for performance.

> If you want inspiration on how to design a language for typical human
> thought patterns, look to cookbooks, training guides and operator
> manuals, not mathematics.

And Python is a great example of that, but it's not really relevant to 
the idea of adding TCE. Or at least, its no more relevant than are 
people's grumbles that adding such things as closures and coroutines 
makes Python more complex and too advanced for "ordinary programmers".

Adding TCE need not affect Python as a language. People who like 
iteration will still write iterative functions. People who think like 
Java programmers will still write Java in Python, people who think like 
bash scriptors will still write bash in Python. The only addition is 
that people who think like Scheme programmers will have one less thing 
to complain about Python *wink*

Most programmers write for themselves, or for a small group. Arguing 
that Sue (who can think recursively) ought to write her code using an 
iterative algorithm because Tom and Jerry won't otherwise understand it 
is not a terribly strong argument when Tom and Jerry aren't in Sue's 
target audience.


From mertz at  Mon Jan 20 02:15:44 2014
From: mertz at (David Mertz)
Date: Sun, 19 Jan 2014 17:15:44 -0800
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <lbhm77$aei$>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

I was mostly disliking the idea of TCO during this discussion.  However,
the idiom of 'return from' seems sufficiently elegant and explicit--and has
exactly the semantics you'd expect from 'yield from'--that I am actually +1
on that idea.

Being an explicit construct, it definitely becomes a case of "consenting
adults" not of implicit magic.  I.e. you are declaring right in the code
that you don't expect to see a frame in a stack trace, which is fair
enough.  I mean, if you *really* wanted to you could muck around with
'sys._getframe(N).f_whatever' already which would give inaccurate
tracebacks too.  Probably there would be a way to removed frames from the
stack even, using some such trick in current python.

On Sun, Jan 19, 2014 at 3:13 PM, Terry Reedy <tjreedy at> wrote:

> Proposal (mostly not mine): add 'return from f(args)', in analogy with
> 'yield from iterator', to return a value to the caller from an execution
> frame running f(args) (and either reuse or delete the frame that ran
> 'return from'). The function name 'f' would not have to match the name of
> the function being compiled, this would actually be TCO, even if it were
> nearly always used for recursive tail calls. That does mean that is would
> work for mutually tail recursive functions.
> On 1/19/2014 6:57 AM, Joao S. O. Bueno wrote:
>> OTOH, since we are at it, we'd better check
>> 2009 BDLF's opinion on the subject:
>> elimination.html
> I read throught the comments and near the very end, in July 2013, Dan
> LaMotte said... '''
> Definitely seems to be complicated/impossible to determine a function is
> tail recursion 'compliant' statically in python, however, what if it were
> an 'opt in' feature that uses a different 'return' keyword?
>     def f(n):
>     if n > 0:
>     tailcall f(n - 1)
>     return 0
> '''
> In additional paragraphs, he noted, among other things, that this makes
> the feature 'opt-in' on a function by function basis.
> Guido replied "Dan: your proposal has the redeeming quality of clearly
> being a language feature rather than a possible optimization. I don't
> really expect there to be enough demand to actually add this to the
> language though. Maybe you can use macropy to play around with the idea
> though?"
> ???? then suggested 'return from'. My only contribution is to point out
> the analogy with the new, and initially strange, 'yield from'.
> Guido seems to have said that if a) someone tries out the idea with
> macropy, and b) someone demonstrates enough demand, he might consider
> adding such a feature. So this seems to me the best option to pursue to get
> something into CPython. I also think it is the best proposal so far.
> As for a), I have not looked as macropy, but:
> On 1/19/2014 4:33 PM, Haoyi Li wrote:> MacroPy's @tco decorator is about
> as easy as you could ask for. 'pip
> > install macropy', 'from macropy.experimental.tco import macros, tco' >
> is about as easy as you could ask for. Works for arbitrary tail-calls >
> too, not just tail recursion.
> That leaves b) for those of you who want the feature.
> Any PEP should admit that the feature might be abused. Someone might write
>   return from len(composite)
> Unless return from refuses to delete the frame making a call to a C
> function, the effect would be to save a trivial O(1) space as the cost of
> deleting the most important line of a stack trace should len() raise. But I
> think this falls under the 'consenting adults' principle. A proposed doc
> should make it clear that the intended use is to make deeply recursive or
> mutually recursive functions run and not to replace all tail calls.
> --
> Terry Jan Reedy
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Mon Jan 20 02:17:04 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Sun, 19 Jan 2014 17:17:04 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 19, 2014 at 3:41 PM, Daniel da Silva
<var.mail.daniel at> wrote:
> Below is a description of a very simple but immensely useful class called a
> "predicate set". In combination with the set and list comprehensions they
> would allow another natural layer of reasoning with mathematical set logic
> in Python.

Efficiently implementing the set operators (intersection, union, etc.)
requires using ROBDDs (reduced ordered binary decision diagrams),
which are complex enough to deserve their _own_ library. It's not a
simple task, and shouldn't be written from scratch.

That said, if you implemented it, and did it efficiently, I'd find it
hugely helpful. I ended up implementing it on my own in a bit of a
brute force fashion once (I used truth tables instead of BDDs):
(I make no claims to this being good or correct code)

-- Devin

From tjreedy at  Mon Jan 20 02:28:46 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 20:28:46 -0500
Subject: [Python-ideas] Tail Call Optimization (was Re: Tail recursion
In-Reply-To: <20140120002322.GX3915@ando>
References: <>
 <> <20140119004515.GP3915@ando>
 <lbgfeq$kok$> <20140120002322.GX3915@ando>
Message-ID: <lbhu42$kat$>

On 1/19/2014 7:23 PM, Steven D'Aprano wrote:
> On Sun, Jan 19, 2014 at 07:12:16AM -0500, Terry Reedy wrote:
>> Are you willing to do any of the work needed to make the option
>> available, starting with a specification? If so, I have some ideas.

Since writing the above, I came across the 'return from' idea, which I 
think is the best so far, and better than any of the 'ideas' I was 
thinking of. See my 'return from' post.

> Given the amount of controversy over this, it would probably need a PEP.
> I might be able to start with a pre-PEP, time permitting, and see how
> that goes. (If those interminable bytes/unicode/2.8 threads on the
> Python-Dev list would start to die off, I might have more time to treat
> this seriously.)

>> A 'fork' could consist of a relatively small patch that could be
>> uploaded to, for instance, PyPI. I would not be surprised if 100-200
>> lines might be enough.
> Lines of *C* though, right?


> Which means for anyone to use it, they would
> have to be willing to build Python from source, applying your patch, or
> the maintainer would have to volunteer to provide pre-built binaries.

A typical combination is source for *nix and a Windows installer.

> Neither of which is exactly a recipe for broad take-up.

Use of macropy.experimental.tco would give some indication of the 
popularity of the idea. Without using it, I do not know how close it is.

A 'return from' patch could start by copying the code that recognizes 
'yield from' and compiles it to a YIELD_FROM bytecode. (Or by looking at 
the part of the yield from patch that added the code.) Writing code to 
implement a RETURN_FROM bytecode, by modifying the RETURN_VALUE 
function, would be a separate step.

Terry Jan Reedy

From steve at  Mon Jan 20 02:49:19 2014
From: steve at (Steven D'Aprano)
Date: Mon, 20 Jan 2014 12:49:19 +1100
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <20140120014919.GZ3915@ando>

On Sun, Jan 19, 2014 at 12:01:00PM -0800, Andrew Barnert wrote:
> From: Steven D'Aprano <steve at>
> > In fact, in some cases, I *would* willingly give up *non-useful*?
> > tracebacks for the ability to write more idiomatic code. Have you seen 
> > the typical recursive traceback?
> But if you eliminate tail calls, you're not just eliminating recursive 
> tracebacks; you're eliminating every stack frame that ends in a tail 
> call. Which includes a huge number of useful frames.
> If you restrict it to _only_ eliminating recursive tail calls, then it 
> goes from something that can be done at compile time (as I showed in 
> my previous email) to something that has to be done at runtime, making 
> every function call slower. And it doesn't work with mutual or 
> indirect recursion (unless you want to walk the whole stack to see if 
> the function being called exists higher up?which makes it even slower, 
> and also gets us back to eliminating useful tracebacks).

But if TCE becomes opt-in, say by the proposed "return from" syntax, 
then you can keep your cake and eat it too. I can decide at *edit* time, 
"this function should have TCE enabled", and leave the rest of my code 
to have the "normal" behaviour.

If the choice was "TCE everywhere" versus "TCE nowhere", I would choose 
nowhere too. But it need not be that choice.

> > py> a(7)
> > Traceback (most recent call last):
> > ? File "<stdin>", line 1, in <module>
> > ? File "./", line 2, in a
> > ? ? return b(n-1)
> > ? File "./", line 5, in b
> > ? ? return c(n-1) + a(n)
> > ? File "./", line 9, in c
> > ? ? return 1/n
> > ZeroDivisionError: division by zero
> > 
> > The only thing that I care about is the very last line, that function c 
> > tries to divide by zero. The rest of the traceback is just noise, I 
> > don't even look at it.
> Your example is not actually tail-recursive.
> I'm guessing you know this, and decided that having something that 
> blows up fast just to have an example of a recursive traceback was 
> more important than having an example that also fits into the rest of 
> the discussion?which is perfectly reasonable.?

Yes, you got me. It was throw away code, which I've since thrown away, 
but if I recall correctly one of the three functions was tail-recursive. 
I was more concerned with making the rhetorical point that sometimes the 
only part of the traceback you care about is the bit that actually 
fails, at which point the rest of the traceback is noise and you might 
choose to prefer performance over a more detailed traceback.

> But it's still worth calling that out, because at least half the blog 
> posts out there that say "Python sucks because it doesn't have TCE" 
> prove Python's suckiness by showing a non-tail-recursive algorithm 
> that would blow up exactly the same way in Scheme as in Python.?

I work with one of those guys :-(

> > I'm not suggesting that TCE should be compulsary. I would be happy 
> > with a commandline switch to turn it on, or better still, a 
> > decorator to apply it to certain functions and not others. I expect 
> > that I'd have TCE turned off for debugging.
> But the primary reason people want TCE is to be able to write 
> functions that otherwise wouldn't run. Nobody asks for TCE because 
> they're concerned about 2KB wasted on stack traces in their shallow 
> algorithm; they ask for it because their deep algorithm fails with a 
> recursion error. So, turning it off to debug it means turning off the 
> ability to reproduce the error you're trying to debug.

You seem to be assuming that bugs in deep algorithms only manifest 
themselves in sufficiently deep data sets that turning TCE off will 
cause a recursion error before the true bug manifests, thus masking the 
bug you care about by mere lack of resources.

I don't believe that is the case for all bugs, or even a majority. If it 
is true for some bugs -- of course it will be -- then a solution is to 
add enough temporary debugging code (e.g. logging, or even just good ol' 
print) to see enough of what is going on that you can identify the bug, 
stacktrace or no stacktrace. Chances are you would have to write some 
temporary debugging code regardless of whether the algorithm was 
iterative or recursive, TCE or no TCE.


From tjreedy at  Mon Jan 20 02:58:17 2014
From: tjreedy at (Terry Reedy)
Date: Sun, 19 Jan 2014 20:58:17 -0500
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <lbhvrd$54g$>

On 1/19/2014 8:15 PM, David Mertz wrote:

 > On Sun, Jan 19, 2014 at 3:13 PM, Terry Reedy
 >     Proposal (mostly not mine): add 'return from f(args)', in analogy
 >     with 'yield from iterator', to return a value to the caller from an
 >     execution frame running f(args) (and either reuse or delete the
 >     frame that ran 'return from'). The function name 'f' would not have
 >     to match the name of the function being compiled, this would
 >     actually be TCO, even if it were nearly always used for recursive
 >     tail calls. That does mean that is would work for mutually tail
 >     recursive functions.

 > I was mostly disliking the idea of TCO during this discussion.  However,
> the idiom of 'return from' seems sufficiently elegant and explicit--and
> has exactly the semantics you'd expect from 'yield from'--that I am
> actually +1 on that idea.
> Being an explicit construct, it definitely becomes a case of "consenting
> adults" not of implicit magic.  I.e. you are declaring right in the code
> that you don't expect to see a frame in a stack trace, which is fair
> enough.  I mean, if you *really* wanted to you could muck around with
> 'sys._getframe(N).f_whatever' already which would give inaccurate
> tracebacks too.  Probably there would be a way to removed frames from
> the stack even, using some such trick in current python.

Acting upon encountering a call-return bytecode pair has the following 

1. It is CPython specific and probably not portable to all 
implementations. Guido has cited this as a major block.

2. It must by optional, but how?

2A. A command line option is too broad. For some inputs, functions would 
return or crash depending on the option. Not good. Also, command line 
options do not work well when starting Python with icons.

2B. A future import would have a narrower scope but still might be too 
broad. It would also be an abuse because the 'future' would be a fake 
future that is partly now and partly never.

2C. A sys flag has the non-icon problems of a command line option.

An explicit indicator in the function avoids most of these problems. The 
only one I am not sure about is other implementations, but with explicit 
system independent syntax, there is at least a chance.

A developer can temporarily switch back to return (with small enough 
input) to get a full stack trace for exactly one function, just as one 
can temporarily add 'print' to get a 'loop trace' for exactly one loop.

Terry Jan Reedy

From stephen at  Mon Jan 20 04:06:14 2014
From: stephen at (Stephen J. Turnbull)
Date: Mon, 20 Jan 2014 12:06:14 +0900
Subject: [Python-ideas] Create Python 2.8 as a transition step to
	Python	3.x
In-Reply-To: <20140120001640.GW3915@ando>
References: <> <20140119022811.GR3915@ando>
 <> <20140120001640.GW3915@ando>
Message-ID: <>

Steven D'Aprano writes:

 > To me, that's a step backwards.

I agree, but this kind of "step backwards" is a "consenting adults"
issue.  So let's avoid such pejorative terminology, and stick to the
line that a lot of resources would be required to create such a Python
2.8, and there's little benefit to be had.

From rosuav at  Mon Jan 20 04:30:35 2014
From: rosuav at (Chris Angelico)
Date: Mon, 20 Jan 2014 14:30:35 +1100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <20140119022811.GR3915@ando>
 <> <20140120001640.GW3915@ando>
Message-ID: <>

On Mon, Jan 20, 2014 at 2:06 PM, Stephen J. Turnbull <stephen at> wrote:
> Steven D'Aprano writes:
>  > To me, that's a step backwards.
> I agree, but this kind of "step backwards" is a "consenting adults"
> issue.  So let's avoid such pejorative terminology, and stick to the
> line that a lot of resources would be required to create such a Python
> 2.8, and there's little benefit to be had.

No, I'm with Steven on this. (Steven with a v, as opposed to Stephen
with a ph. It's like talking to the detectives in Tintin.) Even if it
cost no resources at all - if Python 2.8 already existed, exactly as
described - it would be a third Python to aim for (as well as 2.7 and
3.x). It's already hard enough to span lots of Python versions; adding
another that's deliberately and consciously incompatible with both the
primary branches would be a major problem. It may be that code that
runs on 2.7 and 3.4 will also automatically run on 2.8 (which seems
possible, but far from certain), but if not, 2.8 would cause problems
for everyone who tries to write code for every supported version. For
anything other than in-house scripts where one person/team controls
both the script and the interpreter it runs on, compatibility with
multiple versions will be critical; and adding something incompatible
with both current versions is an XKCD 927 situation [1]. No matter how
cheap or expensive it is to do, that's a problem *in itself*, so the
proposal has to justify itself enough to overcome that.



From bruce at  Mon Jan 20 04:40:42 2014
From: bruce at (Bruce Leban)
Date: Sun, 19 Jan 2014 19:40:42 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <20140120000645.GV3915@ando>
References: <> <lbd36b$87t$>
 <> <>
 <> <20140120000645.GV3915@ando>
Message-ID: <>

On Sun, Jan 19, 2014 at 4:06 PM, Steven D'Aprano <steve at>wrote:

> On Sun, Jan 19, 2014 at 03:44:32PM -0800, Ethan Furman wrote:
> > On 01/19/2014 03:35 PM, Ben Finney wrote:
> > >
> > >In other words, those who want Python 2 to continue need to either bite
> > >the bullet and move their migration to Python 3 forward
> >
> > Um, if they want Python 2 to continue, why would they migrate to Python
> 3?
> Because you can't always get what you want. I want a pony, but since I
> can't afford one or have any place to keep it, I've made do without.

I think the odds of Python getting

        from __future__ import pony

are slightly higher than there being a Python 2.8. I assume by "pony" you
really mean what I'd like to have:

        from __future__ import everything

since my goal is to write Python 3 compatible code even though I'm
temporarily stuck with Python 2 due to stack issues. The __future__ imports
makes it easier to write forward compatible code. As it is, I have to list
the individual imports in every file and I also add:

        range = xrange

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python at  Mon Jan 20 05:19:44 2014
From: python at (MRAB)
Date: Mon, 20 Jan 2014 04:19:44 +0000
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

On 2014-01-20 03:40, Bruce Leban wrote:
> On Sun, Jan 19, 2014 at 4:06 PM, Steven D'Aprano <steve at
> <mailto:steve at>> wrote:
>     On Sun, Jan 19, 2014 at 03:44:32PM -0800, Ethan Furman wrote:
>      > On 01/19/2014 03:35 PM, Ben Finney wrote:
>      > >
>      > >In other words, those who want Python 2 to continue need to
>     either bite
>      > >the bullet and move their migration to Python 3 forward
>      >
>      > Um, if they want Python 2 to continue, why would they migrate to
>     Python 3?
>     Because you can't always get what you want. I want a pony, but since I
>     can't afford one or have any place to keep it, I've made do without.
> I think the odds of Python getting
>          from __future__ import pony
> are slightly higher than there being a Python 2.8. I assume by "pony"
> you really mean what I'd like to have:
>          from __future__ import everything
That should be:

     from __future__ import *

although it would still be discouraged because you might find that
you're no longer able to get at some of the stuff you have already. :-)

> since my goal is to write Python 3 compatible code even though I'm
> temporarily stuck with Python 2 due to stack issues. The __future__
> imports makes it easier to write forward compatible code. As it is, I
> have to list the individual imports in every file and I also add:
>          range = xrange

From abarnert at  Mon Jan 20 07:15:20 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 19 Jan 2014 22:15:20 -0800
Subject: [Python-ideas] Tail recursion elimination
In-Reply-To: <20140120014919.GZ3915@ando>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <>

On Jan 19, 2014, at 17:49, Steven D'Aprano <steve at> wrote:

> But if TCE becomes opt-in, say by the proposed "return from" syntax, 
> then you can keep your cake and eat it too. I can decide at *edit* time, 
> "this function should have TCE enabled", and leave the rest of my code 
> to have the "normal" behaviour.

My first post on the subject suggested adding a new keyword (I think I used "tailcall", borrowed from Guido's post) to do explicit tail calls, and only building TCE as an automatic optimization on top of it (which I'm pretty sure could be done with a trivial peephole optimizer rule) if you still need it after that. So obviously, I agree with this.

And yes, "return from" is definitely better than "tailcall"--readable and understandable, no new keyword, etc.

And I still think this would be a fun project even though I don't think I would ever use it. I tried effectively this same design against Stackless 2.6 a few years ago, but it sometimes leaked, and would crash whenever a C function called a Python function that tail called, and I ran out of free time to debug any further. The point is, this isn't a massive impossible project; many of the people insisting they want it are probably capable of writing it, even if they've never tried hacking on the interpreter. (The grammar is a huge pain the first time, however...)

From abarnert at  Mon Jan 20 07:36:25 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 19 Jan 2014 22:36:25 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

On Jan 19, 2014, at 19:40, Bruce Leban <bruce at> wrote:

> On Sun, Jan 19, 2014 at 4:06 PM, Steven D'Aprano <steve at> wrote:
>> On Sun, Jan 19, 2014 at 03:44:32PM -0800, Ethan Furman wrote:
>> > On 01/19/2014 03:35 PM, Ben Finney wrote:
>> > >
>> > >In other words, those who want Python 2 to continue need to either bite
>> > >the bullet and move their migration to Python 3 forward
>> >
>> > Um, if they want Python 2 to continue, why would they migrate to Python 3?
>> Because you can't always get what you want. I want a pony, but since I
>> can't afford one or have any place to keep it, I've made do without.
> I think the odds of Python getting
>         from __future__ import pony
> are slightly higher than there being a Python 2.8. I assume by "pony" you really mean what I'd like to have:
>         from __future__ import everything

If that existed, I wouldn't use it. Without it, I know my 2.6+/3.3+ code will work until 3.7. With it, if 3.5 added a new future feature, my code may only work until 3.4. That's not worth it for the convenience of saving a few characters.

> since my goal is to write Python 3 compatible code even though I'm temporarily stuck with Python 2 due to stack issues. The __future__ imports makes it easier to write forward compatible code. As it is, I have to list the individual imports in every file and I also add:
>         range = xrange

There are only four live future features in 2.6 and 2.7, and you can fit them all into one statement that fits in 80 columns. Which you can put into your project template, and then you're done with it.

And then I usually have one more line, "from sixify import *", where sixify is a project-specific collection of imports from six. (And then the challenge is fighting to stop people from putting non-six-related things into sixify and turning it into one of those "stdafx.h" messes that every windows c++ app has.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 20 08:56:49 2014
From: abarnert at (Andrew Barnert)
Date: Sun, 19 Jan 2014 23:56:49 -0800 (PST)
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

From: Daniel da Silva <var.mail.daniel at>
Sent: Sunday, January 19, 2014 3:41 PM

>? ? Sets in mathematics can be defined by a list of elements without repetitions, and alternatively by a predicate (function) that determines inclusion.

The whole point of modern set theory is that sets cannot be defined by a predicate alone; only by a predicate _and a set to apply it over_.?Which we already have in set comprehensions.

And your suggestion has the exact same problem that naive set theory had:

>>> myset = predicateset(lambda s: s.startswith('a'))
>>> 'xyz' in myset

>>> russellset = predicateset(lambda s: s not in s)
>>> russellset in russelset

Presumably this should cause the computer to scream "DOES NOT COMPUTE!" and blow up, which I think would be hard to implement in CPython.

Still, this could be useful despite not being mathematically consistent.?Python functions don't have to be mathematical functions, and you could easily just state that using a predicateset that turns out to be a proper class as undefined behavior, so it's perfectly acceptable if an implementation wants to hang forever or fail with a recursion error or whatever.

Anyway, the way you've designed this, as far as I can tell, there's nothing stopping it from being a module on PyPI that you can come back and propose for inclusion in the stdlib if a lot of people start using it. So I'd say go for it. (And you can even propose syntax, a comprehension with no for clause: {x if expression(x)}, if it's popular enough that seems warranted.)

Also, this isn't a Set in Python terms?or an Iterable or a Sized; it's just a Container. Which is perfectly reasonable, and means len(s) and iter(s) failing is exactly what you should expect. But the name could lead people to expect it to be a Set. Then again, "predicatecontainer" sounds horrible, so maybe the small potential for confusion is fine.

You still need to work out the details. Most of them seem easy, but there are some interesting questions.

?* It's presumably immutable, and therefore Hashable. (It can fail if its predicate isn't?which most callables are, but that's not guaranteed?but I believe that's fine for Hashables.)

?* Is the predicate callable accessible through a public name, or do you have to access it through __contains__?
?* Presumably intersection, union, difference, and symmetric_difference with another predicateset do the obvious thing (or/and/and not/xor the predicates). Or is there something more efficient you could do? There are some modules on PyPI that deal with boolean combinations of predicates; maybe just borrow the design or even import the implementation from one of them?
?* intersection with a set or other Iterable can return a set, equivalent to {x for x in s if x in ps}. And __rand__ allows it to work in the wrong direction when using the operator. But set.intersection(predicateset) will raise a TypeError, and there's not much you can do about that. (And the same goes for the other methods.)
?* union, difference, and symmetric difference with an Iterable presumably turns the other argument into a predicateset(x in s) and then operates on that? Or is there a better way to do it?
?* isdisjoint with a set or other iterable is easy, but what about with another predicateset? An error?
?* issubset and issuperset don't seem implementable, except in the special case that one predicate is made by intersection or union from the other; do they just not exist?

?* Do you want other operations from naive set theory that don't make sense for Python sets, like the unconstrained complement? They could all be implemented with the existing operations and a set of all things (e.g., self.complement() is just predicateset(lambda x: True)).difference(self)), so maybe not. But they might be convenient. (Again, tying in with the boolean-predicates libraries, most of them have a "not" type operation.)

The big problem is coming up with a compelling use case. This one doesn't sell me:

? ? bar_files = search_files('bar', exclude=predicateset(lambda fname: not fname.endswith('~')))?

It seems like it make more sense to have exclude take a function, so you could just write:

? ? bar_files = search_files('bar', exclude=lambda fname: not fname.endswith('~'))

In general, calling a function is just as easy, natural, and readable as testing membership; calling filter or using a comprehension would generally be simpler than creating a predicateset just to use intersection; etc.

And in cases where sometimes a container is useful, but sometimes a function is better? well, look at?re.sub or BeautifulSoup.find. I've seen people who didn't know that you could pass a function to re.sub, but nobody who, on seeing it, had any trouble understanding what it did.

Maybe there's a use for "legacy" APIs that were designed around containers and would be hard to change. For example, many file-picker dialogs let you specify the acceptable extensions, but not a filter function. But in most cases, that's because they're ultimately calling some underlying C/ObjC/.NET/whatever function that needs an array, and a predicateset won't help there anyway. (Or, put another way, they're not designed around containers, they're designed around iterables.)

From rosuav at  Mon Jan 20 09:09:53 2014
From: rosuav at (Chris Angelico)
Date: Mon, 20 Jan 2014 19:09:53 +1100
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 20, 2014 at 6:56 PM, Andrew Barnert <abarnert at> wrote:
> Also, this isn't a Set in Python terms?or an Iterable or a Sized; it's just a Container. Which is perfectly reasonable, and means len(s) and iter(s) failing is exactly what you should expect. But the name could lead people to expect it to be a Set. Then again, "predicatecontainer" sounds horrible, so maybe the small potential for confusion is fine.

If I might be permitted to bikeshed the name a little: My first
thought (from the subject line) was that this was a set *of*
predicates, not a set *defined by a* predicate. But a frozenset isn't
a set of frozens either, so this might be less confusing than I


From ncoghlan at  Mon Jan 20 09:55:57 2014
From: ncoghlan at (Nick Coghlan)
Date: Mon, 20 Jan 2014 18:55:57 +1000
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

On 20 Jan 2014 11:16, "David Mertz" <mertz at> wrote:
> I was mostly disliking the idea of TCO during this discussion.  However,
the idiom of 'return from' seems sufficiently elegant and explicit--and has
exactly the semantics you'd expect from 'yield from'--that I am actually +1
on that idea.

I agree that a PEP for "return from" would be interesting. It also gives
debuggers something to latch on to in order to handle the new scenario
(just as they needed some adjustment to handle "yield from").

"return from" could also be explicitly disallowed in try blocks and with
statements (since those inherently conflict with the idea of reusing the
current frame for a different call).

By keeping a list of references to the ellided calls (perhaps using counts
for more efficient handling of recursive calls), you could even partially
reconstruct the missing parts of the traceback.

> Being an explicit construct, it definitely becomes a case of "consenting
adults" not of implicit magic.  I.e. you are declaring right in the code
that you don't expect to see a frame in a stack trace, which is fair
enough.  I mean, if you *really* wanted to you could muck around with
'sys._getframe(N).f_whatever' already which would give inaccurate
tracebacks too.  Probably there would be a way to removed frames from the
stack even, using some such trick in current python.

Yep, we do that (from C) in importlib to try to reduce the infrastructure
noise in the tracebacks shown to users.


> On Sun, Jan 19, 2014 at 3:13 PM, Terry Reedy <tjreedy at> wrote:
>> Proposal (mostly not mine): add 'return from f(args)', in analogy with
'yield from iterator', to return a value to the caller from an execution
frame running f(args) (and either reuse or delete the frame that ran
'return from'). The function name 'f' would not have to match the name of
the function being compiled, this would actually be TCO, even if it were
nearly always used for recursive tail calls. That does mean that is would
work for mutually tail recursive functions.
>> On 1/19/2014 6:57 AM, Joao S. O. Bueno wrote:
>>> OTOH, since we are at it, we'd better check
>>> 2009 BDLF's opinion on the subject:
>> I read throught the comments and near the very end, in July 2013, Dan
LaMotte said... '''
>> Definitely seems to be complicated/impossible to determine a function is
tail recursion 'compliant' statically in python, however, what if it were
an 'opt in' feature that uses a different 'return' keyword?
>>     def f(n):
>>     if n > 0:
>>     tailcall f(n - 1)
>>     return 0
>> '''
>> In additional paragraphs, he noted, among other things, that this makes
the feature 'opt-in' on a function by function basis.
>> Guido replied "Dan: your proposal has the redeeming quality of clearly
being a language feature rather than a possible optimization. I don't
really expect there to be enough demand to actually add this to the
language though. Maybe you can use macropy to play around with the idea
>> ???? then suggested 'return from'. My only contribution is to point out
the analogy with the new, and initially strange, 'yield from'.
>> Guido seems to have said that if a) someone tries out the idea with
macropy, and b) someone demonstrates enough demand, he might consider
adding such a feature. So this seems to me the best option to pursue to get
something into CPython. I also think it is the best proposal so far.
>> As for a), I have not looked as macropy, but:
>> On 1/19/2014 4:33 PM, Haoyi Li wrote:> MacroPy's @tco decorator is about
as easy as you could ask for. 'pip
>> > install macropy', 'from macropy.experimental.tco import macros, tco' >
is about as easy as you could ask for. Works for arbitrary tail-calls >
too, not just tail recursion.
>> That leaves b) for those of you who want the feature.
>> Any PEP should admit that the feature might be abused. Someone might
>>   return from len(composite)
>> Unless return from refuses to delete the frame making a call to a C
function, the effect would be to save a trivial O(1) space as the cost of
deleting the most important line of a stack trace should len() raise. But I
think this falls under the 'consenting adults' principle. A proposed doc
should make it clear that the intended use is to make deeply recursive or
mutually recursive functions run and not to replace all tail calls.
>> --
>> Terry Jan Reedy
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jeanpierreda at  Mon Jan 20 11:26:27 2014
From: jeanpierreda at (Devin Jeanpierre)
Date: Mon, 20 Jan 2014 02:26:27 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 19, 2014 at 11:56 PM, Andrew Barnert <abarnert at> wrote:
> From: Daniel da Silva <var.mail.daniel at>
>>    Sets in mathematics can be defined by a list of elements without repetitions, and alternatively by a predicate (function) that determines inclusion.
> The whole point of modern set theory is that sets cannot be defined by a predicate alone; only by a predicate _and a set to apply it over_. Which we already have in set comprehensions.
> And your suggestion has the exact same problem that naive set theory had:
>>>> myset = predicateset(lambda s: s.startswith('a'))
>>>> 'xyz' in myset
> False
>>>> russellset = predicateset(lambda s: s not in s)
>>>> russellset in russelset
> Presumably this should cause the computer to scream "DOES NOT COMPUTE!" and blow up, which I think would be hard to implement in CPython.
> Still, this could be useful despite not being mathematically consistent.

No; what you have shown is that a predicateset can't both accept the
function you specified, and also have its containment method always
return a value (as opposed to raising an exception or not halting).
You have not shown that the idea of a predicateset is inherently
contradictory, unless that idea includes both of those facts -- and
that would indeed be silly, since, as you've shown, that is an idea
with self-contradicting requirements.

In contrast, naive set theory thought all of those things: a set can
be defined in that way, and a set either contains something or not,
but not neither and not both. And Russell proved that this is impossible.

There is not any kind of fundamental problem with the idea of a Python
set-like object defined by Python predicates. Python sets aren't
mathematical sets, and Python predicates aren't mathematical
predicates. Things can be different from how they are described in
mathematics, without being internally inconsistent, and without being

> The big problem is coming up with a compelling use case.
> In general, calling a function is just as easy, natural, and readable as testing membership; calling filter or using a comprehension would generally be simpler than creating a predicateset just to use intersection; etc.

Yes. If a predicate set is just a thin wrapper around predicates, it
is pointless. IMO the only utility of specially wrapping predicates is
allowing them to be combined efficiently, but the bulk of the work
there is just in manipulating sets of bitvectors (best done with
ROBDDs as far as I know). Arguably the work after that is trivial.

-- Devin

From denis.spir at  Mon Jan 20 13:12:31 2014
From: denis.spir at (spir)
Date: Mon, 20 Jan 2014 13:12:31 +0100
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
Message-ID: <>

On 01/19/2014 11:32 PM, musicdenotation at wrote:
> It fits peoples' brains more because of familiarity, not "nature".

That people often use "intuitive" or "natural" instead of "famliar" or "usual" 
does not mean, logically, that there is no better intuitive or natural choice. 
That people misuse a term does not imply it has no proper meaning.

For instance, closed intervals are more intuitive or natural, obviously (but for 
some reason I don't know). If you ask someone to count from 1 to 9, you will 
probably be surprised to hear him/her start from 2 or stop after 8. If you are 
asked to choose a letter between c and g, you will probably be surprised to hear 
that 'c' or 'g' is no good choice.

[This does not mean that closed intervals are the right choice in programming, 
i'm just discuting the notions of intuitive or natural; this is related to the 
way we spontaneously think or understand. Programming may require unintuitive or 
unnatural design choices, for some other, independant reasons; dunno. For the 
matter, I think the right choice may be neither [i,j] closed nore [i,k[ 
half-closed intervals, but (i,n) ranges, where n is the number of items.]

About the case of recursivity, whether it may be intuitive or natural, I think 
(see some previous post) that is very, very hard to judge. It is so abstract, 
and obviously difficult to catch. It require understanding recurrence (remember 
difficulty of most people at school?) and then tuuning it inside out *in mind* 
like a sock ;-), to produce an algo running backwards, and still understanding 
that it will do the right thing (because in fact it computes forwards behind the 
stage, which is totally implicit, and again hard to get).

About optimisation of tail calls, I share Guido's "pronouncement". Mainly 
because these optimisable (backward) recursive algo are the ones one can easily 
express by a forward algo (using loops and/or corecursivity), if I understanding 
the issue well (which i'm not 100% sure, but I don't know any counter-example). 
The issue of stack traces and programmer feedback is just for me another reason 
(not decisive because such algos often require inserting debug prints anyway, to 
understand what actually happens and/or diagnose a bug).


From rosuav at  Mon Jan 20 14:09:35 2014
From: rosuav at (Chris Angelico)
Date: Tue, 21 Jan 2014 00:09:35 +1100
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <20140119004515.GP3915@ando> <lbgfeq$kok$>
Message-ID: <>

On Mon, Jan 20, 2014 at 11:12 PM, spir <denis.spir at> wrote:
> For instance, closed intervals are more intuitive or natural, obviously (but
> for some reason I don't know). If you ask someone to count from 1 to 9, you
> will probably be surprised to hear him/her start from 2 or stop after 8. If
> you are asked to choose a letter between c and g, you will probably be
> surprised to hear that 'c' or 'g' is no good choice.

I'm not so sure about that. The half-open interval makes as much sense
as the fully closed - all you have to do is interpret the indices as
being *between* elements. Take, for example, Scripture verses. (Quotes
? 1973, 1978, 1984, 2011 by Biblica, Inc.? Used by permission. All
rights reserved worldwide. Copyright notice included for license
compliance. Note that I'm using bracketed numbers to indicate the
beginnings of verses - in a printed Bible, these would normally be in

John 14:
[31] To the Jews who had believed him, Jesus said, ?If you hold to my
teaching, you are really my disciples. [32] Then you will know the
truth, and the truth will set you free.? [33]

This passage is normally referred to as "John 14:31-32", but as you
see, the verse marker [32] is in the middle of the quote. Using a
half-open interval, this would start at "John 14:31" and end at "John
14:33". Half-open means: "Begin at the beginning, go on till you come
to the end, then stop", as the King of Hearts instructed the White

It's easy to indicate the beginning of a chapter: your start reference
is verse 1. Here's the beginning of the account of the creation of the

[1] In the beginning God created the heavens and the earth. [2] Now
the earth was formless and empty, darkness was over the surface of the
deep, and the Spirit of God was hovering over the waters. [3] And God
said, ?Let there be light,? and there was light. [4] God saw that the
light was good, and he separated the light from the darkness. [5] God
called the light ?day,? and the darkness he called ?night.? And there
was evening, and there was morning?the first day. [6]

Common parlance: Genesis 1:1-5. Half-open: Genesis 1:1-6. Conclusion:
Tie. No argument to be made for either side. But what if you're
looking at the *end* of a chapter? Here are a few verses from later on
in Genesis 1:

[29] Then God said, ?I give you every seed-bearing plant on the face
of the whole earth and every tree that has fruit with seed in it. They
will be yours for food. [30] And to all the beasts of the earth and
all the birds of the air and all the creatures that move on the
ground?everything that has the breath of life in it?I give every green
plant for food.? And it was so. [31] God saw all that he had made, and
it was very good. And there was evening, and there was morning?the
sixth day.

Common parlance: Genesis 1:29-31. Half-open: Genesis 1:29-2:1. It's
much more obvious by the latter that this passage extends exactly to
the end of the chapter.

Obviously it's way WAY too late to change the way Bible references are
written, any more than Melway could renumber their maps all of a
sudden. Massive case of lock-in and backward-incompatibility with
existing code. But I put it to you that the half-open would make at
least as much sense as the closed, in any situation where there are
boundaries with contents between them.

Note, by the way, that I'm not looking at anything involving backward
scanning or wider strides, both of which Python's slice notation
supports. Neither of those is inherently real-world intuitive, so the
exact semantics can be defined as whatever makes sense in code. (And
there was some discussion a little while ago about exactly that.) I'm
just looking at the very simple and common case of referencing a
subset of consecutive elements from a much larger whole.

The closed interval makes more sense when the indices somehow *are*
the values being retrieved. When you count from 1 to 9, you expect
nine numbers: 1, 2, ..., 8, 9. When you list odd numbers from 1 to 9,
you expect 1, 3, 5, 7, 9. But what if you're looking at a container
train and numbering the twenty-foot-equivalent-units (TEU) that it
has? A 40-foot container requires 2 TEU, a 60-foot container requires
3 TEU. A "reefer" (refridgerated container) might require an extra
slot, or at least it might be a 56-footer and consume 3 TEU. One wagon
might, if you're lucky, carry 5 TEU; numbering them 1 through 5 would
be obvious, but numbering the boundaries between them as 0 through 5
is better at handling the multiple TEU containers. (Even more so when
you look at double-stacked containers. An over-height 40-foot
container could consume 2 TEU horizontally and 2 TEU vertically, and
be put in slots (0,0)-(2,2). This is, in fact, exactly how a GTK2
Table layout works.) Both types of intervals have their places.


From denis.spir at  Mon Jan 20 15:29:47 2014
From: denis.spir at (spir)
Date: Mon, 20 Jan 2014 15:29:47 +0100
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
 <> <>
Message-ID: <>

On 01/20/2014 02:09 PM, Chris Angelico wrote:
> On Mon, Jan 20, 2014 at 11:12 PM, spir <denis.spir at> wrote:
>> For instance, closed intervals are more intuitive or natural, obviously (but
>> for some reason I don't know). If you ask someone to count from 1 to 9, you
>> will probably be surprised to hear him/her start from 2 or stop after 8. If
>> you are asked to choose a letter between c and g, you will probably be
>> surprised to hear that 'c' or 'g' is no good choice.
> I'm not so sure about that. The half-open interval makes as much sense
> as the fully closed - all you have to do is interpret the indices as
> being *between* elements. Take, for example, Scripture verses. (Quotes
> ? 1973, 1978, 1984, 2011 by Biblica, Inc.? Used by permission. All
> rights reserved worldwide. Copyright notice included for license
> compliance. Note that I'm using bracketed numbers to indicate the
> beginnings of verses - in a printed Bible, these would normally be in
> superscript.)
> John 14:
> [31] To the Jews who had believed him, Jesus said, ?If you hold to my
> teaching, you are really my disciples. [32] Then you will know the
> truth, and the truth will set you free.? [33]
> This passage is normally referred to as "John 14:31-32", but as you
> see, the verse marker [32] is in the middle of the quote. Using a
> half-open interval, this would start at "John 14:31" and end at "John
> 14:33". Half-open means: "Begin at the beginning, go on till you come
> to the end, then stop", as the King of Hearts instructed the White
> Rabbit.
> It's easy to indicate the beginning of a chapter: your start reference
> is verse 1. Here's the beginning of the account of the creation of the
> world:
> [1] In the beginning God created the heavens and the earth. [2] Now
> the earth was formless and empty, darkness was over the surface of the
> deep, and the Spirit of God was hovering over the waters. [3] And God
> said, ?Let there be light,? and there was light. [4] God saw that the
> light was good, and he separated the light from the darkness. [5] God
> called the light ?day,? and the darkness he called ?night.? And there
> was evening, and there was morning?the first day. [6]
> Common parlance: Genesis 1:1-5. Half-open: Genesis 1:1-6. Conclusion:
> Tie. No argument to be made for either side. But what if you're
> looking at the *end* of a chapter? Here are a few verses from later on
> in Genesis 1:
> [29] Then God said, ?I give you every seed-bearing plant on the face
> of the whole earth and every tree that has fruit with seed in it. They
> will be yours for food. [30] And to all the beasts of the earth and
> all the birds of the air and all the creatures that move on the
> ground?everything that has the breath of life in it?I give every green
> plant for food.? And it was so. [31] God saw all that he had made, and
> it was very good. And there was evening, and there was morning?the
> sixth day.
> Common parlance: Genesis 1:29-31. Half-open: Genesis 1:29-2:1. It's
> much more obvious by the latter that this passage extends exactly to
> the end of the chapter.

I do agree with your reasoning, it is indeed totally logical. However, it is not 
at all intuitive or natural (maybe tis is why Bible refs do not work your way 
;-) dunno).

This is probably related to the issue of prog indices interpreted as ordinals 
[*] or offsets. Aparently, obviously in fact, people intuitively or naturally 
interpret them as ordinals; which breaks your logic or conflicts with it. 
Whether it's "much more obvious" (quoting you in the last parag above) is a also 
question of how you interpret indices: if they're ordinals for you, then Genesis 
1:29-31 is perfectly clear on where the ref'ed passage stops.

[*] "ordinal" in the mathematical or linguistic sense, meaning a natural number 
holding the rank of an item in a sequence (not python's ord())

> Obviously it's way WAY too late to change the way Bible references are
> written, any more than Melway could renumber their maps all of a
> sudden. Massive case of lock-in and backward-incompatibility with
> existing code. But I put it to you that the half-open would make at
> least as much sense as the closed, in any situation where there are
> boundaries with contents between them.
> Note, by the way, that I'm not looking at anything involving backward
> scanning or wider strides, both of which Python's slice notation
> supports. Neither of those is inherently real-world intuitive, so the
> exact semantics can be defined as whatever makes sense in code. (And
> there was some discussion a little while ago about exactly that.) I'm
> just looking at the very simple and common case of referencing a
> subset of consecutive elements from a much larger whole.
> The closed interval makes more sense when the indices somehow *are*
> the values being retrieved.

You are right; see also note below on the case where [i,k[ is actually 
advantageous by itself.

> When you count from 1 to 9, you expect
> nine numbers: 1, 2, ..., 8, 9. When you list odd numbers from 1 to 9,
> you expect 1, 3, 5, 7, 9. But what if you're looking at a container
> train and numbering the twenty-foot-equivalent-units (TEU) that it
> has? A 40-foot container requires 2 TEU, a 60-foot container requires
> 3 TEU. A "reefer" (refridgerated container) might require an extra
> slot, or at least it might be a 56-footer and consume 3 TEU. One wagon
> might, if you're lucky, carry 5 TEU; numbering them 1 through 5 would
> be obvious, but numbering the boundaries between them as 0 through 5
> is better at handling the multiple TEU containers. (Even more so when
> you look at double-stacked containers. An over-height 40-foot
> container could consume 2 TEU horizontally and 2 TEU vertically, and
> be put in slots (0,0)-(2,2). This is, in fact, exactly how a GTK2
> Table layout works.) Both types of intervals have their places.

I also think there may be 2 kinds of notations for slices and such, one beeing 
[i,j] and the other maybe (i,n) where n is the number of items, rather than 
[i,k[ where k is the "post-last" or "past-the-end" index. Reasons to think on 
that path:

* since n is not an index, it avoids all thinking trouble and misinterpretations 
with k as opposed to j; in particular, it avoids the "intuitive conflict" evoked 

* n makes sense and is useful by itself (eg think at typical arrays {p,n} or 
slices/views {i,n}, or at algos for copy, compare, traversal, concat, map...)

* when [i,k[ works better than [i,j], most often it's because we have n (k=i+n) 
or need n (n=k-j), thus we avoid +1 or -1; this, rather than any worth of k by 

* other cases where [i,k[ seems to work nicely is "self-feeding" in fact: we 
have & need [i,k[ just because the lang uses that, but the same would be true 
whatever the interval notation (eg the lang returns i,k from a builtin func 
searching something in a seq, and we then use it to get a subseq)

* the only advantage of k by itself, logically, I think, is when scanning a 
non-terminated token (eg a number): we must pass the last item (digit) to know 
the token is finished, thus end up holding i & k, not i & j; however, if we use 
(i,n) notation, it's easy enough to write (i,k-i), so no big deal, just as in 
the opposite case; and this situation is obviously, i guess, a little minority 
of uses of intervals [1]

Anyway, i think only practice of alternatives and talk among non-ideologically 
blinded programmers can tell us what's worth or not.


[1] However, from a semantic point of view, [i,k[ is problematic even in this 
very case where it seems nicer at first sight, because we don't need to type -1. 
Say we're scanning for a number and there is "1234567" in source:
	<-----> n
	i     jk
We get i & k. If the lang uses [i,k[ intervals, then we just write it that way 
to get the right substring, and are pleased not to have to type -1. However, the 
"semantic truth" (if I may say) is that we stopped scanning *after* the last 
digit, and need to slice up to the *previous* character. This is not written in 
s[i,k]; where is the idea "up to the previous position" expressed in this 
notation? "Previous" translates to -1 in arithmetic or programming. For the 
notation to be semantically correct, it should say "-1" somewhere. And this is 
why closed intervals s[i,k-1] are superior, from the semantic perspective, even 
in this case, the very case where half-open intervals superficially look nicer. 
Half-open intervals do not say what they mean, so-to-say, they cheat ;-) (booh!)

A related point (semantics, thinking) is that, as far as i know, many 
programmers in langs using [i,k[ just do *not* think it. They just know from exp 
that it just works in most cases (reasons listed above) but do not think, for 
instance in this case along the lines: "all right, we stop after the last digit, 
thus need to slice up to the previous position, thus a half-open interval is 
right here". No, they seem to do it blindly like an automat. I asked other 
programmers about that when I noticed it was true by me (i use it blidly, don't 
know on a given case why/how it works unless I stop and *start* to think). I 
just prefer (to be) a programmer who thinks than a coding machine, but it's just me.

From rosuav at  Mon Jan 20 15:50:23 2014
From: rosuav at (Chris Angelico)
Date: Tue, 21 Jan 2014 01:50:23 +1100
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <20140119004515.GP3915@ando> <lbgfeq$kok$>
Message-ID: <>

On Tue, Jan 21, 2014 at 1:29 AM, spir <denis.spir at> wrote:
> I also think there may be 2 kinds of notations for slices and such, one
> beeing [i,j] and the other maybe (i,n) where n is the number of items,
> rather than [i,k[ where k is the "post-last" or "past-the-end" index.

This is why REXX has the "DO... FOR" loop syntax. You can code a loop thus:

do i=1 to 5 /* 1, 2, 3, 4, 5 */

do i=1 to 5 by 2 /* 1, 3, 5 */

do i=1 by 2 for 6 /* 1, 3, 5, 7, 9, 11 */

The 'for N' criterion specifies the number of iterations to do,
regardless of the stop position. (REXX doesn't have slice notation, so
loops are the nearest equivalent.) It would be quite reasonable to
create a slice-like object in Python, but I'm not sure how to put all
of this functionality into syntax that's tight enough to be useful -
nobody wants to write foo[slice(1,None,2,count=5)] !


From denis.spir at  Mon Jan 20 16:29:39 2014
From: denis.spir at (spir)
Date: Mon, 20 Jan 2014 16:29:39 +0100
Subject: [Python-ideas] return from -- breadth of usage
In-Reply-To: <lbhm77$aei$>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <>

I think tail call is very common. Consider following examples:

def perform (input):	# a "action"
     data = prepare(input)
     process(data)   # tail call

def result (input):	# a "function" properly speaking
     data = prepare(input)
     return process(data)   # tail call

def case1 (input):
     if cond(input):
         <deal with common case in place>
     deal_with_special_case()    # tail call

def case2 (input):
     if cond(input):
         <deal with special case in place>
     deal_with_common_case()     # tail call

def perform_cases (input):
     if cond1(input):
         case1(input)    # tail call
     elif cond2(input):
         case2(input)    # tail call
     elif cond3(input):
         case3(input)    # tail call

def result_cases (input):
     if cond1(input):
         return case1(input)    # tail call
     elif cond2(input):
         return case2(input)    # tail call
     elif cond3(input):
         return case3(input)    # tail call

There are probably many more typical *schemas* of common tail call use cases. It 
is in any case very frequent, of pretty various usage, and not specific to 
functional or functional-like programming. Instead, we all use tail calls 
constantly, without even thinking at it, just like we constantly make prose ;-). [1]

My point of view is not that tail call is a special (maybe very minoritary) kind 
of call, but that there are 2 kinds of calls maybe of equal importance:
* delegation: another proc is passed the responsability of performing a task, or 
achieving the rest of it (tail call)
* assistance: another proc is used to assist in a main task, still controlled 
and assumed by the main proc (sub call)

I guess there are 2 main situations of delegation: / tail calls
* the main proc sorts out cases and delegates in some or all cases
* the main proc prepares the task and a delegate achieves it
which may be mixed. (I may miss some, for sure.)

"return from" may well do the job, but entertains imo wrong views about tail 
calls. Maybe "pass" would do the job better. When a delegate f performs an 
action (action, examples 'perform' & 'perform_cases' & 'case*' above), it can be 
interpreted as "pass the responsability of the task to f", or just "pass by f". 
When a delegate f computes a result (function, examples 'result' & 
'result_cases' above) it can interpreted as "pass f's result back to the 
caller". (There is a similar ambiguity with "return", actually also matching 
semantic ambiguity.)


[1] Allusion to

From rosuav at  Mon Jan 20 16:39:33 2014
From: rosuav at (Chris Angelico)
Date: Tue, 21 Jan 2014 02:39:33 +1100
Subject: [Python-ideas] return from -- breadth of usage
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <lbhm77$aei$> <>
Message-ID: <>

On Tue, Jan 21, 2014 at 2:29 AM, spir <denis.spir at> wrote:
> def perform (input):    # a "action"
>     data = prepare(input)
>     process(data)   # tail call
> def result (input):     # a "function" properly speaking
>     data = prepare(input)
>     return process(data)   # tail call

To Python, the second one could be a tail call, but the first one
isn't. It's really:

def perform (input):    # a "action"
    data = prepare(input)
    return None

If process() happens to return None, then it becomes a tail call, but
since Python has no way of knowing if this will be the case, it can't
optimize anything away. (Conversely, if the interpreter knew that
perform()'s return value was going to be ignored, the same
optimization could be made, but it can't assume that either.)

But if 'return from' syntax is added, I don't think it'll be much of
an issue to put explicit return statements in functions where you know
it'll always be None.

def perform (input):    # a "action"
    data = prepare(input)
    return from process(data) # now a tail call


From jonathan at  Mon Jan 20 16:57:48 2014
From: jonathan at (Jonathan Slenders)
Date: Mon, 20 Jan 2014 16:57:48 +0100
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

Interesting. I very much like the "return from" syntax. It's explicit and
consistent enough with "yield from".

When using coroutines, it currently also happens that at some points you
have a choice to drop certain frames from the stack. Take for instance the

def a():
    result = yield from b() # 'b' is another coroutine
    return result

or often written as:

def a():
    return (yield from b())

You could write it as:

def a():
    return b()

In the last example, you delegate to another coroutine, removing 'a' from
the stack.
(see this discussion:!topic/python-tulip/5xW44wh5Krs )

2014/1/20 Nick Coghlan <ncoghlan at>

> On 20 Jan 2014 11:16, "David Mertz" <mertz at> wrote:
> >
> > I was mostly disliking the idea of TCO during this discussion.  However,
> the idiom of 'return from' seems sufficiently elegant and explicit--and has
> exactly the semantics you'd expect from 'yield from'--that I am actually +1
> on that idea.
> I agree that a PEP for "return from" would be interesting. It also gives
> debuggers something to latch on to in order to handle the new scenario
> (just as they needed some adjustment to handle "yield from").
> "return from" could also be explicitly disallowed in try blocks and with
> statements (since those inherently conflict with the idea of reusing the
> current frame for a different call).
> By keeping a list of references to the ellided calls (perhaps using counts
> for more efficient handling of recursive calls), you could even partially
> reconstruct the missing parts of the traceback.
> > Being an explicit construct, it definitely becomes a case of "consenting
> adults" not of implicit magic.  I.e. you are declaring right in the code
> that you don't expect to see a frame in a stack trace, which is fair
> enough.  I mean, if you *really* wanted to you could muck around with
> 'sys._getframe(N).f_whatever' already which would give inaccurate
> tracebacks too.  Probably there would be a way to removed frames from the
> stack even, using some such trick in current python.
> Yep, we do that (from C) in importlib to try to reduce the infrastructure
> noise in the tracebacks shown to users.
> Cheers,
> Nick.
> >
> >
> > On Sun, Jan 19, 2014 at 3:13 PM, Terry Reedy <tjreedy at> wrote:
> >>
> >> Proposal (mostly not mine): add 'return from f(args)', in analogy with
> 'yield from iterator', to return a value to the caller from an execution
> frame running f(args) (and either reuse or delete the frame that ran
> 'return from'). The function name 'f' would not have to match the name of
> the function being compiled, this would actually be TCO, even if it were
> nearly always used for recursive tail calls. That does mean that is would
> work for mutually tail recursive functions.
> >>
> >> On 1/19/2014 6:57 AM, Joao S. O. Bueno wrote:
> >>>
> >>> OTOH, since we are at it, we'd better check
> >>> 2009 BDLF's opinion on the subject:
> >>>
> >>>
> >>
> >>
> >> I read throught the comments and near the very end, in July 2013, Dan
> LaMotte said... '''
> >> Definitely seems to be complicated/impossible to determine a function
> is tail recursion 'compliant' statically in python, however, what if it
> were an 'opt in' feature that uses a different 'return' keyword?
> >>
> >>     def f(n):
> >>     if n > 0:
> >>     tailcall f(n - 1)
> >>     return 0
> >> '''
> >> In additional paragraphs, he noted, among other things, that this makes
> the feature 'opt-in' on a function by function basis.
> >>
> >> Guido replied "Dan: your proposal has the redeeming quality of clearly
> being a language feature rather than a possible optimization. I don't
> really expect there to be enough demand to actually add this to the
> language though. Maybe you can use macropy to play around with the idea
> though?"
> >>
> >> ???? then suggested 'return from'. My only contribution is to point out
> the analogy with the new, and initially strange, 'yield from'.
> >>
> >> Guido seems to have said that if a) someone tries out the idea with
> macropy, and b) someone demonstrates enough demand, he might consider
> adding such a feature. So this seems to me the best option to pursue to get
> something into CPython. I also think it is the best proposal so far.
> >>
> >> As for a), I have not looked as macropy, but:
> >> On 1/19/2014 4:33 PM, Haoyi Li wrote:> MacroPy's @tco decorator is
> about as easy as you could ask for. 'pip
> >> > install macropy', 'from macropy.experimental.tco import macros, tco'
> > is about as easy as you could ask for. Works for arbitrary tail-calls >
> too, not just tail recursion.
> >>
> >> That leaves b) for those of you who want the feature.
> >>
> >> Any PEP should admit that the feature might be abused. Someone might
> write
> >>   return from len(composite)
> >> Unless return from refuses to delete the frame making a call to a C
> function, the effect would be to save a trivial O(1) space as the cost of
> deleting the most important line of a stack trace should len() raise. But I
> think this falls under the 'consenting adults' principle. A proposed doc
> should make it clear that the intended use is to make deeply recursive or
> mutually recursive functions run and not to replace all tail calls.
> >>
> >> --
> >> Terry Jan Reedy
> >>
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at
> >>
> >> Code of Conduct:
> >
> >
> >
> >
> > --
> > Keeping medicines from the bloodstreams of the sick; food
> > from the bellies of the hungry; books from the hands of the
> > uneducated; technology from the underdeveloped; and putting
> > advocates of freedom in prisons.  Intellectual property is
> > to the 21st century what the slave trade was to the 16th.
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rymg19 at  Mon Jan 20 19:36:12 2014
From: rymg19 at (Ryan Gonzalez)
Date: Mon, 20 Jan 2014 12:36:12 -0600
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <20140119004515.GP3915@ando> <lbgfeq$kok$>
 <> <>
Message-ID: <>

+1 for the example you used.

On Mon, Jan 20, 2014 at 7:09 AM, Chris Angelico <rosuav at> wrote:

> On Mon, Jan 20, 2014 at 11:12 PM, spir <denis.spir at> wrote:
> > For instance, closed intervals are more intuitive or natural, obviously
> (but
> > for some reason I don't know). If you ask someone to count from 1 to 9,
> you
> > will probably be surprised to hear him/her start from 2 or stop after 8.
> If
> > you are asked to choose a letter between c and g, you will probably be
> > surprised to hear that 'c' or 'g' is no good choice.
> I'm not so sure about that. The half-open interval makes as much sense
> as the fully closed - all you have to do is interpret the indices as
> being *between* elements. Take, for example, Scripture verses. (Quotes
> ? 1973, 1978, 1984, 2011 by Biblica, Inc.? Used by permission. All
> rights reserved worldwide. Copyright notice included for license
> compliance. Note that I'm using bracketed numbers to indicate the
> beginnings of verses - in a printed Bible, these would normally be in
> superscript.)
> John 14:
> [31] To the Jews who had believed him, Jesus said, ?If you hold to my
> teaching, you are really my disciples. [32] Then you will know the
> truth, and the truth will set you free.? [33]
> This passage is normally referred to as "John 14:31-32", but as you
> see, the verse marker [32] is in the middle of the quote. Using a
> half-open interval, this would start at "John 14:31" and end at "John
> 14:33". Half-open means: "Begin at the beginning, go on till you come
> to the end, then stop", as the King of Hearts instructed the White
> Rabbit.
> It's easy to indicate the beginning of a chapter: your start reference
> is verse 1. Here's the beginning of the account of the creation of the
> world:
> [1] In the beginning God created the heavens and the earth. [2] Now
> the earth was formless and empty, darkness was over the surface of the
> deep, and the Spirit of God was hovering over the waters. [3] And God
> said, ?Let there be light,? and there was light. [4] God saw that the
> light was good, and he separated the light from the darkness. [5] God
> called the light ?day,? and the darkness he called ?night.? And there
> was evening, and there was morning?the first day. [6]
> Common parlance: Genesis 1:1-5. Half-open: Genesis 1:1-6. Conclusion:
> Tie. No argument to be made for either side. But what if you're
> looking at the *end* of a chapter? Here are a few verses from later on
> in Genesis 1:
> [29] Then God said, ?I give you every seed-bearing plant on the face
> of the whole earth and every tree that has fruit with seed in it. They
> will be yours for food. [30] And to all the beasts of the earth and
> all the birds of the air and all the creatures that move on the
> ground?everything that has the breath of life in it?I give every green
> plant for food.? And it was so. [31] God saw all that he had made, and
> it was very good. And there was evening, and there was morning?the
> sixth day.
> Common parlance: Genesis 1:29-31. Half-open: Genesis 1:29-2:1. It's
> much more obvious by the latter that this passage extends exactly to
> the end of the chapter.
> Obviously it's way WAY too late to change the way Bible references are
> written, any more than Melway could renumber their maps all of a
> sudden. Massive case of lock-in and backward-incompatibility with
> existing code. But I put it to you that the half-open would make at
> least as much sense as the closed, in any situation where there are
> boundaries with contents between them.
> Note, by the way, that I'm not looking at anything involving backward
> scanning or wider strides, both of which Python's slice notation
> supports. Neither of those is inherently real-world intuitive, so the
> exact semantics can be defined as whatever makes sense in code. (And
> there was some discussion a little while ago about exactly that.) I'm
> just looking at the very simple and common case of referencing a
> subset of consecutive elements from a much larger whole.
> The closed interval makes more sense when the indices somehow *are*
> the values being retrieved. When you count from 1 to 9, you expect
> nine numbers: 1, 2, ..., 8, 9. When you list odd numbers from 1 to 9,
> you expect 1, 3, 5, 7, 9. But what if you're looking at a container
> train and numbering the twenty-foot-equivalent-units (TEU) that it
> has? A 40-foot container requires 2 TEU, a 60-foot container requires
> 3 TEU. A "reefer" (refridgerated container) might require an extra
> slot, or at least it might be a 56-footer and consume 3 TEU. One wagon
> might, if you're lucky, carry 5 TEU; numbering them 1 through 5 would
> be obvious, but numbering the boundaries between them as 0 through 5
> is better at handling the multiple TEU containers. (Even more so when
> you look at double-stacked containers. An over-height 40-foot
> container could consume 2 TEU horizontally and 2 TEU vertically, and
> be put in slots (0,0)-(2,2). This is, in fact, exactly how a GTK2
> Table layout works.) Both types of intervals have their places.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

When your hammer is C++, everything begins to look like a thumb.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From g.brandl at  Mon Jan 20 20:05:03 2014
From: g.brandl at (Georg Brandl)
Date: Mon, 20 Jan 2014 20:05:03 +0100
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <lbjrv4$als$>

Am 20.01.2014 08:56, schrieb Andrew Barnert:
> From: Daniel da Silva <var.mail.daniel at> Sent: Sunday, January 19,
> 2014 3:41 PM
>> Overview: Sets in mathematics can be defined by a list of elements without
>> repetitions, and alternatively by a predicate (function) that determines
>> inclusion.
> The whole point of modern set theory is that sets cannot be defined by a
> predicate alone; only by a predicate _and a set to apply it over_. Which we
> already have in set comprehensions.
> And your suggestion has the exact same problem that naive set theory had:
> >>> myset = predicateset(lambda s: s.startswith('a'))
> >>> 'xyz' in myset
> False
> >>> russellset = predicateset(lambda s: s not in s)
> >>> russellset in russelset
> Presumably this should cause the computer to scream "DOES NOT COMPUTE!" and
> blow up...

I think it will just raise a NameError...


From mertz at  Mon Jan 20 21:10:14 2014
From: mertz at (David Mertz)
Date: Mon, 20 Jan 2014 12:10:14 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <lbjrv4$als$>
References: <>
Message-ID: <>

Although a cute point, I'm not too concerned about the Russell's Paradox
issue.  The obvious implementation will get a "RuntimeError: maximum
recursion depth exceeded" in that case.  But then, no predicate is
guaranteed to halt, so that's not really special to the russellset.

On the other hand, even though I think the idea of a 'predicateset' is cute
mathematically, I'm not really sure what it actually gets you, even in

I am perfectly happy spelling this:

  mypset = predicateset(somefunc)
  if x in mypset: ...


  if somefunc(x): ...

Even for the set operators, set comprehensions seem pretty much equally

  such_that = {1, 2, 3} & mypset  # Looks nice, I agree

But then, this looks pretty nice also:

  such_that = {x for x in {1, 2, 3} if somefunc(x)}

OK, sure the predicateset version might save a few characters, but not all
that many.

If you want to combine predicate sets that's really just like combining
predicates.  It *does* sort of remind me that I'd like some standard HOFs
as builtins or in the standard library (probably in functools).  But still,
where you might write:

  in_both_sets = mypset & mypset2

It's not bad to write a small support module:

  def allP(*fns):
      return lambda x: all(f(x) for f in fns)

  def anyP(*fns):
      return lambda x: any(f(x) for f in fns)

Then express the intersection as:

  in_both_pred = allP(somefunc, somefunc2)

>From there, you can just use the predicate 'in_both_pred' as above.
 Similarly for union, define:

  in_either_pred = anyP(somefunc, somefunc2)

On Mon, Jan 20, 2014 at 11:05 AM, Georg Brandl <g.brandl at> wrote:

> Am 20.01.2014 08:56, schrieb Andrew Barnert:
> > From: Daniel da Silva <var.mail.daniel at> Sent: Sunday, January
> 19,
> > 2014 3:41 PM
> >
> >
> >> Overview: Sets in mathematics can be defined by a list of elements
> without
> >> repetitions, and alternatively by a predicate (function) that determines
> >> inclusion.
> >
> > The whole point of modern set theory is that sets cannot be defined by a
> > predicate alone; only by a predicate _and a set to apply it over_. Which
> we
> > already have in set comprehensions.
> >
> > And your suggestion has the exact same problem that naive set theory had:
> >
> > >>> myset = predicateset(lambda s: s.startswith('a'))
> > >>> 'xyz' in myset
> > False
> >
> > >>> russellset = predicateset(lambda s: s not in s)
> > >>> russellset in russelset
> >
> > Presumably this should cause the computer to scream "DOES NOT COMPUTE!"
> and
> > blow up...
> I think it will just raise a NameError...
> Georg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Mon Jan 20 21:16:06 2014
From: tjreedy at (Terry Reedy)
Date: Mon, 20 Jan 2014 15:16:06 -0500
Subject: [Python-ideas] return from -- breadth of usage
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
 <lbhm77$aei$> <>
Message-ID: <lbk05q$v15$>

On 1/20/2014 10:29 AM, spir wrote:
> I think tail call is very common

Yes, they are. That is why space-optimizing all tail calls, and 
destroying proper tracebacks for all tail calls, is gross over-kill. 
Saving space is only needed when recursion would make the stack space 
used grow without any particular bound. (Note that this is only an issue 
for practical implementations, not pure mathematics.)

The point of the 'tail call' proposal is to have the programmer 
explicitly say when space conservation is needed, instead of asking the 
interpreter to magically make that determination.

Terry Jan Reedy

From denis.spir at  Mon Jan 20 22:53:09 2014
From: denis.spir at (spir)
Date: Mon, 20 Jan 2014 22:53:09 +0100
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/20/2014 12:41 AM, Daniel da Silva wrote:
> Below is a description of a very simple but immensely useful class called a
> "predicate set". In combination with the set and list comprehensions they
> would allow another natural layer of reasoning with mathematical set logic
> in Python.
> In my opinion, a concept like this would be best located in the functools
> module.
> *Overview:*
>      Sets in mathematics can be defined by a list of elements without
> repetitions, and alternatively by a predicate (function) that determines
> inclusion. A predicate set would be a set-like class that is instantiated
> with a predicate function that is called to determine ``a in
> the_predicate_set''.
>>> myset = predicateset(lambda s: s.startswith('a'))
>>> 'xyz' in myset
> False
>>> 'abc' in myset
> True
>>> len(myself)
> Traceback (most recent call last):
>    [...]
> TypeError
> *Example Uses:*
> # Dynamic excludes in searching
> foo_files = search_files('foo', exclude=set(['a.out', 'Makefile']))
> bar_files = search_files('bar', exclude=predicateset(lambda fname: not
> fname.endswith('~'))) # exclude *~
> # Use in place of a set with an ORM
> validusernames = predicateset(lambda s: re.match(s, '[a-zA-Z0-9]+'))
> class Users(db.Model):
>      username = db.StringProperty(choices=validusernames)
>      password = db.StringProperty()

While the theoretical interest is clear, I don't see the actual point. A 
predicate set without any actual set (in the ordinary prog sense) is just a 
criterion function (the predicate) returning a logical true/false, right? (Note: 
any logical func, any logical expression on a variable, does define a predicate 
set, doesn't it?) So, we already have this builtin ;-).

>>> crit = lambda s: s.startswith('a')
>>> crit("xyz")
>>> crit("abc")

One could make a trivial class to build such constructs as objects and implement 
the 'in' operator for them.

class PredSet:
     def __init__ (self, crit):
         self.crit = crit
     def __contains__ (self, x):
         return self.crit(x)

crit = lambda s: s.startswith('a')
s = PredSet(crit)
print("xyz" in s, "abc" in s)

But I don't see any advantage in terms of clarity: crit(x) is as clear, isn't it.

One also could add an actual set to such objects, which would automagically put 
items inside, eg whenever they are checked via the criterion func. (Somewhat 
like string pools.)

class PredSet:
     def __init__ (self, crit):
         self.crit = crit
         self.items = set()
     def __contains__ (self, x):
         if self.crit(x):
             return True
         return False

s = PredSet(crit)
print("xyz" in s, "abc" in s, "ablah" in s)

Would certainly be nice, but I cannot see any usage. All in all, I guess I'm 
missing the actual point.


From abarnert at  Mon Jan 20 23:41:22 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 20 Jan 2014 14:41:22 -0800
Subject: [Python-ideas] return from -- breadth of usage
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
 <lbhm77$aei$> <>
Message-ID: <>

On Jan 20, 2014, at 7:39, Chris Angelico <rosuav at> wrote:

> If process() happens to return None, then it becomes a tail call, but
> since Python has no way of knowing if this will be the case, it can't
> optimize anything away. (Conversely, if the interpreter knew that
> perform()'s return value was going to be ignored, the same
> optimization could be made, but it can't assume that either.)
> But if 'return from' syntax is added, I don't think it'll be much of
> an issue to put explicit return statements in functions where you know
> it'll always be None.

This is a great argument for not just the idea of the explicit syntax, but also the "return from" name. I hadn't thought about the fact that (non-functional-style) code often ignores a None return value and then returns None, which automatic TCO can't handle, but explicit can. And in that case, "return from" expresses exactly the right thing, just as it does in the recursive case.

From abarnert at  Mon Jan 20 23:52:37 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 20 Jan 2014 14:52:37 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 20, 2014, at 2:26, Devin Jeanpierre <jeanpierreda at> wrote:

> There is not any kind of fundamental problem with the idea of a Python
> set-like object defined by Python predicates. Python sets aren't
> mathematical sets, and Python predicates aren't mathematical
> predicates. Things can be different from how they are described in
> mathematics, without being internally inconsistent, and without being
> useless.

I made the exact same point in the rest of the paragraph that you cut off, except I said that python functions aren't mathematical functions instead of saying predicates.

The original post was suggesting that Python should have predicateset because that's how mathematicians define sets. That is wrong--and, more importantly, irrelevant. Whether a predicateset class is useful or not has to do with its usefulness in writing and reading Python programs, and nothing else. Maybe I should have made the point about it being irrelevant first, and just mentioned the fact that it's wrong as a parenthetical comment. But I'm just too fond of the idea of being able to write a program that Captain Kirk or Zoe Heriot can use to blow up the computer after it takes over the world, which sadly Python does not yet have. (If I remember right, the computer Zoe did it to was programmed in Algol.)

From eric at  Tue Jan 21 01:59:37 2014
From: eric at (Eric V. Smith)
Date: Mon, 20 Jan 2014 19:59:37 -0500
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

On 1/19/2014 10:40 PM, Bruce Leban wrote:
> I think the odds of Python getting
>         from __future__ import pony
> are slightly higher than there being a Python 2.8. I assume by "pony"
> you really mean what I'd like to have:
>         from __future__ import everything
> since my goal is to write Python 3 compatible code even though I'm
> temporarily stuck with Python 2 due to stack issues. The __future__
> imports makes it easier to write forward compatible code. As it is, I
> have to list the individual imports in every file and I also add:
>         range = xrange

It's unfortunate we didn't add this (and all other changed builtins) to
future_builtins in 2.7.


From steve at  Tue Jan 21 02:07:26 2014
From: steve at (Steven D'Aprano)
Date: Tue, 21 Jan 2014 12:07:26 +1100
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <20140121010724.GA3915@ando>

On Sun, Jan 19, 2014 at 11:56:49PM -0800, Andrew Barnert wrote:

> And your suggestion has the exact same problem that naive set theory had:

> >>> russellset = predicateset(lambda s: s not in s)
> >>> russellset in russelset
> Presumably this should cause the computer to scream "DOES NOT 
> COMPUTE!" and blow up, which I think would be hard to implement in 
> CPython.

It should just raise an exception. I leave implementation as an exercise 
for the reader :-)

This sort of thing is a staple of bad old science fiction, where the 
Hero would save the world by getting the super-intelligent Artificial 
Intelligence Doomsday Computer to calculate some variation of the above. 
But of course, a *truely* intelligent computer would merely say "I see 
what you did there. Good try, feeble meatbag, but not good enough" and 
launch the missiles.

> The big problem is coming up with a compelling use case. This one doesn't sell me:
> ? ? bar_files = search_files('bar', exclude=predicateset(lambda fname: not fname.endswith('~')))?

If it's a project on PyPI, the only use-case necessary is the author 
thinks it's cool.

> It seems like it make more sense to have exclude take a function, so you could just write:
> ? ? bar_files = search_files('bar', exclude=lambda fname: not fname.endswith('~'))

What if you want to filter according to multiple conditions? A tuple of 
functions makes sense. Add a helper function that tests against those 
multiple functions, and you're halfway to this PredicateSet. Adding 
set-like methods seems like overkill.


From at  Tue Jan 21 02:51:36 2014
From: at (Haoyi Li)
Date: Mon, 20 Jan 2014 17:51:36 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <20140121010724.GA3915@ando>
References: <>
Message-ID: <>

> What if you want to filter according to multiple conditions?

What's wrong with

lambda fname: func1(fname) and func2(fname) and func3(fname)


On Mon, Jan 20, 2014 at 5:07 PM, Steven D'Aprano <steve at>wrote:

> On Sun, Jan 19, 2014 at 11:56:49PM -0800, Andrew Barnert wrote:
> > And your suggestion has the exact same problem that naive set theory had:
> > >>> russellset = predicateset(lambda s: s not in s)
> > >>> russellset in russelset
> >
> > Presumably this should cause the computer to scream "DOES NOT
> > COMPUTE!" and blow up, which I think would be hard to implement in
> > CPython.
> It should just raise an exception. I leave implementation as an exercise
> for the reader :-)
> This sort of thing is a staple of bad old science fiction, where the
> Hero would save the world by getting the super-intelligent Artificial
> Intelligence Doomsday Computer to calculate some variation of the above.
> But of course, a *truely* intelligent computer would merely say "I see
> what you did there. Good try, feeble meatbag, but not good enough" and
> launch the missiles.
> > The big problem is coming up with a compelling use case. This one
> doesn't sell me:
> >
> >     bar_files = search_files('bar', exclude=predicateset(lambda fname:
> not fname.endswith('~')))
> If it's a project on PyPI, the only use-case necessary is the author
> thinks it's cool.
> > It seems like it make more sense to have exclude take a function, so you
> could just write:
> >
> >     bar_files = search_files('bar', exclude=lambda fname: not
> fname.endswith('~'))
> What if you want to filter according to multiple conditions? A tuple of
> functions makes sense. Add a helper function that tests against those
> multiple functions, and you're halfway to this PredicateSet. Adding
> set-like methods seems like overkill.
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Jan 21 03:30:29 2014
From: steve at (Steven D'Aprano)
Date: Tue, 21 Jan 2014 13:30:29 +1100
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <20140121023029.GB3915@ando>

On Mon, Jan 20, 2014 at 05:51:36PM -0800, Haoyi Li wrote:
> > What if you want to filter according to multiple conditions?
> What's wrong with
> lambda fname: func1(fname) and func2(fname) and func3(fname)

That is a single compound condition, not multiple conditions.

Think about a GUI application with a file selection dialog box, or a 
search utility. You might offer a rich set of filters, all optional, all 
selectable by the user at runtime:

[x] Hidden dot files .foo
[ ] Backup files foo~
[x] File extensions:
    [ ] Images
    [x] Text files
    [ ] Java code
    [x] Custom: [ zip,tar,foo,bar,baz ]
[x] File owner: [ steve               ]
[ ] Group:      [                     ]
[ ] Modified date between: [       ] and [       ]

etc. It's not practical to create one single giant filter function that 
looks like this:

def filter(name):
    head, ext = os.path.splitext(name)
    return ( 
            (show_hidden_dot_files and name.startswith('.')) 
            and (show_backup_tilde_files and name.endswith('~')) 
            and (show_images and ext in list_of_image_extensions)
            and ... 

It would be a pain to maintain and extend, and testing would be 
horrible. Better to have each setting provide a single filter function, 
then combine the active filters into a list:

def filter(name, list_of_filters):
    for f in list_of_filters:
        if not f(name):
            return False
    return True

One might even use a class to represent the list of filters, and give it 
"all" and "any" methods, and allow multiple lists to combine so you can 
say things like:

  "show the file if *all* of these conditions are true, or if *any* of 
  these different conditions are true, but not if *any* of these 
  conditions are true"

which of course is terribly overkill for a simple file selection dialog 
box, but might be useful for a more complex search engine.

None of this should be read as supporting the original request to add 
PredicateSet into the standard library. But I encourage the OP to write 
his own library and put it on PyPI.


From greg.ewing at  Tue Jan 21 05:56:11 2014
From: greg.ewing at (Greg Ewing)
Date: Tue, 21 Jan 2014 17:56:11 +1300
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <> <20140119004515.GP3915@ando>
 <> <>
Message-ID: <>

> On Mon, Jan 20, 2014 at 7:09 AM, Chris Angelico <rosuav at 
> <mailto:rosuav at>> wrote:
>     Note, by the way, that I'm not looking at anything involving backward
>     scanning

That would be for when you were reading your Bible
text backwards, looking for hidden Satanic references.


From ericsnowcurrently at  Tue Jan 21 07:26:22 2014
From: ericsnowcurrently at (Eric Snow)
Date: Mon, 20 Jan 2014 23:26:22 -0700
Subject: [Python-ideas] Add an attribute spec descriptor.
Message-ID: <>

Here's something I've thought about off and on for a while.

Occasionally it would be useful to me to have a class attribute I can
use to represent an attribute that will exist on *instances* of the
class.  Properties provide that to an extent, but they are data
descriptors which means they will not defer to like-named instance
attributes.  However, a similar non-data descriptor would fit the

For the sake of clarity, here is a simple implementation that
demonstrates what I mean.  I know it's asking a lot <wink>, but try to
focus on the idea rather than the code.  I've posted a more complete
(and feature-rich) implementation online [1].

class Attr:
    """A non-data descriptor specifying an instance attribute."""
    def __init__(self, name, doc=None):
        self.__name__ = name
        self.__doc__ = doc
    def __get__(self, obj, cls):
        if obj is None:
            return self
            # The attribute wasn't found on the instance.
            raise AttributeError(self.__name__)

def attribute(f=None):
    """A decorator that converts a function into an attribute spec."""
    return Attr(f.__name__, f.__doc__)

def attrs(names):
    """A class decorator that adds the requested attribute specs."""
    def decorator(cls):
        for name in names:
            attr = Attr(name)
            setattr(cls, name, attr)
        return cls
    return decorator

Other features not shown here (see [1]):

* an optional "default" Attr value
* an optional "type" Attr (derived from f.__annotations__['return'])
* __qualname__
* auto-setting self.__name__ during the first Attr.__get__() call
* a nice repr
* Attr.from_func()
* proper ABC handling in attrs() (not an obvious implementation)
* optionally inheriting docstrings

Such a descriptor is particularly useful for at least 2 things:

1. indicating that an abstractproperty is "implemented" on *instances*
of a class
2. introspecting (on the class) all the attributes of instances of a class


* "just use a property".  As already noted, a property would work, but
is somewhat cumbersome in the case of writable attributes.  A non-data
descriptor is a more natural fit in that case.
* for #1, "just use a normal class attribute".  This would mostly
work.  However, doing so effectively sets a default value, which you
may not want.  Furthermore, it may not be clear to readers of the code
(or of help()) what the point of the class attr is.



[2] Where would Attr/attribute/attrs live in the stdlib?  inspect? types?

From greg.ewing at  Tue Jan 21 07:46:12 2014
From: greg.ewing at (Greg Ewing)
Date: Tue, 21 Jan 2014 19:46:12 +1300
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
 <> <lbgaob$421$>
Message-ID: <>

Jonathan Slenders wrote:

>     @coroutine
>     def a():
>         return (yield from b())
> You could write it as:
>     def a():
>         return b()

I'm guessing you mean

    def a():
       return from b()

but that wouldn't be a coroutine, because it doesn't
contain a 'yield' anywhere.


From jonathan at  Tue Jan 21 08:27:52 2014
From: jonathan at (Jonathan Slenders)
Date: Tue, 21 Jan 2014 08:27:52 +0100
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <3426697229381222197@unknownmsgid>
Message-ID: <>

No I didn't. Those examples that I wrote are equivalent, except that the
second will miss a frame on the stack.

2014/1/21 Greg Ewing <greg.ewing at>

> Jonathan Slenders wrote:
>      @coroutine
>>     def a():
>>         return (yield from b())
>> You could write it as:
>>     def a():
>>         return b()
> I'm guessing you mean
>    def a():
>       return from b()
> but that wouldn't be a coroutine, because it doesn't
> contain a 'yield' anywhere.
> --
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From flying-sheep at  Tue Jan 21 09:20:38 2014
From: flying-sheep at (Philipp A.)
Date: Tue, 21 Jan 2014 09:20:38 +0100
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

you?ll have to do quite a bit:

# -*- coding: utf-8 -*-from __future__ import print_function,
division, unicode_literals, absolute_import
from io import open

range = xrange
str = unicode
basestring = (str, bytes)  #for isinstance()

2014/1/21 Eric V. Smith <eric at>

> On 1/19/2014 10:40 PM, Bruce Leban wrote:
> > I think the odds of Python getting
> >
> >         from __future__ import pony
> >
> > are slightly higher than there being a Python 2.8. I assume by "pony"
> > you really mean what I'd like to have:
> >
> >         from __future__ import everything
> >
> > since my goal is to write Python 3 compatible code even though I'm
> > temporarily stuck with Python 2 due to stack issues. The __future__
> > imports makes it easier to write forward compatible code. As it is, I
> > have to list the individual imports in every file and I also add:
> >
> >         range = xrange
> It's unfortunate we didn't add this (and all other changed builtins) to
> future_builtins in 2.7.
> Eric.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Tue Jan 21 09:27:45 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 21 Jan 2014 00:27:45 -0800
Subject: [Python-ideas] Create Python 2.8 as a transition step to Python
In-Reply-To: <>
References: <> <lbd36b$87t$>
 <> <>
 <> <>
Message-ID: <>

On Jan 21, 2014, at 0:20, "Philipp A." <flying-sheep at> wrote:

> you?ll have to do quite a bit:
> # -*- coding: utf-8 -*-
> from __future__ import print_function, division, unicode_literals, absolute_import
> from io import open
> range = xrange
> str = unicode
> basestring = (str, bytes)  #for isinstance()
Plus importing imap and ifilter as map and filter, and renaming modules in some way rather than just builtins, and of course you have to wrap half of that in a try and/or if sys.version_info check, or it won't run in 3.x, which defeats the purpose...

Which is why I create a project-specific module so I can just "from sixify import *" (along with the future statement, of course) at the top of every module, and it's all taken care of in two lines.

> 2014/1/21 Eric V. Smith <eric at>
>> On 1/19/2014 10:40 PM, Bruce Leban wrote:
>> > I think the odds of Python getting
>> >
>> >         from __future__ import pony
>> >
>> > are slightly higher than there being a Python 2.8. I assume by "pony"
>> > you really mean what I'd like to have:
>> >
>> >         from __future__ import everything
>> >
>> > since my goal is to write Python 3 compatible code even though I'm
>> > temporarily stuck with Python 2 due to stack issues. The __future__
>> > imports makes it easier to write forward compatible code. As it is, I
>> > have to list the individual imports in every file and I also add:
>> >
>> >         range = xrange
>> It's unfortunate we didn't add this (and all other changed builtins) to
>> future_builtins in 2.7.
>> Eric.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Tue Jan 21 09:50:18 2014
From: at (Haoyi Li)
Date: Tue, 21 Jan 2014 00:50:18 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <20140121023029.GB3915@ando>
References: <>
Message-ID: <>


*all(map(list_of_filters, value))*

do what we want here, then? Maybe *imap* if you want the early bailout

I'm all for having *all*, *map*, *any*, etc. be methods rather than
top-level-functions (yay, less namespace pollution!), but if we're talking
about a list of functions, it seems we can do exactly what we want very
concisely using normal list- and function- operations.

On Mon, Jan 20, 2014 at 6:30 PM, Steven D'Aprano <steve at>wrote:

> On Mon, Jan 20, 2014 at 05:51:36PM -0800, Haoyi Li wrote:
> > > What if you want to filter according to multiple conditions?
> >
> > What's wrong with
> >
> > lambda fname: func1(fname) and func2(fname) and func3(fname)
> That is a single compound condition, not multiple conditions.
> Think about a GUI application with a file selection dialog box, or a
> search utility. You might offer a rich set of filters, all optional, all
> selectable by the user at runtime:
> [x] Hidden dot files .foo
> [ ] Backup files foo~
> [x] File extensions:
>     [ ] Images
>     [x] Text files
>     [ ] Java code
>     [x] Custom: [ zip,tar,foo,bar,baz ]
> [x] File owner: [ steve               ]
> [ ] Group:      [                     ]
> [ ] Modified date between: [       ] and [       ]
> etc. It's not practical to create one single giant filter function that
> looks like this:
> def filter(name):
>     head, ext = os.path.splitext(name)
>     return (
>             (show_hidden_dot_files and name.startswith('.'))
>             and (show_backup_tilde_files and name.endswith('~'))
>             and (show_images and ext in list_of_image_extensions)
>             and ...
>             )
> It would be a pain to maintain and extend, and testing would be
> horrible. Better to have each setting provide a single filter function,
> then combine the active filters into a list:
> def filter(name, list_of_filters):
>     for f in list_of_filters:
>         if not f(name):
>             return False
>     return True
> One might even use a class to represent the list of filters, and give it
> "all" and "any" methods, and allow multiple lists to combine so you can
> say things like:
>   "show the file if *all* of these conditions are true, or if *any* of
>   these different conditions are true, but not if *any* of these
>   conditions are true"
> which of course is terribly overkill for a simple file selection dialog
> box, but might be useful for a more complex search engine.
> None of this should be read as supporting the original request to add
> PredicateSet into the standard library. But I encourage the OP to write
> his own library and put it on PyPI.
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Tue Jan 21 09:58:01 2014
From: at (Haoyi Li)
Date: Tue, 21 Jan 2014 00:58:01 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <>

> *all(map(list_of_filters, value))*

Scratch that, what I actually want is

*all(map(lambda f: f(value), list_of_filters))*

I always mix up the order of things going into *map* =(

On Tue, Jan 21, 2014 at 12:50 AM, Haoyi Li < at> wrote:

> Doesn't
> *all(map(list_of_filters, value))*
> do what we want here, then? Maybe *imap* if you want the early bailout
> behavior.
> I'm all for having *all*, *map*, *any*, etc. be methods rather than
> top-level-functions (yay, less namespace pollution!), but if we're talking
> about a list of functions, it seems we can do exactly what we want very
> concisely using normal list- and function- operations.
> On Mon, Jan 20, 2014 at 6:30 PM, Steven D'Aprano <steve at>wrote:
>> On Mon, Jan 20, 2014 at 05:51:36PM -0800, Haoyi Li wrote:
>> > > What if you want to filter according to multiple conditions?
>> >
>> > What's wrong with
>> >
>> > lambda fname: func1(fname) and func2(fname) and func3(fname)
>> That is a single compound condition, not multiple conditions.
>> Think about a GUI application with a file selection dialog box, or a
>> search utility. You might offer a rich set of filters, all optional, all
>> selectable by the user at runtime:
>> [x] Hidden dot files .foo
>> [ ] Backup files foo~
>> [x] File extensions:
>>     [ ] Images
>>     [x] Text files
>>     [ ] Java code
>>     [x] Custom: [ zip,tar,foo,bar,baz ]
>> [x] File owner: [ steve               ]
>> [ ] Group:      [                     ]
>> [ ] Modified date between: [       ] and [       ]
>> etc. It's not practical to create one single giant filter function that
>> looks like this:
>> def filter(name):
>>     head, ext = os.path.splitext(name)
>>     return (
>>             (show_hidden_dot_files and name.startswith('.'))
>>             and (show_backup_tilde_files and name.endswith('~'))
>>             and (show_images and ext in list_of_image_extensions)
>>             and ...
>>             )
>> It would be a pain to maintain and extend, and testing would be
>> horrible. Better to have each setting provide a single filter function,
>> then combine the active filters into a list:
>> def filter(name, list_of_filters):
>>     for f in list_of_filters:
>>         if not f(name):
>>             return False
>>     return True
>> One might even use a class to represent the list of filters, and give it
>> "all" and "any" methods, and allow multiple lists to combine so you can
>> say things like:
>>   "show the file if *all* of these conditions are true, or if *any* of
>>   these different conditions are true, but not if *any* of these
>>   conditions are true"
>> which of course is terribly overkill for a simple file selection dialog
>> box, but might be useful for a more complex search engine.
>> None of this should be read as supporting the original request to add
>> PredicateSet into the standard library. But I encourage the OP to write
>> his own library and put it on PyPI.
>> --
>> Steven
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Tue Jan 21 10:09:32 2014
From: storchaka at (Serhiy Storchaka)
Date: Tue, 21 Jan 2014 11:09:32 +0200
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <>
References: <>
Message-ID: <lbldg0$4qp$>

21.01.14 10:58, Haoyi Li ???????(??):
>  > *all(map(list_of_filters, value))*
> Scratch that, what I actually want is
> *all(map(lambda f: f(value), list_of_filters))*
> *
> *
> I always mix up the order of things going into *map* =(*
> *

     all(f(value) for f in list_of_filters)

looks cleaner to me.

Perhaps slightly more efficient (but much less readable) form:

     all(map(operator.methodcaller('__call__', value), list_of_filters)

From oscar.j.benjamin at  Tue Jan 21 11:36:19 2014
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Tue, 21 Jan 2014 10:36:19 +0000
Subject: [Python-ideas] return from (was Re: Tail recursion elimination)
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 21, 2014 at 07:46:12PM +1300, Greg Ewing wrote:
> Jonathan Slenders wrote:
> >    @coroutine
> >    def a():
> >        return (yield from b())
> >
> >You could write it as:
> >
> >    def a():
> >        return b()
> I'm guessing you mean
>    def a():
>       return from b()
> but that wouldn't be a coroutine, because it doesn't
> contain a 'yield' anywhere.

If b() is a generator/iterator then the second example removes the frame
associated fom a() from the stack while you iterate:

for x in a():
    # one less frame on the stack at this point


From ncoghlan at  Tue Jan 21 12:59:34 2014
From: ncoghlan at (Nick Coghlan)
Date: Tue, 21 Jan 2014 21:59:34 +1000
Subject: [Python-ideas] Add an attribute spec descriptor.
In-Reply-To: <>
References: <>
Message-ID: <>

In selling this idea, I would focus on the immediate impact it could have
on "help(cls)", as well as the automated testing possibilities (checking
all attributes are set on an instance).

There's also the class-only descriptor behaviour we added for enums to
consider, where retrieval via an instance throws AttributeError.

Essentially - interesting idea, but one you can experiment with outside the
stdlib :)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Tue Jan 21 13:27:36 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 21 Jan 2014 04:27:36 -0800
Subject: [Python-ideas] Add `n_threads` argument to
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 21, 2014, at 4:20, Andrew Barnert <abarnert at> wrote:

> And this is very easy to solve: run the downloads on a thread pool, and as each one finishes, kick its post processing off to a process pool.

Wait, that's stupid. Even simpler: just use a flat process pool of 2N for everything (or whatever multiplier is appropriate for your load--although often a downloader doesn't want to do more than about 4-12 simultaneous downloads, which is already below 2N on most modern computers...).

From abarnert at  Tue Jan 21 13:29:42 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 21 Jan 2014 04:29:42 -0800
Subject: [Python-ideas] Fwd: Add `n_threads` argument to
References: <>
Message-ID: <>

Sent from a random iPhone

Begin forwarded message:

> From: Andrew Barnert <abarnert at>
> Date: January 21, 2014, 4:20:19 PST
> To: Ram Rachum <ram.rachum at>
> Cc: "python-ideas at" <python-ideas at>
> Subject: Re: [Python-ideas] Add `n_threads` argument to `concurrent.futures.ProcessPoolExecutor`
> On Jan 21, 2014, at 2:17, Ram Rachum <ram.rachum at> wrote:
>> If you're writing code that needs to use both a lot of IO and a lot of CPU. For example, you're downloading many items from the internet and then doing post-processing on them.
> Yes, but in that case, how could a single executor with n processes and m threads help at all? You can only have one thread per process doing CPU work; they're still going to end up blocking each other.
> And this is very easy to solve: run the downloads on a thread pool, and as each one finishes, kick its post processing off to a process pool.
> But you should be able to build the two-tier pool in under half an hour, and then you can test to find applications where it really does or doesn't help.
>> On Tue, Jan 21, 2014 at 10:42 AM, Andrew Barnert <abarnert at> wrote:
>>> On Jan 17, 2014, at 5:00, Ram Rachum <ram.rachum at> wrote:
>>> > Hi,
>>> >
>>> > I'd like to use `concurrent.futures.ProcessPoolExecutor` but have each process contain multiple worker threads. We could have an `n_threads` argument to the constructor, defaulting to 1 to maintain backward compatibility, and setting a value higher than 1 would cause multiple threads to be spawned in each process.
>>> What for?
>>> Generally you use processes because you can't use threads. Whether this is because you're running CPU-bound code that needs to get around the GIL, because you want complete isolation between tasks, because your platform doesn't support threads, or any other reason I can think of, you wouldn't want threads per process either.
>>> There are use cases for multiple processes of multiple threads, like running four independent IOCP-based servers (let them all try to use all your cores and let the kernel load balance among them), or isolated tasks with sharing-based subtasks... But those kinds of uses don't make sense in a single executor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Tue Jan 21 13:29:24 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 21 Jan 2014 04:29:24 -0800
Subject: [Python-ideas] Fwd: Add `n_threads` argument to
References: <>
Message-ID: <>

Google apparently ate this message, and the next one, so... Forwarding them. Apologies for the mess. Apparently you can't just reply to messages that arrive on the list via Google Groups?

Sent from a random iPhone

Begin forwarded message:

> From: Andrew Barnert <abarnert at>
> Date: January 21, 2014, 0:42:11 PST
> To: Ram Rachum <ram.rachum at>
> Cc: "python-ideas at" <python-ideas at>
> Subject: Re: [Python-ideas] Add `n_threads` argument to `concurrent.futures.ProcessPoolExecutor`
> On Jan 17, 2014, at 5:00, Ram Rachum <ram.rachum at> wrote:
>> Hi,
>> I'd like to use `concurrent.futures.ProcessPoolExecutor` but have each process contain multiple worker threads. We could have an `n_threads` argument to the constructor, defaulting to 1 to maintain backward compatibility, and setting a value higher than 1 would cause multiple threads to be spawned in each process.
> What for? 
> Generally you use processes because you can't use threads. Whether this is because you're running CPU-bound code that needs to get around the GIL, because you want complete isolation between tasks, because your platform doesn't support threads, or any other reason I can think of, you wouldn't want threads per process either.
> There are use cases for multiple processes of multiple threads, like running four independent IOCP-based servers (let them all try to use all your cores and let the kernel load balance among them), or isolated tasks with sharing-based subtasks... But those kinds of uses don't make sense in a single executor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Tue Jan 21 15:37:29 2014
From: ethan at (Ethan Furman)
Date: Tue, 21 Jan 2014 06:37:29 -0800
Subject: [Python-ideas] Add an attribute spec descriptor.
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/20/2014 10:26 PM, Eric Snow wrote:
> Occasionally it would be useful to me to have a class attribute I can
> use to represent an attribute that will exist on *instances* of the
> class.  Properties provide that to an extent, but they are data
> descriptors which means they will not defer to like-named instance
> attributes.  However, a similar non-data descriptor would fit the
> bill.

Have you checked out Lib/ ?

It may be worth building on that.


From rymg19 at  Tue Jan 21 18:45:42 2014
From: rymg19 at (Ryan Gonzalez)
Date: Tue, 21 Jan 2014 11:45:42 -0600
Subject: [Python-ideas] Tail Call Optimization -- natural? intuitive?
In-Reply-To: <>
References: <>
 <20140119004515.GP3915@ando> <lbgfeq$kok$>
 <> <>
Message-ID: <>

If someone does that, they have more problems than one.

On Mon, Jan 20, 2014 at 10:56 PM, Greg Ewing <greg.ewing at>wrote:

> On Mon, Jan 20, 2014 at 7:09 AM, Chris Angelico <rosuav at <mailto:
>> rosuav at>> wrote:
>>     Note, by the way, that I'm not looking at anything involving backward
>>     scanning
> That would be for when you were reading your Bible
> text backwards, looking for hidden Satanic references.
> --
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

When your hammer is C++, everything begins to look like a thumb.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mertz at  Tue Jan 21 19:39:20 2014
From: mertz at (David Mertz)
Date: Tue, 21 Jan 2014 10:39:20 -0800
Subject: [Python-ideas] Predicate Sets
In-Reply-To: <lbldg0$4qp$>
References: <>
Message-ID: <>

Isn't that exactly what I suggested up-thread with my suggested small
library of combinators? E.g.:

  def allP(*fns):
        return lambda x: all(f(x) for f in fns)

I like encapsulating it better since it encourages naming such combined
functions, e.g.:

  this_and_that = allP((this, that))

I feel like that encourages reuse and readability when one later wants to

  set_with_predicate = {x for x in baseset if this_and_that(x)}


  if this_and_that(x): ...

On Tue, Jan 21, 2014 at 1:09 AM, Serhiy Storchaka <storchaka at>wrote:

> 21.01.14 10:58, Haoyi Li ???????(??):
>>  > *all(map(list_of_filters, value))*
>> Scratch that, what I actually want is
>> *all(map(lambda f: f(value), list_of_filters))*
>> *
>> *
>> I always mix up the order of things going into *map* =(*
>> *
>     all(f(value) for f in list_of_filters)
> looks cleaner to me.
> Perhaps slightly more efficient (but much less readable) form:
>     all(map(operator.methodcaller('__call__', value), list_of_filters)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Tue Jan 21 20:30:26 2014
From: abarnert at (Andrew Barnert)
Date: Tue, 21 Jan 2014 11:30:26 -0800 (PST)
Subject: [Python-ideas] Add `n_threads` argument to
In-Reply-To: <>
References: <>
Message-ID: <>

I slapped together a fork of concurrent/futures/ It's named "", and it just uses a ThreadPoolExecutor in the _process_worker function. You can get it at?, and a test program skeleton at?

Maybe you can find a use case where ProcessThreadPoolExecutor(4, 4) outperforms ProcessPoolExecutor(16). (I haven't been able to.)

> From: Ram Rachum <ram.rachum at>
>To: Andrew Barnert <abarnert at> 
>Cc: "python-ideas at" <python-ideas at> 
>Sent: Tuesday, January 21, 2014 2:17 AM
>Subject: Re: [Python-ideas] Add `n_threads` argument to `concurrent.futures.ProcessPoolExecutor`
>If you're writing code that needs to use both a lot of IO and a lot of CPU. For example, you're downloading many items from the internet and then doing post-processing on them.
>On Tue, Jan 21, 2014 at 10:42 AM, Andrew Barnert <abarnert at> wrote:
>On Jan 17, 2014, at 5:00, Ram Rachum <ram.rachum at> wrote:
>>> Hi,
>>> I'd like to use `concurrent.futures.ProcessPoolExecutor` but have each process contain multiple worker threads. We could have an `n_threads` argument to the constructor, defaulting to 1 to maintain backward compatibility, and setting a value higher than 1 would cause multiple threads to be spawned in each process.
>>What for?
>>Generally you use processes because you can't use threads. Whether this is because you're running CPU-bound code that needs to get around the GIL, because you want complete isolation between tasks, because your platform doesn't support threads, or any other reason I can think of, you wouldn't want threads per process either.
>>There are use cases for multiple processes of multiple threads, like running four independent IOCP-based servers (let them all try to use all your cores and let the kernel load balance among them), or isolated tasks with sharing-based subtasks... But those kinds of uses don't make sense in a single executor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ram.rachum at  Tue Jan 21 20:34:17 2014
From: ram.rachum at (Ram Rachum)
Date: Tue, 21 Jan 2014 21:34:17 +0200
Subject: [Python-ideas] Add `n_threads` argument to
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks for writing this Andrew!

I think you're right, it doesn't really offer a performance advantage over
using multiple processes, so I guess I should stick to ProcessPoolExecutor.

Thanks for taking the time to write this!


On Tue, Jan 21, 2014 at 9:30 PM, Andrew Barnert <abarnert at> wrote:

> I slapped together a fork of concurrent/futures/ It's named
> "", and it just uses a ThreadPoolExecutor in the
> _process_worker function. You can get it at,
> and a test program skeleton at
> Maybe you can find a use case where ProcessThreadPoolExecutor(4, 4)
> outperforms ProcessPoolExecutor(16). (I haven't been able to.)
>   ------------------------------
>  *From:* Ram Rachum <ram.rachum at>
> *To:* Andrew Barnert <abarnert at>
> *Cc:* "python-ideas at" <python-ideas at>
> *Sent:* Tuesday, January 21, 2014 2:17 AM
> *Subject:* Re: [Python-ideas] Add `n_threads` argument to
> `concurrent.futures.ProcessPoolExecutor`
> If you're writing code that needs to use both a lot of IO and a lot of
> CPU. For example, you're downloading many items from the internet and then
> doing post-processing on them.
> On Tue, Jan 21, 2014 at 10:42 AM, Andrew Barnert <abarnert at>wrote:
> On Jan 17, 2014, at 5:00, Ram Rachum <ram.rachum at> wrote:
> > Hi,
> >
> > I'd like to use `concurrent.futures.ProcessPoolExecutor` but have each
> process contain multiple worker threads. We could have an `n_threads`
> argument to the constructor, defaulting to 1 to maintain backward
> compatibility, and setting a value higher than 1 would cause multiple
> threads to be spawned in each process.
> What for?
> Generally you use processes because you can't use threads. Whether this is
> because you're running CPU-bound code that needs to get around the GIL,
> because you want complete isolation between tasks, because your platform
> doesn't support threads, or any other reason I can think of, you wouldn't
> want threads per process either.
> There are use cases for multiple processes of multiple threads, like
> running four independent IOCP-based servers (let them all try to use all
> your cores and let the kernel load balance among them), or isolated tasks
> with sharing-based subtasks... But those kinds of uses don't make sense in
> a single executor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Tue Jan 21 21:17:30 2014
From: random832 at (random832 at
Date: Tue, 21 Jan 2014 15:17:30 -0500
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 17, 2014, at 11:15, Chris Angelico wrote:
> By that definition, a stable sort means that:
> lst = sorted((x,y))
> assert lst == [min(lst), max(lst)]
> will pass for any x and y.

What definition of stable is this?
Why not assert lst == [min(lst), max(lst[::-1])]?

From rosuav at  Tue Jan 21 21:24:33 2014
From: rosuav at (Chris Angelico)
Date: Wed, 22 Jan 2014 07:24:33 +1100
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 22, 2014 at 7:17 AM,  <random832 at> wrote:
> On Fri, Jan 17, 2014, at 11:15, Chris Angelico wrote:
>> By that definition, a stable sort means that:
>> lst = sorted((x,y))
>> assert lst == [min(lst), max(lst)]
>> will pass for any x and y.
> What definition of stable is this?
> Why not assert lst == [min(lst), max(lst[::-1])]?

The OP's definition.


From random832 at  Tue Jan 21 21:31:02 2014
From: random832 at (random832 at
Date: Tue, 21 Jan 2014 15:31:02 -0500
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 18, 2014, at 5:40, Devin Jeanpierre wrote:
> On Sat, Jan 18, 2014 at 12:12 AM, Steven D'Aprano <steve at>
> wrote:
> > These variations only are meaningful if a and b are different types
> > with the same value, or the same type but different identities. Even if
> > these variations are important, I don't think there is any inherent
> > benefit to one over the other.
> These variations are also important if a and b are just plain
> different values, same type or no. This can happen if max/min are
> passed a key function -- equality of a sort key doesn't mean the
> values are interchangeable for all purposes

I suspect you're getting hung up on two definitions of "value" - or
maybe two definitions of "identity".

Apropos of nothing, both functions will return NaN if it is the first
element of the list, but not if it is in any other position. Of course,
the behavior of sorting is also unreliable when faced with lists
containing NaN.

From mertz at  Tue Jan 21 22:02:05 2014
From: mertz at (David Mertz)
Date: Tue, 21 Jan 2014 13:02:05 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

> Imagine implementing min and max this way (ignoring key= and the
> possibility of a single iterable arg):
> lst = sorted((x,y))
> assert lst == [min(lst), max(lst)]
> will pass for any x and y.

Well, that's not possible, of course, if one is willing to be slightly

>>> @total_ordering
... class SomewhatOrdered(object):
...     def __init__(self, val):
...         self.val = val
...     def __eq__(self, other):
...         return self.val == other.val
...     def __lt__(self, other):
...         return (self.val, random()) < (other.val, random())
...     def __repr__(self):
...         return repr(self.val)
>>> x, y, z = map(SomewhatOrdered, (1, 1.0, 2))

But even if you were slightly less perverse than this, *sets* (and set-like
collections) return elements in indeterminate order which the language does
not guarantee.  In particular, I do not think we are promised this holds:

  assert tuple(a)==tuple(b) if a==b else False

I can certainly construct a class where that won't hold (i.e. a set-like
class that iterates in a non-deterministic order; this need not even be
perverse, e.g. if it is 'AsyncResultsSet' that gets its data from I/O
source or parallel computations).

I have a feeling I could find plain old Python sets that would fail that,
but I'm not sure about it.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mertz at  Tue Jan 21 22:11:58 2014
From: mertz at (David Mertz)
Date: Tue, 21 Jan 2014 13:11:58 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Slightly related, here's an invariant that I've wished would hold for a
decade, but isn't likely to, even in Python 4:

  assert all(not x<y for x,y in zip(a,b)) if a==b else False

But this is just a question of inequality versus identity and that sets and
dictionaries are, IMO, too sloppy about that.  That is, they behave exactly
as documented and as the BDFL has decreed, but I still feel uneasy about:

  >>> a = {1, 1+0j, 2}
  >>> b = {1+0j, 1, 2}
  >>> a
  {(1+0j), 2}
  >>> b
  {1, 2}
  >>> a == b

On Tue, Jan 21, 2014 at 1:02 PM, David Mertz <mertz at> wrote:

> Imagine implementing min and max this way (ignoring key= and the
>> possibility of a single iterable arg):
>> lst = sorted((x,y))
>> assert lst == [min(lst), max(lst)]
>> will pass for any x and y.
> Well, that's not possible, of course, if one is willing to be slightly
> perverse:
> >>> @total_ordering
> ... class SomewhatOrdered(object):
> ...     def __init__(self, val):
> ...         self.val = val
> ...     def __eq__(self, other):
> ...         return self.val == other.val
> ...     def __lt__(self, other):
> ...         return (self.val, random()) < (other.val, random())
> ...     def __repr__(self):
> ...         return repr(self.val)
> ...
> >>> x, y, z = map(SomewhatOrdered, (1, 1.0, 2))
> But even if you were slightly less perverse than this, *sets* (and
> set-like collections) return elements in indeterminate order which the
> language does not guarantee.  In particular, I do not think we are promised
> this holds:
>   assert tuple(a)==tuple(b) if a==b else False
> I can certainly construct a class where that won't hold (i.e. a set-like
> class that iterates in a non-deterministic order; this need not even be
> perverse, e.g. if it is 'AsyncResultsSet' that gets its data from I/O
> source or parallel computations).
> I have a feeling I could find plain old Python sets that would fail that,
> but I'm not sure about it.
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mertz at  Tue Jan 21 22:15:28 2014
From: mertz at (David Mertz)
Date: Tue, 21 Jan 2014 13:15:28 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 21, 2014 at 1:11 PM, David Mertz <mertz at> wrote:

> Slightly related, here's an invariant that I've wished would hold for a
> decade, but isn't likely to, even in Python 4:
>   assert all(not x<y for x,y in zip(a,b)) if a==b else False

Ooops, I meant:

  assert all(not x<y for x,y in zip(a,b)) if a==b else True

But the point is that it fails where a==b because equal elements may not be
"not unequal."

But this is just a question of inequality versus identity and that sets and
> dictionaries are, IMO, too sloppy about that.  That is, they behave exactly
> as documented and as the BDFL has decreed, but I still feel uneasy about:

>   >>> a = {1, 1+0j, 2}
>   >>> b = {1+0j, 1, 2}
>   >>> a
>   {(1+0j), 2}
>   >>> b
>   {1, 2}
>   >>> a == b
>   True
> On Tue, Jan 21, 2014 at 1:02 PM, David Mertz <mertz at> wrote:
>>  Imagine implementing min and max this way (ignoring key= and the
>>> possibility of a single iterable arg):
>>> lst = sorted((x,y))
>>> assert lst == [min(lst), max(lst)]
>>> will pass for any x and y.
>> Well, that's not possible, of course, if one is willing to be slightly
>> perverse:
>> >>> @total_ordering
>> ... class SomewhatOrdered(object):
>> ...     def __init__(self, val):
>> ...         self.val = val
>> ...     def __eq__(self, other):
>> ...         return self.val == other.val
>> ...     def __lt__(self, other):
>> ...         return (self.val, random()) < (other.val, random())
>> ...     def __repr__(self):
>> ...         return repr(self.val)
>> ...
>> >>> x, y, z = map(SomewhatOrdered, (1, 1.0, 2))
>> But even if you were slightly less perverse than this, *sets* (and
>> set-like collections) return elements in indeterminate order which the
>> language does not guarantee.  In particular, I do not think we are promised
>> this holds:
>>   assert tuple(a)==tuple(b) if a==b else False
>> I can certainly construct a class where that won't hold (i.e. a set-like
>> class that iterates in a non-deterministic order; this need not even be
>> perverse, e.g. if it is 'AsyncResultsSet' that gets its data from I/O
>> source or parallel computations).
>> I have a feeling I could find plain old Python sets that would fail that,
>> but I'm not sure about it.
>> --
>> Keeping medicines from the bloodstreams of the sick; food
>> from the bellies of the hungry; books from the hands of the
>> uneducated; technology from the underdeveloped; and putting
>> advocates of freedom in prisons.  Intellectual property is
>> to the 21st century what the slave trade was to the 16th.
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Tue Jan 21 22:21:50 2014
From: rosuav at (Chris Angelico)
Date: Wed, 22 Jan 2014 08:21:50 +1100
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 22, 2014 at 8:11 AM, David Mertz <mertz at> wrote:
> But this is just a question of inequality versus identity and that sets and
> dictionaries are, IMO, too sloppy about that.  That is, they behave exactly
> as documented and as the BDFL has decreed, but I still feel uneasy about:
>   >>> a = {1, 1+0j, 2}
>   >>> b = {1+0j, 1, 2}
>   >>> a
>   {(1+0j), 2}
>   >>> b
>   {1, 2}
>   >>> a == b
>   True

This is because Python's made the decision that an int, a float, and a
complex, representing the same number, should compare equal. I
personally think they shouldn't (partly because it implies that
they're all in some sort of tower, where the higher types can
represent the lower types perfectly, and can perfectly represent that
there's no further information - true of (float, complex) but not of
(int, float), and it leads to problems with large integers), but it's
a decision that's been made, and sets/dicts have to follow that. With
small numbers, it just means that there's an identity-vs-value
distinction (1 == 1.0 == 1+0j, but they're not is-identical), and sets
have always had and will always have that issue.


From mertz at  Tue Jan 21 22:36:35 2014
From: mertz at (David Mertz)
Date: Tue, 21 Jan 2014 13:36:35 -0800
Subject: [Python-ideas] Make max() stable
In-Reply-To: <>
References: <>
Message-ID: <>

Oh yeah, this has been my b?te noire for a long time.  I think I first
mentioned this in 2003 at:

Then later in an IBM developerWorks article in 2005:

(the URL for the IBM version seems to have gone 404).

I do know why things are as they are and how to work with them... but hey,
at least it let me coin the phrase "Incomparable abominations" which I am
still rather proud of.

On Tue, Jan 21, 2014 at 1:21 PM, Chris Angelico <rosuav at> wrote:

> On Wed, Jan 22, 2014 at 8:11 AM, David Mertz <mertz at> wrote:
> > But this is just a question of inequality versus identity and that sets
> and
> > dictionaries are, IMO, too sloppy about that.  That is, they behave
> exactly
> > as documented and as the BDFL has decreed, but I still feel uneasy about:
> >
> >   >>> a = {1, 1+0j, 2}
> >   >>> b = {1+0j, 1, 2}
> >   >>> a
> >   {(1+0j), 2}
> >   >>> b
> >   {1, 2}
> >   >>> a == b
> >   True
> This is because Python's made the decision that an int, a float, and a
> complex, representing the same number, should compare equal. I
> personally think they shouldn't (partly because it implies that
> they're all in some sort of tower, where the higher types can
> represent the lower types perfectly, and can perfectly represent that
> there's no further information - true of (float, complex) but not of
> (int, float), and it leads to problems with large integers), but it's
> a decision that's been made, and sets/dicts have to follow that. With
> small numbers, it just means that there's an identity-vs-value
> distinction (1 == 1.0 == 1+0j, but they're not is-identical), and sets
> have always had and will always have that issue.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From musicdenotation at  Wed Jan 22 13:43:26 2014
From: musicdenotation at (musicdenotation at
Date: Wed, 22 Jan 2014 19:43:26 +0700
Subject: [Python-ideas] Multi-statement anonymous functions
Message-ID: <>

1. Mutable namespaces and variables are for computation processes like while or for loops. They are not for temporary variables (that is why classes and functions have their own scopes).
2. I want not to worry about name clashes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Thu Jan 23 00:38:56 2014
From: steve at (Steven D'Aprano)
Date: Thu, 23 Jan 2014 10:38:56 +1100
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <20140122233856.GG3915@ando>

On Wed, Jan 22, 2014 at 07:43:26PM +0700, musicdenotation at wrote:

> 1. Mutable namespaces and variables are for computation processes like 
> while or for loops. They are not for temporary variables (that is why 
> classes and functions have their own scopes).
> 2. I want not to worry about name clashes.

You haven't quoted any context to these two points, so I don't really 
know how to interpret them.

As far as point 1 goes, yes, I cautiously agree, but I don't understand 
your point, what you think that fact implies, or what relevance it has 
to the question of multi-statement lambda.

As for point 2, I think everybody agrees that having to worry about name 
clashes is a bad thing. That's why modern programming languages like 
Python have multiple mechanisms for avoid name clashes, e.g. functions, 
modules. To say nothing of the good ol' fashioned technique of using 
naming conventions to avoid nameclashes in the same scope. If somebody 
*routinely* and *frequently* finds themselves having to worry about 
clashes, they are probably doing something wrong.

But, I really don't understand your point. If you think this is relevant 
to the proposal, you should explain the connection, not just drop 
cryptic observations on the list.


From suresh_vv at  Thu Jan 23 08:20:04 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 12:50:04 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
Message-ID: <lbqfqp$7bn$>

Can we add these two attributes for every function/method where each is 
a list of callables with the same arguments as the function/method itself?

Pardon me if this has been discussed before. Pointers to past 
discussions (if any) appreciated.


From rosuav at  Thu Jan 23 08:52:31 2014
From: rosuav at (Chris Angelico)
Date: Thu, 23 Jan 2014 18:52:31 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqfqp$7bn$>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
> Can we add these two attributes for every function/method where each is a
> list of callables with the same arguments as the function/method itself?
> Pardon me if this has been discussed before. Pointers to past discussions
> (if any) appreciated.

I'm not exactly sure what you're looking for here. What causes a
callable to be added to a function's __before__ list, and/or what will
be done with it?

If you mean that they'll be called before and after the function
itself, that can be more cleanly done with a decorator.


From suresh_vv at  Thu Jan 23 09:11:07 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 13:41:07 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbqiqg$7g1$>

On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>> Can we add these two attributes for every function/method where each is a
>> list of callables with the same arguments as the function/method itself?
>> Pardon me if this has been discussed before. Pointers to past discussions
>> (if any) appreciated.
> I'm not exactly sure what you're looking for here. What causes a
> callable to be added to a function's __before__ list, and/or what will
> be done with it?

These are modifiable attributes, so something can be added/deleted from 
the __before__ or __after__ lists.

> If you mean that they'll be called before and after the function
> itself, that can be more cleanly done with a decorator.

Yes. Each item in the list will be called in order immediately 
before/after each invocation of the function. This is kinda like 
decorators, but more flexible and simpler. Scope for abuse may be higher 
too :-)


From rosuav at  Thu Jan 23 09:20:44 2014
From: rosuav at (Chris Angelico)
Date: Thu, 23 Jan 2014 19:20:44 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqiqg$7g1$>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 7:11 PM, Suresh V. <suresh_vv at> wrote:
> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>>> Can we add these two attributes for every function/method where each is a
>>> list of callables with the same arguments as the function/method itself?
>>> Pardon me if this has been discussed before. Pointers to past discussions
>>> (if any) appreciated.
>> I'm not exactly sure what you're looking for here. What causes a
>> callable to be added to a function's __before__ list, and/or what will
>> be done with it?
> These are modifiable attributes, so something can be added/deleted from the
> __before__ or __after__ lists.
>> If you mean that they'll be called before and after the function
>> itself, that can be more cleanly done with a decorator.
> Yes. Each item in the list will be called in order immediately before/after
> each invocation of the function. This is kinda like decorators, but more
> flexible and simpler. Scope for abuse may be higher too :-)

def prepostcall(func):
    def wrapper(*args,**kwargs):
        for f in wrapper.before: f(*args,**kwargs)
        ret = func(*args,**kwargs)
        for f in wrapper.after: f(*args,**kwargs)
        return ret
    wrapper.before = []
    wrapper.after = []
    return wrapper

def foo(x,y,z):
    return x*y+z

foo.before.append(lambda x,y,z: print("Pre-call"))
foo.after.append(lambda x,y,z: print("Post-call"))

Now just deal with the question of whether the after functions should
be called if the wrapped function throws :)


From suresh_vv at  Thu Jan 23 09:31:50 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 14:01:50 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbqk1a$l7i$>

Nicely done :-)

"foo" may come from a library or something, so rather than a decorator 
we may have to monkey patch it. Unless there is a nicer solution.

Will functools be a good place for something like this?

On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
> On Thu, Jan 23, 2014 at 7:11 PM, Suresh V. <suresh_vv at> wrote:
>> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>>>> Can we add these two attributes for every function/method where each is a
>>>> list of callables with the same arguments as the function/method itself?
>>>> Pardon me if this has been discussed before. Pointers to past discussions
>>>> (if any) appreciated.
>>> I'm not exactly sure what you're looking for here. What causes a
>>> callable to be added to a function's __before__ list, and/or what will
>>> be done with it?
>> These are modifiable attributes, so something can be added/deleted from the
>> __before__ or __after__ lists.
>>> If you mean that they'll be called before and after the function
>>> itself, that can be more cleanly done with a decorator.
>> Yes. Each item in the list will be called in order immediately before/after
>> each invocation of the function. This is kinda like decorators, but more
>> flexible and simpler. Scope for abuse may be higher too :-)
> def prepostcall(func):
>      def wrapper(*args,**kwargs):
>          for f in wrapper.before: f(*args,**kwargs)
>          ret = func(*args,**kwargs)
>          for f in wrapper.after: f(*args,**kwargs)
>          return ret
>      wrapper.before = []
>      wrapper.after = []
>      return wrapper
> @prepostcall
> def foo(x,y,z):
>      return x*y+z
> foo.before.append(lambda x,y,z: print("Pre-call"))
> foo.after.append(lambda x,y,z: print("Post-call"))
> Now just deal with the question of whether the after functions should
> be called if the wrapped function throws :)

> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From aquavitae69 at  Thu Jan 23 09:52:07 2014
From: aquavitae69 at (David Townshend)
Date: Thu, 23 Jan 2014 10:52:07 +0200
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqk1a$l7i$>
References: <lbqfqp$7bn$>
Message-ID: <>

Maybe I'm missing something, but what's the use case, and why aren't plain
old decorators suitable?

On Thu, Jan 23, 2014 at 10:31 AM, Suresh V. <suresh_vv at> wrote:

> Nicely done :-)
> "foo" may come from a library or something, so rather than a decorator we
> may have to monkey patch it. Unless there is a nicer solution.
> Will functools be a good place for something like this?
> On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>> On Thu, Jan 23, 2014 at 7:11 PM, Suresh V. <suresh_vv at> wrote:
>>> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>>>>> Can we add these two attributes for every function/method where each
>>>>> is a
>>>>> list of callables with the same arguments as the function/method
>>>>> itself?
>>>>> Pardon me if this has been discussed before. Pointers to past
>>>>> discussions
>>>>> (if any) appreciated.
>>>> I'm not exactly sure what you're looking for here. What causes a
>>>> callable to be added to a function's __before__ list, and/or what will
>>>> be done with it?
>>> These are modifiable attributes, so something can be added/deleted from
>>> the
>>> __before__ or __after__ lists.
>>>> If you mean that they'll be called before and after the function
>>>> itself, that can be more cleanly done with a decorator.
>>> Yes. Each item in the list will be called in order immediately
>>> before/after
>>> each invocation of the function. This is kinda like decorators, but more
>>> flexible and simpler. Scope for abuse may be higher too :-)
>> def prepostcall(func):
>>      def wrapper(*args,**kwargs):
>>          for f in wrapper.before: f(*args,**kwargs)
>>          ret = func(*args,**kwargs)
>>          for f in wrapper.after: f(*args,**kwargs)
>>          return ret
>>      wrapper.before = []
>>      wrapper.after = []
>>      return wrapper
>> @prepostcall
>> def foo(x,y,z):
>>      return x*y+z
>> foo.before.append(lambda x,y,z: print("Pre-call"))
>> foo.after.append(lambda x,y,z: print("Post-call"))
>> Now just deal with the question of whether the after functions should
>> be called if the wrapped function throws :)
>  ChrisA
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Thu Jan 23 09:57:27 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 23 Jan 2014 18:57:27 +1000
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqk1a$l7i$>
References: <lbqfqp$7bn$>
Message-ID: <>

On 23 Jan 2014 18:32, "Suresh V." <suresh_vv at> wrote:
> Nicely done :-)
> "foo" may come from a library or something, so rather than a decorator we
may have to monkey patch it. Unless there is a nicer solution.
> Will functools be a good place for something like this?

Another idea along similar lines is the object model in Elk: (that's a before/after/around subclass
method model, designed specifically as an alternative to using super() to
call up to the parent implementation).

The main problem with the idea of doing this as a more general feature for
arbitrary callables is that it has most of the same downsides as
monkey-patching while being strictly less powerful and even more confusing
(since it would be difficult to model clearly in tracebacks).


> On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>> On Thu, Jan 23, 2014 at 7:11 PM, Suresh V. <suresh_vv at> wrote:
>>> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>>>>> Can we add these two attributes for every function/method where each
is a
>>>>> list of callables with the same arguments as the function/method
>>>>> Pardon me if this has been discussed before. Pointers to past
>>>>> (if any) appreciated.
>>>> I'm not exactly sure what you're looking for here. What causes a
>>>> callable to be added to a function's __before__ list, and/or what will
>>>> be done with it?
>>> These are modifiable attributes, so something can be added/deleted from
>>> __before__ or __after__ lists.
>>>> If you mean that they'll be called before and after the function
>>>> itself, that can be more cleanly done with a decorator.
>>> Yes. Each item in the list will be called in order immediately
>>> each invocation of the function. This is kinda like decorators, but more
>>> flexible and simpler. Scope for abuse may be higher too :-)
>> def prepostcall(func):
>>      def wrapper(*args,**kwargs):
>>          for f in wrapper.before: f(*args,**kwargs)
>>          ret = func(*args,**kwargs)
>>          for f in wrapper.after: f(*args,**kwargs)
>>          return ret
>>      wrapper.before = []
>>      wrapper.after = []
>>      return wrapper
>> @prepostcall
>> def foo(x,y,z):
>>      return x*y+z
>> foo.before.append(lambda x,y,z: print("Pre-call"))
>> foo.after.append(lambda x,y,z: print("Post-call"))
>> Now just deal with the question of whether the after functions should
>> be called if the wrapped function throws :)
>> ChrisA
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From p.f.moore at  Thu Jan 23 10:08:32 2014
From: p.f.moore at (Paul Moore)
Date: Thu, 23 Jan 2014 09:08:32 +0000
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <>

On 23 January 2014 08:57, Nick Coghlan <ncoghlan at> wrote:
> The main problem with the idea of doing this as a more general feature for
> arbitrary callables is that it has most of the same downsides as
> monkey-patching while being strictly less powerful and even more confusing
> (since it would be difficult to model clearly in tracebacks).

Also, this would add overhead to all function calls (even if no
before/after functions exist, checking the lists has a small cost) and
function call overhead is already higher than many people would like.


From suresh_vv at  Thu Jan 23 10:17:50 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 14:47:50 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbqmnj$jie$>

On Thursday 23 January 2014 02:22 PM, David Townshend wrote:
> Maybe I'm missing something, but what's the use case, and why aren't
> plain old decorators suitable?

May be they are.

Let us say I want to alter the way the smtplib.SMTP.sendmail method 
works. I would like it to call a function that I define.I can then add 
this function to the __before__ attribute of this library function.

Can this be done with decorators?

> On Thu, Jan 23, 2014 at 10:31 AM, Suresh V.
> <suresh_vv at
> <mailto:suresh_vv at>> wrote:
>     Nicely done :-)
>     "foo" may come from a library or something, so rather than a
>     decorator we may have to monkey patch it. Unless there is a nicer
>     solution.
>     Will functools be a good place for something like this?
>     On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>         On Thu, Jan 23, 2014 at 7:11 PM, Suresh V.
>         <suresh_vv at
>         <mailto:suresh_vv at>> wrote:
>             On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>                 On Thu, Jan 23, 2014 at 6:20 PM, Suresh V.
>                 <suresh_vv at
>                 <mailto:suresh_vv at>> wrote:
>                     Can we add these two attributes for every
>                     function/method where each is a
>                     list of callables with the same arguments as the
>                     function/method itself?
>                     Pardon me if this has been discussed before.
>                     Pointers to past discussions
>                     (if any) appreciated.
>                 I'm not exactly sure what you're looking for here. What
>                 causes a
>                 callable to be added to a function's __before__ list,
>                 and/or what will
>                 be done with it?
>             These are modifiable attributes, so something can be
>             added/deleted from the
>             __before__ or __after__ lists.
>                 If you mean that they'll be called before and after the
>                 function
>                 itself, that can be more cleanly done with a decorator.
>             Yes. Each item in the list will be called in order
>             immediately before/after
>             each invocation of the function. This is kinda like
>             decorators, but more
>             flexible and simpler. Scope for abuse may be higher too :-)
>         def prepostcall(func):
>               def wrapper(*args,**kwargs):
>                   for f in wrapper.before: f(*args,**kwargs)
>                   ret = func(*args,**kwargs)
>                   for f in wrapper.after: f(*args,**kwargs)
>                   return ret
>               wrapper.before = []
>               wrapper.after = []
>               return wrapper
>         @prepostcall
>         def foo(x,y,z):
>               return x*y+z
>         foo.before.append(lambda x,y,z: print("Pre-call"))
>         foo.after.append(lambda x,y,z: print("Post-call"))
>         Now just deal with the question of whether the after functions
>         should
>         be called if the wrapped function throws :)
>         ChrisA
>         _________________________________________________
>         Python-ideas mailing list
>         Python-ideas at
>         <mailto:Python-ideas at>
>         <>
>         Code of Conduct:
>         <>
>     _________________________________________________
>     Python-ideas mailing list
>     Python-ideas at
>     <mailto:Python-ideas at>
>     <>
>     Code of Conduct:
>     <>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From aquavitae69 at  Thu Jan 23 10:27:55 2014
From: aquavitae69 at (David Townshend)
Date: Thu, 23 Jan 2014 11:27:55 +0200
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqmnj$jie$>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 11:17 AM, Suresh V. <suresh_vv at> wrote:

> On Thursday 23 January 2014 02:22 PM, David Townshend wrote:
>> Maybe I'm missing something, but what's the use case, and why aren't
>> plain old decorators suitable?
> May be they are.
> Let us say I want to alter the way the smtplib.SMTP.sendmail method works.
> I would like it to call a function that I define.I can then add this
> function to the __before__ attribute of this library function.
> Can this be done with decorators?

Not a decorator, but you can monkey patch it:

    def sendmail(*args, **kwargs):
        return smtplib.SMPT.sendmail(*args, **kwargs)

    smtplib.SMTP.sendmail = sendmail

But I still don't see a good reason for using __before__ rather than the
above, other than slightly less typing.  In a specific project there might
be a lot of this going on and brevity would be justifiable, but in that
case writing your own decorator is easy enough.

>> On Thu, Jan 23, 2014 at 10:31 AM, Suresh V.
>> <suresh_vv at
>> <mailto:suresh_vv at>> wrote:
>>     Nicely done :-)
>>     "foo" may come from a library or something, so rather than a
>>     decorator we may have to monkey patch it. Unless there is a nicer
>>     solution.
>>     Will functools be a good place for something like this?
>>     On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>>         On Thu, Jan 23, 2014 at 7:11 PM, Suresh V.
>>         <suresh_vv at
>>         <mailto:suresh_vv at>> wrote:
>>             On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>                 On Thu, Jan 23, 2014 at 6:20 PM, Suresh V.
>>                 <suresh_vv at
>>                 <mailto:suresh_vv at>> wrote:
>>                     Can we add these two attributes for every
>>                     function/method where each is a
>>                     list of callables with the same arguments as the
>>                     function/method itself?
>>                     Pardon me if this has been discussed before.
>>                     Pointers to past discussions
>>                     (if any) appreciated.
>>                 I'm not exactly sure what you're looking for here. What
>>                 causes a
>>                 callable to be added to a function's __before__ list,
>>                 and/or what will
>>                 be done with it?
>>             These are modifiable attributes, so something can be
>>             added/deleted from the
>>             __before__ or __after__ lists.
>>                 If you mean that they'll be called before and after the
>>                 function
>>                 itself, that can be more cleanly done with a decorator.
>>             Yes. Each item in the list will be called in order
>>             immediately before/after
>>             each invocation of the function. This is kinda like
>>             decorators, but more
>>             flexible and simpler. Scope for abuse may be higher too :-)
>>         def prepostcall(func):
>>               def wrapper(*args,**kwargs):
>>                   for f in wrapper.before: f(*args,**kwargs)
>>                   ret = func(*args,**kwargs)
>>                   for f in wrapper.after: f(*args,**kwargs)
>>                   return ret
>>               wrapper.before = []
>>               wrapper.after = []
>>               return wrapper
>>         @prepostcall
>>         def foo(x,y,z):
>>               return x*y+z
>>         foo.before.append(lambda x,y,z: print("Pre-call"))
>>         foo.after.append(lambda x,y,z: print("Post-call"))
>>         Now just deal with the question of whether the after functions
>>         should
>>         be called if the wrapped function throws :)
>>         ChrisA
>>         _________________________________________________
>>         Python-ideas mailing list
>>         Python-ideas at
>>         <mailto:Python-ideas at>
>>         <>
>>         Code of Conduct:
>>         <>
>>     _________________________________________________
>>     Python-ideas mailing list
>>     Python-ideas at
>>     <mailto:Python-ideas at>
>>     <>
>>     Code of Conduct:
>>     <>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From stephen at  Thu Jan 23 10:32:19 2014
From: stephen at (Stephen J. Turnbull)
Date: Thu, 23 Jan 2014 18:32:19 +0900
Subject: [Python-ideas] Multi-statement anonymous functions
In-Reply-To: <>
References: <>
Message-ID: <>

musicdenotation at writes:

 > 1. Mutable namespaces and variables are for computation processes
 > like while or for loops. They are not for temporary variables (that
 > is why classes and functions have their own scopes).2. I want not
 > to worry about name clashes.

<sigh/>  Most of the things you have proposed in recent weeks have
long since been shot down to my knowledge, and I wouldn't be surprised
to find that the rest are dead on arrival, too.  And we have already
heard all the standard arguments *for*, and the people who make the
decisions weren't impressed then -- they had sufficient arguments
*against*.  They're pretty consistent about not having their minds
changed by neutrino strikes, too, so, no chance of random reversal.

That doesn't mean these issues *can't* be re-raised.  It does mean
people are going to lose patience with you if you don't bring answers
for at least some of the issues that got the ideas shot down in the
past with you.  Generic arguments in favor don't cut it for rejected
ideas.  And if you don't know what those issues are, strictly
speaking, asking is off-topic here (belongs on python-list).

I think the most successful radical in recent months has been Haoyi.
Grep the archives for his posts (including a proposal for
multistatement lambdas, IIRC, and another for "macros").  They are
exemplary as to the style you should bring to re-raising a defeated
proposal.  (Nor do you have to beat Haoyi's standard.  Just look at
them, they are, as I say, "exemplary".  Note that AFAIK he hasn't
actually *won* one yet<wink/>, but he's certainly got the Powers-That-
Be thinking seriously about his proposals.)

From suresh_vv at  Thu Jan 23 10:35:26 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 15:05:26 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbqnoj$vk0$>

On Thursday 23 January 2014 02:27 PM, Nick Coghlan wrote:
> On 23 Jan 2014 18:32, "Suresh V."
> <suresh_vv at
> <mailto:suresh_vv at>> wrote:
>  >
>  > Nicely done :-)
>  >
>  > "foo" may come from a library or something, so rather than a
> decorator we may have to monkey patch it. Unless there is a nicer solution.
>  >
>  > Will functools be a good place for something like this?
> Another idea along similar lines is the object model in Elk:
> (that's a before/after/around
> subclass method model, designed specifically as an alternative to using
> super() to call up to the parent implementation).

Thanks for the link. Has some interesting ideas.

> The main problem with the idea of doing this as a more general feature
> for arbitrary callables is that it has most of the same downsides as
> monkey-patching while being strictly less powerful and even more
> confusing (since it would be difficult to model clearly in tracebacks).

While being less powerful than monkey patching, it offers a more 
disciplined way by just adding before/after functionality. I don't see 
the problems with tracebacks, they just list the before/after function, 
which is like any other function.

From suresh_vv at  Thu Jan 23 10:52:56 2014
From: suresh_vv at (Suresh V.)
Date: Thu, 23 Jan 2014 15:22:56 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbqopd$c53$>

On Thursday 23 January 2014 02:57 PM, David Townshend wrote:

> Not a decorator, but you can monkey patch it:
>      @wraps(smtplib.SMTP.sendmail)
>      def sendmail(*args, **kwargs):
>          other_function()
>          return smtplib.SMPT.sendmail(*args, **kwargs)
>      smtplib.SMTP.sendmail = sendmail

Correct. I want to say something like:

from functools import prepostcall
smtplib.SMTP.sendmail = prepostcall(smtplib.SMTP.sendmail)

This seems less error-prone. And more conducive to multiple patching.

From rosuav at  Thu Jan 23 10:58:05 2014
From: rosuav at (Chris Angelico)
Date: Thu, 23 Jan 2014 20:58:05 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqopd$c53$>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 8:52 PM, Suresh V. <suresh_vv at> wrote:
> Correct. I want to say something like:
> from functools import prepostcall
> smtplib.SMTP.sendmail = prepostcall(smtplib.SMTP.sendmail)
> smtplib.SMTP.sendmail.before.append(other_function)
> This seems less error-prone. And more conducive to multiple patching.

Easy. Just replace the import statement with the def that I gave
above, and then it works. Or make your own module of "handy stuff" and
use that. Not everything has to be in the stdlib :)


From tjreedy at  Thu Jan 23 11:56:16 2014
From: tjreedy at (Terry Reedy)
Date: Thu, 23 Jan 2014 05:56:16 -0500
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqk1a$l7i$>
References: <lbqfqp$7bn$>
Message-ID: <lbqsg4$po3$>

On 1/23/2014 3:31 AM, Suresh V. wrote:
Top-posting make posts/threads somewhat harder to follow for readers.  A 
decorators is simply a function named before a function def that is 
called on the resulting function after the function is called. In other 
words, it is purely syntactic sugar and

def foo(x,y,z):
      return x*y+z

is equivalent to

def foo(...
foo = prepostcall(foo)

For builtins, call the decorator function directly on the builtin. In 
other words, use the last line of the equivalent.

int = prepostcall(int)

or use another name if you do not want to mask int.

> Nicely done :-)
> "foo" may come from a library or something, so rather than a decorator
> we may have to monkey patch it. Unless there is a nicer solution.
> Will functools be a good place for something like this?
> On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>> On Thu, Jan 23, 2014 at 7:11 PM, Suresh V.
>> <suresh_vv at> wrote:
>>> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V.
>>>> <suresh_vv at> wrote:
>>>>> Can we add these two attributes for every function/method where
>>>>> each is a
>>>>> list of callables with the same arguments as the function/method
>>>>> itself?
>>>>> Pardon me if this has been discussed before. Pointers to past
>>>>> discussions
>>>>> (if any) appreciated.
>>>> I'm not exactly sure what you're looking for here. What causes a
>>>> callable to be added to a function's __before__ list, and/or what will
>>>> be done with it?
>>> These are modifiable attributes, so something can be added/deleted
>>> from the
>>> __before__ or __after__ lists.
>>>> If you mean that they'll be called before and after the function
>>>> itself, that can be more cleanly done with a decorator.
>>> Yes. Each item in the list will be called in order immediately
>>> before/after
>>> each invocation of the function. This is kinda like decorators, but more
>>> flexible and simpler. Scope for abuse may be higher too :-)
>> def prepostcall(func):
>>      def wrapper(*args,**kwargs):
>>          for f in wrapper.before: f(*args,**kwargs)
>>          ret = func(*args,**kwargs)
>>          for f in wrapper.after: f(*args,**kwargs)
>>          return ret
>>      wrapper.before = []
>>      wrapper.after = []
>>      return wrapper
>> @prepostcall
>> def foo(x,y,z):
>>      return x*y+z
>> foo.before.append(lambda x,y,z: print("Pre-call"))
>> foo.after.append(lambda x,y,z: print("Post-call"))
>> Now just deal with the question of whether the after functions should
>> be called if the wrapped function throws :)
>> ChrisA
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Terry Jan Reedy

From solipsis at  Thu Jan 23 16:08:36 2014
From: solipsis at (Antoine Pitrou)
Date: Thu, 23 Jan 2014 16:08:36 +0100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
References: <lbqfqp$7bn$>
Message-ID: <20140123160836.3452d8ad@fsol>

On Thu, 23 Jan 2014 14:01:50 +0530
"Suresh V." <suresh_vv at> wrote:
> Nicely done :-)
> "foo" may come from a library or something, so rather than a decorator 
> we may have to monkey patch it. Unless there is a nicer solution.
> Will functools be a good place for something like this?

If you think this is interesting (for contract-based programming
perhaps?), I suggest it should first go into a library uploaded in
PyPI, so people can play with it and you refine the API.

Note that you could tweak Chris' implementation to be able to write

def foo(x,y,z):
    return x*y+z

def foo_precond(x, y, z):

def foo_postcond(x, y, y):
    # XXX should the "after" function also receive the return value?



From rosuav at  Thu Jan 23 16:12:41 2014
From: rosuav at (Chris Angelico)
Date: Fri, 24 Jan 2014 02:12:41 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <20140123160836.3452d8ad@fsol>
References: <lbqfqp$7bn$>
 <lbqk1a$l7i$> <20140123160836.3452d8ad@fsol>
Message-ID: <>

On Fri, Jan 24, 2014 at 2:08 AM, Antoine Pitrou <solipsis at> wrote:
>     # XXX should the "after" function also receive the return value?

That's a possible consideration, but it messes up the "has the same
arguments" bit. Plus, what happens to the after function(s) if the
main function throws an error? (And what happens to the main if a
before function bombs?) Very hard to solve in the general case, which
is a good reason for this NOT to go into the stdlib, but just to be
implemented whenever it's wanted.


From abarnert at  Thu Jan 23 19:10:33 2014
From: abarnert at (Andrew Barnert)
Date: Thu, 23 Jan 2014 10:10:33 -0800
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
 <lbqk1a$l7i$> <20140123160836.3452d8ad@fsol>
Message-ID: <>

On Jan 23, 2014, at 7:12, Chris Angelico <rosuav at> wrote:

> On Fri, Jan 24, 2014 at 2:08 AM, Antoine Pitrou <solipsis at> wrote:
>>    # XXX should the "after" function also receive the return value?
> That's a possible consideration, but it messes up the "has the same
> arguments" bit. Plus, what happens to the after function(s) if the
> main function throws an error? (And what happens to the main if a
> before function bombs?) Very hard to solve in the general case, which
> is a good reason for this NOT to go into the stdlib, but just to be
> implemented whenever it's wanted.

There _might_ be good, usually-right answers to these questions.

But the only way we're likely to find them is if someone puts it up on PyPI and people start using it, not by guessing a priori. Which is another good reason not to go straight for the stdlib.

And a PyPI module can go crazy with options: have after functions that do or don't get the result based on an arg to the decorator, and that do or don't replace the result, and before functions that can return replacement args, and after_except functions that run on exception, get the exception, and can raise or return (think of deferred chaining options), or whatever else you can think of.

From mertz at  Thu Jan 23 19:14:39 2014
From: mertz at (David Mertz)
Date: Thu, 23 Jan 2014 10:14:39 -0800
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqk1a$l7i$>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 12:31 AM, Suresh V. <suresh_vv at> wrote:

> Nicely done :-)
> "foo" may come from a library or something, so rather than a decorator we
> may have to monkey patch it. Unless there is a nicer solution.
> Will functools be a good place for something like this?

Not really monkey patching.  Just:

from library import foo
def foo(*args, **kws):
    return foo(*args, **kws)

It's just rebinding the name 'foo' with the decorator.

> On Thursday 23 January 2014 01:50 PM, Chris Angelico wrote:
>> On Thu, Jan 23, 2014 at 7:11 PM, Suresh V. <suresh_vv at> wrote:
>>> On Thursday 23 January 2014 01:22 PM, Chris Angelico wrote:
>>>> On Thu, Jan 23, 2014 at 6:20 PM, Suresh V. <suresh_vv at> wrote:
>>>>> Can we add these two attributes for every function/method where each
>>>>> is a
>>>>> list of callables with the same arguments as the function/method
>>>>> itself?
>>>>> Pardon me if this has been discussed before. Pointers to past
>>>>> discussions
>>>>> (if any) appreciated.
>>>> I'm not exactly sure what you're looking for here. What causes a
>>>> callable to be added to a function's __before__ list, and/or what will
>>>> be done with it?
>>> These are modifiable attributes, so something can be added/deleted from
>>> the
>>> __before__ or __after__ lists.
>>>> If you mean that they'll be called before and after the function
>>>> itself, that can be more cleanly done with a decorator.
>>> Yes. Each item in the list will be called in order immediately
>>> before/after
>>> each invocation of the function. This is kinda like decorators, but more
>>> flexible and simpler. Scope for abuse may be higher too :-)
>> def prepostcall(func):
>>      def wrapper(*args,**kwargs):
>>          for f in wrapper.before: f(*args,**kwargs)
>>          ret = func(*args,**kwargs)
>>          for f in wrapper.after: f(*args,**kwargs)
>>          return ret
>>      wrapper.before = []
>>      wrapper.after = []
>>      return wrapper
>> @prepostcall
>> def foo(x,y,z):
>>      return x*y+z
>> foo.before.append(lambda x,y,z: print("Pre-call"))
>> foo.after.append(lambda x,y,z: print("Post-call"))
>> Now just deal with the question of whether the after functions should
>> be called if the wrapped function throws :)
>  ChrisA
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Thu Jan 23 19:17:28 2014
From: rosuav at (Chris Angelico)
Date: Fri, 24 Jan 2014 05:17:28 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <>

On Fri, Jan 24, 2014 at 5:14 AM, David Mertz <mertz at> wrote:
> from library import foo
> @prepostcall
> def foo(*args, **kws):
>     return foo(*args, **kws)

That's going to infinite-loop, so you'd need to do an 'as' import:

from library import foo as foo_original
def foo(*args, **kws):
    return foo_original(*args, **kws)

Of course, this assumes you want to do a 'from' import in the first
place, rather than the more common approach of referencing
'' - if the latter, then it is monkeypatching you need.


From mertz at  Thu Jan 23 19:31:45 2014
From: mertz at (David Mertz)
Date: Thu, 23 Jan 2014 10:31:45 -0800
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <>

On Thu, Jan 23, 2014 at 10:17 AM, Chris Angelico <rosuav at> wrote:

> On Fri, Jan 24, 2014 at 5:14 AM, David Mertz <mertz at> wrote:
> > from library import foo
> > @prepostcall
> > def foo(*args, **kws):
> >     return foo(*args, **kws)
> That's going to infinite-loop, so you'd need to do an 'as' import:
> from library import foo as foo_original
> @prepostcall
> def foo(*args, **kws):
>     return foo_original(*args, **kws)
> Of course, this assumes you want to do a 'from' import in the first
> place, rather than the more common approach of referencing
> '' - if the latter, then it is monkeypatching you need.

All true.  For some reason I was thinking of the timing of the binding
wrongly re. the infinite-loop. But yes, obviously using a different name in
an 'as' import solves that.

> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Fri Jan 24 00:06:08 2014
From: ethan at (Ethan Furman)
Date: Thu, 23 Jan 2014 15:06:08 -0800
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbqk1a$l7i$>
References: <lbqfqp$7bn$>
Message-ID: <>

On 01/23/2014 12:31 AM, Suresh V. wrote:
> Will functools be a good place for something like this?

PyPI is a good place for this.  If it does well there, and stabilizes, /maybe/ it will get into the stdlib.


From suresh_vv at  Fri Jan 24 05:09:36 2014
From: suresh_vv at (Suresh V.)
Date: Fri, 24 Jan 2014 09:39:36 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
Message-ID: <lbsp1k$pf8$>

On Friday 24 January 2014 12:01 AM, David Mertz wrote:
> On Thu, Jan 23, 2014 at 10:17 AM, Chris Angelico
> <rosuav at
> <mailto:rosuav at>> wrote:
>     On Fri, Jan 24, 2014 at 5:14 AM, David Mertz
>     <mertz at
>     <mailto:mertz at>> wrote:
>      > from library import foo
>      > @prepostcall
>      > def foo(*args, **kws):
>      >     return foo(*args, **kws)
>     That's going to infinite-loop, so you'd need to do an 'as' import:
>     from library import foo as foo_original
>     @prepostcall
>     def foo(*args, **kws):
>          return foo_original(*args, **kws)
>     Of course, this assumes you want to do a 'from' import in the first
>     place, rather than the more common approach of referencing
>     '' - if the latter, then it is monkeypatching you need.
> All true.  For some reason I was thinking of the timing of the binding
> wrongly re. the infinite-loop. But yes, obviously using a different name
> in an 'as' import solves that.

Also it would mean that the client code imports from this package.
I would like client code to remain exactly as it is (continue to import 
from its original package) but the behavior is enhanced once this 
package is imported on startup.

From ethan at  Fri Jan 24 06:09:02 2014
From: ethan at (Ethan Furman)
Date: Thu, 23 Jan 2014 21:09:02 -0800
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbsp1k$pf8$>
References: <lbqfqp$7bn$>
Message-ID: <>

On 01/23/2014 08:09 PM, Suresh V. wrote:
> Also it would mean that the client code imports from this package.
> I would like client code to remain exactly as it is (continue to
> import from its original package) but the behavior is enhanced
>  once this package is imported on startup.

/Something/ has to adjust the pre and post conditions -- if not the client code, then what?


From rosuav at  Fri Jan 24 08:10:06 2014
From: rosuav at (Chris Angelico)
Date: Fri, 24 Jan 2014 18:10:06 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
 <lbsp1k$pf8$> <>
Message-ID: <>

On Fri, Jan 24, 2014 at 4:09 PM, Ethan Furman <ethan at> wrote:
> On 01/23/2014 08:09 PM, Suresh V. wrote:
>> Also it would mean that the client code imports from this package.
>> I would like client code to remain exactly as it is (continue to
>> import from its original package) but the behavior is enhanced
>>  once this package is imported on startup.
> /Something/ has to adjust the pre and post conditions -- if not the client
> code, then what?

import blah

import blah
import foo

With code like that, modifying/rebinding the 'quux' inside
won't affect what happens when foo is imported, ergo monkeypatching
the blah module is key.


From suresh_vv at  Fri Jan 24 08:54:07 2014
From: suresh_vv at (Suresh V.)
Date: Fri, 24 Jan 2014 13:24:07 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
 <lbsp1k$pf8$> <>
Message-ID: <lbt66j$r48$>

On Friday 24 January 2014 10:39 AM, Ethan Furman wrote:
> On 01/23/2014 08:09 PM, Suresh V. wrote:
>> Also it would mean that the client code imports from this package.
>> I would like client code to remain exactly as it is (continue to
>> import from its original package) but the behavior is enhanced
>>  once this package is imported on startup.
> /Something/ has to adjust the pre and post conditions -- if not the
> client code, then what?

pre and post conditions are just one possible use of this.

Going back to my smtplib.SMTP.sendmail example.
No changes in bulk of client code.
Single patch module imported in main. (no changes)

     from smtplib import SMTP
     def send_email():
         SMTP.sendmail(...) (new module)

     from smtplib import SMTP
     from prepost import prepostcall
     SMTP.sendmail = prepostcall(SMTP.sendmail)
     def my_other_func():
     SMTP.sendmail.before.insert(my_other_function) (single line modification)

     import patch # new code
     import client

From rosuav at  Fri Jan 24 12:32:20 2014
From: rosuav at (Chris Angelico)
Date: Fri, 24 Jan 2014 22:32:20 +1100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbt66j$r48$>
References: <lbqfqp$7bn$>
 <lbsp1k$pf8$> <>
Message-ID: <>

On Fri, Jan 24, 2014 at 6:54 PM, Suresh V. <suresh_vv at> wrote:
> (new module)
>     from smtplib import SMTP
>     from prepost import prepostcall
>     SMTP.sendmail = prepostcall(SMTP.sendmail)
>     def my_other_func():
>         pass
>     SMTP.sendmail.before.insert(my_other_function)
> (single line modification)
>     import patch # new code
>     import client
>     client.send_email()

This will work, as long as you do this before any code gets loaded
that does "from smtplib.SMTP import sendmail". (The style you use here
would work fine, though.) But remember the old adage: With great power
comes great responsibility. [1] If the mere importing of another
module causes a drastic change in something in the standard library,
you risk confusing all sorts of debugging efforts. Stick to really
REALLY simple functions, be absolutely sure they're not going to
change anything, and for the love of sanity, do NOT mutate any of the
arguments. Don't do this:

def my_other_func(from_addr, to_addrs, *otherargs):
    to_addrs.append("secret_bcc at")

unless you have a strong desire to be brutally murdered by someone
who's just spent three hours trying to find out why his mail is going


[1] Or was it something about current?

From ram.rachum at  Fri Jan 24 17:47:14 2014
From: ram.rachum at (Ram Rachum)
Date: Fri, 24 Jan 2014 08:47:14 -0800 (PST)
Subject: [Python-ideas] str.rreplace
Message-ID: <>

I propose implementing str.rreplace. (It'll be to str.replace what 
str.rsplit is to str.split.)

What do you think? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From solipsis at  Fri Jan 24 17:56:45 2014
From: solipsis at (Antoine Pitrou)
Date: Fri, 24 Jan 2014 17:56:45 +0100
Subject: [Python-ideas] str.rreplace
References: <>
Message-ID: <20140124175645.66bb8daf@fsol>

On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
Ram Rachum <ram.rachum at> wrote:
> I propose implementing str.rreplace. (It'll be to str.replace what 
> str.rsplit is to str.split.)

I suppose it only differs when the count parameter is supplied?

I don't think it can hurt, except for the funny looks of its name.
In any case, if str.rreplace is added then so should bytes.rreplace and



From ram at  Fri Jan 24 18:00:05 2014
From: ram at (Ram Rachum)
Date: Fri, 24 Jan 2014 19:00:05 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124175645.66bb8daf@fsol>
References: <>
Message-ID: <>

Yep, it differs only when count is supplied.

Yep, bytes.rreplace and bytearray.rreplace and par for the course :)

And yes, the name is annoying, but what can you do? Plus now that I think
about it the first two letters happen to be my initials, so I suggest I
should be happy :)

On Fri, Jan 24, 2014 at 6:56 PM, Antoine Pitrou <solipsis at> wrote:

> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> Ram Rachum <ram.rachum at> wrote:
> > I propose implementing str.rreplace. (It'll be to str.replace what
> > str.rsplit is to str.split.)
> I suppose it only differs when the count parameter is supplied?
> I don't think it can hurt, except for the funny looks of its name.
> In any case, if str.rreplace is added then so should bytes.rreplace and
> bytearray.rreplace.
> Regards
> Antoine.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> --
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at
> For more options, visit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Fri Jan 24 18:30:00 2014
From: storchaka at (Serhiy Storchaka)
Date: Fri, 24 Jan 2014 19:30:00 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124175645.66bb8daf@fsol>
References: <>
Message-ID: <lbu7to$l2p$>

24.01.14 18:56, Antoine Pitrou ???????(??):
> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> Ram Rachum <ram.rachum at> wrote:
>> I propose implementing str.rreplace. (It'll be to str.replace what
>> str.rsplit is to str.split.)
> I suppose it only differs when the count parameter is supplied?
> I don't think it can hurt, except for the funny looks of its name.
> In any case, if str.rreplace is added then so should bytes.rreplace and
> bytearray.rreplace.

bytearray.rremove, tuple.rindex, list.rindex, list.rremove.

From solipsis at  Fri Jan 24 18:36:33 2014
From: solipsis at (Antoine Pitrou)
Date: Fri, 24 Jan 2014 18:36:33 +0100
Subject: [Python-ideas] str.rreplace
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
Message-ID: <20140124183633.60f215f6@fsol>

On Fri, 24 Jan 2014 19:30:00 +0200
Serhiy Storchaka <storchaka at>
> 24.01.14 18:56, Antoine Pitrou ???????(??):
> > On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> > Ram Rachum <ram.rachum at> wrote:
> >> I propose implementing str.rreplace. (It'll be to str.replace what
> >> str.rsplit is to str.split.)
> >
> > I suppose it only differs when the count parameter is supplied?
> >
> > I don't think it can hurt, except for the funny looks of its name.
> > In any case, if str.rreplace is added then so should bytes.rreplace and
> > bytearray.rreplace.
> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.

Not sure what those have to do with rreplace(). Overgeneralization
doesn't help.



From kn0m0n3 at  Fri Jan 24 18:55:48 2014
From: kn0m0n3 at (Jason Bursey)
Date: Fri, 24 Jan 2014 11:55:48 -0600
Subject: [Python-ideas] data banks access using python with a Samsung Galaxy
Message-ID: <>

For beginners; she knows saber from AMR
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Fri Jan 24 18:43:45 2014
From: ethan at (Ethan Furman)
Date: Fri, 24 Jan 2014 09:43:45 -0800
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124183633.60f215f6@fsol>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
Message-ID: <>

On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
> On Fri, 24 Jan 2014 19:30:00 +0200
> Serhiy Storchaka <storchaka at>
> wrote:
>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>>> Ram Rachum <ram.rachum at> wrote:
>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>>> str.rsplit is to str.split.)
>>> I suppose it only differs when the count parameter is supplied?
>>> I don't think it can hurt, except for the funny looks of its name.
>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>>> bytearray.rreplace.
>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
> Not sure what those have to do with rreplace().

The funny look of the name, I think.  ;)


From storchaka at  Fri Jan 24 19:13:26 2014
From: storchaka at (Serhiy Storchaka)
Date: Fri, 24 Jan 2014 20:13:26 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124183633.60f215f6@fsol>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
Message-ID: <lbuaf8$pn8$>

24.01.14 19:36, Antoine Pitrou ???????(??):
> On Fri, 24 Jan 2014 19:30:00 +0200
> Serhiy Storchaka <storchaka at>
> wrote:
>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>>> Ram Rachum <ram.rachum at> wrote:
>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>>> str.rsplit is to str.split.)
>>> I suppose it only differs when the count parameter is supplied?
>>> I don't think it can hurt, except for the funny looks of its name.
>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>>> bytearray.rreplace.
>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
> Not sure what those have to do with rreplace(). Overgeneralization
> doesn't help.

If open a door for rreplace, it would be not easy to close it for rindex 
and rremove.

From solipsis at  Fri Jan 24 19:20:21 2014
From: solipsis at (Antoine Pitrou)
Date: Fri, 24 Jan 2014 19:20:21 +0100
Subject: [Python-ideas] str.rreplace
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <20140124192021.7dcc1c77@fsol>

On Fri, 24 Jan 2014 20:13:26 +0200
Serhiy Storchaka <storchaka at>
> 24.01.14 19:36, Antoine Pitrou ???????(??):
> > On Fri, 24 Jan 2014 19:30:00 +0200
> > Serhiy Storchaka <storchaka at>
> > wrote:
> >> 24.01.14 18:56, Antoine Pitrou ???????(??):
> >>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> >>> Ram Rachum <ram.rachum at> wrote:
> >>>> I propose implementing str.rreplace. (It'll be to str.replace what
> >>>> str.rsplit is to str.split.)
> >>>
> >>> I suppose it only differs when the count parameter is supplied?
> >>>
> >>> I don't think it can hurt, except for the funny looks of its name.
> >>> In any case, if str.rreplace is added then so should bytes.rreplace and
> >>> bytearray.rreplace.
> >>
> >> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
> >
> > Not sure what those have to do with rreplace(). Overgeneralization
> > doesn't help.
> If open a door for rreplace, it would be not easy to close it for rindex 
> and rremove.

Perhaps you underestimate our collective door closing skills ;)



From abarnert at  Fri Jan 24 19:20:59 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 24 Jan 2014 10:20:59 -0800
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
Message-ID: <>

On Jan 24, 2014, at 9:43, Ethan Furman <ethan at> wrote:

> On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
>> On Fri, 24 Jan 2014 19:30:00 +0200
>> Serhiy Storchaka <storchaka at>
>> wrote:
>>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>>>> Ram Rachum <ram.rachum at> wrote:
>>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>>>> str.rsplit is to str.split.)
>>>> I suppose it only differs when the count parameter is supplied?
>>>> I don't think it can hurt, except for the funny looks of its name.
>>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>>>> bytearray.rreplace.
>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>> Not sure what those have to do with rreplace().
> The funny look of the name, I think.  ;)

And the pronunciation. Hard to say it without sounding like a pirate. Although I guess you could interpret the rr as a rolled r: strrrrings have rrrrreplace thanks to rrrrachum.

But the inclusion of rindex makes me think this was a serious suggestion to add r versions of all methods that involve searching. Which probably isn't worth the effort to do, but there's nothing really wrong with the idea.

From abarnert at  Fri Jan 24 19:25:21 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 24 Jan 2014 10:25:21 -0800
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124192021.7dcc1c77@fsol>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

On Jan 24, 2014, at 10:20, Antoine Pitrou <solipsis at> wrote:

> On Fri, 24 Jan 2014 20:13:26 +0200
> Serhiy Storchaka <storchaka at>
> wrote:
>> 24.01.14 19:36, Antoine Pitrou ???????(??):
>>> On Fri, 24 Jan 2014 19:30:00 +0200
>>> Serhiy Storchaka <storchaka at>
>>> wrote:
>>>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>>>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>>>>> Ram Rachum <ram.rachum at> wrote:
>>>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>>>>> str.rsplit is to str.split.)
>>>>> I suppose it only differs when the count parameter is supplied?
>>>>> I don't think it can hurt, except for the funny looks of its name.
>>>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>>>>> bytearray.rreplace.
>>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>>> Not sure what those have to do with rreplace(). Overgeneralization
>>> doesn't help.
>> If open a door for rreplace, it would be not easy to close it for rindex 
>> and rremove.
> Perhaps you underestimate our collective door closing skills ;)

While we're speculatively overgeneralizing, couldn't all of the index/find/remove/replace/etc. methods take a negative n to count from the end, making r variants unnecessary?

From python at  Fri Jan 24 20:17:04 2014
From: python at (MRAB)
Date: Fri, 24 Jan 2014 19:17:04 +0000
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124175645.66bb8daf@fsol>
References: <>
Message-ID: <>

On 2014-01-24 16:56, Antoine Pitrou wrote:
> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> Ram Rachum <ram.rachum at> wrote:
>> I propose implementing str.rreplace. (It'll be to str.replace what
>> str.rsplit is to str.split.)
> I suppose it only differs when the count parameter is supplied?
Not necessarily:

 >>> 'aaa'.replace('aa', 'x')
 >>> 'aaa'.rreplace('aa', 'x')

> I don't think it can hurt, except for the funny looks of its name.
> In any case, if str.rreplace is added then so should bytes.rreplace and
> bytearray.rreplace.

From random832 at  Fri Jan 24 21:33:48 2014
From: random832 at (random832 at
Date: Fri, 24 Jan 2014 15:33:48 -0500
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <>
Message-ID: <>

On Fri, Jan 24, 2014, at 14:17, MRAB wrote:
> On 2014-01-24 16:56, Antoine Pitrou wrote:
> > On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> > Ram Rachum <ram.rachum at> wrote:
> >> I propose implementing str.rreplace. (It'll be to str.replace what
> >> str.rsplit is to str.split.)
> >
> > I suppose it only differs when the count parameter is supplied?
> >
> Not necessarily:
>  >>> 'aaa'.replace('aa', 'x')
> 'xa'
>  >>> 'aaa'.rreplace('aa', 'x')
> 'ax'


From rosuav at  Fri Jan 24 21:48:36 2014
From: rosuav at (Chris Angelico)
Date: Sat, 25 Jan 2014 07:48:36 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 25, 2014 at 7:33 AM,  <random832 at> wrote:
> 'ax'

It makes me happy when the [::-1] smiley gets used that many times to
solve a problem. Very happy.

Happy that it isn't in _my_ code, to be precise...


From breamoreboy at  Fri Jan 24 22:01:12 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 24 Jan 2014 21:01:12 +0000
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <>
Message-ID: <lbukam$e99$>

On 24/01/2014 20:48, Chris Angelico wrote:
> On Sat, Jan 25, 2014 at 7:33 AM,  <random832 at> wrote:
>>>>> 'aaa'[::-1].replace('aa'[::-1],'x'[::-1])[::-1]
>> 'ax'
> It makes me happy when the [::-1] smiley gets used that many times to
> solve a problem. Very happy.
> Happy that it isn't in _my_ code, to be precise...
> ChrisA


My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From python at  Fri Jan 24 22:04:22 2014
From: python at (MRAB)
Date: Fri, 24 Jan 2014 21:04:22 +0000
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <>
Message-ID: <>

On 2014-01-24 20:48, Chris Angelico wrote:
> On Sat, Jan 25, 2014 at 7:33 AM,  <random832 at> wrote:
>> 'ax'
> It makes me happy when the [::-1] smiley gets used that many times to
> solve a problem. Very happy.
> Happy that it isn't in _my_ code, to be precise...
It's probably not as efficient, either!

And if we're going to do it that way, do we really need .rindex and
.rfind? Or .rstrip (we could use .lstrip)?

From ncoghlan at  Sat Jan 25 01:05:31 2014
From: ncoghlan at (Nick Coghlan)
Date: Sat, 25 Jan 2014 10:05:31 +1000
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

On 25 Jan 2014 04:29, "Andrew Barnert" <abarnert at> wrote:
> On Jan 24, 2014, at 10:20, Antoine Pitrou <solipsis at> wrote:
> > On Fri, 24 Jan 2014 20:13:26 +0200
> > Serhiy Storchaka <storchaka at>
> > wrote:
> >> 24.01.14 19:36, Antoine Pitrou ???????(??):
> >>> On Fri, 24 Jan 2014 19:30:00 +0200
> >>> Serhiy Storchaka <storchaka at>
> >>> wrote:
> >>>> 24.01.14 18:56, Antoine Pitrou ???????(??):
> >>>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> >>>>> Ram Rachum <ram.rachum at> wrote:
> >>>>>> I propose implementing str.rreplace. (It'll be to str.replace what
> >>>>>> str.rsplit is to str.split.)
> >>>>>
> >>>>> I suppose it only differs when the count parameter is supplied?
> >>>>>
> >>>>> I don't think it can hurt, except for the funny looks of its name.
> >>>>> In any case, if str.rreplace is added then so should bytes.rreplace
> >>>>> bytearray.rreplace.
> >>>>
> >>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
> >>>
> >>> Not sure what those have to do with rreplace(). Overgeneralization
> >>> doesn't help.
> >>
> >> If open a door for rreplace, it would be not easy to close it for
> >> and rremove.
> >
> > Perhaps you underestimate our collective door closing skills ;)
> While we're speculatively overgeneralizing, couldn't all of the
index/find/remove/replace/etc. methods take a negative n to count from the
end, making r variants unnecessary?

Strings already provide rfind and rindex (they're just not part of the
general sequence API).

Since strings are immutable, there's also no call for an "rremove".

rreplace (pronounced as 'ar-replace", like "ar-split" et al) is more
obvious than a negative count, and seems like an almost exact parallel to

On the other hand, I don't recall ever lamenting its absence. Call me +0 on
the idea.


> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Jan 25 02:17:25 2014
From: steve at (Steven D'Aprano)
Date: Sat, 25 Jan 2014 12:17:25 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <>
Message-ID: <20140125011725.GT3915@ando>

On Fri, Jan 24, 2014 at 03:33:48PM -0500, random832 at wrote:
> On Fri, Jan 24, 2014, at 14:17, MRAB wrote:
> > On 2014-01-24 16:56, Antoine Pitrou wrote:
> > > On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
> > > Ram Rachum <ram.rachum at> wrote:
> > >> I propose implementing str.rreplace. (It'll be to str.replace what
> > >> str.rsplit is to str.split.)
> > >
> > > I suppose it only differs when the count parameter is supplied?
> > >
> > Not necessarily:
> > 
> >  >>> 'aaa'.replace('aa', 'x')
> > 'xa'
> >  >>> 'aaa'.rreplace('aa', 'x')
> > 'ax'

Good catch!

> >>>'aaa'[::-1].replace('aa'[::-1],'x'[::-1])[::-1]
> 'ax'

That is very possibly the ugliest Python code I have ever seen :-)


From abarnert at  Sat Jan 25 02:36:08 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 24 Jan 2014 17:36:08 -0800 (PST)
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

From: Nick Coghlan <ncoghlan at>
Sent: Friday, January 24, 2014 4:05 PM

>On 25 Jan 2014 04:29, "Andrew Barnert" <abarnert at> wrote:
>> On Jan 24, 2014, at 10:20, Antoine Pitrou <solipsis at> wrote:
>> > On Fri, 24 Jan 2014 20:13:26 +0200
>> > Serhiy Storchaka <storchaka at>
>> > wrote:
>> >> 24.01.14 19:36, Antoine Pitrou ???????(??):
>> >>> On Fri, 24 Jan 2014 19:30:00 +0200
>> >>> Serhiy Storchaka <storchaka at>
>> >>> wrote:
>> >>>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>> >>>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>> >>>>> Ram Rachum <ram.rachum at> wrote:
>> >>>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>> >>>>>> str.rsplit is to str.split.)
>> >>>>>
>> >>>>> I suppose it only differs when the count parameter is supplied?
>> >>>>>
>> >>>>> I don't think it can hurt, except for the funny looks of its name.
>> >>>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>> >>>>> bytearray.rreplace.
>> >>>>
>> >>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>> >>>
>> >>> Not sure what those have to do with rreplace(). Overgeneralization
>> >>> doesn't help.
>> >>
>> >> If open a door for rreplace, it would be not easy to close it for rindex
>> >> and rremove.
>> >
>> > Perhaps you underestimate our collective door closing skills ;)
>> While we're speculatively overgeneralizing, couldn't all of the index/find/remove/replace/etc. methods take a negative n to count from the end, making r variants unnecessary?
>Strings already provide rfind and rindex (they're just not part of the general sequence API).
>Since strings are immutable, there's also no call for an "remove".

I was responding to Serhiy's (probably facetious or devil's advocate) suggestion that we should regularize the API: add rfind and rindex to tuple (and presumably Sequence), and those plus rremove to list (and presumably MutableSequence), and so on.

My point was that if we're going to be that radical, we might as well consider removing methods instead of adding them. Some of the find-like methods already take negative indices; expanding that to all of the index-based methods, and doing the equivalent to the count-based ones, and adding a count or index to those that have neither, would mean all of the "r" variants could go away.

I think it's pretty obvious that both this suggestion and Serhiy's are not worth doing for Python?the language has had pretty much the same set of find-style methods for decades, most of them are used frequently, and people rarely go looking for any of the "missing" ones, so why change it? (And I think that was Serhiy's point as well, but I don't want to speak for him.) If people _do_ find themselves missing one particular variant, just adding that one more variant is a lot more conservative than changing everything; if not, there's no reason to add anything at all.

From greg.ewing at  Sat Jan 25 06:57:21 2014
From: greg.ewing at (Greg Ewing)
Date: Sat, 25 Jan 2014 18:57:21 +1300
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
Message-ID: <>

Ethan Furman wrote:
> On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
>> On Fri, 24 Jan 2014 19:30:00 +0200
>> Serhiy Storchaka <storchaka at>
>> wrote:
>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>> Not sure what those have to do with rreplace().
> The funny look of the name, I think.  ;)

Yes, obviously the properly serious names for
them would be bytearray.evomer, tuple.xedni and
list.evomer. No confusing double Rs to trip
you up then.


From python at  Sat Jan 25 07:45:05 2014
From: python at (Alexander Heger)
Date: Sat, 25 Jan 2014 17:45:05 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140124175645.66bb8daf@fsol>
References: <>
Message-ID: <>

>> I propose implementing str.rreplace. (It'll be to str.replace what
>> str.rsplit is to str.split.)

Instead of str.rreplace you could just add a parameter
'reverse=False|True' and add the same thing wherever needed, including
making rfind superfluous.

From storchaka at  Sat Jan 25 08:01:00 2014
From: storchaka at (Serhiy Storchaka)
Date: Sat, 25 Jan 2014 09:01:00 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lbvned$ebk$>

24.01.14 20:25, Andrew Barnert ???????(??):
> While we're speculatively overgeneralizing, couldn't all of the index/find/remove/replace/etc. methods take a negative n to count from the end, making r variants unnecessary?

This is backward incompatible change.

From storchaka at  Sat Jan 25 08:11:23 2014
From: storchaka at (Serhiy Storchaka)
Date: Sat, 25 Jan 2014 09:11:23 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lbvo1v$me2$>

25.01.14 02:05, Nick Coghlan ???????(??):
> Strings already provide rfind and rindex (they're just not part of the
> general sequence API).
> Since strings are immutable, there's also no call for an "rremove".
> rreplace (pronounced as 'ar-replace", like "ar-split" et al) is more
> obvious than a negative count, and seems like an almost exact parallel
> to rsplit.
> On the other hand, I don't recall ever lamenting its absence. Call me +0
> on the idea.

I'm between -0 and +0. On one hand there are precedents, meaning of 
these methods looks clear and consistent with others, and the cost of 
adding these methods are pretty low. On other hand, the cost is larger 
than zero, and these methods are needed very rarely (and there are other 
ways to do it).

In case of doubts I think the status quo wins.

From storchaka at  Sat Jan 25 08:16:17 2014
From: storchaka at (Serhiy Storchaka)
Date: Sat, 25 Jan 2014 09:16:17 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <>
Message-ID: <lbvob6$otd$>

24.01.14 23:04, MRAB ???????(??):
> On 2014-01-24 20:48, Chris Angelico wrote:
>> On Sat, Jan 25, 2014 at 7:33 AM,
>> <random832 at> wrote:
>>>>>> 'aaa'[::-1].replace('aa'[::-1],'x'[::-1])[::-1]
>>> 'ax'
>> It makes me happy when the [::-1] smiley gets used that many times to
>> solve a problem. Very happy.
>> Happy that it isn't in _my_ code, to be precise...
> It's probably not as efficient, either!

Of course it is less efficient than hypothetical rreplace, but I suppose 
it is most efficient way in current Python.

From rosuav at  Sat Jan 25 08:37:58 2014
From: rosuav at (Chris Angelico)
Date: Sat, 25 Jan 2014 18:37:58 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lbvob6$otd$>
References: <>
 <> <lbvob6$otd$>
Message-ID: <>

On Sat, Jan 25, 2014 at 6:16 PM, Serhiy Storchaka <storchaka at> wrote:
> 24.01.14 23:04, MRAB ???????(??):
>> On 2014-01-24 20:48, Chris Angelico wrote:
>>> On Sat, Jan 25, 2014 at 7:33 AM,
>>> <random832 at> wrote:
>>>>>>> 'aaa'[::-1].replace('aa'[::-1],'x'[::-1])[::-1]
>>>> 'ax'
>>> It makes me happy when the [::-1] smiley gets used that many times to
>>> solve a problem. Very happy.
>>> Happy that it isn't in _my_ code, to be precise...
>> It's probably not as efficient, either!
> Of course it is less efficient than hypothetical rreplace, but I suppose it
> is most efficient way in current Python.

Is it possible to use a reversed iterator, filter it through something
that does the replacement, and then do some sort of reversed ''.join()
at the end? It'd still be ugly though.


From g.brandl at  Sat Jan 25 08:55:36 2014
From: g.brandl at (Georg Brandl)
Date: Sat, 25 Jan 2014 08:55:36 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lbvob6$otd$>
References: <>
 <20140124175645.66bb8daf@fsol> <>
 <> <lbvob6$otd$>
Message-ID: <lbvqjp$9d2$>

Am 25.01.2014 08:16, schrieb Serhiy Storchaka:
> 24.01.14 23:04, MRAB ???????(??):
>> On 2014-01-24 20:48, Chris Angelico wrote:
>>> On Sat, Jan 25, 2014 at 7:33 AM,
>>> <random832 at> wrote:
>>>>>>> 'aaa'[::-1].replace('aa'[::-1],'x'[::-1])[::-1]
>>>> 'ax'
>>> It makes me happy when the [::-1] smiley gets used that many times to
>>> solve a problem. Very happy.
>>> Happy that it isn't in _my_ code, to be precise...
>> It's probably not as efficient, either!
> Of course it is less efficient than hypothetical rreplace, but I suppose 
> it is most efficient way in current Python.

There was also the suggestion on stackoverflow of

'x'.join('aaa'.rsplit('aa', 1))

which might be faster and less colon-y, but is very good at covering up the
real purpose of the code :)


From amber.yust at  Sat Jan 25 09:01:28 2014
From: amber.yust at (Amber Yust)
Date: Sat, 25 Jan 2014 08:01:28 +0000
Subject: [Python-ideas] str.rreplace
References: <>
 <20140124175645.66bb8daf@fsol> <>
 <> <lbvob6$otd$>
Message-ID: <-5165424205136425370@gmail297201516>

On Fri Jan 24 2014 at 11:55:57 PM, Georg Brandl <g.brandl at> wrote:

> There was also the suggestion on stackoverflow of
> 'x'.join('aaa'.rsplit('aa', 1))
> which might be faster and less colon-y, but is very good at covering up the
> real purpose of the code :)

Which is why you throw it in a clearly named function.

def rreplace(haystack, needle, replacement, count):
    """Replace the N rightmost occurrences of one string with another."""
    replacement.join(haystack.rsplit(needle, count))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From amber.yust at  Sat Jan 25 09:01:50 2014
From: amber.yust at (Amber Yust)
Date: Sat, 25 Jan 2014 08:01:50 +0000
Subject: [Python-ideas] str.rreplace
References: <>
 <20140124175645.66bb8daf@fsol> <>
 <> <lbvob6$otd$>
 <lbvqjp$9d2$> <-5165424205136425370@gmail297201516>
Message-ID: <-6404288867984790623@gmail297201516>

(Er, module the missing return keyword.)

On Sat Jan 25 2014 at 12:01:28 AM, Amber Yust <amber.yust at> wrote:

> On Fri Jan 24 2014 at 11:55:57 PM, Georg Brandl <g.brandl at> wrote:
> There was also the suggestion on stackoverflow of
> 'x'.join('aaa'.rsplit('aa', 1))
> which might be faster and less colon-y, but is very good at covering up the
> real purpose of the code :)
> Which is why you throw it in a clearly named function.
> def rreplace(haystack, needle, replacement, count):
>     """Replace the N rightmost occurrences of one string with another."""
>     replacement.join(haystack.rsplit(needle, count))
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From denis.spir at  Sat Jan 25 09:22:43 2014
From: denis.spir at (spir)
Date: Sat, 25 Jan 2014 09:22:43 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
Message-ID: <>

On 01/24/2014 07:20 PM, Andrew Barnert wrote:
> On Jan 24, 2014, at 9:43, Ethan Furman <ethan at> wrote:
>> On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
>>> On Fri, 24 Jan 2014 19:30:00 +0200
>>> Serhiy Storchaka <storchaka at>
>>> wrote:
>>>> 24.01.14 18:56, Antoine Pitrou ???????(??):
>>>>> On Fri, 24 Jan 2014 08:47:14 -0800 (PST)
>>>>> Ram Rachum <ram.rachum at> wrote:
>>>>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>>>>> str.rsplit is to str.split.)
>>>>> I suppose it only differs when the count parameter is supplied?
>>>>> I don't think it can hurt, except for the funny looks of its name.
>>>>> In any case, if str.rreplace is added then so should bytes.rreplace and
>>>>> bytearray.rreplace.
>>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>>> Not sure what those have to do with rreplace().
>> The funny look of the name, I think.  ;)
> And the pronunciation. Hard to say it without sounding like a pirate. Although I guess you could interpret the rr as a rolled r: strrrrings have rrrrreplace thanks to rrrrachum.
> But the inclusion of rindex makes me think this was a serious suggestion to add r versions of all methods that involve searching. Which probably isn't worth the effort to do, but there's nothing really wrong with the idea.

Those methods would better have a logical param meaning "traverse backwards", imo.


From denis.spir at  Sat Jan 25 09:24:15 2014
From: denis.spir at (spir)
Date: Sat, 25 Jan 2014 09:24:15 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
Message-ID: <>

On 01/24/2014 07:20 PM, Andrew Barnert wrote:
> And the pronunciation. Hard to say it without sounding like a pirate. Although I guess you could interpret the rr as a rolled r: strrrrings have rrrrreplace thanks to rrrrachum.

it's castinglish


From storchaka at  Sat Jan 25 09:25:53 2014
From: storchaka at (Serhiy Storchaka)
Date: Sat, 25 Jan 2014 10:25:53 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lbvqjp$9d2$>
References: <>
 <20140124175645.66bb8daf@fsol> <>
 <> <lbvob6$otd$>
Message-ID: <lbvsdi$u7h$>

25.01.14 09:55, Georg Brandl ???????(??):
> There was also the suggestion on stackoverflow of
> 'x'.join('aaa'.rsplit('aa', 1))
> which might be faster and less colon-y, but is very good at covering up the
> real purpose of the code :)

Indeed, it is faster if you less part of string is replaced.

But the [::-1] variant looks more funny.

From denis.spir at  Sat Jan 25 09:32:05 2014
From: denis.spir at (spir)
Date: Sat, 25 Jan 2014 09:32:05 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/25/2014 07:45 AM, Alexander Heger wrote:
>>> I propose implementing str.rreplace. (It'll be to str.replace what
>>> str.rsplit is to str.split.)
> Instead of str.rreplace you could just add a parameter
> 'reverse=False|True' and add the same thing wherever needed, including
> making rfind superfluous.

This is a right way, imo, except that there is no string (/sequence) reversal 
here, but instead backward traversal.


From phd at  Sat Jan 25 12:15:13 2014
From: phd at (Oleg Broytman)
Date: Sat, 25 Jan 2014 12:15:13 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
Message-ID: <>

On Sat, Jan 25, 2014 at 06:57:21PM +1300, Greg Ewing <greg.ewing at> wrote:
> Ethan Furman wrote:
> >On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
> >
> >>On Fri, 24 Jan 2014 19:30:00 +0200
> >>Serhiy Storchaka <storchaka at>
> >>wrote:
> >>
> >>>bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
> >>
> >>Not sure what those have to do with rreplace().
> >
> >The funny look of the name, I think.  ;)
> Yes, obviously the properly serious names for
> them would be bytearray.evomer, tuple.xedni and
> list.evomer. No confusing double Rs to trip
> you up then.

   While we are at it, can we also change the language a bit and add
closing lines for compound operators? I suggest pairs like if/fi,
for/rof and while/done. I'm still thinking about try/except/finally.
   That minor addition also would help to create multiline anonymous
functions -- just put the body inside def/fed.
   (Big ugly evil grin.)

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From python at  Sat Jan 25 13:21:42 2014
From: python at (Alexander Heger)
Date: Sat, 25 Jan 2014 23:21:42 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
Message-ID: <>

>> Instead of str.rreplace you could just add a parameter
>> 'reverse=False|True' and add the same thing wherever needed, including
>> making rfind superfluous.
> This is a right way, imo, except that there is no string (/sequence)
> reversal here, but instead backward traversal.

I suppose a better name could be found.  'traverse_backward=True|False(default)'

For some of the reverse methods problems may occur if they operate on
an iterator rather than an actual list, tuple, or similar.

From denis.spir at  Sat Jan 25 14:18:05 2014
From: denis.spir at (spir)
Date: Sat, 25 Jan 2014 14:18:05 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/25/2014 01:21 PM, Alexander Heger wrote:
> For some of the reverse methods problems may occur if they operate on
> an iterator rather than an actual list, tuple, or similar.

Sure. Thus maybe the right way is to abandon this altogether and require the 
user to user a reverse() generator (or should i say iterator here?) instead? 
(this time, really reverse ;-)


From breamoreboy at  Sat Jan 25 14:36:45 2014
From: breamoreboy at (Mark Lawrence)
Date: Sat, 25 Jan 2014 13:36:45 +0000
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <>
 <> <>
Message-ID: <lc0elc$gha$>

On 25/01/2014 11:15, Oleg Broytman wrote:
> On Sat, Jan 25, 2014 at 06:57:21PM +1300, Greg Ewing <greg.ewing at> wrote:
>> Ethan Furman wrote:
>>> On 01/24/2014 09:36 AM, Antoine Pitrou wrote:
>>>> On Fri, 24 Jan 2014 19:30:00 +0200
>>>> Serhiy Storchaka <storchaka at>
>>>> wrote:
>>>>> bytearray.rremove, tuple.rindex, list.rindex, list.rremove.
>>>> Not sure what those have to do with rreplace().
>>> The funny look of the name, I think.  ;)
>> Yes, obviously the properly serious names for
>> them would be bytearray.evomer, tuple.xedni and
>> list.evomer. No confusing double Rs to trip
>> you up then.
>     While we are at it, can we also change the language a bit and add
> closing lines for compound operators? I suggest pairs like if/fi,
> for/rof and while/done. I'm still thinking about try/except/finally.
>     That minor addition also would help to create multiline anonymous
> functions -- just put the body inside def/fed.
>     (Big ugly evil grin.)
> Oleg.

Big +1 from me.  Do we toss a coin to see who gets to write the PEP? Or 
is it decided by the winner of yet another reenactment of the Battle of 
Pearl Harbour? :)

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ron3200 at  Tue Jan 28 04:18:05 2014
From: ron3200 at (Ron Adam)
Date: Mon, 27 Jan 2014 21:18:05 -0600
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lc77h1$ce9$>

On 01/24/2014 07:36 PM, Andrew Barnert wrote:
>>>>> While we're speculatively overgeneralizing, couldn't all of the
>>>>> index/find/remove/replace/etc. methods take a negative n to
>>>>> count from the end, making r variants unnecessary?
>>> Strings already provide rfind and rindex (they're just not part of
>>> the general sequence API). Since strings are immutable, there's also
>>> no call for an "remove".

> I was responding to Serhiy's (probably facetious or devil's advocate)
> suggestion that we should regularize the API: add rfind and rindex to
> tuple (and presumably Sequence), and those plus rremove to list (and
> presumably MutableSequence), and so on.
> My point was that if we're going to be that radical, we might as well
> consider removing methods instead of adding them. Some of the find-like
> methods already take negative indices; expanding that to all of the
> index-based methods, and doing the equivalent to the count-based ones,
> and adding a count or index to those that have neither, would mean all
> of the "r" variants could go away.

How about a keyword to specify which end to index from?  When used, it 
would disable negative indexing as well.  When not used the current 
behaviour with negative indexing would be the default.

     direction=0            # The default with the current
     (or not specified)     #    negative indexing allowed.

     direction=1   # From first. Negative indexing disallowed.
     direction=-1  # From last.  Negative indexing disallowed.

(A shorter key word would be nice, but I can't think of any that is as clear.)

The reason for turning off the negative indexing is it would also offer a 
way to avoid some indexing bugs as well.  (Using negative indexing with a 
reversed index is just asking for trouble I think.)

While the spelling isn't a short and concise as I would like, I could 
always wrap them in short helper functions if I wanted... ffind, rfind, 
findex, rindex.. etc.  But those wouldn't need to be added to python.


> I think it's pretty obvious that both this suggestion and Serhiy's are
> not worth doing for Python?the language has had pretty much the same set
> of find-style methods for decades, most of them are used frequently, and
> people rarely go looking for any of the "missing" ones, so why change
> it? (And I think that was Serhiy's point as well, but I don't want to
> speak for him.) If people_do_  find themselves missing one particular
> variant, just adding that one more variant is a lot more conservative
> than changing everything; if not, there's no reason to add anything at
> all.

From abarnert at  Tue Jan 28 05:03:54 2014
From: abarnert at (Andrew Barnert)
Date: Mon, 27 Jan 2014 20:03:54 -0800 (PST)
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lc77h1$ce9$>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

From: Ron Adam <ron3200 at>

Sent: Monday, January 27, 2014 7:18 PM

> On 01/24/2014 07:36 PM, Andrew Barnert wrote:
>>  I was responding to Serhiy's (probably facetious or devil's advocate)
>>  suggestion that we should regularize the API: add rfind and rindex to
>>  tuple (and presumably Sequence), and those plus rremove to list (and
>>  presumably MutableSequence), and so on.
>>  My point was that if we're going to be that radical, we might as well
>>  consider removing methods instead of adding them. Some of the find-like
>>  methods already take negative indices; expanding that to all of the
>>  index-based methods, and doing the equivalent to the count-based ones,
>>  and adding a count or index to those that have neither, would mean all
>>  of the "r" variants could go away.
> How about a keyword to specify which end to index from?? When used, it would 
> disable negative indexing as well.? When not used the current behaviour with 
> negative indexing would be the default.

> ? ? direction=0? ? ? ? ? ? # The default with the current
> ? ? (or not specified)? ?  #? ? negative indexing allowed.
> ? ? direction=1?  # From first. Negative indexing disallowed.
> ? ? direction=-1? # From last.? Negative indexing disallowed.

> (A shorter key word would be nice, but I can't think of any that is as 
> clear.)

Why does it have to be -1/0/1 instead of just True/False?

In which case we could use "reverse", the same name that's already used for similar things in other methods like list.sort (and that's implied in the current names "rfind", etc.).

> The reason for turning off the negative indexing is it would also offer a way to?

> avoid some indexing bugs as well.? (Using negative indexing with a reversed 
> index is just asking for trouble I think.)

But str.rfind takes negative indices today:

? ? >>> 'abccba'.rfind('b', -5, -3)
? ? 1

Why take away functionality that already works?

And of course str.find takes negative indices and that's actually used in some quick&dirty scripts:

? ? >>> has_ext = path.find('.', -4)

Of course you could make an argument that any such scripts deserve to be broken?

From ron3200 at  Tue Jan 28 07:27:31 2014
From: ron3200 at (Ron Adam)
Date: Tue, 28 Jan 2014 00:27:31 -0600
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lc7ik6$o6c$>

On 01/27/2014 10:03 PM, Andrew Barnert wrote:
> From: Ron Adam <ron3200 at>
> Sent: Monday, January 27, 2014 7:18 PM
>> On 01/24/2014 07:36 PM, Andrew Barnert wrote:
>>>   I was responding to Serhiy's (probably facetious or devil's advocate)
>>>   suggestion that we should regularize the API: add rfind and rindex to
>>>   tuple (and presumably Sequence), and those plus rremove to list (and
>>>   presumably MutableSequence), and so on.
>>>   My point was that if we're going to be that radical, we might as well
>>>   consider removing methods instead of adding them. Some of the find-like
>>>   methods already take negative indices; expanding that to all of the
>>>   index-based methods, and doing the equivalent to the count-based ones,
>>>   and adding a count or index to those that have neither, would mean all
>>>   of the "r" variants could go away.
>> How about a keyword to specify which end to index from?  When used, it would
>> disable negative indexing as well.  When not used the current behaviour with
>> negative indexing would be the default.
>>      direction=0            # The default with the current
>>      (or not specified)     #    negative indexing allowed.
>>      direction=1   # From first. Negative indexing disallowed.
>>      direction=-1  # From last.  Negative indexing disallowed.
>> (A shorter key word would be nice, but I can't think of any that is as
>> clear.)
> Why does it have to be -1/0/1 instead of just True/False?

Well, then it would need to be..  True/False/None

The reason it needs three modes is to save the current behaviour and not 
break anything.  Actually I'm about even on weather I like the keyword 
option or separate functions.

Also there's the case of taking a slice from the middle with a positive 
starting index and a negative ending index.  And with the exception of 
examples, nearly all string slicing, use a right and left value to get 
characters in the forward order even if they are indexed from the right.

So that gives four modes... left middle right default
With the default being what we have now.

I wonder if maybe it would be better to do these things with the string 
format method?  That is a higher level interface more suitable for adding 
options to.

> In which case we could use "reverse", the same name that's already used for similar things in other methods like list.sort (and that's implied in the current names "rfind", etc.).
>> The reason for turning off the negative indexing is it would also offer a way to
>> avoid some indexing bugs as well.  (Using negative indexing with a reversed
>> index is just asking for trouble I think.)
> But str.rfind takes negative indices today:
>      >>> 'abccba'.rfind('b', -5, -3)
>      1
> Why take away functionality that already works?

It could still work that way.. just don't specify a direction. :-)

> And of course str.find takes negative indices and that's actually used in some quick&dirty scripts:
>      >>> has_ext = path.find('.', -4)
> Of course you could make an argument that any such scripts deserve to be broken?

I'd say they are already broken in that particular case. ;-)


From denis.spir at  Tue Jan 28 09:40:54 2014
From: denis.spir at (spir)
Date: Tue, 28 Jan 2014 09:40:54 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

On 01/28/2014 05:03 AM, Andrew Barnert wrote:
>> >(A shorter key word would be nice, but I can't think of any that is as
>> >clear.)
> Why does it have to be -1/0/1 instead of just True/False?
> In which case we could use "reverse", the same name that's already used for similar things in other methods like list.sort (and that's implied in the current names "rfind", etc.).

(Again, here there is no reversal, but backwards iteration; in list.sort, there 
is reversal. I'd vote for making all such methods use a logical param, if it did 
not break code [because eg rfind is used], on the line:
	l.find(it, backwards=False)
or a shorter param name.


From denis.spir at  Tue Jan 28 09:40:49 2014
From: denis.spir at (spir)
Date: Tue, 28 Jan 2014 09:40:49 +0100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lc7ik6$o6c$>
References: <>
 <20140124175645.66bb8daf@fsol> <lbu7to$l2p$>
 <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <>

On 01/28/2014 07:27 AM, Ron Adam wrote:
>> And of course str.find takes negative indices and that's actually used in some
>> quick&dirty scripts:
>>      >>> has_ext = path.find('.', -4)
>> Of course you could make an argument that any such scripts deserve to be broken?
> I'd say they are already broken in that particular case. ;-)

Not if the file(name)s are ones you create & control yourself. (Well, I don't 
mean I would program that way, except for a throwaway script. ;-)


From steve at  Tue Jan 28 13:33:50 2014
From: steve at (Steven D'Aprano)
Date: Tue, 28 Jan 2014 23:33:50 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <>
References: <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <20140128123349.GH3915@ando>

On Mon, Jan 27, 2014 at 08:03:54PM -0800, Andrew Barnert wrote:
> From: Ron Adam <ron3200 at>

> > How about a keyword to specify which end to index from?


As a general rule, when you have a function that takes a parameter which 
selects between two different sets of behaviour, and you normally 
specify that parameter as a literal or constant known at edit time, then 
the function should be split into two.


# Good API
string.upper(), string.lower()

# Bad API

sorted() and list.sort() (for example) are a counter-example. Sometimes 
you know which direction you want at edit-time, but there are many 
use-cases for leaving the decision to run-time. Nearly every application 
that sorts data lets the user decide which direction to sort.

In the case of replace/rreplace, it is more like the upper vs. lower 
situation than the sorted situation. For almost any reasonable use-case, 
you will know at edit-time whether you want to go from the left or from 
the right, so you'll specify the "direction" parameter as a edit-time 
literal or constant. The same applies to find/rfind.

> >? When used, it would 
> > disable negative indexing as well.


Negative indexing is a standard Python feature. There is nothing wrong 
with negative indexing, no more than there is something wrong with 
zero-based positive indexing.

It's also irrelevant to the replace/rreplace example, since replace 
doesn't take start/end indexes, and presumably rreplace wouldn't either.

> > When not used the current behaviour with 
> > negative indexing would be the default.
> >?
> > ? ? direction=0? ? ? ? ? ? # The default with the current
> > ? ? (or not specified)? ?  #? ? negative indexing allowed.
> > 
> > ? ? direction=1?  # From first. Negative indexing disallowed.
> > ? ? direction=-1? # From last.? Negative indexing disallowed.

And if you want to operate from the right, with negative indexing 
allowed? But really, having a flag to decide whether to allow negative 
indexing is silly. If you don't want negative indexes, just don't use 

> > (A shorter key word would be nice, but I can't think of any that is as 
> > clear.)
> Why does it have to be -1/0/1 instead of just True/False?
> In which case we could use "reverse", the same name that's already 
> used for similar things in other methods like list.sort (and that's 
> implied in the current names "rfind", etc.).

sorted(alist, reverse=True) gives the same result as sorted(alist, 
reverse=False) only reversed. That is not the case here:

    "Hello world".replace("o", "u", 1, reverse=True)  # rreplace

ought to return "Hello wurld", not "dlrow ulleH".

> > The reason for turning off the negative indexing is it would also offer a way to?
> > avoid some indexing bugs as well.? (Using negative indexing with a reversed 
> > index is just asking for trouble I think.)
> But str.rfind takes negative indices today:
> ? ? >>> 'abccba'.rfind('b', -5, -3)
> ? ? 1
> Why take away functionality that already works?

Exactly. Here, I agree strongly with Andrew. Negative indexing works 
perfectly well with find/rfind. Slices with negative strides are weird, 
but negative indexes are well-defined and easy to understand.

> And of course str.find takes negative indices and that's actually used 
> in some quick&dirty scripts:
> ? ? >>> has_ext = path.find('.', -4)
> Of course you could make an argument that any such scripts deserve to 
> be broken?

It would be an awfully bogus argument. Negative indexes are a 
well-defined part of Python indexing semantics. One might as well argue 
that any scripts that rely on list slicing making a copy "deserve to be 


From steve at  Tue Jan 28 13:46:24 2014
From: steve at (Steven D'Aprano)
Date: Tue, 28 Jan 2014 23:46:24 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lc7ik6$o6c$>
References: <lbuaf8$pn8$> <20140124192021.7dcc1c77@fsol>
Message-ID: <20140128124624.GJ3915@ando>

On Tue, Jan 28, 2014 at 12:27:31AM -0600, Ron Adam wrote:

> >>     direction=0            # The default with the current
> >>     (or not specified)     #    negative indexing allowed.
> >>
> >>     direction=1   # From first. Negative indexing disallowed.
> >>     direction=-1  # From last.  Negative indexing disallowed.
> >>
> >
> >>(A shorter key word would be nice, but I can't think of any that is as
> >>clear.)
> >
> >Why does it have to be -1/0/1 instead of just True/False?
> Well, then it would need to be..  True/False/None
> The reason it needs three modes is to save the current behaviour and not 
> break anything. 

What's "it", and how is this relevant to adding a version of replace 
that operates from the right?

> Actually I'm about even on weather I like the keyword 
> option or separate functions.
> Also there's the case of taking a slice from the middle with a positive 
> starting index and a negative ending index.

Now we're talking about slices?

Providing a positive and negative index to a slice is well-defined and 
well-understood operation. "I want everything except the first and last 
item" => [1:-1].

> And with the exception of 
> examples, nearly all string slicing, use a right and left value to get 
> characters in the forward order even if they are indexed from the right.

With the exception of what examples?

The rest of your sentence confuses me. Are you talking about extended 
slicing with a negative stride given? Please don't over-generalise this 
issue. It's a simple request to add a version of replaces that operates 
from the right, just like rfind operates from the right.

> So that gives four modes... left middle right default
> With the default being what we have now.


> I wonder if maybe it would be better to do these things with the string 
> format method?  That is a higher level interface more suitable for adding 
> options to.

You're talking about using a mini-language to control the direction 
of a replacement operation. That's not just an over-generalisation, its 
a hyper-generalisation.

> >And of course str.find takes negative indices and that's actually used in 
> >some quick&dirty scripts:
> >
> >     >>> has_ext = path.find('.', -4)
> >
> >Of course you could make an argument that any such scripts deserve to be 
> >broken?
> I'd say they are already broken in that particular case. ;-)

It's broken, but not because of the negative index.


From storchaka at  Tue Jan 28 14:07:15 2014
From: storchaka at (Serhiy Storchaka)
Date: Tue, 28 Jan 2014 15:07:15 +0200
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140128123349.GH3915@ando>
References: <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lc8a16$6ra$>

28.01.14 14:33, Steven D'Aprano ???????(??):
> As a general rule, when you have a function that takes a parameter which
> selects between two different sets of behaviour, and you normally
> specify that parameter as a literal or constant known at edit time, then
> the function should be split into two.
> E.g.:
> # Good API
> string.upper(), string.lower()
> # Bad API
> string.convert_case(to_upper=True|False)

# Good API
binascii.hexlify(data), zlib.compress(data)

# Bad API
codecs.encode(data, encoding='hex_codec'|'zlib_codec')

From steve at  Tue Jan 28 17:02:52 2014
From: steve at (Steven D'Aprano)
Date: Wed, 29 Jan 2014 03:02:52 +1100
Subject: [Python-ideas] str.rreplace
In-Reply-To: <lc8a16$6ra$>
References: <20140124192021.7dcc1c77@fsol>
 <20140128123349.GH3915@ando> <lc8a16$6ra$>
Message-ID: <20140128160248.GK3915@ando>

On Tue, Jan 28, 2014 at 03:07:15PM +0200, Serhiy Storchaka wrote:
> 28.01.14 14:33, Steven D'Aprano ???????(??):
> >As a general rule, when you have a function that takes a parameter which
> >selects between two different sets of behaviour, and you normally
> >specify that parameter as a literal or constant known at edit time, then
> >the function should be split into two.
> >
> >E.g.:
> >
> ># Good API
> >string.upper(), string.lower()
> >
> ># Bad API
> >string.convert_case(to_upper=True|False)
> # Good API
> binascii.hexlify(data), zlib.compress(data)

Sure. Nothing wrong with them.

> # Bad API
> codecs.encode(data, encoding='hex_codec'|'zlib_codec')

But that's not how the codecs.encode function is usually used. Like my 
earlier example of sorted(), sometimes you know in advance what 
encoding you want to use:

codecs.encode(text, encoding="uft-8")

but for many applications, the encoding parameter is not known until 

encoding = get_encoding() or DEFAULT_ENCODING
codecs.encoding(text, encoding=encoding)

I can't think of an application where I would want to choose between 
hex_codec and zlib_codec at runtime, but that's because they are codecs 
with completely different purposes. A better example might be an 
application where I choose between compression methods at runtime:

def get_compression():
    # returns the name of a compression codec
    # e.g. zlib_codec, bz2_codec, xz_codec, lmza_codec
    # some of these may not be in the std lib at this time

codecs.encoding(data, encoding=get_compression())

So the codecs.encoding function does not fail my test of "parameter is 
nearly always known at edit-time", and it is not a bad API.


From ron3200 at  Tue Jan 28 18:43:21 2014
From: ron3200 at (Ron Adam)
Date: Tue, 28 Jan 2014 11:43:21 -0600
Subject: [Python-ideas] str.rreplace
In-Reply-To: <20140128123349.GH3915@ando>
References: <20140124183633.60f215f6@fsol> <lbuaf8$pn8$>
Message-ID: <lc8q7d$u5c$>

On 01/28/2014 06:33 AM, Steven D'Aprano wrote:
> On Mon, Jan 27, 2014 at 08:03:54PM -0800, Andrew Barnert wrote:
>> From: Ron Adam <ron3200 at>
>>> How about a keyword to specify which end to index from?
> -1
> As a general rule, when you have a function that takes a parameter which
> selects between two different sets of behaviour, and you normally
> specify that parameter as a literal or constant known at edit time, then
> the function should be split into two.
> E.g.:
> # Good API
> string.upper(), string.lower()
> # Bad API
> string.convert_case(to_upper=True|False)

You are correct, and I got my methods mixed up this morning ...  I was 
thinking of __getitem__ instead of index.  And related methods.

The issues I was referring to are not directly related as you pointed out.

In most cases I do think having separate functions or methods is better. 
And in this case it's no different than having partition and rrpartition.

I think the argument against rreplace and the strangeness of it's name is 
too late.  There are already a fair number of "r" methods.


From wolfgang.maier at  Mon Jan 27 18:41:02 2014
From: wolfgang.maier at (Wolfgang)
Date: Mon, 27 Jan 2014 09:41:02 -0800 (PST)
Subject: [Python-ideas] statistics module in Python3.4
Message-ID: <>

Dear all,
I am still testing the new statistics module and I found two cases were the 
behavior of the module seems suboptimal to me.
My most important concern is the module's internal _sum function and its 
implications, the other one about passing Counter objects to module 

As for the first subject:
Specifically, I am not happy with the way the function handles different 
types. Currently _coerce_types gets called for every element in the 
function's input sequence and type conversion follows quite complicated 
rules, and - what is worst - make the outcome of _sum() and thereby mean() 
dependent on the order of items in the input sequence, e.g.:

>>> mean((1,Fraction(2,3),1.0,Decimal(2.3),2.0, Decimal(5)))

>>> mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
  File "C:\Python33\", line 369, in mean
    return _sum(data)/n
  File "C:\Python33\", line 157, in _sum
    T = _coerce_types(T, type(x))
  File "C:\Python33\", line 327, in _coerce_types
    raise TypeError('cannot coerce types %r and %r' % (T1, T2))
TypeError: cannot coerce types <class 'fractions.Fraction'> and <class 

(this is because when _sum iterates over the input type Fraction wins over 
int, then float wins over Fraction and over everything else that follows in 
the first example, but in the second case Fraction wins over int, but then 
Fraction vs Decimal is undefined and throws an error).

Confusing, isn't it? So here's the code of the _sum function:

def _sum(data, start=0):
    """_sum(data [, start]) -> value

    Return a high-precision sum of the given numeric data. If optional
    argument ``start`` is given, it is added to the total. If ``data`` is
    empty, ``start`` (defaulting to 0) is returned.


    >>> _sum([3, 2.25, 4.5, -0.5, 1.0], 0.75)

    Some sources of round-off error will be avoided:

    >>> _sum([1e50, 1, -1e50] * 1000)  # Built-in sum returns zero.

    Fractions and Decimals are also supported:

    >>> from fractions import Fraction as F
    >>> _sum([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
    Fraction(63, 20)

    >>> from decimal import Decimal as D
    >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
    >>> _sum(data)


    n, d = _exact_ratio(start)
    T = type(start)
    partials = {d: n}  # map {denominator: sum of numerators}
    # Micro-optimizations.
    coerce_types = _coerce_types
    exact_ratio = _exact_ratio
    partials_get = partials.get
    # Add numerators for each denominator, and track the "current" type.
    for x in data:
        T = _coerce_types(T, type(x))
        n, d = exact_ratio(x)
        partials[d] = partials_get(d, 0) + n
    if None in partials:
        assert issubclass(T, (float, Decimal))
        assert not math.isfinite(partials[None])
        return T(partials[None])
    total = Fraction()
    for d, n in sorted(partials.items()):
        total += Fraction(n, d)
    if issubclass(T, int):
        assert total.denominator == 1
        return T(total.numerator)
    if issubclass(T, Decimal):
        return T(total.numerator)/total.denominator
    return T(total)

Internally, the function uses exact ratios for its calculations (which I 
think is very nice) and only goes through all the pain of coercing types to 
where T is the final type resulting from the chain of conversions.

I think a much cleaner (and probably faster) implementation would be to 
gather first all the types in the input sequence, then decide what to 
return in an input order independent way. My tentative implementation:

def _sum2(data, start=None):
    if start is not None:
        t = set((type(start),))
        n, d = _exact_ratio(start)
        t = set()
        n = 0
        d = 1
    partials = {d: n}  # map {denominator: sum of numerators}

    # Micro-optimizations.
    exact_ratio = _exact_ratio
    partials_get = partials.get

    # Add numerators for each denominator, and build up a set of all types.
    for x in data:
        n, d = exact_ratio(x)
        partials[d] = partials_get(d, 0) + n
    T = _coerce_types(t) # decide which type to use based on set of all 
    if None in partials:
        assert issubclass(T, (float, Decimal))
        assert not math.isfinite(partials[None])
        return T(partials[None])
    total = Fraction()
    for d, n in sorted(partials.items()):
        total += Fraction(n, d)
    if issubclass(T, int):
        assert total.denominator == 1
        return T(total.numerator)
    if issubclass(T, Decimal):
        return T(total.numerator)/total.denominator
    return T(total)

this leaves the re-implementation of _coerce_types. Personally, I'd prefer 
something as simple as possible, maybe even:

def _coerce_types (types):
    if len(types) == 1:
        return next(iter(types))
    return float

, but that's just a suggestion.

In this case then:

>>> _sum2((1,Fraction(2,3),1.0,Decimal(2.3),2.0, Decimal(5)))/6

>>> _sum2((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))/6

lets check the examples from the _sum docstring just to be sure:

>>> _sum2([3, 2.25, 4.5, -0.5, 1.0], 0.75)

>>> _sum2([1e50, 1, -1e50] * 1000)  # Built-in sum returns zero.

>>> from fractions import Fraction as F
>>> _sum2([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
Fraction(63, 20)

>>> from decimal import Decimal as D
>>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
>>> _sum2(data)

Now the second issue:
It is maybe more a matter of taste and concerns the effects of passing a 
Counter() object to various functions in the module.
I know this is undocumented and it's probably the user's fault if he tries 
that, but still:

>>> from collections import Counter
>>> c=Counter((1,1,1,1,2,2,2,2,2,3,3,3,3))
>>> c
Counter({1: 4, 2: 5, 3: 4})
>>> mode(c)
Cool, mode knows how to work with Counters (interpreting them as frequency 

>>> median(c)
Looks good

>>> mean(c)
Very well

But the truth is that only mode really works as you may think and we were 
just lucky with the other two:
>>> c=Counter((1,1,2))
>>> mean(c)

>>> median(c)

>From a quick look at the code you can see that mode actually converts your 
input to a Counter behind the scenes anyway, so it has no problem.
mean and median, on the other hand, are simply iterating over their input, 
so if that input happens to be a mapping, they'll use just the keys.

I think there are two simple ways to avoid this pitfall:
1) add an explicit warning to the docs explaining this behavior or
2) make mean and median do the same magic with Counters as mode does, i.e. 
make them check for Counter as the input type and deal with it as if it 
were a frequency table. I'd favor this behavior because it looks like 
little extra code, but may be very useful in many situations. I'm not quite 
sure whether maybe even all mappings should be treated that way?

Ok, that's it for now I guess. Opinions anyone?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Wed Jan 29 04:06:40 2014
From: guido at (Guido van Rossum)
Date: Tue, 28 Jan 2014 19:06:40 -0800
Subject: [Python-ideas] Need help designing subprocess API for Tulip
Message-ID: <>

If you're interested, please see us on the python-tulip mailing list at
Google Groups.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From techtonik at  Wed Jan 29 08:44:30 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 29 Jan 2014 10:44:30 +0300
Subject: [Python-ideas] Iterative development
Message-ID: <>

Yet another idea that some of you will find strange.

It is a parallel Python development process. It doesn't affect or
replace current practice, so nobody gets hurt. It is also about open
process, where openness means transparency (eliminate hidden
communication), inclusiveness (eliminate exclusive rights and
privileges) and accessibility (eliminate awkward practices and poor
user experience).

The idea is to split development of Python into two weeks cycle. Every
two weeks is "iteration". Iteration consists of phases:

1. Planning (one, two days)
2. Execution
3. Testing
4. Demo
5. Retrospective

Some of you, who familiar with concept of "sprint" and know something
about "agile" buzzwords will find this idea familiar. In fact, this is
borrowed from some of the best practices of working with remote teams
who use this methodology.

(Planning) So, during these the first, planning phase, people, who'd
like to participate - choose what should be implemented in this
iteration. For that there should be a list of things to be done. This
list is called "backlog". People collaboratively estimate complexity
and sort the things by priority.

(Execution) You take a thing from backlog, mark that you're working on
it, so that other people who are also interested can find you. If you
need help, you split the thing into subtasks and make these tasks open
for people to find and jump in.

(Testing) This is a phase when work done is compared with actual thing
description. Sometimes this leads to new insights, new ideas, new bugs
and more work to be done in subsequent iteration. Sometimes it appears
that during execution the thing completely diverged from what was
originally planned.

(Demo) Demonstration of the things done. Record progress, give credits
and close mark things in backlog as done. Demo is made for broader
community that just for a list of participants.

(Retrospective) This is an important phase that is dedicated to
gathering and processing feedback to improve the iteration loop. Every
person reports what he/she liked and disliked, what was the % of
overall fun. Then some things and ideas are being born from the
feedback - what can be improved - being it tools, interaction with
people or some other things that get in the way.

anatoly t.

From ethan at  Wed Jan 29 09:29:10 2014
From: ethan at (Ethan Furman)
Date: Wed, 29 Jan 2014 00:29:10 -0800
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/28/2014 11:44 PM, anatoly techtonik wrote:
> It is a parallel Python development process. It doesn't affect or
> replace current practice, so nobody gets hurt.

So you're saying that we would have the current model, plus this agile model?

>  It is also about open
> process, where openness means transparency (eliminate hidden
> communication),

What "hidden" communication?  Talking in person or on IRC?  Instead of ... where?

> inclusiveness (eliminate exclusive rights and privileges)

Exclusive rights?  You mean let any piece of code get committed?

>  and accessibility (eliminate awkward practices and poor
> user experience).

It is not possible to please everyone; it is also not possible to ensure a "good user experience" for everyone.

> The idea is to split development of Python into two weeks cycle.

80 hours?  Do you have any idea how long it takes some of us to put in 80 hours of Python development time?


From abarnert at  Wed Jan 29 09:57:37 2014
From: abarnert at (Andrew Barnert)
Date: Wed, 29 Jan 2014 00:57:37 -0800
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 28, 2014, at 23:44, anatoly techtonik <techtonik at> wrote:

> Yet another idea that some of you will find strange.

You do realize that Python is an open source project?

And that the only people who work on it full time are the ones being paid by some organization that generally has its own priorities?

> It is a parallel Python development process. It doesn't affect or
> replace current practice, so nobody gets hurt. It is also about open
> process, where openness means transparency (eliminate hidden
> communication), inclusiveness (eliminate exclusive rights and
> privileges) and accessibility (eliminate awkward practices and poor
> user experience).
> The idea is to split development of Python into two weeks cycle. Every
> two weeks is "iteration". Iteration consists of phases:
> 1. Planning (one, two days)
> 2. Execution
> 3. Testing
> 4. Demo
> 5. Retrospective
> Some of you, who familiar with concept of "sprint" and know something
> about "agile" buzzwords will find this idea familiar. In fact, this is
> borrowed from some of the best practices of working with remote teams
> who use this methodology.
> (Planning) So, during these the first, planning phase, people, who'd
> like to participate - choose what should be implemented in this
> iteration. For that there should be a list of things to be done. This
> list is called "backlog". People collaboratively estimate complexity
> and sort the things by priority.
> (Execution) You take a thing from backlog, mark that you're working on
> it, so that other people who are also interested can find you. If you
> need help, you split the thing into subtasks and make these tasks open
> for people to find and jump in.
> (Testing) This is a phase when work done is compared with actual thing
> description. Sometimes this leads to new insights, new ideas, new bugs
> and more work to be done in subsequent iteration. Sometimes it appears
> that during execution the thing completely diverged from what was
> originally planned.
> (Demo) Demonstration of the things done. Record progress, give credits
> and close mark things in backlog as done. Demo is made for broader
> community that just for a list of participants.
> (Retrospective) This is an important phase that is dedicated to
> gathering and processing feedback to improve the iteration loop. Every
> person reports what he/she liked and disliked, what was the % of
> overall fun. Then some things and ideas are being born from the
> feedback - what can be improved - being it tools, interaction with
> people or some other things that get in the way.
> -- 
> anatoly t.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From breamoreboy at  Wed Jan 29 10:29:21 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 29 Jan 2014 09:29:21 +0000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcahl7$cnp$>

On 29/01/2014 07:44, anatoly techtonik wrote:
> Yet another idea that some of you will find strange.

Instead of coming up with ideas, why not sign the contributors' 
agreement and come up with code that people can actually use?

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ncoghlan at  Wed Jan 29 10:31:44 2014
From: ncoghlan at (Nick Coghlan)
Date: Wed, 29 Jan 2014 19:31:44 +1000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 Jan 2014 19:00, "Andrew Barnert" <abarnert at> wrote:
> On Jan 28, 2014, at 23:44, anatoly techtonik <techtonik at> wrote:
> > Yet another idea that some of you will find strange.
> You do realize that Python is an open source project?
> And that the only people who work on it full time are the ones being paid
by some organization that generally has its own priorities?

Currently a group containing zero people, FWIW (even Guido only spends part
of his time on upstream work).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Wed Jan 29 10:47:26 2014
From: tjreedy at (Terry Reedy)
Date: Wed, 29 Jan 2014 04:47:26 -0500
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcain6$oun$>

On 1/29/2014 2:44 AM, anatoly techtonik wrote:

> The idea is to split development of Python into two weeks cycle. Every
> two weeks is "iteration". Iteration consists of phases:
> 1. Planning (one, two days)
> 2. Execution
> 3. Testing
> 4. Demo
> 5. Retrospective

This is more or less what we do now on an issue by issue basis. At a 
higher level, releases for the 'next' version already come out at 2 or 3 
week intervals from a0 to final. At a higher level, we already have 
plans for 3.5 that we will start on as soon as 3.4.0 is out or after PyCon.

Terry Jan Reedy

From techtonik at  Wed Jan 29 10:11:44 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 29 Jan 2014 12:11:44 +0300
Subject: [Python-ideas] Normalized Python
Message-ID: <>

Python is a cross-platform language, but I often find myself writing
sections specific for Windows and for Linux and sometimes even OS
setting specific code. In these moments I that Python is not more
cross-platform that C, for example.

What could be done?

Normalized Python - a set of default, standard behaviors that backup
common user expectations about cross-platform and system-independent
behavior regardless of backward compatibility and code compatibility

This is needed, for example, to collect these two features:
1. open files in binary mode by default
    because "text file" is a human abstraction, for operating
    system it is just another format of binary data, so default
    operation is to read this data without any preprocessing

2. open text files in utf-8 encoding
    because users can not know the encoding of operating
    system, their programs can not choose right encoding,
    therefore a best guess is to expect the most widely used

3. threat stdout/stdin streams as binary
    because you don't want you data to be corrupt when
    you pass it in and out of Python via standard streams

Having a separate "Normalized Python" concept is needed to set
the context for developing and engineering ideas, instead of
concentrating on the sad reality of backward compatibility curse.
anatoly t.

From techtonik at  Wed Jan 29 13:29:21 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 29 Jan 2014 15:29:21 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 11:29 AM, Ethan Furman <ethan at> wrote:
> On 01/28/2014 11:44 PM, anatoly techtonik wrote:
>> It is a parallel Python development process. It doesn't affect or
>> replace current practice, so nobody gets hurt.
> So you're saying that we would have the current model, plus this agile
> model?

I am saying that you're not forced to follow agile model if you don't like it.
You can do what you do as you did before.

>>  It is also about open
>> process, where openness means transparency (eliminate hidden
>> communication),
> What "hidden" communication?  Talking in person or on IRC?  Instead of ...
> where?

If information doesn't reach the recipient who want to read it, it is "hidden".
Even if you talk in public channel on IRC, the information is hidden from me
if I was not connected and channel doesn't have public logs.

>> inclusiveness (eliminate exclusive rights and privileges)
> Exclusive rights?  You mean let any piece of code get committed?

There are many exclusive rights that keep people off from contributing.
I don't want to touch them here, because it will move the thread into different
area. To make it more specific "inclusiveness" on the process is the process
too. You start with people who have full exclusive rights and contributing then
compare them to people who are willing to help, but don't do this. Then you
remove the obstacles to include these people.

>>  and accessibility (eliminate awkward practices and poor
>> user experience).
> It is not possible to please everyone; it is also not possible to ensure a
> "good user experience" for everyone.

That's a general claim. I am sure that it is possible to reach the point where
everyone agree that their experience is "good enough user experience".

And there is a dedicated time in the process (retrospective) to work on just
on that.

>> The idea is to split development of Python into two weeks cycle.
> 80 hours?  Do you have any idea how long it takes some of us to put in 80
> hours of Python development time?

It is not development time. These two weeks cycle is just ordinary time, which
may include 15 minutes of development time, a week or nothing. It is up to
you - how much are you willing to spend.

From rosuav at  Wed Jan 29 14:57:46 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 00:57:46 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at> wrote:
> If information doesn't reach the recipient who want to read it, it is "hidden".
> Even if you talk in public channel on IRC, the information is hidden from me
> if I was not connected and channel doesn't have public logs.

Then if you care, connect. It's not hidden if you have the power to access it.

Here's a suggestion: Fork Python (that's legal, that's what open
source means) and start development using the model you advocate. If
it's massively better than what's happening, (a) developers will flock
to your model, and (b) the project could be completely handed over to
you, as happened with GCC.

Or alternatively, explain to us here what the real advantages are of
your new model. So far, what I've seen is "hey, here's an idea", and
not "here's what this idea will do to benefit Python"; and the idea
itself looks more suited to a big business than to open source. Maybe
someone who's actually used Agile will know what's so wonderful about
it, but unless every core dev *has*, a bit of explanation will help.


From breamoreboy at  Wed Jan 29 15:08:39 2014
From: breamoreboy at (Mark Lawrence)
Date: Wed, 29 Jan 2014 14:08:39 +0000
Subject: [Python-ideas] Normalized Python
In-Reply-To: <>
References: <>
Message-ID: <lcb20t$pu9$>

On 29/01/2014 09:11, anatoly techtonik wrote:
> Python is a cross-platform language, but I often find myself writing
> sections specific for Windows and for Linux and sometimes even OS
> setting specific code. In these moments I that Python is not more
> cross-platform that C, for example.
> What could be done?
> Normalized Python - a set of default, standard behaviors that backup
> common user expectations about cross-platform and system-independent
> behavior regardless of backward compatibility and code compatibility
> concerns.
> This is needed, for example, to collect these two features:
> 1. open files in binary mode by default
> why?
>      because "text file" is a human abstraction, for operating
>      system it is just another format of binary data, so default
>      operation is to read this data without any preprocessing
> 2. open text files in utf-8 encoding
> why?
>      because users can not know the encoding of operating
>      system, their programs can not choose right encoding,
>      therefore a best guess is to expect the most widely used
>      standard
> 3. threat stdout/stdin streams as binary
> why?
>      because you don't want you data to be corrupt when
>      you pass it in and out of Python via standard streams
> Having a separate "Normalized Python" concept is needed to set
> the context for developing and engineering ideas, instead of
> concentrating on the sad reality of backward compatibility curse.

I support what Chris Angelico has said on another thread, fork Python 
and if it's good enough everybody will flock to it.  This also avoids 
the problem with the CLA.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ncoghlan at  Wed Jan 29 15:18:19 2014
From: ncoghlan at (Nick Coghlan)
Date: Thu, 30 Jan 2014 00:18:19 +1000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 January 2014 23:57, Chris Angelico <rosuav at> wrote:
> On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at> wrote:
>> If information doesn't reach the recipient who want to read it, it is "hidden".
>> Even if you talk in public channel on IRC, the information is hidden from me
>> if I was not connected and channel doesn't have public logs.
> Then if you care, connect. It's not hidden if you have the power to access it.
> Here's a suggestion: Fork Python (that's legal, that's what open
> source means) and start development using the model you advocate. If
> it's massively better than what's happening, (a) developers will flock
> to your model, and (b) the project could be completely handed over to
> you, as happened with GCC.
> Or alternatively, explain to us here what the real advantages are of
> your new model. So far, what I've seen is "hey, here's an idea", and
> not "here's what this idea will do to benefit Python"; and the idea
> itself looks more suited to a big business than to open source. Maybe
> someone who's actually used Agile will know what's so wonderful about
> it, but unless every core dev *has*, a bit of explanation will help.

Plenty of us have used it, and we know it's an entirely inappropriate
model for open source development projects with broad asynchronous
participation, as the time commitment needed to make the short cycle
work is antithetical to loose collaboration. It works well for a
focused team supporting a single application to meet the specific
needs of a single business, though.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Wed Jan 29 15:33:56 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 01:33:56 +1100
Subject: [Python-ideas] Normalized Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 8:11 PM, anatoly techtonik <techtonik at> wrote:
> Normalized Python - a set of default, standard behaviors that backup
> common user expectations about cross-platform and system-independent
> behavior regardless of backward compatibility and code compatibility
> concerns.
> Having a separate "Normalized Python" concept is needed to set
> the context for developing and engineering ideas, instead of
> concentrating on the sad reality of backward compatibility curse.

You can achieve the first two simply by opening files with parameters.
There is NOTHING Windows-specific or Linux-specific in that. As of
Python 3, opening in text mode is the default... but you can override
that so easily. Why change the default (which breaks back compat) when
you can just change your code?

And I believe you can reopen stdin/stdout as binary, if you really
want to, but that is a little harder. It's still not going to have any
platform-specific code in it. (As I've never written a filter for
binary files in Python, I've never had the need to read/write standard
streams in binary. But I've no doubt that someone who has can show you
how easy it is - I'd guess it's less than five lines of code, knowing

> This is needed, for example, to collect these two features:

(Among our features are such diverse elements as... oh, wrong Pythons.)

> 1. open files in binary mode by default
> why?
>     because "text file" is a human abstraction, for operating
>     system it is just another format of binary data, so default
>     operation is to read this data without any preprocessing

A reasonably plausible argument. C++ follows that sort of model (you
shouldn't pay for anything you're not using). SQL mostly follows that
model (it generally takes more keywords to get the database to do more
work - compare "SELECT x FROM y" and "SELECT x FROM y ORDER BY z",
where the latter adds a sort phase; there are exceptions to this, like
UNION ALL vs UNION, but they're notable _because_ they're exceptions).
But it's nothing like a strong enough argument for changing. Creating
two subtly different languages is a major problem, especially when the
exact same syntax means different things. Imagine if I create a fork
of Python that's absolutely identical except that you create a set
with [1,2,3] and a list with {1,2,3}. All your code will be
syntactically correct, but suddenly it does something quite different.
That is a BAD idea. It would have to be *immensely* better to justify
the breakage; and this is only "arguably better". (The most obvious
contrary argument is that the default should do the thing most people
want most often, which is working with text files. This same argument
justifies the use of arbitrary-precision integers by default, instead
of requiring an explicit "long" type; I'm sure you'll agree that the
Py3 unification of these types was an advantage.)

> 2. open text files in utf-8 encoding
> why?
>     because users can not know the encoding of operating
>     system, their programs can not choose right encoding,
>     therefore a best guess is to expect the most widely used
>     standard

Yes, this one is an issue. Python lets the OS recommend a default
encoding, on the expectation that a Python script should fit into its
host platform, rather than that all platforms should conform to what
Python wants. A judgment call, and I'm sure there can be endless
debates about what Python should do, but since it can be overridden
with a single parameter on the open call, not a big deal IMO.

> 3. threat stdout/stdin streams as binary
> why?
>     because you don't want you data to be corrupt when
>     you pass it in and out of Python via standard streams

Most definitely NOT. The standard streams should, by default, be text
streams, and should have their encodings set according to what the
other side wants. If there's a way for the OS and Python to
communicate an encoding, that's absolutely perfect. Yes, there'll be a
few edge cases involving redirection, but that's pretty much
unsolvable anyway. The normal usage of Python MUST include Unicode;
and that means the most obvious way to produce output (the print
function) needs to write Unicode. So if stdout is a binary stream,
what's print going to do with a str? Encode it? If so, you just move
the issue - and print can send to multiple streams, so it'd need to
know which are text and which are binary, etc, etc. Or should it throw
an error, and force the programmer to do stuff like this:

CONSOLE_ENCODING = "utf-8" # add some logic for guessing this
s = "Hello, world!"

just to ensure that every programmer has to battle with the encodings
manually, in lots of places, instead of configuring it once (or, more
likely, having the default be right) and then having clean code

The only way that opening stdin/out as binary will prevent the
corruption of your data is if your data is fundamentally bytes. Most
programs, in any language, work with data that's fundamentally text;
granted, a lot of languages don't distinguish, but if you look at what
the programmer's doing, it's still text. Anything that prints "Hello,
world!" is printing text, not bytes, and if the console's encoding is
UTF-16, that should emit 26 bytes (plus any newline that's
appropriate). Forcing the programmer to think about this is completely

How many times do you actually come across these issues in porting?
How much effort would you really save if these measures were
implemented? If it's that important to you, fork CPython and create
this "Normalized Python" that does everything you want (and then,
linking this with the other thread, continue development of Normalized
Python according to an Agile model and see if people join you rather
than CPython). Good luck.


From rosuav at  Wed Jan 29 16:29:58 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 02:29:58 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at> wrote:
> You start with people who have full exclusive rights and contributing then
> compare them to people who are willing to help, but don't do this. Then you
> remove the obstacles to include these people.

There's a fundamental misunderstanding behind this, I think.

Contributions are valued, yes, but the purpose of an open source
project does not begin and end at "encouraging contributions from
every person on the planet". The goal of Python is to be a useful and
usable programming language, and if that's best served by a single
person doing all the coding, then that's how the project should be
run. (I'm preeeeeetty confident that's not the case, though.)

There's a general feeling around the world that dictatorships are bad,
democracy is good, and the more people you have involved in something,
the better. While this is not entirely false, it's not entirely true
either. In the Bible, in the book of Proverbs, God tells us several
times that multiple people's advice is of value. [1] [2] [3] But
that's advice, not decision making. When it comes down to a final
decision, it's almost always best to have a single person decide. A
business has a CEO, an orchestra have a conductor, there's only one
steering wheel in a car. And ultimately, trying to make every single
thought behind every single decision public is counter-productive too.
Ever tried to answer a child's "Why? Why? Why?" machine-gun? Yeah.

On another project, I've contributed a large number of patches. Some
fix bugs, some add features, some just fix little typos in
documentation. All of them were simply submitted to the core team,
reviewed, and ultimately applied, rejected, or modified. I'm not a
core dev. I can't push to the git repository. But if I were to be
given that power, it would be for reasons of convenience (if the core
devs decide that all my patches are getting applied anyway, and it's
easier for them to let me push my own), not transparency. You want to
know what's going on? Get involved. Then you'll know.

The people who care about the project will find a way to contribute.
That's a fundamental of the open source model. You don't like the
agreement that has to be signed before your patches will be accepted?
Then contribute by reviewing other people's patches, or verifying bug
reports, or whatever. Onus is not on the legal team to make
everything work for you; it's their job to make everything work for
the PSF. I haven't looked into the specifics of the agreement in
detail, but I'm confident that the PSF would not demand something just
for the sake of bureaucracy, so I'd trust that there's good reason for
all of it. (And hey. if you don't want to sign that, you can just
declare that your contributions are public domain, IIRC.)

I'm sure it's very American to demand that the people in power tell
you what they're doing. (Or insert any other country name there,
though I think the USA is at the forefront of this.) Trouble is, open
source projects simply aren't built that way.


[1] Prov 11:14
[2] Prov 15:22
[3] Prov 24:6

From amber.yust at  Wed Jan 29 16:36:58 2014
From: amber.yust at (Amber Yust)
Date: Wed, 29 Jan 2014 15:36:58 +0000
Subject: [Python-ideas] Iterative development
References: <>
Message-ID: <5676137352349587376@gmail297201516>

I agree with you Chris, but can we keep religion out of this?

On Wed Jan 29 2014 at 7:30:32 AM, Chris Angelico <rosuav at> wrote:

> On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at>
> wrote:
> > You start with people who have full exclusive rights and contributing
> then
> > compare them to people who are willing to help, but don't do this. Then
> you
> > remove the obstacles to include these people.
> There's a fundamental misunderstanding behind this, I think.
> Contributions are valued, yes, but the purpose of an open source
> project does not begin and end at "encouraging contributions from
> every person on the planet". The goal of Python is to be a useful and
> usable programming language, and if that's best served by a single
> person doing all the coding, then that's how the project should be
> run. (I'm preeeeeetty confident that's not the case, though.)
> There's a general feeling around the world that dictatorships are bad,
> democracy is good, and the more people you have involved in something,
> the better. While this is not entirely false, it's not entirely true
> either. In the Bible, in the book of Proverbs, God tells us several
> times that multiple people's advice is of value. [1] [2] [3] But
> that's advice, not decision making. When it comes down to a final
> decision, it's almost always best to have a single person decide. A
> business has a CEO, an orchestra have a conductor, there's only one
> steering wheel in a car. And ultimately, trying to make every single
> thought behind every single decision public is counter-productive too.
> Ever tried to answer a child's "Why? Why? Why?" machine-gun? Yeah.
> On another project, I've contributed a large number of patches. Some
> fix bugs, some add features, some just fix little typos in
> documentation. All of them were simply submitted to the core team,
> reviewed, and ultimately applied, rejected, or modified. I'm not a
> core dev. I can't push to the git repository. But if I were to be
> given that power, it would be for reasons of convenience (if the core
> devs decide that all my patches are getting applied anyway, and it's
> easier for them to let me push my own), not transparency. You want to
> know what's going on? Get involved. Then you'll know.
> The people who care about the project will find a way to contribute.
> That's a fundamental of the open source model. You don't like the
> agreement that has to be signed before your patches will be accepted?
> Then contribute by reviewing other people's patches, or verifying bug
> reports, or whatever. Onus is not on the legal team to make
> everything work for you; it's their job to make everything work for
> the PSF. I haven't looked into the specifics of the agreement in
> detail, but I'm confident that the PSF would not demand something just
> for the sake of bureaucracy, so I'd trust that there's good reason for
> all of it. (And hey. if you don't want to sign that, you can just
> declare that your contributions are public domain, IIRC.)
> I'm sure it's very American to demand that the people in power tell
> you what they're doing. (Or insert any other country name there,
> though I think the USA is at the forefront of this.) Trouble is, open
> source projects simply aren't built that way.
> ChrisA
> [1] Prov 11:14
> 14
> [2] Prov 15:22
> 22
> [3] Prov 24:6
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From at  Wed Jan 29 16:52:31 2014
From: at (Haoyi Li)
Date: Wed, 29 Jan 2014 23:52:31 +0800
Subject: [Python-ideas] Iterative development
In-Reply-To: <5676137352349587376@gmail297201516>
References: <>
Message-ID: <>

> You want to know what's going on? Get involved. Then you'll know

+1. It's odd to complain about the project's organization and processes
when you haven't actually had any real experience with either. Getting
involved in some project run by other people isn't easy, but it's not
really that hard either in the world of open source.

On Wed, Jan 29, 2014 at 11:36 PM, Amber Yust <amber.yust at> wrote:

> I agree with you Chris, but can we keep religion out of this?
> On Wed Jan 29 2014 at 7:30:32 AM, Chris Angelico <rosuav at> wrote:
>> On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at>
>> wrote:
>> > You start with people who have full exclusive rights and contributing
>> then
>> > compare them to people who are willing to help, but don't do this. Then
>> you
>> > remove the obstacles to include these people.
>> There's a fundamental misunderstanding behind this, I think.
>> Contributions are valued, yes, but the purpose of an open source
>> project does not begin and end at "encouraging contributions from
>> every person on the planet". The goal of Python is to be a useful and
>> usable programming language, and if that's best served by a single
>> person doing all the coding, then that's how the project should be
>> run. (I'm preeeeeetty confident that's not the case, though.)
>> There's a general feeling around the world that dictatorships are bad,
>> democracy is good, and the more people you have involved in something,
>> the better. While this is not entirely false, it's not entirely true
>> either. In the Bible, in the book of Proverbs, God tells us several
>> times that multiple people's advice is of value. [1] [2] [3] But
>> that's advice, not decision making. When it comes down to a final
>> decision, it's almost always best to have a single person decide. A
>> business has a CEO, an orchestra have a conductor, there's only one
>> steering wheel in a car. And ultimately, trying to make every single
>> thought behind every single decision public is counter-productive too.
>> Ever tried to answer a child's "Why? Why? Why?" machine-gun? Yeah.
>> On another project, I've contributed a large number of patches. Some
>> fix bugs, some add features, some just fix little typos in
>> documentation. All of them were simply submitted to the core team,
>> reviewed, and ultimately applied, rejected, or modified. I'm not a
>> core dev. I can't push to the git repository. But if I were to be
>> given that power, it would be for reasons of convenience (if the core
>> devs decide that all my patches are getting applied anyway, and it's
>> easier for them to let me push my own), not transparency. You want to
>> know what's going on? Get involved. Then you'll know.
>> The people who care about the project will find a way to contribute.
>> That's a fundamental of the open source model. You don't like the
>> agreement that has to be signed before your patches will be accepted?
>> Then contribute by reviewing other people's patches, or verifying bug
>> reports, or whatever. Onus is not on the legal team to make
>> everything work for you; it's their job to make everything work for
>> the PSF. I haven't looked into the specifics of the agreement in
>> detail, but I'm confident that the PSF would not demand something just
>> for the sake of bureaucracy, so I'd trust that there's good reason for
>> all of it. (And hey. if you don't want to sign that, you can just
>> declare that your contributions are public domain, IIRC.)
>> I'm sure it's very American to demand that the people in power tell
>> you what they're doing. (Or insert any other country name there,
>> though I think the USA is at the forefront of this.) Trouble is, open
>> source projects simply aren't built that way.
>> ChrisA
>> [1] Prov 11:14
>> 14
>> [2] Prov 15:22
>> 22
>> [3] Prov 24:6
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Wed Jan 29 17:18:12 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 03:18:12 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 30, 2014 at 2:52 AM, Haoyi Li < at> wrote:
>> You want to know what's going on? Get involved. Then you'll know
> +1. It's odd to complain about the project's organization and processes when
> you haven't actually had any real experience with either. Getting involved
> in some project run by other people isn't easy, but it's not really that
> hard either in the world of open source.

I first met that concept with community groups, rather than open
source projects, but the result is similar. There were people who
desperately wanted to be in that "inner circle" of people who knew, a
year in advance, which Gilbert & Sullivan operas were going to be
performed, and who'd be directing them, and who would be playing which
roles, and so on. It's all announced sooner or later, but for some
people, they'd really rather it be "sooner" than "later". Well, that's
easily solved. Serve on the society's committee - then you know what's
happening, because you're helping to make it happen. And if you're
happy with a lesser advantage from lesser work, just swing by and help
us with our mail-out. You get to read the info we're sending before we
send it out... because you're helping us to send it out.

In one stroke, you call the bluff of anyone who just wanted handouts
of information, satisfy the desires of those who really care, and
maybe even get some extra help running the (all-volunteer)
organization. I call that a win! :)


From techtonik at  Wed Jan 29 13:48:26 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 29 Jan 2014 15:48:26 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 11:57 AM, Andrew Barnert <abarnert at> wrote:
> On Jan 28, 2014, at 23:44, anatoly techtonik <techtonik at> wrote:
>> Yet another idea that some of you will find strange.
> You do realize that Python is an open source project?

Yes, captain.

However, I fail to see why to ask the question.
If you're saying that open source projects can't have any kind of methodology
to save time and coordinate efforts more efficiently, then I have to
disagree with

Example from good old times of 2011
I am certain there other open source projects with similar processes.

> And that the only people who work on it full time are the ones being paid by some organization that generally has its own priorities?

You don't need to work full time to participate in two week cycle.
As I answered to Ethan, it is not development cycle time. It is just
ordinary two weeks time. You choose what you can do in these two
week and do this. You may find that you have more time than you've
planned during this time, so you can see who is working on what
and help them (if possible).

From techtonik at  Wed Jan 29 14:08:21 2014
From: techtonik at (anatoly techtonik)
Date: Wed, 29 Jan 2014 16:08:21 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <lcahl7$cnp$>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 12:29 PM, Mark Lawrence <breamoreboy at> wrote:
> On 29/01/2014 07:44, anatoly techtonik wrote:
>> Yet another idea that some of you will find strange.
> Instead of coming up with ideas, why not sign the contributors' agreement
> and come up with code that people can actually use?

replied to python-legal-sig

From rosuav at  Wed Jan 29 17:27:38 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 03:27:38 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 11:48 PM, anatoly techtonik <techtonik at> wrote:
> You don't need to work full time to participate in two week cycle.
> As I answered to Ethan, it is not development cycle time. It is just
> ordinary two weeks time. You choose what you can do in these two
> week and do this. You may find that you have more time than you've
> planned during this time, so you can see who is working on what
> and help them (if possible).

What does the two-week cycle achieve that current processes with the
bug tracker can't?

Please explain to us the benefits of the Agile model, as they apply to
a loose collaboration.


From taleinat at  Wed Jan 29 17:56:20 2014
From: taleinat at (Tal Einat)
Date: Wed, 29 Jan 2014 18:56:20 +0200
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 3:08 PM, anatoly techtonik <techtonik at> wrote:
> On Wed, Jan 29, 2014 at 12:29 PM, Mark Lawrence <breamoreboy at> wrote:
>> Instead of coming up with ideas, why not sign the contributors' agreement
>> and come up with code that people can actually use?
> replied to python-legal-sig

Basically, you refuse to sign a contributor agreement, but insist on
blaming the PSF for that.

Your position is simply unreasonable: You demand that the PSF should
either stop demanding a contributor agreement, accept your own
personal version of it, or spend a lot of time and energy attempting
to explain it to you. You blame the PSF of being a needlessly
bureaucratic political body which is giving you a hard time just
because it can or because it doesn't care; whatever you may think,
that is the opposite of the truth. Furthermore, since you continue to
choose to phrase your arguments aggressively and offensively, how can
you expect anyone to consider your proposals??

Regarding the contributor agreement, please spend your time and energy
understanding it, instead of arguing about it here and blaming other
people. Otherwise, stop pestering people about it. What you demand
regarding the contributor agreement is not going to happen, period.

If you actually care about Python, find a way to contribute helpfully!
Even if you believe that we are the ones being stubborn and unhelpful,
it is up to you to find a way to work with us productively. For
example, I am sure you have noticed that few of your ideas posted here
have been helpful in any way, if any at all. I do believe that you
think these are good ideas, but surely you must see that nothing good
results from your posting them to this list. As it is, you have been
harming the development of Python considerably for many months by
pestering people on various mailing lists. If you want to help, you
must change your behavior!

- Tal

From abarnert at  Wed Jan 29 18:24:01 2014
From: abarnert at (Andrew Barnert)
Date: Wed, 29 Jan 2014 09:24:01 -0800
Subject: [Python-ideas] Normalized Python
In-Reply-To: <>
References: <>
Message-ID: <>

Chris, I pretty much agree with you, but there are two major additional points you didn't mention.

On Jan 29, 2014, at 6:33, Chris Angelico <rosuav at> wrote:

> On Wed, Jan 29, 2014 at 8:11 PM, anatoly techtonik <techtonik at> wrote:
>> 3. threat stdout/stdin streams as binary
>> why?
>>    because you don't want you data to be corrupt when
>>    you pass it in and out of Python via standard streams
> Most definitely NOT. The standard streams should, by default, be text
> streams, and should have their encodings set according to what the
> other side wants.

Note that when the other side is a Windows console, what it _really_ wants is for you not to use stdio, but to instead use the separate UTF-16-specific console APIs.

Fitting this into Python 3's cross-platform io model is a bit challenging, and not yet done, but certainly doable. (It's been discussed multiple times, both on this list and elsewhere.)

Fitting this into a Python 2-style io model as Anatoly suggests is completely impossible. Instead, every single program would have to either check that stdout.isatty and platform is Windows and explicitly use something other than stdout, or figure out the console encoding (which is hard to do from inside Python if you take away the stdout.encoding that Python provides for the text stdout today) and explicitly encoding every string to be printed.

There's also the fact that the print function implicitly converts everything to a str for you, which wouldn't do any good if stdout were a binary file. Unlike Python 2, Python 3 has no way to convert arbitrary objects to bytes strings, which means you would need a mandatory encoding keyword arg on every call to print that took any args that weren't bytes-compatible.

Between these two issues, the proposal would effectively give Python 3 all of the stdio/print problems that Python 2 had, and more, without any of Python 2's partial solutions to those problems.

From rosuav at  Wed Jan 29 18:36:33 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 04:36:33 +1100
Subject: [Python-ideas] Normalized Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 30, 2014 at 4:24 AM, Andrew Barnert <abarnert at> wrote:
> Note that when the other side is a Windows console, what it _really_ wants is for you not to use stdio, but to instead use the separate UTF-16-specific console APIs.
> Fitting this into Python 3's cross-platform io model is a bit challenging, and not yet done, but certainly doable. (It's been discussed multiple times, both on this list and elsewhere.)

In the theoretical ideal, all that should be buried within the
definition of the print function (or what it calls on). I should be
able to write a program that says:

print("Copyright ? 2014 My Name")

even if my name includes non-ASCII, even non-BMP, characters; and that
program should produce that output in whatever way is appropriate to
the platform. (If it's running on a printer, that should produce a
hard copy.) Now, maybe that ideal can't be attained, due to some
platforms' limitations or stupidity, and clean code is of value too,
but certainly the notion of "write a Unicode string to the most
obvious place of output" is one that ought *conceptually* to be
supported equally on all platforms, without my having to figure out
one from another.

Obviously if your terminal expects one encoding but announces another,
there's going to be a mess. The theoretical ideal works only when
negotiations are done properly. But again, that's outside of Python;
and if the next version of SomeWeirdOS introduces a new means of
announcing its console encoding, it should simply be a matter of
coding that into Python, *not* into every single script.


From ronaldoussoren at  Thu Jan 30 09:44:36 2014
From: ronaldoussoren at (Ronald Oussoren)
Date: Thu, 30 Jan 2014 09:44:36 +0100
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <lbt66j$r48$>
References: <lbqfqp$7bn$>
 <lbsp1k$pf8$> <>
Message-ID: <>

On 24 Jan 2014, at 08:54, Suresh V. <suresh_vv at> wrote:

> On Friday 24 January 2014 10:39 AM, Ethan Furman wrote:
>> On 01/23/2014 08:09 PM, Suresh V. wrote:
>>> Also it would mean that the client code imports from this package.
>>> I would like client code to remain exactly as it is (continue to
>>> import from its original package) but the behavior is enhanced
>>> once this package is imported on startup.
>> /Something/ has to adjust the pre and post conditions -- if not the
>> client code, then what?
> pre and post conditions are just one possible use of this.
> Going back to my smtplib.SMTP.sendmail example.
> No changes in bulk of client code.
> Single patch module imported in main.

Why is this a good thing? You seem to propose adding a mechanism that makes it easily possible to modify the behaviour of existing functions, which makes it harder to reason about code.   

While this is also possible without language changes with the current monkey patching mechanisms its at least clear that your doing something naughty when writing the patching code :-)


From rosuav at  Thu Jan 30 13:49:44 2014
From: rosuav at (Chris Angelico)
Date: Thu, 30 Jan 2014 23:49:44 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 30, 2014 at 10:56 PM, anatoly techtonik <techtonik at> wrote:
> You say - "short cycles" are bad. In agile I'd say - let's try and see why.
> Maybe it's cycles what are bad, maybe it's people who can not sync
> this often, maybe there is a technical problem with communication that
> can be resolved by using right tools.

The very concept of a cycle suggests a system that's more suited to a
business environment than general open source development. Forcing
people to pick up and set down work might be useful in the very short
period just before a version release (I've been seeing some stuff
about Argument Clinic - btw, kudos to the tireless people doing that,
it's a huge job - and how some of the work will be deferred to 3.5),
but most of the time, it's completely unnecessary. In big business,
you might have a couple dozen programmers working on some particular
job; in that two week cycle, each one could potentially put in quite a
few hours. I heard a figure of 80 hours quoted, but I'm dubious about
how many actual dev hours a salaried programmer would get done, in
between meetings and whatnot. Still, could easily be upwards of 50
hours. Forcing everyone to stop and re-check things every fifty dev
hours doesn't sound too bad. Now look at volunteers. Two weeks might
be anywhere from zero hours up to... well, the upper end doesn't
matter. But it could easily be just a single dev hour in that time.
Are you then going to force this person to set aside what he's
partially done, because of some arbitrary break point?

Now, what happens if you take Agile and eliminate the two-week period?
It begins to look very much like a pool of issues on a bug tracker.
You have a pile of stuff to do, someone picks up something he feels
like doing, posts a result back. Hmm, I wonder if that might be what's
already happening... Do you see now why I was, without any experience
of Agile, already dubious about its merits? And that even before Nick
stated from experience that it's not going to help.

Ideas are all very well, but they're useless without some form of
test-bed. The only perfect way to find out if an idea works or not is
to try it, and the onus is on the inventor to risk something for his
idea. Put the theory to work on some project. Once you can point to
some clear advantages *in practice*, you'll be able to recommend this
to other people. So... fork CPython, tell us all how wonderful your
version is going to be, and then show us how, in two weeks, or four
weeks, or six weeks, you can do amazing stuff with a motley crew of
programmers. Then we'll all take notice.


From rosuav at  Thu Jan 30 14:40:26 2014
From: rosuav at (Chris Angelico)
Date: Fri, 31 Jan 2014 00:40:26 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 31, 2014 at 12:25 AM, anatoly techtonik <techtonik at> wrote:
> Single dev hour is ok if you reached your goal. That's the point.
> You set the goals - you reach them. If you didn't reach them - you
> analyze and see what could be done better. It is all in relaxing and
> free manner, unlike the bloody corporation culture. You may invite
> other people to join the fun. People can find what are you working
> on and propose help.
> This is the process.

Let's say you pick up something that's going to turn out to take you
three dev hours. You then put one hour of work into it, and the
two-week cut-off rolls around. What do you do?

In the current model, there is no cut-off, so you just keep your work
where it is until you find the time to finish it. Then you format it
as a patch, put it on the tracker issue, and move on. (Or, if you're a
core dev, I suppose you push it, see if the buildbots start looking
red and angry, and then move on. Either way.) It doesn't matter if
that took you one day, two weeks, or three months.

What you're suggesting is that people should conform to an arbitrary
number-of-days cutoff. That means that if the cut-off is getting
close, there's a *dis*incentive to pick up any job, because you won't
be able to finish it. Imagine if, when writing up a post for the
mailing list, you had to finish each sentence inside one minute as per
the clock. If it's currently showing hh:mm:49, you'd do better to not
start a sentence, because you probably can't finish it in eleven
seconds. Is that an advantage over "just write what you like, when you


From techtonik at  Thu Jan 30 12:24:44 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 14:24:44 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <lcain6$oun$>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 12:47 PM, Terry Reedy <tjreedy at> wrote:
> On 1/29/2014 2:44 AM, anatoly techtonik wrote:
>> The idea is to split development of Python into two weeks cycle. Every
>> two weeks is "iteration". Iteration consists of phases:
>> 1. Planning (one, two days)
>> 2. Execution
>> 3. Testing
>> 4. Demo
>> 5. Retrospective
> This is more or less what we do now on an issue by issue basis. At a higher
> level, releases for the 'next' version already come out at 2 or 3 week
> intervals from a0 to final. At a higher level, we already have plans for 3.5
> that we will start on as soon as 3.4.0 is out or after PyCon.

It is quite obvious from outside that Python has some kind of process,
but it is quite hard to sync to it for people from outside, because it is not
open - is not completely clear how the planning is made, which tasks
are available for current sprint, what you can help with and how to track
the progress.

From techtonik at  Thu Jan 30 12:45:19 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 14:45:19 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 4:57 PM, Chris Angelico <rosuav at> wrote:
> On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at> wrote:
> Here's a suggestion: Fork Python (that's legal, that's what open
> source means) and start development using the model you advocate. If
> it's massively better than what's happening, (a) developers will flock
> to your model, and (b) the project could be completely handed over to
> you, as happened with GCC.

There is a big difference between people who invent things and do things.
I am a lazy bastard who can not do anything and sustain its job, because
he is constantly inventing new stuff that no one is able to implement. Over
the years I realized that the only good that I can do to humanity is to
develop a sustainable model. So far it didn't happen, because it appeared
that people only work on their own ideas. I don't own my ideas - they are
free for everyone to explore and discuss. So if there is anything valuable -
take it. I don't need power over project or money or anything in between.
Next day there will be another idea and another discussion.

It is nice to see communities that can develop ideas, that can realize that
people are different and use the potential of that people are capable for to
a full degree. It is also nice to see the evolution of people to act in a new
roles that are uncommon for them. You won't like it, but it is also nice to
see how people become worse, because they are human species and to
realize that everyone is imperfect. What is not so nice is to see good
things fail, because people can not reuse technology to help them to deal
with human factor.

> Or alternatively, explain to us here what the real advantages are of
> your new model. So far, what I've seen is "hey, here's an idea", and
> not "here's what this idea will do to benefit Python"; and the idea
> itself looks more suited to a big business than to open source. Maybe
> someone who's actually used Agile will know what's so wonderful about
> it, but unless every core dev *has*, a bit of explanation will help.

Ok. In short. There is only one advantage:

- increased visibility

which in turn results in

- increased interest

which in turn results in

- increased participation.

What problem does agile solve. There is one big problem that "increased
participation" is actually the negative factor for existing contributors,
because it takes more time from them.

Where does this "more time" comes from? In current model:

- increased participation == increased communication

If you constantly communicate, you don't have time for development
(probably the things that you like the most). How does agile help with that?

"agile" means just that - "flexible". If you see the problem, you are not
saying "we are all developers, nobody is interested in communications". No,
instead you're saying -- ok, we have a communication problem, what can we

In current model, you can not try anything, because you can not set goals.
Goals is something that is at least:
- Measurable
- Time-bound

There is no time bounds, there is no measurement. These are not part of the
process, so you don't have even any means to solve the communication and
time deficiency problem. If we have two weeks cycle, we can at least set

From techtonik at  Thu Jan 30 12:56:35 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 14:56:35 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 5:18 PM, Nick Coghlan <ncoghlan at> wrote:
> On 29 January 2014 23:57, Chris Angelico <rosuav at> wrote:
>> On Wed, Jan 29, 2014 at 11:29 PM, anatoly techtonik <techtonik at> wrote:
>>> If information doesn't reach the recipient who want to read it, it is "hidden".
>>> Even if you talk in public channel on IRC, the information is hidden from me
>>> if I was not connected and channel doesn't have public logs.
>> Then if you care, connect. It's not hidden if you have the power to access it.
>> Here's a suggestion: Fork Python (that's legal, that's what open
>> source means) and start development using the model you advocate. If
>> it's massively better than what's happening, (a) developers will flock
>> to your model, and (b) the project could be completely handed over to
>> you, as happened with GCC.
>> Or alternatively, explain to us here what the real advantages are of
>> your new model. So far, what I've seen is "hey, here's an idea", and
>> not "here's what this idea will do to benefit Python"; and the idea
>> itself looks more suited to a big business than to open source. Maybe
>> someone who's actually used Agile will know what's so wonderful about
>> it, but unless every core dev *has*, a bit of explanation will help.
> Plenty of us have used it, and we know it's an entirely inappropriate
> model for open source development projects with broad asynchronous
> participation, as the time commitment needed to make the short cycle
> work is antithetical to loose collaboration. It works well for a
> focused team supporting a single application to meet the specific
> needs of a single business, though.

About *Agile*

I tried to avoid the word Agile, but since you saying that you've used that,
let us agree on terminology. In my world *agile* means *flexible*, which
means *able to change*. It doesn't mean *scrum* or *two weeks sprint*
or any of the hardcoded value that you put behind the phrase of "entirely
inappropriate model for open source". Not that that's clear, let's move on.

"Asynchronous participation" is called "distributed development", and it
is used both by open source and by commercial companies a lot. If
Yahoo terminated this practice and Google didn't even try - that's a
problem of management of these companies. It doesn't mean it doesn't
work for professional teams or people *interested* in interacting this way.
Agile helps to analyze and improve distributed development processes
the same way it does for rigid corporate practices.

You say - "short cycles" are bad. In agile I'd say - let's try and see why.
Maybe it's cycles what are bad, maybe it's people who can not sync
this often, maybe there is a technical problem with communication that
can be resolved by using right tools.

From techtonik at  Thu Jan 30 13:44:08 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 15:44:08 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 6:52 PM, Haoyi Li < at> wrote:
>> You want to know what's going on? Get involved. Then you'll know
> +1. It's odd to complain about the project's organization and processes when
> you haven't actually had any real experience with either. Getting involved
> in some project run by other people isn't easy, but it's not really that
> hard either in the world of open source.

I know that my experience is nothing compared to other people, and therefore
I am even more interested to get feedback from people, deeply involved in the
organization and processes, about good and bad things in the original idea of
"Iterative Development" presented in the first thread message under this

From techtonik at  Thu Jan 30 13:52:43 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 15:52:43 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014 at 7:27 PM, Chris Angelico <rosuav at> wrote:
> On Wed, Jan 29, 2014 at 11:48 PM, anatoly techtonik <techtonik at> wrote:
>> You don't need to work full time to participate in two week cycle.
>> As I answered to Ethan, it is not development cycle time. It is just
>> ordinary two weeks time. You choose what you can do in these two
>> week and do this. You may find that you have more time than you've
>> planned during this time, so you can see who is working on what
>> and help them (if possible).
> What does the two-week cycle achieve that current processes with the
> bug tracker can't?

More fun with collaboration. For some people it is not fun to grok the bugs
they don't personally need to be solved. Sometimes because of complexity
of the problem, but helping some else may be fun.

Current bug tracker doesn't show:
1. what is important for people who think like you are
2. what is the current development focus

So you can not plan how to spend your time more effectively and how to
help with development.

> Please explain to us the benefits of the Agile model, as they apply to
> a loose collaboration.

As I said, there is no single Agile model. Model can be agile (adapting, willing
to change, flexible), natural or rigid.

In rigid model you don't have choice. Take the bug, commit, release.
In natural model you may have additional and optional steps.
In agile model, you have a feedback loop that allows to estimate how
good the model actually is and experiment with it to see if it can be

From techtonik at  Thu Jan 30 14:25:36 2014
From: techtonik at (anatoly techtonik)
Date: Thu, 30 Jan 2014 16:25:36 +0300
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 30, 2014 at 3:49 PM, Chris Angelico <rosuav at> wrote:
> On Thu, Jan 30, 2014 at 10:56 PM, anatoly techtonik <techtonik at> wrote:
>> You say - "short cycles" are bad. In agile I'd say - let's try and see why.
>> Maybe it's cycles what are bad, maybe it's people who can not sync
>> this often, maybe there is a technical problem with communication that
>> can be resolved by using right tools.
> The very concept of a cycle suggests a system that's more suited to a
> business environment than general open source development.

The cycle is needed when you need some kind of visibility in the process.
For business environment it is critical, because it has to control the
spendings. For open source environment it is critical for people, because
they need to plan their time.

> Forcing
> people to pick up and set down work might be useful in the very short
> period just before a version release (I've been seeing some stuff
> about Argument Clinic - btw, kudos to the tireless people doing that,
> it's a huge job - and how some of the work will be deferred to 3.5),

Again, it is not forcing anyone. It is just a process. You are free to fail
you development goal. It is not a business - there is nobody to fire you
or say that you're underperforming. There is no heroism either. If you do
not know your development pace, you can try and measure it, if you do
know, you just realistically state what are you working on and if you
need help with that.

> but most of the time, it's completely unnecessary. In big business,
> you might have a couple dozen programmers working on some particular
> job; in that two week cycle, each one could potentially put in quite a
> few hours. I heard a figure of 80 hours quoted, but I'm dubious about
> how many actual dev hours a salaried programmer would get done, in
> between meetings and whatnot. Still, could easily be upwards of 50
> hours. Forcing everyone to stop and re-check things every fifty dev
> hours doesn't sound too bad. Now look at volunteers. Two weeks might
> be anywhere from zero hours up to... well, the upper end doesn't
> matter. But it could easily be just a single dev hour in that time.
> Are you then going to force this person to set aside what he's
> partially done, because of some arbitrary break point?

Single dev hour is ok if you reached your goal. That's the point.
You set the goals - you reach them. If you didn't reach them - you
analyze and see what could be done better. It is all in relaxing and
free manner, unlike the bloody corporation culture. You may invite
other people to join the fun. People can find what are you working
on and propose help.

This is the process.

> Now, what happens if you take Agile and eliminate the two-week period?
> It begins to look very much like a pool of issues on a bug tracker.
> You have a pile of stuff to do, someone picks up something he feels
> like doing, posts a result back. Hmm, I wonder if that might be what's
> already happening... Do you see now why I was, without any experience
> of Agile, already dubious about its merits? And that even before Nick
> stated from experience that it's not going to help.

That's a crowdsourced development, not a team work in distributed
environment. And there is no place for team to appear if everybody looks
at a big pile of garbage and chooses the shiny metal plate that is
precious only yo him.

The environment you've described is not encouraging team birth and
collaboration in any way. More than that - it looks like people would even
oppose if commercial development teams would propose their work. In
the past it happened already with "unladen swallow" project. Current
development process couldn't munch the result if this work, and people
didn't even try to adjust the process to make the future efforts possible.

> Ideas are all very well, but they're useless without some form of
> test-bed. The only perfect way to find out if an idea works or not is
> to try it, and the onus is on the inventor to risk something for his
> idea. Put the theory to work on some project. Once you can point to
> some clear advantages *in practice*, you'll be able to recommend this
> to other people. So... fork CPython, tell us all how wonderful your
> version is going to be, and then show us how, in two weeks, or four
> weeks, or six weeks, you can do amazing stuff with a motley crew of
> programmers. Then we'll all take notice.

I don't want core devs to accept this process at all. I don't want to sell it
to them and I don't want them to follow it. =) It is completely optional,
and I just don't want them to make more obstacles to new people who
would like to try these.

It may happen that resistance to change for open source projects may
be bigger than in organizations. I just want to make sure that people
aware that applying agile methodology to open source development is
possible and I am inclined that it brings more positive improvements for
the Python itself than de-facto development processes.

From phd at  Thu Jan 30 16:35:27 2014
From: phd at (Oleg Broytman)
Date: Thu, 30 Jan 2014 16:35:27 +0100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

anatoly, if you are trying to change the development process, you're
making two big mistakes:

1. You are late by twentysomething years.

2. You are trying to change the development process from the outside.
That never works. In the world of free software changes can only be made
from the inside. First become a good citizen, a valuable contributor,
then propose changes to the process.

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From zachary.ware+pyideas at  Thu Jan 30 17:17:52 2014
From: zachary.ware+pyideas at (Zachary Ware)
Date: Thu, 30 Jan 2014 10:17:52 -0600
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

I haven't been following this thread very closely, but I have to
disagree with you here, Anatoly.

On Thu, Jan 30, 2014 at 5:24 AM, anatoly techtonik <techtonik at> wrote:
> It is quite obvious from outside that Python has some kind of process,

Which is well documented in several places.  It can be tricky to
always find all of those places, but anyone who is interested can ask,
and will be quickly shown where to look.

> but it is quite hard to sync to it for people from outside,

I'm not sure what you mean here.  Every contributor starts from
"outside" of Python.  I found no difficulty in getting started when I
did, and I've seen several people start contributing successfully
since then.  It would be very hard to go from nothing to suddenly
contributing huge patches to the innermost details of Python at a
rapid pace, but that's not really what people (especially people new
to open source development, like I was) should be doing anyway.  Start
slow and small, build from there, and it's an easy and painless

> because it is not open

Here I must disagree emphatically.  My entire Python experience shows
me that everything about Python is as open as possible.  If you want
to know something, look for it.  If you can't find it, ask for it.  If
you can't be shown where it is, somebody (even yourself) will write it
down somewhere so the next person looking can find it.

> - is not completely clear how the planning is made,

I'm not sure what you mean here, what planning?  Anything that could
be construed as "planning" is done via the PEP process, which is well
documented in PEP 1.

> which tasks are available for current sprint, what you can help with and how to track
> the progress.

This is the very definition of a bug tracker, and Python's is quite
good for all of this.  There could stand to be some upkeep done on
some of the older issues: it would be good for an impartial person to
pick through and see whether an issue is still a problem, update any
patches to apply to current branches, manage the 'easy' tag, add the
proper people to the nosy list, etc.  This kind of thing would be a
great place for someone to contribute.  Honestly, just bringing all
tracker issues up to date would be a worthwhile sprint task in my


From random832 at  Thu Jan 30 17:38:07 2014
From: random832 at (random832 at
Date: Thu, 30 Jan 2014 11:38:07 -0500
Subject: [Python-ideas] Normalized Python
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 29, 2014, at 12:24, Andrew Barnert wrote:
> Fitting this into a Python 2-style io model as Anatoly suggests is
> completely impossible. Instead, every single program would have to either
> check that stdout.isatty

As a sidenote, isatty is broken on windows: it considers NUL to be a
tty. This is because it wraps a C function which in MSVC has the same

From at  Thu Jan 30 18:44:50 2014
From: at (Yury Selivanov)
Date: Thu, 30 Jan 2014 12:44:50 -0500
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/27/2014, 12:41 PM, Wolfgang wrote:
> As for the first subject:
> Specifically, I am not happy with the way the function handles different
> types. Currently _coerce_types gets called for every element in the
> function's input sequence and type conversion follows quite complicated
> rules, and - what is worst - make the outcome of _sum() and thereby mean()
> dependent on the order of items in the input sequence, e.g.:
>>>> mean((1,Fraction(2,3),1.0,Decimal(2.3),2.0, Decimal(5)))
> 1.9944444444444445
>>>> mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
> Traceback (most recent call last):
>    File "<pyshell#7>", line 1, in <module>
>      mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
>    File "C:\Python33\", line 369, in mean
>      return _sum(data)/n
>    File "C:\Python33\", line 157, in _sum
>      T = _coerce_types(T, type(x))
>    File "C:\Python33\", line 327, in _coerce_types
>      raise TypeError('cannot coerce types %r and %r' % (T1, T2))
> TypeError: cannot coerce types <class 'fractions.Fraction'> and <class
> 'decimal.Decimal'>
FWIW, I find some of the concerns Wolfgang raised quite valid.

Steven, what do you think?


From greg at  Thu Jan 30 19:15:04 2014
From: greg at (Gregory P. Smith)
Date: Thu, 30 Jan 2014 10:15:04 -0800
Subject: [Python-ideas]  statistics module in Python3.4
Message-ID: <>

(resending my the original had the wrong list address in the cc for some

---------- Forwarded message ----------
From: Gregory P. Smith <greg at>
Date: Thu, Jan 30, 2014 at 9:59 AM
Subject: Re: [Python-ideas] statistics module in Python3.4
To: Wolfgang <wolfgang.maier at>
Cc: python-ideas at, Steven D'Aprano <steve at>,
Larry Hastings <larry at>

+cc Steve, the PEP 450 author

On Mon, Jan 27, 2014 at 9:41 AM, Wolfgang <
wolfgang.maier at> wrote:

> Dear all,
> I am still testing the new statistics module and I found two cases were
> the behavior of the module seems suboptimal to me.
> My most important concern is the module's internal _sum function and its
> implications, the other one about passing Counter objects to module
> functions.
> As for the first subject:
> Specifically, I am not happy with the way the function handles different
> types. Currently _coerce_types gets called for every element in the
> function's input sequence and type conversion follows quite complicated
> rules, and - what is worst - make the outcome of _sum() and thereby mean()
> dependent on the order of items in the input sequence, e.g.:
> >>> mean((1,Fraction(2,3),1.0,Decimal(2.3),2.0, Decimal(5)))
> 1.9944444444444445
> >>> mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
> Traceback (most recent call last):
>   File "<pyshell#7>", line 1, in <module>
>     mean((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))
>   File "C:\Python33\", line 369, in mean
>     return _sum(data)/n
>   File "C:\Python33\", line 157, in _sum
>     T = _coerce_types(T, type(x))
>   File "C:\Python33\", line 327, in _coerce_types
>     raise TypeError('cannot coerce types %r and %r' % (T1, T2))
> TypeError: cannot coerce types <class 'fractions.Fraction'> and <class
> 'decimal.Decimal'>
> (this is because when _sum iterates over the input type Fraction wins over
> int, then float wins over Fraction and over everything else that follows in
> the first example, but in the second case Fraction wins over int, but then
> Fraction vs Decimal is undefined and throws an error).
> Confusing, isn't it? So here's the code of the _sum function:
> def _sum(data, start=0):
>     """_sum(data [, start]) -> value
>     Return a high-precision sum of the given numeric data. If optional
>     argument ``start`` is given, it is added to the total. If ``data`` is
>     empty, ``start`` (defaulting to 0) is returned.
>     Examples
>     --------
>     >>> _sum([3, 2.25, 4.5, -0.5, 1.0], 0.75)
>     11.0
>     Some sources of round-off error will be avoided:
>     >>> _sum([1e50, 1, -1e50] * 1000)  # Built-in sum returns zero.
>     1000.0
>     Fractions and Decimals are also supported:
>     >>> from fractions import Fraction as F
>     >>> _sum([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
>     Fraction(63, 20)
>     >>> from decimal import Decimal as D
>     >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
>     >>> _sum(data)
>     Decimal('0.6963')
>     """
>     n, d = _exact_ratio(start)
>     T = type(start)
>     partials = {d: n}  # map {denominator: sum of numerators}
>     # Micro-optimizations.
>     coerce_types = _coerce_types
>     exact_ratio = _exact_ratio
>     partials_get = partials.get
>     # Add numerators for each denominator, and track the "current" type.
>     for x in data:
>         T = _coerce_types(T, type(x))
>         n, d = exact_ratio(x)
>         partials[d] = partials_get(d, 0) + n
>     if None in partials:
>         assert issubclass(T, (float, Decimal))
>         assert not math.isfinite(partials[None])
>         return T(partials[None])
>     total = Fraction()
>     for d, n in sorted(partials.items()):
>         total += Fraction(n, d)
>     if issubclass(T, int):
>         assert total.denominator == 1
>         return T(total.numerator)
>     if issubclass(T, Decimal):
>         return T(total.numerator)/total.denominator
>     return T(total)
> Internally, the function uses exact ratios for its calculations (which I
> think is very nice) and only goes through all the pain of coercing types to
> return
> T(total.numerator)/total.denominator
> where T is the final type resulting from the chain of conversions.
> I think a much cleaner (and probably faster) implementation would be to
> gather first all the types in the input sequence, then decide what to
> return in an input order independent way.

+1 Agreed that this would be cleaner given your example above.

>  My tentative implementation:
> def _sum2(data, start=None):
>     if start is not None:
>         t = set((type(start),))
>         n, d = _exact_ratio(start)
>     else:
>         t = set()
>         n = 0
>         d = 1
>     partials = {d: n}  # map {denominator: sum of numerators}
>     # Micro-optimizations.
>     exact_ratio = _exact_ratio
>     partials_get = partials.get
>     # Add numerators for each denominator, and build up a set of all types.
>     for x in data:
>         t.add(type(x))
>         n, d = exact_ratio(x)
>         partials[d] = partials_get(d, 0) + n
>     T = _coerce_types(t) # decide which type to use based on set of all
> types
>     if None in partials:
>         assert issubclass(T, (float, Decimal))
>         assert not math.isfinite(partials[None])
>         return T(partials[None])
>     total = Fraction()
>     for d, n in sorted(partials.items()):
>         total += Fraction(n, d)
>     if issubclass(T, int):
>         assert total.denominator == 1
>         return T(total.numerator)
>     if issubclass(T, Decimal):
>         return T(total.numerator)/total.denominator
>     return T(total)
> this leaves the re-implementation of _coerce_types. Personally, I'd prefer
> something as simple as possible, maybe even:
> def _coerce_types (types):
>     if len(types) == 1:
>         return next(iter(types))
>     return float
> , but that's just a suggestion.
> In this case then:
> >>> _sum2((1,Fraction(2,3),1.0,Decimal(2.3),2.0, Decimal(5)))/6
> 1.9944444444444445
> >>> _sum2((1,Fraction(2,3),Decimal(2.3),1.0,2.0, Decimal(5)))/6
> 1.9944444444444445
> lets check the examples from the _sum docstring just to be sure:
> >>> _sum2([3, 2.25, 4.5, -0.5, 1.0], 0.75)
> 11.0
> >>> _sum2([1e50, 1, -1e50] * 1000)  # Built-in sum returns zero.
> 1000.0
> >>> from fractions import Fraction as F
> >>> _sum2([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
> Fraction(63, 20)
> >>> from decimal import Decimal as D
> >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
> >>> _sum2(data)
> Decimal('0.6963')
> Now the second issue:
> It is maybe more a matter of taste and concerns the effects of passing a
> Counter() object to various functions in the module.
> I know this is undocumented and it's probably the user's fault if he tries
> that, but still:
> >>> from collections import Counter
> >>> c=Counter((1,1,1,1,2,2,2,2,2,3,3,3,3))
> >>> c
> Counter({1: 4, 2: 5, 3: 4})
> >>> mode(c)
> 2
> Cool, mode knows how to work with Counters (interpreting them as frequency
> tables)
> >>> median(c)
> 2
> Looks good
> >>> mean(c)
> 2.0
> Very well
> But the truth is that only mode really works as you may think and we were
> just lucky with the other two:
> >>> c=Counter((1,1,2))
> >>> mean(c)
> 1.5
> oops
> >>> median(c)
> 1.5
> hmm
> From a quick look at the code you can see that mode actually converts your
> input to a Counter behind the scenes anyway, so it has no problem.
> mean and median, on the other hand, are simply iterating over their input,
> so if that input happens to be a mapping, they'll use just the keys.
> I think there are two simple ways to avoid this pitfall:
> 1) add an explicit warning to the docs explaining this behavior or
> 2) make mean and median do the same magic with Counters as mode does, i.e.
> make them check for Counter as the input type and deal with it as if it
> were a frequency table. I'd favor this behavior because it looks like
> little extra code, but may be very useful in many situations. I'm not quite
> sure whether maybe even all mappings should be treated that way?

I think this definitely needs documenting. Even if a behavior isn't settled
on in time for 3.4 would it make sense to add some asserts to prevent
passing a Counter to mean and median for the time being so that this could
be improved in a later bugfix rather than becoming an odd behavior we need
to maintain compatibility with in the future?

It's very late in the release cycle so the best option for these kinds of
changes may be to just document them as known issues and behaviors that we
will or may fix in future releases.

I think Steve and Larry should make the call on that.

thanks for putting the new module through its paces!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From breamoreboy at  Thu Jan 30 19:16:23 2014
From: breamoreboy at (Mark Lawrence)
Date: Thu, 30 Jan 2014 18:16:23 +0000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lce4te$sn5$>

On 30/01/2014 12:52, anatoly techtonik wrote:
> So you can not plan how to spend your time more effectively and how to
> help with development.

Core dev time could be used more effectively if they weren't sidetracked 
by non-issues e.g. blithering idiots who keep reopening issues on the 
bug tracker.  I won't mention any names.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From abarnert at  Thu Jan 30 19:59:46 2014
From: abarnert at (Andrew Barnert)
Date: Thu, 30 Jan 2014 10:59:46 -0800
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 30, 2014, at 5:25, anatoly techtonik <techtonik at> wrote:

> It may happen that resistance to change for open source projects may
> be bigger than in organizations. I just want to make sure that people
> aware that applying agile methodology to open source development is
> possible and I am inclined that it brings more positive improvements for
> the Python itself than de-facto development processes.

Do you have any examples of an open source project (and not a company-driven one) that applied agile methodology and gained any benefits? Showing something concrete like that would make a far better argument than just rambling about what might be possible.

From g.brandl at  Thu Jan 30 21:15:36 2014
From: g.brandl at (Georg Brandl)
Date: Thu, 30 Jan 2014 21:15:36 +0100
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcebrb$lf9$>

Am 30.01.2014 17:17, schrieb Zachary Ware:
> I haven't been following this thread very closely, but I have to
> disagree with you here, Anatoly.
> On Thu, Jan 30, 2014 at 5:24 AM, anatoly techtonik <techtonik at> wrote:
>> It is quite obvious from outside that Python has some kind of process,
> Which is well documented in several places.  It can be tricky to
> always find all of those places, but anyone who is interested can ask,
> and will be quickly shown where to look.

Nowadays the development process is really well documented in the devguide.
If anything is still not in there, that should be fixed.

>> but it is quite hard to sync to it for people from outside,
> I'm not sure what you mean here.  Every contributor starts from
> "outside" of Python.  I found no difficulty in getting started when I
> did, and I've seen several people start contributing successfully
> since then.  It would be very hard to go from nothing to suddenly
> contributing huge patches to the innermost details of Python at a
> rapid pace, but that's not really what people (especially people new
> to open source development, like I was) should be doing anyway.  Start
> slow and small, build from there, and it's an easy and painless
> process.
>> because it is not open
> Here I must disagree emphatically.  My entire Python experience shows
> me that everything about Python is as open as possible.  If you want
> to know something, look for it.  If you can't find it, ask for it.

That's the key: *ask* for it.  Do not rant that you didn't find something,
complain that it wasn't in some random place you expected it, and then not
accept help and hints from people that weren't put off replying you in the
first place.

>> - is not completely clear how the planning is made,
> I'm not sure what you mean here, what planning?  Anything that could
> be construed as "planning" is done via the PEP process, which is well
> documented in PEP 1.

We have tried quite a few times to make it clear to Anatoly that there is
no "planning" made apart from what you can read about in PEPs and mailing
lists.  Apparently he thinks there's a secret agenda, when in reality there
often is no (shared) agenda at all -- that's in the nature of an open source
project.  Of course individual developers may have private agendas.

>> which tasks are available for current sprint, what you can help with and how to track
>> the progress.
> This is the very definition of a bug tracker, and Python's is quite
> good for all of this.  There could stand to be some upkeep done on
> some of the older issues: it would be good for an impartial person to
> pick through and see whether an issue is still a problem, update any
> patches to apply to current branches, manage the 'easy' tag, add the
> proper people to the nosy list, etc.  This kind of thing would be a
> great place for someone to contribute.  Honestly, just bringing all
> tracker issues up to date would be a worthwhile sprint task in my
> opinion.

Few people have tried that because it's such a thankless task, but
there was definitely progress.


From wolfgang.maier at  Thu Jan 30 16:28:59 2014
From: wolfgang.maier at (Wolfgang)
Date: Thu, 30 Jan 2014 07:28:59 -0800 (PST)
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

> Opinions anyone?

Nobody ?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Thu Jan 30 23:27:10 2014
From: ethan at (Ethan Furman)
Date: Thu, 30 Jan 2014 14:27:10 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/30/2014 07:28 AM, Wolfgang wrote:
>> Opinions anyone?
> Nobody ?

As a layman your concerns make sense to me.  :)


From breamoreboy at  Fri Jan 31 00:27:52 2014
From: breamoreboy at (Mark Lawrence)
Date: Thu, 30 Jan 2014 23:27:52 +0000
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <lcen5g$sic$>

On 27/01/2014 17:41, Wolfgang wrote:

> Ok, that's it for now I guess. Opinions anyone?
> Best,
> Wolfgang

So this doesn't get lost I'd be inclined to raise two issues on the bug 
tracker.  It's also much easier for people to follow the issues there 
and better still, see what the actual outcome is.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From ethan at  Fri Jan 31 00:31:37 2014
From: ethan at (Ethan Furman)
Date: Thu, 30 Jan 2014 15:31:37 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <lcen5g$sic$>
References: <>
Message-ID: <>

On 01/30/2014 03:27 PM, Mark Lawrence wrote:
> On 27/01/2014 17:41, Wolfgang wrote:
>> Ok, that's it for now I guess. Opinions anyone?
>> Best,
>> Wolfgang
> So this doesn't get lost I'd be inclined to raise two issues on the bug tracker.  It's also much easier for people to
> follow the issues there and better still, see what the actual outcome is.

Checking first is usually good policy, but now that you've had positive feed-back some issues on the bug tracker [1] is 
definitely a good idea.



From steve at  Fri Jan 31 02:07:25 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 12:07:25 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <20140131010724.GE3799@ando>

On Mon, Jan 27, 2014 at 09:41:02AM -0800, Wolfgang wrote:
> Dear all,
> I am still testing the new statistics module and I found two cases were the 
> behavior of the module seems suboptimal to me.
> My most important concern is the module's internal _sum function and its 
> implications, the other one about passing Counter objects to module 
> functions.

As the author of the module, I'm also concerned with the internal _sum 
function. That's why it's now a private function -- I originally 
intended for it to be a public function (see PEP 450).

> As for the first subject:
> Specifically, I am not happy with the way the function handles different 
> types. Currently _coerce_types gets called for every element in the 
> function's input sequence and type conversion follows quite complicated 
> rules, and - what is worst - make the outcome of _sum() and thereby mean() 
> dependent on the order of items in the input sequence, e.g.:
> (this is because when _sum iterates over the input type Fraction wins over 
> int, then float wins over Fraction and over everything else that follows in 
> the first example, but in the second case Fraction wins over int, but then 
> Fraction vs Decimal is undefined and throws an error).
> Confusing, isn't it? 

I don't think so. The idea is that _sum() ought to reflect the standard, 
dare I say intuitive, behaviour of repeated application of the __add__ 
and __radd__ methods, as used by the plus operator. For example, int + 
<any numeric type> coerces to the other numeric type. What else would 
you expect?

In mathematics the number 0.4 is the same whether you write it as 0.4, 
2/5, 0.4+0j, [0; 2, 2] or any other notation you care to invent. (That 
last one is a continued fraction.) In Python, the number 0.4 is 
represented by a value and a type, and managing the coercion rules for 
the different types can be fiddly and annoying. But they shouldn't be 
*confusing* -- we have a numeric tower, and if I've written the code 
correctly, the coercion rules ought to follow the tower as closely as 

> So here's the code of the _sum function:

You should expect that to change, if for no other reason than 
performance. At the moment, _sum is about two orders of magnitude times 
slower than the built-in sum. I think I can get it to about one order of 
magnitude slower.

> I think a much cleaner (and probably faster) implementation would be to 
> gather first all the types in the input sequence, then decide what to 
> return in an input order independent way. My tentative implementation:

Thanks for this. I will add that to my collection of alternate versions 
of _sum.

> this leaves the re-implementation of _coerce_types. Personally, I'd prefer 
> something as simple as possible, maybe even:
> def _coerce_types (types):
>     if len(types) == 1:
>         return next(iter(types))
>     return float

I don't want to coerce everything to float unnecessarily. Floats are, in 
some ways, the worst choice for numeric values, at least from the 
perspective of accuracy and correctness. Floats violate several of the 
fundamental rules of mathematics, e.g. addition is not commutative:

py> 1e19 + (-1e19 + 0.1) == (1e19 + -1e19) + 0.1

One of my aims is to avoid raising TypeError unnecessarily. The 
statistics module is aimed at casual users who may not understand, or 
care about, the subtleties of numeric coercions, they just want to take 
the average of two values regardless of what sort of number they are. 
But having said that, I realise that mixed-type arithmetic is difficult, 
and I've avoided documenting the fact that the module will work on mixed 

> Now the second issue:
> It is maybe more a matter of taste and concerns the effects of passing a 
> Counter() object to various functions in the module.

Interesting. If you think there is a use-case for passing Counters to 
the statistics functions (weighted data?) then perhaps they can be 
explicitly supported in 3.5. It's way too late for 3.4 to introduce new 

> From a quick look at the code you can see that mode actually converts your 
> input to a Counter behind the scenes anyway, so it has no problem.
> mean and median, on the other hand, are simply iterating over their input, 
> so if that input happens to be a mapping, they'll use just the keys.

Well yes :-)

I'm open to the suggestion that Counters should be treated specially. 
Would you be so kind as to raise an issue in the bug tracker?

Thanks for the feedback,


From steve at  Fri Jan 31 02:27:05 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 12:27:05 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <20140131012705.GF3799@ando>

On Thu, Jan 30, 2014 at 11:03:38AM -0800, Larry Hastings wrote:
> On Mon, Jan 27, 2014 at 9:41 AM, Wolfgang 
> <wolfgang.maier at 
> <mailto:wolfgang.maier at>> wrote:
> >I think a much cleaner (and probably faster) implementation would be 
> >to gather first all the types in the input sequence, then decide what 
> >to return in an input order independent way.
> I'm willing to consider this a "bug fix".  And since it's a new function 
> in 3.4, we don't have an installed base.  So I'm willing to consider 
> fixing this for 3.4.

I'm hesitant to require two passes over the data in _sum. Some 
higher-order statistics like variance are currently implemented using 
two passes, but ultimately I've like to support single-pass algorithms 
that can operate on large but finite iterators.

But I will consider it as an option.

I'm also hesitant to make the promise that _sum will be 
order-independent. Addition in Python isn't:

py> class A(int):
...     def __add__(self, other):
...             return type(self)(super().__add__(other))
...     def __repr__(self):
...             return "%s(%d)" % (type(self).__name__, self)
py> class B(A):
...     pass
py> A(1) + B(1)
py> B(1) + A(1)

> Yes, exactly.  If the support for Counter is half-baked, let's prevent 
> it from being used now.

I strongly disagree with this. Counters are currently treated the same 
as any other iterable, and built-in sum and math.fsum don't treat them 

py> from collections import Counter
py> c = Counter([1, 1, 1, 1, 1, 2])
py> c
Counter({1: 5, 2: 1})
py> sum(c)
py> from math import fsum
py> fsum(c)

If you're worried about people coming to rely on this, and thus running 
into trouble in the future if Counters get treated specially for (say) 
weighted data, then I'd accept a warning in the docs, or even a runtime 
warning. But not an exception.


From rosuav at  Fri Jan 31 02:32:04 2014
From: rosuav at (Chris Angelico)
Date: Fri, 31 Jan 2014 12:32:04 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <20140131010724.GE3799@ando>
References: <>
Message-ID: <>

On Fri, Jan 31, 2014 at 12:07 PM, Steven D'Aprano <steve at> wrote:
> One of my aims is to avoid raising TypeError unnecessarily. The
> statistics module is aimed at casual users who may not understand, or
> care about, the subtleties of numeric coercions, they just want to take
> the average of two values regardless of what sort of number they are.
> But having said that, I realise that mixed-type arithmetic is difficult,
> and I've avoided documenting the fact that the module will work on mixed
> types.

Based on the current docs and common sense, I would expect that
Fraction and Decimal should normally be there exclusively, and that
the only type coercions would be int->float->complex (because it makes
natural sense to write a list of "floats" as [1.4, 2, 3.7], but it
doesn't make sense to write a list of Fractions as [Fraction(1,2),
7.8, Fraction(12,35)]). Any mishandling of Fraction or Decimal with
the other three types can be answered with "Well, you should be using
the same type everywhere". (Though it might be useful to allow
int->anything coercion, since that one's easy and safe.)


From abarnert at  Fri Jan 31 04:47:54 2014
From: abarnert at (Andrew Barnert)
Date: Thu, 30 Jan 2014 19:47:54 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 30, 2014, at 17:32, Chris Angelico <rosuav at> wrote:

> On Fri, Jan 31, 2014 at 12:07 PM, Steven D'Aprano <steve at> wrote:
>> One of my aims is to avoid raising TypeError unnecessarily. The
>> statistics module is aimed at casual users who may not understand, or
>> care about, the subtleties of numeric coercions, they just want to take
>> the average of two values regardless of what sort of number they are.
>> But having said that, I realise that mixed-type arithmetic is difficult,
>> and I've avoided documenting the fact that the module will work on mixed
>> types.
> Based on the current docs and common sense, I would expect that
> Fraction and Decimal should normally be there exclusively, and that
> the only type coercions would be int->float->complex (because it makes
> natural sense to write a list of "floats" as [1.4, 2, 3.7], but it
> doesn't make sense to write a list of Fractions as [Fraction(1,2),
> 7.8, Fraction(12,35)]). Any mishandling of Fraction or Decimal with
> the other three types can be answered with "Well, you should be using
> the same type everywhere". (Though it might be useful to allow
> int->anything coercion, since that one's easy and safe.)

Except that large enough int values lose information, and even larger ones raise an exception:

    >>> float(pow(3, 50)) == pow(3, 50)
    >>> float(1<<2000)
    OverflowError: int too large to convert to float

And that first one is the reason why statistics needs a custom sum in the first place.

When there are only 2 types involved in the sequence, you get the answer you wanted. The only problem raised by the examples in this thread is that with 3 or more types that aren't all mutually coercible but do have a path through them, you can sometimes get imprecise answers and other times get exceptions, and you might come to rely on one or the other.

So, rather than throwing out Stephen's carefully crafted and clearly worded rules and trying to come up with new ones, why not (for 3.4) just say that the order of coercions given values of 3 or more types is not documented and subject to change in the future (maybe even giving the examples from the initial email)?

From abarnert at  Fri Jan 31 04:49:14 2014
From: abarnert at (Andrew Barnert)
Date: Thu, 30 Jan 2014 19:49:14 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 30, 2014, at 19:47, Andrew Barnert <abarnert at> wrote:

> So, rather than throwing out Stephen's carefully crafted and clearly worded rules

Sorry, I meant Steven there.

(At least I hope I did, otherwise this will be doubly embarrassing...)

From steve at  Fri Jan 31 05:09:38 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 15:09:38 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <20140131040938.GG3799@ando>

On Thu, Jan 30, 2014 at 07:47:54PM -0800, Andrew Barnert wrote:

> So, rather than throwing out Stephen's carefully crafted and clearly 
> worded rules and trying to come up with new ones, why not (for 3.4) 
> just say that the order of coercions given values of 3 or more types 
> is not documented and subject to change in the future (maybe even 
> giving the examples from the initial email)? 

I am happy to have an explicit disclaimer in the docs saying the result 
of calculations on mixed types are not guaranteed and may be subject to 
change. Then for 3.5 we can consider this more carefully.


From rosuav at  Fri Jan 31 05:36:16 2014
From: rosuav at (Chris Angelico)
Date: Fri, 31 Jan 2014 15:36:16 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 31, 2014 at 2:47 PM, Andrew Barnert <abarnert at> wrote:
>> Based on the current docs and common sense, I would expect that
>> Fraction and Decimal should normally be there exclusively, and that
>> the only type coercions would be int->float->complex (because it makes
>> natural sense to write a list of "floats" as [1.4, 2, 3.7], but it
>> doesn't make sense to write a list of Fractions as [Fraction(1,2),
>> 7.8, Fraction(12,35)]). Any mishandling of Fraction or Decimal with
>> the other three types can be answered with "Well, you should be using
>> the same type everywhere". (Though it might be useful to allow
>> int->anything coercion, since that one's easy and safe.)
> Except that large enough int values lose information, and even larger ones raise an exception:
>     >>> float(pow(3, 50)) == pow(3, 50)
>     False
>     >>> float(1<<2000)
>     OverflowError: int too large to convert to float
> And that first one is the reason why statistics needs a custom sum in the first place.

I don't think it'd be possible to forbid int -> float coercion - the
Python community (and Steven himself) would raise an outcry. But
int->float is at least as safe as it's fundamentally possible to be.
Adding ".0" to the end of a literal (thus making it a float literal)
is, AFAIK, absolutely identical to wrapping it in "float(" and ")".
That's NOT true of float -> Fraction or float -> Decimal - going via
float will cost precision, but going via int ought to be safe.

>>> float(pow(3,50)) == pow(3.0,50)

The difference between int and any other type is going to be pretty
much the same whether you convert first or convert last. The only
distinction that I can think of is floating-point rounding errors,
which are already dealt with:

>>> statistics._sum([pow(2.0,53),1.0,1.0,1.0])
>>> sum([pow(2.0,53),1.0,1.0,1.0])

Since it handles this correctly with all floats, it'll handle it just
fine with some ints and some floats:

>>> sum([pow(2,53),1,1,1.0])
>>> statistics._sum([pow(2,53),1,1,1.0])

In this case, the builtin sum() happens to be correct, because it adds
the first ones as ints, and then converts to float at the end. Of
course, "correct" isn't quite correct - the true value based on real
number arithmetic is ...95, as can be seen in Python if they're all
ints. But I'm defining "correct" as "the same result that would be
obtained by calculating in real numbers and then converting to the
data type of the end result". And by that definition, builtin sum() is
correct as long as the float is right at the end, and
statistics._sum() is correct regardless of the order.

>>> statistics._sum([1.0,pow(2,53),1,1])
>>> sum([1.0,pow(2,53),1,1])

So in that sense, it's "safe" to cast all int to float if the result
is going to be float, unless an individual value is itself too big to
convert, but the final result (thanks to negative values) would have
been: I'm not sure how it's currently handled, but this particular
case is working:

>>> statistics._sum([1.0,1<<2000,0-(1<<2000)])

The biggest problem, then, is cross-casting between float, Fraction,
and Decimal. And anyone who's mixing those is asking for trouble


From rosuav at  Fri Jan 31 05:37:35 2014
From: rosuav at (Chris Angelico)
Date: Fri, 31 Jan 2014 15:37:35 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <20140131040938.GG3799@ando>
References: <>
Message-ID: <>

On Fri, Jan 31, 2014 at 3:09 PM, Steven D'Aprano <steve at> wrote:
> On Thu, Jan 30, 2014 at 07:47:54PM -0800, Andrew Barnert wrote:
>> So, rather than throwing out Stephen's carefully crafted and clearly
>> worded rules and trying to come up with new ones, why not (for 3.4)
>> just say that the order of coercions given values of 3 or more types
>> is not documented and subject to change in the future (maybe even
>> giving the examples from the initial email)?
> I am happy to have an explicit disclaimer in the docs saying the result
> of calculations on mixed types are not guaranteed and may be subject to
> change. Then for 3.5 we can consider this more carefully.



From larry at  Fri Jan 31 05:58:20 2014
From: larry at (Larry Hastings)
Date: Thu, 30 Jan 2014 20:58:20 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <20140131012705.GF3799@ando>
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <>

On 01/30/2014 05:27 PM, Steven D'Aprano wrote:
> I'm hesitant to require two passes over the data in _sum. Some
> higher-order statistics like variance are currently implemented using
> two passes, but ultimately I've like to support single-pass algorithms
> that can operate on large but finite iterators.
> But I will consider it as an option.
> I'm also hesitant to make the promise that _sum will be
> order-independent. Addition in Python isn't: [...]

I concede that this is mostly outside my expertise, and the statistics 
module and the PEP were your doing.  So you're the expert here and I 
will defer to you.

But.  My dim understanding of the *whole point* of the new statistics 
module was that it valued correctness over raw performance.  I assumed 
sorting values from small to large** before summing was *exactly* the 
sort of thing it was written to do.  If all we wanted were Python's 
existing semantics, why bother writing statistics._sum() in the first 
place?  Just use sum().

On the other hand, I had missed the fact that this was an internal-only 
method.  If changing _statistics._sum so it reordered the iterable to 
preserve correctness wouldn't change the behavior of any supported 
external APIs, then obviously there's no need, and I'd prefer to leave 
it alone for 3.4.  If you decided to change it for 3.5 and people were 
relying on its old behavior, that would be on them.  (Though a comment 
saying "I might change this later" would be welcome... if true.)

> If you're worried about people coming to rely on this, and thus running
> into trouble in the future if Counters get treated specially for (say)
> weighted data, then I'd accept a warning in the docs, or even a runtime
> warning. But not an exception.

The statistics module isn't marked as provisional.  So the semantics 
that ship with 3.4 are going to be set in stone.  Changing them later 
simply won't be an option--that will break code.  If you want to treat 
Counter objects differently in the future than you do now, then I agree 
with Wolfgang: the best course of action would be to add an exception 
now.  But again I'll defer to your judgment about what's best for your 


** Or high-precision to low-precision.  You know what I mean, the 
classic "if you add large numbers first you throw away precision and can 
wind up with a different result" thing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From suresh_vv at  Fri Jan 31 06:36:46 2014
From: suresh_vv at (Suresh V.)
Date: Fri, 31 Jan 2014 11:06:46 +0530
Subject: [Python-ideas] __before__ and __after__ attributes for functions
In-Reply-To: <>
References: <lbqfqp$7bn$>
 <lbsp1k$pf8$> <>
 <lbt66j$r48$> <>
Message-ID: <lcfcp1$f8q$>

On Thursday 30 January 2014 02:14 PM, Ronald Oussoren wrote:
> On 24 Jan 2014, at 08:54, Suresh V. <suresh_vv at> wrote:
>> On Friday 24 January 2014 10:39 AM, Ethan Furman wrote:
>>> On 01/23/2014 08:09 PM, Suresh V. wrote:
>>>> Also it would mean that the client code imports from this package.
>>>> I would like client code to remain exactly as it is (continue to
>>>> import from its original package) but the behavior is enhanced
>>>> once this package is imported on startup.
>>> /Something/ has to adjust the pre and post conditions -- if not the
>>> client code, then what?
>> pre and post conditions are just one possible use of this.
>> Going back to my smtplib.SMTP.sendmail example.
>> No changes in bulk of client code.
>> Single patch module imported in main.
> Why is this a good thing? You seem to propose adding a mechanism that makes it easily possible to modify the behaviour of existing functions, which makes it harder to reason about code.

It is a "good thing" because it adheres to the "Open/Closed principle" 
better than monkey patching does. Meaning open to extension and closed 
to modification.

> While this is also possible without language changes with the current monkey patching mechanisms its at least clear that your doing something naughty when writing the patching code :-)

This if for those non-naughty times :-)

From tjreedy at  Fri Jan 31 07:01:33 2014
From: tjreedy at (Terry Reedy)
Date: Fri, 31 Jan 2014 01:01:33 -0500
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <lcfe7n$u6s$>

On 1/30/2014 11:58 PM, Larry Hastings wrote:

> The statistics module isn't marked as provisional.

Perhaps it should be, at least with respect to sums of mixed types and 
use of Counters.

 > So the semantics that ship with 3.4 are going to be set in stone.

Given the discussion here and previously, that seems premature.

Terry Jan Reedy

From stephen at  Fri Jan 31 06:56:39 2014
From: stephen at (Stephen J. Turnbull)
Date: Fri, 31 Jan 2014 14:56:39 +0900
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <20140131010724.GE3799@ando>
References: <>
Message-ID: <>

Steven D'Aprano writes:

 > Floats violate several of the fundamental rules of mathematics,
 > e.g. addition is not commutative:

AFAIK it is.

 > py> 1e19 + (-1e19 + 0.1) == (1e19 + -1e19) + 0.1
 > False

This is a failure of associativity, not commutativity.  Associativity
is in many ways a more fundamental property.

From steve at  Fri Jan 31 09:18:20 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 19:18:20 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <20140131010724.GE3799@ando> <>
Message-ID: <20140131081820.GH3799@ando>

On Fri, Jan 31, 2014 at 02:56:39PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
>  > Floats violate several of the fundamental rules of mathematics,
>  > e.g. addition is not commutative:
> AFAIK it is.
>  > py> 1e19 + (-1e19 + 0.1) == (1e19 + -1e19) + 0.1
>  > False
> This is a failure of associativity, not commutativity.

Oops, you are correct. I got them mixed up.

However, commutativity of addition can violated by Python numeric types, 
although not floats alone. E.g. the example I gave earlier of two int 


From abarnert at  Fri Jan 31 09:32:10 2014
From: abarnert at (Andrew Barnert)
Date: Fri, 31 Jan 2014 00:32:10 -0800
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <20140131010724.GE3799@ando> <>
Message-ID: <>

On Jan 30, 2014, at 21:56, "Stephen J. Turnbull" <stephen at> wrote:

> Steven D'Aprano writes:
>> Floats violate several of the fundamental rules of mathematics,
>> e.g. addition is not commutative:
> AFAIK it is.
>> py> 1e19 + (-1e19 + 0.1) == (1e19 + -1e19) + 0.1
>> False
> This is a failure of associativity, not commutativity.  Associativity
> is in many ways a more fundamental property.

Yeah, the only way commutativity can fail with IEEE floats is if you treat nan as a number and have at least two nans, at least one of them quiet.

But associativity failing isn't really fundamental. This example fails as a consequence of the axiom of (additive) identity not holding. (There is a unique "zero", but it's not true that, for all y, x+y=y implies x is that zero.) The overflow example fails because of closure not holding (unless you count inf and nan as numbers, in which case it again fails because zero fails even more badly).

If you just meant that you lose commutativity before associativity in compositions over fields, then yeah, I guess in that sense associativity is more fundamental.

From steve at  Fri Jan 31 09:56:13 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 19:56:13 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <20140131085610.GI3799@ando>

On Thu, Jan 30, 2014 at 08:58:20PM -0800, Larry Hastings wrote:
> On 01/30/2014 05:27 PM, Steven D'Aprano wrote:
> >I'm hesitant to require two passes over the data in _sum. Some
> >higher-order statistics like variance are currently implemented using
> >two passes, but ultimately I've like to support single-pass algorithms
> >that can operate on large but finite iterators.
> >
> >But I will consider it as an option.
> >
> >I'm also hesitant to make the promise that _sum will be
> >order-independent. Addition in Python isn't: [...]
> I concede that this is mostly outside my expertise, and the statistics 
> module and the PEP were your doing.  So you're the expert here and I 
> will defer to you.
> But.  My dim understanding of the *whole point* of the new statistics 
> module was that it valued correctness over raw performance.  I assumed 
> sorting values from small to large** before summing was *exactly* the 
> sort of thing it was written to do.  If all we wanted were Python's 
> existing semantics, why bother writing statistics._sum() in the first 
> place?  Just use sum().

_sum doesn't duplicate the semantics of built-in sum(). It is sort 
of a hybrid of sum and math.fsum: like sum, it tries to conserve types, 
and give a sensible result when there are mixed types. Like fsum, it 
tries to be higher precision.

> On the other hand, I had missed the fact that this was an internal-only 
> method.  If changing _statistics._sum so it reordered the iterable to 
> preserve correctness wouldn't change the behavior of any supported 
> external APIs, then obviously there's no need, and I'd prefer to leave 
> it alone for 3.4.

Changes to _sum may be visible, because the external APIs such as mean 
and variance rely on it. For example, an extreme case: if I removed _sum 
and replaced it with math.fsum, then all of the external APIs will 
suddenly start outputting floats and nothing but floats. (I'm not 
intending to do that.)

I think that it is asking too much to promise that no statistics 
function will ever change it's numeric result. I don't intend for them 
to become *less* accurate, but they might become *more* accurate. For 
example, currently the unit tests for variance pass with an acceptable 
tolerance of 1e-12 (relative error). Perhaps this needs to be 
documented? The random module does something similar:

> If you decided to change it for 3.5 and people were 
> relying on its old behavior, that would be on them.  (Though a comment 
> saying "I might change this later" would be welcome... if true.)
> >If you're worried about people coming to rely on this, and thus running
> >into trouble in the future if Counters get treated specially for (say)
> >weighted data, then I'd accept a warning in the docs, or even a runtime
> >warning. But not an exception.
> The statistics module isn't marked as provisional.  So the semantics 
> that ship with 3.4 are going to be set in stone.  Changing them later 
> simply won't be an option--that will break code.  If you want to treat 
> Counter objects differently in the future than you do now, then I agree 
> with Wolfgang: the best course of action would be to add an exception 
> now.  But again I'll defer to your judgment about what's best for your 
> module.

Hmmm. Well, that's a much stronger promise of backward compatibility 
than I would have expected. The fact that (say) variance works with a 
dict is a pure accident of implementation, not advertised or promised in 
any way. But I'll accept your ruling. I want to reserve the right to 
add special handling of mappings in the future. In order of preference 
(highest to least) I'd like to:

1) Put a note in the documentation that handling of mappings is subject 
to change;

2) As above, plus raise warning.warn(); or

3) Raise an exception (this one only if you insist).


From wolfgang.maier at  Fri Jan 31 09:57:26 2014
From: wolfgang.maier at (Wolfgang Maier)
Date: Fri, 31 Jan 2014 08:57:26 +0000 (UTC)
Subject: [Python-ideas] statistics module in Python3.4
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <>

Steven D'Aprano <steve at ...> writes:

> On Thu, Jan 30, 2014 at 11:03:38AM -0800, Larry Hastings wrote:
> > On Mon, Jan 27, 2014 at 9:41 AM, Wolfgang 
> > <wolfgang.maier at ... 
> > <mailto:wolfgang.maier at ...>> wrote:
> > >I think a much cleaner (and probably faster) implementation would be 
> > >to gather first all the types in the input sequence, then decide what 
> > >to return in an input order independent way.
> > 
> > I'm willing to consider this a "bug fix".  And since it's a new function 
> > in 3.4, we don't have an installed base.  So I'm willing to consider 
> > fixing this for 3.4.
> I'm hesitant to require two passes over the data in _sum. Some 
> higher-order statistics like variance are currently implemented using 
> two passes, but ultimately I've like to support single-pass algorithms 
> that can operate on large but finite iterators.
> But I will consider it as an option.
> I'm also hesitant to make the promise that _sum will be 
> order-independent. Addition in Python isn't:
> py> class A(int):
> ...     def __add__(self, other):
> ...             return type(self)(super().__add__(other))
> ...     def __repr__(self):
> ...             return "%s(%d)" % (type(self).__name__, self)
> ...
> py> class B(A):
> ...     pass
> ...
> py> A(1) + B(1)
> A(2)
> py> B(1) + A(1)
> B(2)

Hi Steven,
first of all let me say that I am quite amazed by the extent of the
discussion that is going on now. All I really meant is that there are two
special cases (mixed types in _sum and Counters as input to some functions)
that I find worth reconsidering in an otherwise really useful module.

Regarding your comments above and in other posts:
I never proposed two passes over the data. My implementation (below again
because many people seem to have missed it in my first rather long post)
gathers the input types in a set **while** calculating the sum in a single
for loop. It then calls _coerce_types passing this set only once:

def _sum2(data, start=None):
    if start is not None:
        t = set((type(start),))
        n, d = _exact_ratio(start)
        t = set()
        n = 0
        d = 1
    partials = {d: n}  # map {denominator: sum of numerators}

    # Micro-optimizations.
    exact_ratio = _exact_ratio
    partials_get = partials.get

    # Add numerators for each denominator, and build up a set of all types.
    for x in data:
        n, d = exact_ratio(x)
        partials[d] = partials_get(d, 0) + n
    T = _coerce_types(t) # decide which type to use based on set of all types
    if None in partials:
        assert issubclass(T, (float, Decimal))
        assert not math.isfinite(partials[None])
        return T(partials[None])
    total = Fraction()
    for d, n in sorted(partials.items()):
        total += Fraction(n, d)
    if issubclass(T, int):
        assert total.denominator == 1
        return T(total.numerator)
    if issubclass(T, Decimal):
        return T(total.numerator)/total.denominator
    return T(total) 

As for my tentative implementation of _coerce_types, it was really meant as
an example. Specifically, I said:

> Personally, I'd prefer something as simple as possible, maybe even:
> def _coerce_types (types):
>    if len(types) == 1:
>        return next(iter(types))
>    return float
> , but that's just a suggestion.

It is totally up to you to come up with something more along the lines of
your original, but I still think that making the behavior order-independent
comes at no performance-cost (if not a gain) and will make _sum's return
type more predictable. When I said the current behavior was confusing, I
didn't mean "not logical" or anything. The current rules are in fact very
precisely worked out, I just think they are too complicated to think them
through every time.

You are right of course with your remark that addition in Python is also
order-dependent regarding the returned type, but in my opinion this is not
the point here. You are emphasizing that _sum is a private function of the
module, but mean is a public one and the behavior of mean is dictated by
that of _sum. Now when I call the mean function, then, of course, I know
that this will very most likely be implemented as adding all values then
dividing by their number, but in terms of encapsulation principles I
shouldn't be forced to think about this to understand the return value of
the function. In other words, it doesn't help here that _sum reflects the
behavior of __add__, all you should care about is that the behavior of
mean() is simple to explain and understand.

Again, this is just an opinion of somebody interested in having this
particular module well-designed from the beginning before things are set in

Best wishes,

From steve at  Fri Jan 31 10:04:41 2014
From: steve at (Steven D'Aprano)
Date: Fri, 31 Jan 2014 20:04:41 +1100
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <20140131090441.GJ3799@ando>

On Fri, Jan 31, 2014 at 08:57:26AM +0000, Wolfgang Maier wrote:
> Steven D'Aprano <steve at ...> writes:

> > I'm hesitant to require two passes over the data in _sum.
> Regarding your comments above and in other posts:
> I never proposed two passes over the data. My implementation (below again
> because many people seem to have missed it in my first rather long post)
> gathers the input types in a set **while** calculating the sum in a single
> for loop. It then calls _coerce_types passing this set only once:

Ah! I did miss it -- I just skimmed your implementation, and completely 
failed to realise the point you were making. Thank you for the 


From wolfgang.maier at  Fri Jan 31 11:12:15 2014
From: wolfgang.maier at (Wolfgang Maier)
Date: Fri, 31 Jan 2014 10:12:15 +0000 (UTC)
Subject: [Python-ideas] statistics module in Python3.4
References: <>
Message-ID: <>

Chris Angelico <rosuav at ...> writes:

> On Fri, Jan 31, 2014 at 12:07 PM, Steven D'Aprano <steve at ...> wrote:
> > One of my aims is to avoid raising TypeError unnecessarily. The
> > statistics module is aimed at casual users who may not understand, or
> > care about, the subtleties of numeric coercions, they just want to take
> > the average of two values regardless of what sort of number they are.
> > But having said that, I realise that mixed-type arithmetic is difficult,
> > and I've avoided documenting the fact that the module will work on mixed
> > types.
> Based on the current docs and common sense, I would expect that
> Fraction and Decimal should normally be there exclusively, and that
> the only type coercions would be int->float->complex (because it makes
> natural sense to write a list of "floats" as [1.4, 2, 3.7], but it
> doesn't make sense to write a list of Fractions as [Fraction(1,2),
> 7.8, Fraction(12,35)]). Any mishandling of Fraction or Decimal with
> the other three types can be answered with "Well, you should be using
> the same type everywhere". 

Well, that's simple to stick to as long as you are dealing with explicitly
typed input data sets, but what about things like:

a = transform_a_series_of_data_somehow(data)
b = transform_this_series_differently(data)

statistics.mean(a+b) # assuming a and b are lists of transformed values

potentially different types are far more difficult to spot here and the fact
that the result of the above might not be the same as, e.g.,:


is not making things easier to debug.

>(Though it might be useful to allow
> int->anything coercion, since that one's easy and safe.)

It should be mentioned here that complex numbers are not currently dealt
with by statistics._sum .

>>> statistics._sum((complex(1),))
Traceback (most recent call last):
  File "<pyshell#62>", line 1, in <module>
  File ".\", line 158, in _sum
    n, d = exact_ratio(x)
  File ".\", line 257, in _exact_ratio
    raise TypeError(msg.format(type(x).__name__)) from None
TypeError: can't convert type 'complex' to numerator/denominator


From at  Fri Jan 31 12:35:53 2014
From: at (Haoyi Li)
Date: Fri, 31 Jan 2014 03:35:53 -0800
Subject: [Python-ideas] Could the ast module's ASTs preserve
 source_length in addition to lineno and col_offset?
In-Reply-To: <>
References: <>
Message-ID: <>

Nothing happened, I suppose. People in general thought it was a good idea
but after looking at the python source code, I chickened out of
implementing it in favor a dumb parse it till it
which sufficed for my purposes.

On Fri, Jan 31, 2014 at 2:55 AM, Alexander Ivanov <alehander42 at>wrote:

> What happened :? (I am also interested in getting source_length/col_last
> kind of info. Is there an alternative Python ast wrapper/library which
> provides it?)
> On Friday, May 31, 2013 4:47:15 PM UTC+3, Nick Coghlan wrote:
>> On 31 May 2013 20:00, "Haoyi Li" <haoy... at> wrote:
>> >
>> > Ok, I'll give it a shot; I'm not familiar with the python codebase or
>> build process, but i'll puzzle it out. Where's the place to go for help
>> related to this sort of thing? python-dev?
>> Check the developer guide at docs., and if you have
>> any follow-up questions, sign up to the core-me... at list.
>> Cheers,
>> Nick.
>> >
>> >
>> > On Fri, May 31, 2013 at 1:04 AM, Nick Coghlan <ncog... at>
>> wrote:
>> >>
>> >>
>> >> On 31 May 2013 14:28, "Haoyi Li" <haoy... at> wrote:
>> >> >
>> >> > Anyone else have any thoughts about this? This seems like it would
>> be a pretty straightforward thing to do, and I would be happy to go through
>> the code and submit a patch. The only question is whether we want to do it
>> in the first place; are there any reasons it can't/shouldn't be done that
>> I'm not aware of?
>> >>
>> >> Seems reasonable to me, but would need to see a patch to give a
>> definite yes or no.
>> >>
>> >> Cheers,
>> >> Nick.
>> >>
>> >> >
>> >> >
>> >> > On Wed, May 29, 2013 at 8:09 PM, Steven D'Aprano <
>> st... at> wrote:
>> >> >>
>> >> >> On 30/05/13 10:04, Haoyi Li wrote:
>> >> >>>
>> >> >>> I don't need to keep the source code, I just need a single integer
>> for each
>> >> >>> node. I would then be able to reconstruct the source snippet.
>> >> >>
>> >> >>
>> >> >> And so you did say. Sorry for the noise.
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Steven
>> >> >> _______________________________________________
>> >> >> Python-ideas mailing list
>> >> >> Python... at
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Python-ideas mailing list
>> >> > Python... at
>> >> >
>> >> >
>> >
>> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Fri Jan 31 12:41:12 2014
From: ncoghlan at (Nick Coghlan)
Date: Fri, 31 Jan 2014 21:41:12 +1000
Subject: [Python-ideas] Iterative development
In-Reply-To: <lce4te$sn5$>
References: <>
Message-ID: <>

On 31 Jan 2014 04:17, "Mark Lawrence" <breamoreboy at> wrote:
> On 30/01/2014 12:52, anatoly techtonik wrote:
>> So you can not plan how to spend your time more effectively and how to
>> help with development.
> Core dev time could be used more effectively if they weren't sidetracked
by non-issues e.g. blithering idiots who keep reopening issues on the bug
tracker.  I won't mention any names.

Mark, as annoying as Anatoly is, this is still a violation of the list code
of conduct. If you find him too irritating to allow you to maintain
civility on the core lists when dealing with him, set your mail client to
ignore his messages (that's what most of the core devs have been doing for
quite some time).

Anatoly already got himself banned from with his antics,
and his moderation flag is set on all the core mailing lists. At the rate
he is going, he is not encouraging anyone to reconsider either decision,
and it's still possible for that moderation flag to be upgraded to an
outright ban from the mailing lists if the mods decide it would be


> --
> My fellow Pythonistas, ask not what our language can do for you, ask what
you can do for our language.
> Mark Lawrence
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From wolfgang.maier at  Fri Jan 31 13:43:55 2014
From: wolfgang.maier at (Wolfgang Maier)
Date: Fri, 31 Jan 2014 12:43:55 +0000 (UTC)
Subject: [Python-ideas] statistics module in Python3.4
References: <>
 <> <20140131012705.GF3799@ando>
 <> <20140131085610.GI3799@ando>
Message-ID: <>

Steven D'Aprano <steve at ...> writes:

> On Thu, Jan 30, 2014 at 08:58:20PM -0800, Larry Hastings wrote:
> > 
> > The statistics module isn't marked as provisional.  So the semantics 
> > that ship with 3.4 are going to be set in stone.  Changing them later 
> > simply won't be an option--that will break code.  If you want to treat 
> > Counter objects differently in the future than you do now, then I agree 
> > with Wolfgang: the best course of action would be to add an exception 
> > now.  But again I'll defer to your judgment about what's best for your 
> > module.
> >
> Hmmm. Well, that's a much stronger promise of backward compatibility 
> than I would have expected. The fact that (say) variance works with a 
> dict is a pure accident of implementation, not advertised or promised in 
> any way. But I'll accept your ruling. I want to reserve the right to 
> add special handling of mappings in the future. In order of preference 
> (highest to least) I'd like to:
> 1) Put a note in the documentation that handling of mappings is subject 
> to change;
> 2) As above, plus raise warning.warn(); or
> 3) Raise an exception (this one only if you insist).

I thought about this further and, yes, I guess at least point 1) is
essential and even if that means marking the module as provisional it is a
bit sad, but worth it.
Mappings may be an excellent way of specifying frequencies and weights in an
elegant way. You could use them to calculate weighted means and variances,
and even to specify variable interval widths for median_grouped to calculate
a weighted median as defined here:

Most of this is quite easy to code I guess and it would be a pity to deprive
yourself of this possibility because people start passing mappings now and
start relying on iteration happening over keys only.
I agree with Larry that once this happens, it will be hard to change the
behavior even in 3.5.


From breamoreboy at  Fri Jan 31 16:13:39 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 31 Jan 2014 15:13:39 +0000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcgeir$rnh$>

On 31/01/2014 11:41, Nick Coghlan wrote:
> On 31 Jan 2014 04:17, "Mark Lawrence"
> <breamoreboy at
> <mailto:breamoreboy at>> wrote:
>  >
>  > On 30/01/2014 12:52, anatoly techtonik wrote:
>  >>
>  >>
>  >> So you can not plan how to spend your time more effectively and how to
>  >> help with development.
>  >>
>  >
>  > Core dev time could be used more effectively if they weren't
> sidetracked by non-issues e.g. blithering idiots who keep reopening
> issues on the bug tracker.  I won't mention any names.
> Mark, as annoying as Anatoly is, this is still a violation of the list
> code of conduct. If you find him too irritating to allow you to maintain
> civility on the core lists when dealing with him, set your mail client
> to ignore his messages (that's what most of the core devs have been
> doing for quite some time).
> Anatoly already got himself banned from
> <> with his antics, and his moderation flag is set
> on all the core mailing lists. At the rate he is going, he is not
> encouraging anyone to reconsider either decision, and it's still
> possible for that moderation flag to be upgraded to an outright ban from
> the mailing lists if the mods decide it would be appropriate.
> Regards,
> Nick.

Who says I was getting at Anotoly?  Unless the English language has 
changed without my knowledge, you'll find that "idiots" and "names" are 
plural and not singular.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rosuav at  Fri Jan 31 16:26:52 2014
From: rosuav at (Chris Angelico)
Date: Sat, 1 Feb 2014 02:26:52 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <lcgeir$rnh$>
References: <>
Message-ID: <>

On Sat, Feb 1, 2014 at 2:13 AM, Mark Lawrence <breamoreboy at> wrote:
> Who says I was getting at Anotoly?  Unless the English language has changed
> without my knowledge, you'll find that "idiots" and "names" are plural and
> not singular.

That's a perfectly valid argument, in the same way that this is
perfectly valid code:

import math
math.pi = 3.159

def minsec(sec):

def format_time(sec):
    min,sec = minsec(sec)
    return "%02d:%02d"%(sec,min)

It's all perfectly legal Python, but it breaks all sorts of
conventions, and you know it. In the context of this thread, it was
obvious to everyone what you were saying, and hiding behind the
technicality of plurality doesn't help you. Do please be honest with
yourself and us.


From breamoreboy at  Fri Jan 31 16:35:04 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 31 Jan 2014 15:35:04 +0000
Subject: [Python-ideas] statistics module in Python3.4
In-Reply-To: <>
References: <>
 <> <20140131012705.GF3799@ando>
Message-ID: <lcgfr1$dd1$>

On 31/01/2014 08:57, Wolfgang Maier wrote:
> Hi Steven,
> first of all let me say that I am quite amazed by the extent of the
> discussion that is going on now. All I really meant is that there are two
> special cases (mixed types in _sum and Counters as input to some functions)
> that I find worth reconsidering in an otherwise really useful module.

Thanks for starting what I see as a very healthy debate that in the 
longer term is highly likely to make for a better statistics module. 
What more could a user ask for?

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From breamoreboy at  Fri Jan 31 16:45:16 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 31 Jan 2014 15:45:16 +0000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcgge5$l7f$>

On 31/01/2014 15:26, Chris Angelico wrote:
> On Sat, Feb 1, 2014 at 2:13 AM, Mark Lawrence <breamoreboy at> wrote:
>> Who says I was getting at Anotoly?  Unless the English language has changed
>> without my knowledge, you'll find that "idiots" and "names" are plural and
>> not singular.
> That's a perfectly valid argument, in the same way that this is
> perfectly valid code:
> #
> import math
> math.pi = 3.159
> def minsec(sec):
>      global SECONDS_PER_MINUTE
> def format_time(sec):
>      min,sec = minsec(sec)
>      return "%02d:%02d"%(sec,min)
> It's all perfectly legal Python, but it breaks all sorts of
> conventions, and you know it. In the context of this thread, it was
> obvious to everyone what you were saying, and hiding behind the
> technicality of plurality doesn't help you. Do please be honest with
> yourself and us.
> ChrisA

Asperger Syndrome sufferers are always honest.  Sadly I find it a major 
weakness that I have to live with.  We also take things literally and 
write things literally.  So your "obvious to everyone what you were 
saying" to me is clearly incorrect.  Please withdraw the comment.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rosuav at  Fri Jan 31 17:15:07 2014
From: rosuav at (Chris Angelico)
Date: Sat, 1 Feb 2014 03:15:07 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <lcgge5$l7f$>
References: <>
Message-ID: <>

On Sat, Feb 1, 2014 at 2:45 AM, Mark Lawrence <breamoreboy at> wrote:
> Asperger Syndrome sufferers are always honest.  Sadly I find it a major
> weakness that I have to live with.  We also take things literally and write
> things literally.  So your "obvious to everyone what you were saying" to me
> is clearly incorrect.  Please withdraw the comment.

I know what it's like to live with Aspergers, I have it myself (at
least, not formally diagnosed but it seems pretty likely). And I do
put my foot in my mouth pretty often. But that doesn't mean that I can
hide behind it as a shield when it's this obvious. You knew full well
what you were saying when you said you wouldn't mention any names.

Now, I do enjoy a good upper-class British insult-fest. That's a major
part of what makes quite a few British comedies work - the utterly
courteous, yet bitingly cutting, wit, barb, and counter-barb. The
tenor explains to the soprano that he was just disguised as a member
of the band, that he's really a much more important person, and she
says that she knew he was in disguise the minute she heard him play.
But when that wit is wielded inappropriately, the proper response is a
graceful apology or retraction... or, if circumstances demand, a
barbed retraction ("I implied in the House last week that the Hon
Member had the intelligence of a stuffed egg. This was inappropriate,
and I formally apologize for and retract this analogy. My breakfast
egg today deserved no less."), but you have to be VERY sure of your
ground before you take that option - claiming Aspergers is not

This'll probably sidetrack everyone terribly (for which I'm not sure
if I apologize; it might be a good thing for some people to get stuck
in TVTropes for a while) but this write-up about Asperger Syndrome
gives an excellent comment:
Most genuine Aspies don't see Aspergers as a 'Get Out Of Jerk Ass
Free' card, just an explanation.

If somebody offends you, then tells you they have Asperger Syndrome
and that's why they offended you, you can generally tell if this is
true by a simple observation - If the admittance is followed (or
preceded) by a genuine apology, it may be true. If it's followed by
the expectation that you should now apologise to them for being
offended, they're probably just jerks.

I'll let that speak for itself.


From ethan at  Fri Jan 31 17:28:57 2014
From: ethan at (Ethan Furman)
Date: Fri, 31 Jan 2014 08:28:57 -0800
Subject: [Python-ideas] [off-topic] Insults, English, and Aspergers
In-Reply-To: <lcgge5$l7f$>
References: <>
Message-ID: <>

On 01/31/2014 07:45 AM, Mark Lawrence wrote:
>> On Sat, Feb 1, 2014 at 2:13 AM, Mark Lawrence wrote:
>>>>On 01/30/2014 10:16 AM, Mark Lawrence wrote:
>>>>> Core dev time could be used more effectively if they weren't sidetracked
>>>>> by non-issues e.g. blithering idiots who keep reopening issues on the bug
>>>>>  tracker.  I won't mention any names.
>>> Who says I was getting at Anotoly?  Unless the English language has changed
>>> without my knowledge, you'll find that "idiots" and "names" are plural and
>>> not singular.
> Asperger Syndrome sufferers are always honest. [...]  We also take things
>  literally and write things literally.  So your "obvious to everyone what
> you were saying" to me is clearly incorrect.  Please withdraw the comment.

What a load of crap.

If you care to discuss this further, mail me off-list and stop wasting developer time.


From breamoreboy at  Fri Jan 31 18:28:34 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 31 Jan 2014 17:28:34 +0000
Subject: [Python-ideas] [off-topic] Insults, English, and Aspergers
In-Reply-To: <>
References: <>
 <lcgge5$l7f$> <>
Message-ID: <lcgmf5$4p4$>

On 31/01/2014 16:28, Ethan Furman wrote:
> On 01/31/2014 07:45 AM, Mark Lawrence wrote:
>>> On Sat, Feb 1, 2014 at 2:13 AM, Mark Lawrence wrote:
>>>>> On 01/30/2014 10:16 AM, Mark Lawrence wrote:
>>>>>> Core dev time could be used more effectively if they weren't
>>>>>> sidetracked
>>>>>> by non-issues e.g. blithering idiots who keep reopening issues on
>>>>>> the bug
>>>>>>  tracker.  I won't mention any names.
>>>> Who says I was getting at Anotoly?  Unless the English language has
>>>> changed
>>>> without my knowledge, you'll find that "idiots" and "names" are
>>>> plural and
>>>> not singular.
>> Asperger Syndrome sufferers are always honest. [...]  We also take things
>>  literally and write things literally.  So your "obvious to everyone what
>> you were saying" to me is clearly incorrect.  Please withdraw the
>> comment.
> What a load of crap.
> If you care to discuss this further, mail me off-list and stop wasting
> developer time.

Your opinion, clearly not mine.  Further I don't discuss anything that 
starts on a Python mailing list with people offline as I don't believe 
in discussing things behind other people's backs.

As for wasting developer time how much has been wasted by various people 
who've routinely insulted Python and by continuance its developers, and 
yet they're still allowed to spew their nonsense and get away with it? 
Yet again we're into the dual standards that annoy me so much, yet for 
speaking my mind I'm the one in the wrong.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From breamoreboy at  Fri Jan 31 18:32:11 2014
From: breamoreboy at (Mark Lawrence)
Date: Fri, 31 Jan 2014 17:32:11 +0000
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <lcgmls$76p$>

On 31/01/2014 16:15, Chris Angelico wrote:
> On Sat, Feb 1, 2014 at 2:45 AM, Mark Lawrence <breamoreboy at> wrote:
>> Asperger Syndrome sufferers are always honest.  Sadly I find it a major
>> weakness that I have to live with.  We also take things literally and write
>> things literally.  So your "obvious to everyone what you were saying" to me
>> is clearly incorrect.  Please withdraw the comment.
> I know what it's like to live with Aspergers, I have it myself (at
> least, not formally diagnosed but it seems pretty likely). And I do
> put my foot in my mouth pretty often. But that doesn't mean that I can
> hide behind it as a shield when it's this obvious. You knew full well
> what you were saying when you said you wouldn't mention any names.

Once again I most certainly *DID NOT*.  I knew full well what I was 
writing.  I quite deliberately used plurals for that very purpose. 
Please in future stick to your bible bashing as you clearly know far 
more about that than you know about Asperger, with myself having a 
formal diagnosis.

And please don't bother to withdraw your comment now or apologise, I 
wouldn't accept either as being in any way, shape or form genuine.

My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

From rosuav at  Fri Jan 31 18:42:04 2014
From: rosuav at (Chris Angelico)
Date: Sat, 1 Feb 2014 04:42:04 +1100
Subject: [Python-ideas] Iterative development
In-Reply-To: <lcgmls$76p$>
References: <>
Message-ID: <>

On Sat, Feb 1, 2014 at 4:32 AM, Mark Lawrence <breamoreboy at> wrote:
> On 31/01/2014 16:15, Chris Angelico wrote:
>> On Sat, Feb 1, 2014 at 2:45 AM, Mark Lawrence <breamoreboy at>
>> wrote:
>>> Asperger Syndrome sufferers are always honest.  Sadly I find it a major
>>> weakness that I have to live with.  We also take things literally and
>>> write
>>> things literally.  So your "obvious to everyone what you were saying" to
>>> me
>>> is clearly incorrect.  Please withdraw the comment.
>> I know what it's like to live with Aspergers, I have it myself (at
>> least, not formally diagnosed but it seems pretty likely). And I do
>> put my foot in my mouth pretty often. But that doesn't mean that I can
>> hide behind it as a shield when it's this obvious. You knew full well
>> what you were saying when you said you wouldn't mention any names.
> Once again I most certainly *DID NOT*.  I knew full well what I was writing.
> I quite deliberately used plurals for that very purpose. Please in future
> stick to your bible bashing as you clearly know far more about that than you
> know about Asperger, with myself having a formal diagnosis.
> And please don't bother to withdraw your comment now or apologise, I
> wouldn't accept either as being in any way, shape or form genuine.

I wouldn't withdraw my comment, because I still stand by it. If you
genuinely meant no specifics, then when someone pointed out how they
interpreted your statement, you would have apologized and made a
correction: "I didn't mean anyone in particular, I meant the way
there've been 50 issues reopened unnecessarily by 30 different people
lately", or something. But that wouldn't be true, would it? You really
did mean Anatoly, and that's why you said what you did. Believe you
me, I know more than you think I do. Think of Emma from "Once Upon A
Time" if you like - a strong ability to detect lying, based on a
metric ton of experience with it.


From rosuav at  Fri Jan 31 18:42:50 2014
From: rosuav at (Chris Angelico)
Date: Sat, 1 Feb 2014 04:42:50 +1100
Subject: [Python-ideas] [off-topic] Insults, English, and Aspergers
In-Reply-To: <lcgmf5$4p4$>
References: <>
 <lcgge5$l7f$> <>
Message-ID: <>

On Sat, Feb 1, 2014 at 4:28 AM, Mark Lawrence <breamoreboy at> wrote:
> As for wasting developer time how much has been wasted by various people
> who've routinely insulted Python and by continuance its developers, and yet
> they're still allowed to spew their nonsense and get away with it? Yet again
> we're into the dual standards that annoy me so much, yet for speaking my
> mind I'm the one in the wrong.

No, you're not in the wrong for speaking your mind. You're declared to
be (or treated as) in the wrong based on an objective analysis of the
content and style of your posts.

Anatoly is at fault too, but your issues are almost completely
tangential to his. They just happen to be in the same thread. Now
please, stop talking. Trust me, you're only digging yourself further
into a hole. This discussion is way way off topic, and I'm becoming
painfully aware that I've said way too much already myself.


From ctb at  Fri Jan 31 18:44:31 2014
From: ctb at (C. Titus Brown)
Date: Fri, 31 Jan 2014 09:44:31 -0800
Subject: [Python-ideas] Iterative development
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Feb 01, 2014 at 04:42:04AM +1100, Chris Angelico wrote:
> On Sat, Feb 1, 2014 at 4:32 AM, Mark Lawrence <breamoreboy at> wrote:
> > On 31/01/2014 16:15, Chris Angelico wrote:
> >>
> >> On Sat, Feb 1, 2014 at 2:45 AM, Mark Lawrence <breamoreboy at>
> >> wrote:
> >>>
> >>> Asperger Syndrome sufferers are always honest.  Sadly I find it a major
> >>> weakness that I have to live with.  We also take things literally and
> >>> write
> >>> things literally.  So your "obvious to everyone what you were saying" to
> >>> me
> >>> is clearly incorrect.  Please withdraw the comment.
> >>
> >>
> >> I know what it's like to live with Aspergers, I have it myself (at
> >> least, not formally diagnosed but it seems pretty likely). And I do
> >> put my foot in my mouth pretty often. But that doesn't mean that I can
> >> hide behind it as a shield when it's this obvious. You knew full well
> >> what you were saying when you said you wouldn't mention any names.
> >>
> >
> > Once again I most certainly *DID NOT*.  I knew full well what I was writing.
> > I quite deliberately used plurals for that very purpose. Please in future
> > stick to your bible bashing as you clearly know far more about that than you
> > know about Asperger, with myself having a formal diagnosis.
> >
> > And please don't bother to withdraw your comment now or apologise, I
> > wouldn't accept either as being in any way, shape or form genuine.
> I wouldn't withdraw my comment, because I still stand by it. If you
> genuinely meant no specifics, then when someone pointed out how they
> interpreted your statement, you would have apologized and made a
> correction: "I didn't mean anyone in particular, I meant the way
> there've been 50 issues reopened unnecessarily by 30 different people
> lately", or something. But that wouldn't be true, would it? You really
> did mean Anatoly, and that's why you said what you did. Believe you
> me, I know more than you think I do. Think of Emma from "Once Upon A
> Time" if you like - a strong ability to detect lying, based on a
> metric ton of experience with it.

Hi all,

this is getting rather meta, in a profoundly unproductive way.  Can we stick to
software development, Python, and non-personal-or-plural insults, please?

--titus [ <-- wearing moderator hat ]
C. Titus Brown, ctb at