From ncoghlan at  Fri Jan  1 00:51:36 2016
From: ncoghlan at (Nick Coghlan)
Date: Fri, 1 Jan 2016 15:51:36 +1000
Subject: [Python-ideas] Where to put non-collection ABCs (was:
 Deprecating the old-style sequence protocol)
In-Reply-To: <>
References: <>
Message-ID: <>

On 1 January 2016 at 08:18, Michael Selik <mike at> wrote:
> On Thu, Dec 31, 2015 at 12:48 PM Brett Cannon <brett at> wrote:
>> On Thu, Dec 31, 2015, 07:08 Alexander Walters <tritium-list at>
>> wrote:
>>> Would it be a good idea to mix 'concrete implementations of ABCs'*
>>> directly in the abc module where the tooling to create ABCs live, or to
>>> put it in a submodule?  I feel it should be a submodule, but that isn't
>>> based on vast experience.
> Locating collections ABCs in a submodule makes some sense, as there are 21
> of them and the collections module is important for beginners to learn
> without getting distracted by ABCs. Contrast that with the direct inclusion
> of ABCs in most other modules and it suggests the creation of a submodule
> for collections may have been motivated for the same reason as this
> discussion -- it didn't feel right to have certain ABCs directly in the
> collections module.

No need to speculate, the original motive for the move is documented
in the tracker: (which can be found
by looking at the commit history for the collections module: )

The problem was with folks getting confused between the abstract types
like Sequence and Mapping and the usable classes like deque, ChainMap,
OrderedDict, defaultdict, etc, rather than there being a lot of
non-collections related content in the file.

At the time, Callable was the only non-container related ABC in - most of the others now being considered for
relocation (Generator, Coroutine, Awaitable, AsyncIterable,
AsyncIterator) were added as part of the PEP 492 implementation in
3.5, and *that* was mostly driven by Iterable and Iterator already
being there so it was "logical" to also add Generator, AsyncIterable
and AsyncIterator, with Coroutine and Awaitable coming along for the

That does raise the question of whether or not it's worth continuing
to publish the PEP 492 ABCs from - Guido formally
accepted PEP 492 with provisional status [1], so we have scope to do
the following:

- add abc.(Generator, Coroutine, Awaitable, AsyncIterable,
AsyncIterator) in 3.5.2 (keeping the aliases in
- drop the aliases for the PEP 492 ABCs in 3.6
- add abc.(Callable, Iterable, Iterator) in 3.6 (keeping the aliases
in indefinitely for Python 2 compatibility)


> If the non-collection ABCs are being moved out of the collections module and
> into the ``abc`` module, there's less reason to separate them into a
> submodule. Beginners don't venture into the abc module expecting to
> understand everything. It's natural to find a bunch of ABCs in a module
> called ``abc``. And ABCs are included directly in many other modules instead
> of being relegated to a less discoverable submodule like ````,
> ````, ````, etc. as many of those are focused on ABCs in
> the first place.

Right, the reason and make sense is that
when you import "importlib", you're probably interested in dynamic
imports, and when you import "collections", you're probably interested
in using one of the concrete container classes. The ABCs are only
relevant if you're wanting to do some kind of type checking or define
your own classes implementing the ABCs, so it makes sense to separate
them at the module level, not just in the documentation.

Other modules defining ABCs either don't need separation, or get their
separation in other ways:

abc: no separation needed, you're necessarily already thinking about
ABCs when importing this
typing: no separation needed, the only ABC is the one for defining generic types
email: no separation needed, the only ABC is the one for defining email policies
io: to use the io stack, you just call open() or use some other
file/stream opening API
numbers: to use the numeric tower, you use a builtin type,
fractions.Fraction, decimal.Decimal, or some other numeric type


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From surya.subbarao1 at  Sat Jan  2 13:14:55 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Sat, 2 Jan 2016 10:14:55 -0800
Subject: [Python-ideas] Bad programming style in decorators?
Message-ID: <>

In most decorator tutorials, it's taught using functions inside functions.
Isn't this inefficient because every time the decorator is called, you're
redefining the function, which takes up much more memory? I prefer defining
decorators as classes with __call__ overridden. Is there a reason why
decorators are taught with functions inside functions?

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Sat Jan  2 13:56:19 2016
From: abarnert at (Andrew Barnert)
Date: Sat, 2 Jan 2016 10:56:19 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 2, 2016, at 10:14, u8y7541 The Awesome Person <surya.subbarao1 at> wrote:
> In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory?


First, most decorators are only called once. For example:

    def fib(n)
        if n < 2:
            return n
        return fib(n-1) + fib(n-2)

The lru_cache function gets called once, and creates and returns a decorator function. That decorator function is passed fib, and creates and returns a wrapper function. That wrapper function is what gets stored in the globals as fib. It may then get called a zillion times, but it doesn't create or call any new function.

So, why should you care whether lru_cache is implemented with a function or a class? You're talking about a difference of a few dozen bytes, once in the entire lifetime of your program.

Plus, where do you get the idea that a function object is "much larger"? Each new function that gets built used the same code object, globals dict, etc., so you're only paying for the cost of a function object header, plus a tuple of cell objects (pointers) for any state variables. Your alternative is to create a class instance header (maybe a little smaller than a function object header), and store all those state variables in a dict (33-50% bigger even with the new split-instance-dict optimizations).

Anyway, I'm willing to bet that in this case, the function is ~256 bytes while the class is ~1024, so you're actually wasting rather than saving memory. But either way, it's far too little memory to care.

> I prefer defining decorators as classes with __call__ overridden. Is there a reason why decorators are taught with functions inside functions?

Because most decorators are more concise, more readable, and easier to understand that way. And that's far more important than a micro-optimization which may actually be a pessimization but which even more likely isn't going to matter at all.

From ncoghlan at  Sat Jan  2 21:42:31 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 3 Jan 2016 12:42:31 +1000
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2016 at 04:56, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> On Jan 2, 2016, at 10:14, u8y7541 The Awesome Person <surya.subbarao1 at> wrote:
>> In most decorator tutorials, it's taught using functions inside functions. Isn't this inefficient because every time the decorator is called, you're redefining the function, which takes up much more memory?
> No.
> First, most decorators are only called once. For example:
>     @lru_cache(maxsize=None)
>     def fib(n)
>         if n < 2:
>             return n
>         return fib(n-1) + fib(n-2)
> The lru_cache function gets called once, and creates and returns a decorator function. That decorator function is passed fib, and creates and returns a wrapper function. That wrapper function is what gets stored in the globals as fib. It may then get called a zillion times, but it doesn't create or call any new function.
> So, why should you care whether lru_cache is implemented with a function or a class? You're talking about a difference of a few dozen bytes, once in the entire lifetime of your program.

We need to make a slight terminology clarification here, as the answer
to Surya's question changes depend on whether we're talking about
implementing wrapper functions inside decorators (like the
"_lru_cache_wrapper" that lru_cache wraps around the passed in
callable), or about implementing decorators inside decorator factories
(like the transient "decorating_function" that lru_cache uses to apply
the wrapper to the function being defined). Most of the time that
distinction isn't important, so folks use the more informal approach
of using "decorator" to refer to both decorators and decorator
factories, but this is a situation where the difference matters.

Every decorator factor does roughly the same thing: when called, it
produces a new instance of a callable type which accepts a single
function as its sole argument. From the perspective of the *user* of
the decorator factory, it doesn't matter whether internally that's
handled using a def statement, a lambda expression, functools.partial,
instantiating a class that defines a custom __call__ method, or some
other technique. It's also rare for decorator factories to be invoked
in code that's a performance bottleneck, so it's generally more
important to optimise for readability and maintainability when writing
them than it is to optimise for speed.

The wrapper functions themselves, though, exist in a one:one
correspondence with the functions they're applied to - when you apply
functools.lru_cache to a function, the transient decorator produced by
the decorator factory only lasts as long as the execution of the
function definition, but the wrapper function lasts for as long as the
wrapped function does, and gets invoked every time that function is
called (and if a function is performance critical enough for the
results to be worth caching, then it's likely performance critical
enough to be thinking about micro-optimisations).

As such, from a micro-optimisation perspective, it's reasonable to
want to know the answers to:

* Which is faster, defining a new function object, or instantiating an
existing class?
* Which is faster, calling a function object that accepts a single
parameter, or calling a class with a custom __call__ method?
* Which uses more memory, defining a new function object, or
instantiating an existing class?

The answers to these questions can technically vary by implementation,
but in practice, CPython's likely to be representative of their
*relative* performance for any given implementation, so we can use it
to check whether or not our intuitions about relative speed and memory
consumption are correct.

For the first question then, here are the numbers I get locally for CPython 3.4:

$ python3 -m timeit "def f(): pass"
10000000 loops, best of 3: 0.0744 usec per loop
$ python3 -m timeit -s "class C: pass" "c = C()"
10000000 loops, best of 3: 0.113 usec per loop

The trick here is to realise that *at runtime*, a def statement is
really just instantiating a new instance of types.FunctionType - most
of the heavy lifting has already been done at compile time. The reason
it manages to be faster than typical class instantiation is because we
get to use customised bytecode operating on constant values rather
than having to look the class up by name and making a standard
function call:

 >>> dis.dis("def f(): pass")
  1           0 LOAD_CONST               0 (<code object f at
0x7fe875aff0c0, file "<dis>", line 1>)
              3 LOAD_CONST               1 ('f')
              6 MAKE_FUNCTION            0
              9 STORE_NAME               0 (f)
             12 LOAD_CONST               2 (None)
             15 RETURN_VALUE
 >>> dis.dis("c = C()")
  1           0 LOAD_NAME                0 (C)
              3 CALL_FUNCTION            0 (0 positional, 0 keyword pair)
              6 STORE_NAME               1 (c)
              9 LOAD_CONST               0 (None)
             12 RETURN_VALUE

For the second question:

$ python3 -m timeit -s "def f(arg): pass" "f(None)"
10000000 loops, best of 3: 0.111 usec per loop
[ncoghlan at thechalk ~]$ python3 -m timeit -s "class C:" -s "  def
__call__(self, arg): pass" -s "c = C()" "c(None)"
1000000 loops, best of 3: 0.232 usec per loop

Again, we see that the native function outperforms the class with a
custom __call__ method. There's no difference in the bytecode this
time, but rather a difference in what happens inside the CALL_FUNCTION
opcode: for the second case, we first have to retrieve the bound
c.__call__() method, and then call *that* as c.__call__(None), which
in turn internally calls C.__call__(c, None), while for the native
function case we get to skip straight to running the called function.
The speed difference can be significantly reduced (but not entirely
eliminated), by caching the bound method during setup:

$ python3 -m timeit -s "class C:" -s "  def __call__(self, arg): pass"
-s "c_call = C().__call__" "c_call(None)"
10000000 loops, best of 3: 0.115 usec per loop

Finally, we get to the question of relative size: are function
instances larger or smaller than your typical class instance? Again,
we don't have to guess, we can use the interpreter to experiment and
check our assumptions:

    >>> import sys
    >>> def f(): pass
    >>> sys.getsizeof(f)
    >>> class C(): pass
    >>> sys.getsizeof(C())

That's a potentially noticeable difference if we're applying the
wrapper often enough - the native function is 80 bytes larger than an
empty standard class instance. Looking at the available data
attributes on f, we can see the likely causes of the difference:

    >>> set(dir(f)) - set(dir(C()))
    {'__code__', '__defaults__', '__name__', '__closure__', '__get__',
'__kwdefaults__', '__qualname__', '__annotations__', '__globals__',

There are 10 additional attributes there, although 2 of them (__get__
and __call__) relate to methods our native function has defined, but
the empty class doesn't. The other 8 represent additional pieces of
data stored (or potentially stored) per function, that we don't store
for a typical class instance.

However, we also need to account for the overhead of defining a new
class object, and that's a non-trivial amount of memory when we're
talking about a size difference of only 80 bytes per wrapped function:

    >>> sys.getsizeof(C)

That means if a wrapper function is only used a few times in any given
run of the program, then native functions will be faster *and* use
less memory (at least on CPython). If the wrapper is used more often
than that, then native functions will still be the fastest option, but
not the lowest memory option.

Furthermore, if we decide to cache the bound __call__ method to reduce
the speed impact of using a custom __call__ method, we give up most of
the memory gains:

    >>> sys.getsizeof(C().__call__)

This all suggests that if your application is severely memory
constrained (e.g. it's running on an embedded interpreter like
MicroPython), then it *might* make sense to incur the extra complexity
of using classes with a custom __call__ method to define wrapper
functions, over just using a nested function. For more typical cases
though, the difference is going to disappear into the noise, so you're
likely to be better off defaulting to using nested function
definitions, and only switching to the class based version in cases
where it's more readable and maintainable (and in those cases
considering whether or not it might make sense to return the bound
__call__ method from the decorator, rather than the callable itself).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From surya.subbarao1 at  Sat Jan  2 22:00:38 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Sat, 2 Jan 2016 19:00:38 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

The wrapper functions themselves, though, exist in a one:one
> correspondence with the functions they're applied to - when you apply
> functools.lru_cache to a function, the transient decorator produced by
> the decorator factory only lasts as long as the execution of the
> function definition, but the wrapper function lasts for as long as the
> wrapped function does, and gets invoked every time that function is
> called (and if a function is performance critical enough for the
> results to be worth caching, then it's likely performance critical
> enough to be thinking about micro-optimisations). (Nick Coghlan)

Yes, that is what I was thinking of. Just like Quake's fast inverse square
root. Even though it is a micro-optimization, it greatly affects how fast
the game runs.

But, as I explained, the function will _not_ be redefined and trashed every
> frame; it will be created one time. (Andrew Barnert)

Hmm... Nick says different...

This all suggests that if your application is severely memory
> constrained (e.g. it's running on an embedded interpreter like
> MicroPython), then it *might* make sense to incur the extra complexity
> of using classes with a custom __call__ method to define wrapper
> functions, over just using a nested function. (Nick Coghlan)

Yes, I was thinking of that when I started this thread, but this thread is
just from my speculation.

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan  2 22:34:48 2016
From: guido at (Guido van Rossum)
Date: Sat, 2 Jan 2016 20:34:48 -0700
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

If you want more discussion please discuss a specific example, showing the
decorator code itself (not just the decorator call).

The upshot is that creating a function object is pretty efficient and
probably more efficient than instantiating a class -- if you don't believe
that write a micro-benchmark.

On Sat, Jan 2, 2016 at 8:00 PM, u8y7541 The Awesome Person <
surya.subbarao1 at> wrote:

> The wrapper functions themselves, though, exist in a one:one
>> correspondence with the functions they're applied to - when you apply
>> functools.lru_cache to a function, the transient decorator produced by
>> the decorator factory only lasts as long as the execution of the
>> function definition, but the wrapper function lasts for as long as the
>> wrapped function does, and gets invoked every time that function is
>> called (and if a function is performance critical enough for the
>> results to be worth caching, then it's likely performance critical
>> enough to be thinking about micro-optimisations). (Nick Coghlan)
> Yes, that is what I was thinking of. Just like Quake's fast inverse square
> root. Even though it is a micro-optimization, it greatly affects how fast
> the game runs.
> But, as I explained, the function will _not_ be redefined and trashed
>> every frame; it will be created one time. (Andrew Barnert)
> Hmm... Nick says different...
> This all suggests that if your application is severely memory
>> constrained (e.g. it's running on an embedded interpreter like
>> MicroPython), then it *might* make sense to incur the extra complexity
>> of using classes with a custom __call__ method to define wrapper
>> functions, over just using a nested function. (Nick Coghlan)
> Yes, I was thinking of that when I started this thread, but this thread is
> just from my speculation.
> --
> -Surya Subbarao
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jan  2 22:39:01 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 3 Jan 2016 13:39:01 +1000
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2016 at 13:00, u8y7541 The Awesome Person
<surya.subbarao1 at> wrote:
>> The wrapper functions themselves, though, exist in a one:one
>> correspondence with the functions they're applied to - when you apply
>> functools.lru_cache to a function, the transient decorator produced by
>> the decorator factory only lasts as long as the execution of the
>> function definition, but the wrapper function lasts for as long as the
>> wrapped function does, and gets invoked every time that function is
>> called (and if a function is performance critical enough for the
>> results to be worth caching, then it's likely performance critical
>> enough to be thinking about micro-optimisations). (Nick Coghlan)
> Yes, that is what I was thinking of. Just like Quake's fast inverse square
> root. Even though it is a micro-optimization, it greatly affects how fast
> the game runs.

For Python, much bigger performance pay-offs are available without
changing the code by adopting tools like PyPy, Cython and Numba.
Worrying about micro-optimisations like this usually only makes sense
if a profiler has identified the code as a hotspot for your particular
workload (and sometimes not even then).

>> But, as I explained, the function will _not_ be redefined and trashed
>> every frame; it will be created one time. (Andrew Barnert)
> Hmm... Nick says different...
>> This all suggests that if your application is severely memory
>> constrained (e.g. it's running on an embedded interpreter like
>> MicroPython), then it *might* make sense to incur the extra complexity
>> of using classes with a custom __call__ method to define wrapper
>> functions, over just using a nested function. (Nick Coghlan)

The memory difference is only per function defined using the wrapper,
not per call. The second speed difference I described (how long the
CALL_FUNCTION opcode takes) is per call, and there native functions
are the clear winner (followed by bound methods, and custom callable
objects a relatively distant third).

The other thing to keep in mind is that the examples I showed were
focused specifically on measuring the differences in overhead, so the
function bodies don't actually do anything, and the class instances
didn't contain any state of their own. Adding even a single
instance/closure variable is likely to swamp the differences in memory
consumption between a native function and a class instance.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From guido at  Sat Jan  2 22:48:04 2016
From: guido at (Guido van Rossum)
Date: Sat, 2 Jan 2016 20:48:04 -0700
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

Whoops, Nick already did the micro-benchmarks, and showed that creating a
function object is faster than instantiating a class. He also measured the
size, but I think he forgot that sys.getsizeof() doesn't report the size
(recursively) of contained objects -- a class instance references a dict
which is another 288 bytes (though if you care you can get rid of this by
using __slots__). I expect that calling an instance using __call__ is also
slower than calling a function (but you can do your own benchmarks :-).

On Sat, Jan 2, 2016 at 8:39 PM, Nick Coghlan <ncoghlan at> wrote:

> On 3 January 2016 at 13:00, u8y7541 The Awesome Person
> <surya.subbarao1 at> wrote:
> >
> >> The wrapper functions themselves, though, exist in a one:one
> >> correspondence with the functions they're applied to - when you apply
> >> functools.lru_cache to a function, the transient decorator produced by
> >> the decorator factory only lasts as long as the execution of the
> >> function definition, but the wrapper function lasts for as long as the
> >> wrapped function does, and gets invoked every time that function is
> >> called (and if a function is performance critical enough for the
> >> results to be worth caching, then it's likely performance critical
> >> enough to be thinking about micro-optimisations). (Nick Coghlan)
> >
> > Yes, that is what I was thinking of. Just like Quake's fast inverse
> square
> > root. Even though it is a micro-optimization, it greatly affects how fast
> > the game runs.
> For Python, much bigger performance pay-offs are available without
> changing the code by adopting tools like PyPy, Cython and Numba.
> Worrying about micro-optimisations like this usually only makes sense
> if a profiler has identified the code as a hotspot for your particular
> workload (and sometimes not even then).
> >> But, as I explained, the function will _not_ be redefined and trashed
> >> every frame; it will be created one time. (Andrew Barnert)
> >
> > Hmm... Nick says different...
> >
> >> This all suggests that if your application is severely memory
> >> constrained (e.g. it's running on an embedded interpreter like
> >> MicroPython), then it *might* make sense to incur the extra complexity
> >> of using classes with a custom __call__ method to define wrapper
> >> functions, over just using a nested function. (Nick Coghlan)
> The memory difference is only per function defined using the wrapper,
> not per call. The second speed difference I described (how long the
> CALL_FUNCTION opcode takes) is per call, and there native functions
> are the clear winner (followed by bound methods, and custom callable
> objects a relatively distant third).
> The other thing to keep in mind is that the examples I showed were
> focused specifically on measuring the differences in overhead, so the
> function bodies don't actually do anything, and the class instances
> didn't contain any state of their own. Adding even a single
> instance/closure variable is likely to swamp the differences in memory
> consumption between a native function and a class instance.
> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Sat Jan  2 22:50:34 2016
From: abarnert at (Andrew Barnert)
Date: Sat, 2 Jan 2016 19:50:34 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 2, 2016, at 19:00, u8y7541 The Awesome Person <surya.subbarao1 at> wrote:
>> The wrapper functions themselves, though, exist in a one:one
>> correspondence with the functions they're applied to - when you apply
>> functools.lru_cache to a function, the transient decorator produced by
>> the decorator factory only lasts as long as the execution of the
>> function definition, but the wrapper function lasts for as long as the
>> wrapped function does, and gets invoked every time that function is
>> called (and if a function is performance critical enough for the
>> results to be worth caching, then it's likely performance critical
>> enough to be thinking about micro-optimisations). (Nick Coghlan)
> Yes, that is what I was thinking of. Just like Quake's fast inverse square root. Even though it is a micro-optimization, it greatly affects how fast the game runs.

Of course micro-optimizations _can_ matter--when you're optimizing the work done in the inner loop of a program that's CPU-bound, even a few percent can make a difference.

But that doesn't mean they _always_ matter. Saving 50ns in some code that runs thousands of times per frames makes a difference; saving 50ns in some code that happens once at startup does not. That's why we have profiling tools: so you can find the bit of your program where you're spending 99% of your time doing something a billion times, and optimize that part.

And it also doesn't mean that everything that sounds like it should be lighter is worth doing. You have to actually test it and see. In the typical case where you're replacing one function object with one class object and one instance object, that's actually taking more space, not less.

>> But, as I explained, the function will _not_ be redefined and trashed every frame; it will be created one time. (Andrew Barnert)
> Hmm... Nick says different...

No, Nick doesn't say different. Read it again. The wrapper function lives as long as the wrapped function lives. It doesn't get created anew each time you call it.

If you don't understand this, it may help to profile [fib(i) for i in range(10000)]. You'll see that the wrapper function gets called a ton of times, the wrapper function gets called 10000 times, and the factory function (which created wrapper functions) gets called 0 times.

>> This all suggests that if your application is severely memory
>> constrained (e.g. it's running on an embedded interpreter like
>> MicroPython), then it *might* make sense to incur the extra complexity
>> of using classes with a custom __call__ method to define wrapper
>> functions, over just using a nested function. (Nick Coghlan)
> Yes, I was thinking of that when I started this thread, but this thread is just from my speculation. 

Nick is saying that there may be some cases where it might make sense to use a class. That doesn't at all support your idea that tutorials should teach using classes instead of functions. In general, using functions will be faster; in the most common case, using functions will use less memory; most importantly, in the vast majority of cases, it won't matter anyway.  Maybe a MicroPython tutorial should have a section on how running on a machine with only 4KB changes a lot of the usual tradeoffs, using a decorator as an example. But a tutorial on decorators should show using a function, because it's the simplest, most readable way to do it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From surya.subbarao1 at  Sat Jan  2 22:56:08 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Sat, 2 Jan 2016 19:56:08 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

> If you don't understand this, it may help to profile [fib(i) for i in
> range(10000)]. You'll see that the wrapper function gets called a ton of
> times, the wrapper function gets called 10000 times, and the factory
> function (which created wrapper functions) gets called 0 times.

Ah, I see now. Thank you.

On Sat, Jan 2, 2016 at 7:50 PM, Andrew Barnert <abarnert at> wrote:

> On Jan 2, 2016, at 19:00, u8y7541 The Awesome Person <
> surya.subbarao1 at> wrote:
> The wrapper functions themselves, though, exist in a one:one
>> correspondence with the functions they're applied to - when you apply
>> functools.lru_cache to a function, the transient decorator produced by
>> the decorator factory only lasts as long as the execution of the
>> function definition, but the wrapper function lasts for as long as the
>> wrapped function does, and gets invoked every time that function is
>> called (and if a function is performance critical enough for the
>> results to be worth caching, then it's likely performance critical
>> enough to be thinking about micro-optimisations). (Nick Coghlan)
> Yes, that is what I was thinking of. Just like Quake's fast inverse square
> root. Even though it is a micro-optimization, it greatly affects how fast
> the game runs.
> Of course micro-optimizations _can_ matter--when you're optimizing the
> work done in the inner loop of a program that's CPU-bound, even a few
> percent can make a difference.
> But that doesn't mean they _always_ matter. Saving 50ns in some code that
> runs thousands of times per frames makes a difference; saving 50ns in some
> code that happens once at startup does not. That's why we have profiling
> tools: so you can find the bit of your program where you're spending 99% of
> your time doing something a billion times, and optimize that part.
> And it also doesn't mean that everything that sounds like it should be
> lighter is worth doing. You have to actually test it and see. In the
> typical case where you're replacing one function object with one class
> object and one instance object, that's actually taking more space, not less.
> But, as I explained, the function will _not_ be redefined and trashed
>> every frame; it will be created one time. (Andrew Barnert)
> Hmm... Nick says different...
> No, Nick doesn't say different. Read it again. The wrapper function lives
> as long as the wrapped function lives. It doesn't get created anew each
> time you call it.
> If you don't understand this, it may help to profile [fib(i) for i in
> range(10000)]. You'll see that the wrapper function gets called a ton of
> times, the wrapper function gets called 10000 times, and the factory
> function (which created wrapper functions) gets called 0 times.
> This all suggests that if your application is severely memory
>> constrained (e.g. it's running on an embedded interpreter like
>> MicroPython), then it *might* make sense to incur the extra complexity
>> of using classes with a custom __call__ method to define wrapper
>> functions, over just using a nested function. (Nick Coghlan)
> Yes, I was thinking of that when I started this thread, but this thread is
> just from my speculation.
> Nick is saying that there may be some cases where it might make sense to
> use a class. That doesn't at all support your idea that tutorials should
> teach using classes instead of functions. In general, using functions will
> be faster; in the most common case, using functions will use less memory;
> most importantly, in the vast majority of cases, it won't matter anyway.
> Maybe a MicroPython tutorial should have a section on how running on a
> machine with only 4KB changes a lot of the usual tradeoffs, using a
> decorator as an example. But a tutorial on decorators should show using a
> function, because it's the simplest, most readable way to do it.

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Sat Jan  2 23:01:06 2016
From: random832 at (Random832)
Date: Sat, 02 Jan 2016 23:01:06 -0500
Subject: [Python-ideas] Bad programming style in decorators?
References: <>
Message-ID: <>

Nick Coghlan  writes:
> (and if a function is performance critical enough for the
> results to be worth caching, then it's likely performance critical
> enough to be thinking about micro-optimisations).

Maybe. It could be that the "real" implementation is Very Expensive to
invoke, and/or that the characteristics of how the function is called
change the complexity class of an algorithm that calls it for a cached
vs non-cached version.

From ncoghlan at  Sun Jan  3 01:42:13 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 3 Jan 2016 16:42:13 +1000
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On 3 January 2016 at 13:48, Guido van Rossum <guido at> wrote:
> Whoops, Nick already did the micro-benchmarks, and showed that creating a
> function object is faster than instantiating a class. He also measured the
> size, but I think he forgot that sys.getsizeof() doesn't report the size
> (recursively) of contained objects -- a class instance references a dict
> which is another 288 bytes (though if you care you can get rid of this by
> using __slots__).

You're right I forgot to account for that (54 bytes without __slots__
did seem surprisingly small!), but functions also always allocate
f.__annotations__ at the moment.

Always allocating f.__annotations__ actually puzzled me a bit - did we
do that for a specific reason, or did we just not think of setting it
to None when it's unused to save space the way we do for other
function attributes? (__closure__, __defaults__, etc)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From storchaka at  Sun Jan  3 06:55:36 2016
From: storchaka at (Serhiy Storchaka)
Date: Sun, 3 Jan 2016 13:55:36 +0200
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <n6b27o$k19$>

On 03.01.16 04:42, Nick Coghlan wrote:
> Finally, we get to the question of relative size: are function
> instances larger or smaller than your typical class instance? Again,
> we don't have to guess, we can use the interpreter to experiment and
> check our assumptions:
>      >>> import sys
>      >>> def f(): pass
>      ...
>      >>> sys.getsizeof(f)
>      136
>      >>> class C(): pass
>      ...
>      >>> sys.getsizeof(C())
>      56

sys.getsizeof() returns only the bare size of the object, not including 
the size of subobjects. To calculate total size you have to sum sizes of 
all subobjects recursively. [1]


From surya.subbarao1 at  Sun Jan  3 13:57:45 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Sun, 3 Jan 2016 10:57:45 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks for explaining the differences tho. I got confused between the
decorator and the decorator factory, thinking the decorator had a function
inside it. Sorry :)

On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan <ncoghlan at> wrote:

> On 3 January 2016 at 13:48, Guido van Rossum <guido at> wrote:
> > Whoops, Nick already did the micro-benchmarks, and showed that creating a
> > function object is faster than instantiating a class. He also measured
> the
> > size, but I think he forgot that sys.getsizeof() doesn't report the size
> > (recursively) of contained objects -- a class instance references a dict
> > which is another 288 bytes (though if you care you can get rid of this by
> > using __slots__).
> You're right I forgot to account for that (54 bytes without __slots__
> did seem surprisingly small!), but functions also always allocate
> f.__annotations__ at the moment.
> Always allocating f.__annotations__ actually puzzled me a bit - did we
> do that for a specific reason, or did we just not think of setting it
> to None when it's unused to save space the way we do for other
> function attributes? (__closure__, __defaults__, etc)
> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan  4 15:31:48 2016
From: guido at (Guido van Rossum)
Date: Mon, 4 Jan 2016 12:31:48 -0800
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

[Adding python-ideas back -- I'm not sure why you dropped it but it looks
like an oversight, not intentional]

On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert at> wrote:

> On Dec 27, 2015, at 09:04, Guido van Rossum <guido at> wrote:
> > If we want some way to turn something that just defines __getitem__ and
> __len__ into a proper sequence, it should just be made to inherit from
> Sequence, which supplies the default __iter__ and __reversed__.
> (Registration is *not* good enough here.)
> So, if I understand correctly, you're hoping that we can first make the
> old-style sequence protocol unnecessary, except for backward compatibility,
> and then maybe change the docs to only mention it for backward
> compatibility, and only then deprecate it?

That sounds about right.

> I think it's worth doing those first two steps, but not actually
> deprecating it, at least while Python 2.7 is still around; otherwise, for
> dual-version code, something like Steven D'Aprano's "Squares" type would
> have to copy Indexable from the 3.x stdlib or get it from some third-party
> module like six or backports.collections.

Yes, that's fine. Deprecation sometimes just has to take a really long time.

> > If we really want a way to turn something that just supports __getitem__
> into an Iterable maybe we can provide an additional ABC for that purpose;
> let's call it a HalfSequence until we've come up with a better name. (We
> can't use Iterable for this because Iterable should not reference
> __getitem__.)
> #25988 (using Nick's name Indexable, and the details from that post).

Oh, interesting. Though I have misgivings about that name.

> > I also think it's fine to introduce Reversible as another ABC and
> carefully fit it into the existing hierarchy. It should be a one-trick pony
> and be another base class for Sequence; it should not have a default
> implementation. (But this has been beaten to death in other threads -- it's
> time to just file an issue with a patch.)
> #25987.


--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From surya.subbarao1 at  Mon Jan  4 19:04:06 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Mon, 4 Jan 2016 16:04:06 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

Yes, I knew this already. This is what I understood when I said I
understood. I thought the *decorator* had a function inside it, not
the *decorator
factory*. So of course the decorator factory is only called twice.

On Mon, Jan 4, 2016 at 10:48 AM, Andrew Barnert <abarnert at> wrote:

> (off-list, because I think this is no longer relevant to suggesting
> changes to Python)
> On Sunday, January 3, 2016 10:58 AM, u8y7541 The Awesome Person <
> surya.subbarao1 at> wrote:
> >Thanks for explaining the differences tho. I got confused between the
> decorator and the decorator factory, thinking the decorator had a function
> inside it. Sorry :)
> I think you're _still_ confused. Not your fault, because this is confusing
> stuff. The decorator actually does (usually) have a function definition in
> it too. But the decorated function--the wrapper that it defines--doesn't.
> And that's the thing that you usually call a zillion times, not the
> decorator or the decorator factory.
> Let's make it concrete and as simple as possible, and then walk through
> all the details:
>     def div(id):
>         def decorator(func):
>             @wraps(func)
>             def wrapper(*args, **kw):
>                 return "<div id='{}'>{}</div>".format(id, func(*args,
> **kw))
>             return wrapper
>         return decorator
>     @div('eggs')
>     def eggs():
>         return 'eggs'
>     @div('cheese')
>     def cheeses():
>         return '<ul><li>gouda</li><li>edam</li></ul>'
>     for _ in range(1000000):
>         print(eggs())
>         print(cheeses())
> When you're importing the module and hit that "@div('eggs')", that calls
> the "div" factory. The only other time that happens is at the
> "@div('cheese')". So, the factory does of course create a function, but the
> factory only gets called twice in your entire program. (Also, the functions
> it creates become garbage as soon as they're called, so by the time you get
> to the "for" loop, they've both been deleted.)
> When you finish the "def eggs():" or "def cheeses():" statement, the
> decorator function "decorator" returned by "div('eggs')" or "div('cheese')"
> gets called. And that decorator also creates a function. But each one only
> gets called once in your entire program, and there are only two of them, so
> that's only two extra function definitions. (Obviously these two aren't
> garbage--they're the functions you call inside the loop.)
> When you hit that "print(eggs())" line, you're calling the decorated
> function "wrapper", returned by the decorator function "decorator",
> returned by the decorator factory function "div". That function does not
> have a function definition inside of it. So, calling it a million times
> doesn't cost anything in function definitions.
> And of course "div" itself isn't garbage--you don't need it anymore, but
> if you don't tell Python "del div", it'll stick around. So, at your peak,
> in the middle of that "for" loop, you have 5 function definitions around
> (div, decorated eggs, original eggs, decorated cheese, original cheese).
> If you refactor things differently, you could have 5 functions, 2 class
> objects, 2 class instances (your intended class-style design); or 4
> functions, 1 class object, 2 class instances, 2 bound methods, (a simple
> class-style decorator); 4 functions, 1 class object, 2 class instances,
> (the smallest possible class-style decorator); 2 functions but with
> duplicated code and string constants (by inlining div directly into each
> function); or 3 functions (with eggs and cheese explicitly calling div--I
> suspect this would be actually be smallest here); etc. The difference is
> going to be a few hundred bytes one way or the other, and the smallest
> possible design may have a severe cost in (time) performance or in
> readability. But, as you suggested, and Nick confirmed, there are cases
> where it matters. Which means it's worth knowing how to write all the
> different possibilities, and evaluate them analytically, and test them.

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan  4 19:26:53 2016
From: guido at (Guido van Rossum)
Date: Mon, 4 Jan 2016 16:26:53 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan <ncoghlan at> wrote:

> On 3 January 2016 at 13:48, Guido van Rossum <guido at> wrote:
> > Whoops, Nick already did the micro-benchmarks, and showed that creating a
> > function object is faster than instantiating a class. He also measured
> the
> > size, but I think he forgot that sys.getsizeof() doesn't report the size
> > (recursively) of contained objects -- a class instance references a dict
> > which is another 288 bytes (though if you care you can get rid of this by
> > using __slots__).
> You're right I forgot to account for that (54 bytes without __slots__
> did seem surprisingly small!), but functions also always allocate
> f.__annotations__ at the moment.
> Always allocating f.__annotations__ actually puzzled me a bit - did we
> do that for a specific reason, or did we just not think of setting it
> to None when it's unused to save space the way we do for other
> function attributes? (__closure__, __defaults__, etc)

Where do you see that happening? The code in funcobject.c seems to indicate
that it's created on demand. (And that's how I remember it always being.)

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan  4 21:46:38 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 4 Jan 2016 18:46:38 -0800
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 4, 2016, at 12:31, Guido van Rossum <guido at> wrote:
>> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert at> wrote:
>> On Dec 27, 2015, at 09:04, Guido van Rossum <guido at> wrote:
>> > If we really want a way to turn something that just supports __getitem__ into an Iterable maybe we can provide an additional ABC for that purpose; let's call it a HalfSequence until we've come up with a better name. (We can't use Iterable for this because Iterable should not reference __getitem__.)
>> #25988 (using Nick's name Indexable, and the details from that post).
> Oh, interesting. Though I have misgivings about that name.

Now that you mention it, I can see the confusion. I interpreted Nick's "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as opposed to "subscriptable by arbitrary keys". But if I didn't already know what he intended, I suppose I could have instead guessed "usable as an index", which would be very misleading.

There don't seem to be any existing terms for this that don't relate to "sequence", so maybe your HalfSequence (or Sequential or SequentiallySubscriptable or something even more horrible than that last one?) is the best option?

Or, hopefully, someone _can_ come up with a better name. :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From vgr255 at  Mon Jan  4 22:08:18 2016
From: vgr255 at (Emanuel Barry)
Date: Mon, 4 Jan 2016 22:08:18 -0500
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>,
Message-ID: <BLU172-W3965A686A6D2DD595700EE91F30@phx.gbl>

Output from both 3.4.1 and 3.5.0:
>>> def foo(): pass>>> foo.__annotations__{}
Probably an oversight. I'm also not a C expert, but func_get_annotations (line 396 and onwards in funcobject.c) explicitely returns a new, empty dict if the function doesn't have any annotations (unlike all the other slots, like __defaults__ or __kwdefaults__, which merely return None if they're not present).
From: guido at
Date: Mon, 4 Jan 2016 16:26:53 -0800
To: ncoghlan at
Subject: Re: [Python-ideas] Bad programming style in decorators?
CC: python-ideas at; surya.subbarao1 at

On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan <ncoghlan at> wrote:
On 3 January 2016 at 13:48, Guido van Rossum <guido at> wrote:

> Whoops, Nick already did the micro-benchmarks, and showed that creating a

> function object is faster than instantiating a class. He also measured the

> size, but I think he forgot that sys.getsizeof() doesn't report the size

> (recursively) of contained objects -- a class instance references a dict

> which is another 288 bytes (though if you care you can get rid of this by

> using __slots__).

You're right I forgot to account for that (54 bytes without __slots__

did seem surprisingly small!), but functions also always allocate

f.__annotations__ at the moment.

Always allocating f.__annotations__ actually puzzled me a bit - did we

do that for a specific reason, or did we just not think of setting it

to None when it's unused to save space the way we do for other

function attributes? (__closure__, __defaults__, etc)

Where do you see that happening? The code in funcobject.c seems to indicate that it's created on demand. (And that's how I remember it always being.)
--Guido van Rossum (

Python-ideas mailing list
Python-ideas at
Code of Conduct: 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan  4 22:33:30 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 4 Jan 2016 19:33:30 -0800
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <BLU172-W3965A686A6D2DD595700EE91F30@phx.gbl>
References: <>
Message-ID: <>

On Jan 4, 2016, at 19:08, Emanuel Barry <vgr255 at> wrote:
> Output from both 3.4.1 and 3.5.0:
> >>> def foo(): pass
> >>> foo.__annotations__
> {}
> Probably an oversight. I'm also not a C expert, but func_get_annotations (line 396 and onwards in funcobject.c) explicitely returns a new, empty dict if the function doesn't have any annotations (unlike all the other slots, like __defaults__ or __kwdefaults__, which merely return None if they're not present).

But that code implies if you just create a new function object and never check its __annotations__, it's not wasting any space for them. Otherwise, it wouldn't have to check for NULL there.

We can't test this _directly_ from Python, but with a bit of ctypes hackery and funcobject.h, we can define a PyFunctionObject(Structure)... or, keeping things a bit more concise but a lot more hacky for the purposes of email:

    def func(): pass
    pf = cast(id(func), POINTER(c_voidp))
    assert pf[2] == id(func.__code__)
    assert not pf[12]
    assert pf[12]

So it is indeed NULL until you check it, and then it becomes something (an empty dict).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Tue Jan  5 00:25:29 2016
From: guido at (Guido van Rossum)
Date: Mon, 4 Jan 2016 21:25:29 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

Following up on this, in theory the right way to walk a tree using pathlib
already exists, it's the rglob() method. E.g. all paths under /foo/bar
should be found as follows:

  for path in pathlib.Path('/foo/bar').rglob('**/*'):

The PermissionError bug you found is already reported: -- it even has  a patch but it's stuck in

Sadly there's another error: loops introduced by symlinks cause infinite
recursion. I filed that here: (The fix
should be judicious use of is_symlink(), but the code is a little

On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker <chris.barker at>

> On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum <guido at>
> wrote:
>> The two-level iteration forced upon you by os.walk() is indeed often
>> unnecessary -- but handling dirs and files separately usually makes sense,
> indeed, but not always, so a simple API that allows you to get a flat walk
> would be nice....
> Of course for that basic use case, you could just write your own wrapper
>>> around os.walk:
> sure, but having to write "little" wrappers for common needs is
> unfortunate...
> The problem isn't designing a nice walk API; it's integrating it with
>>> pathlib.*
> indeed -- I'd really like to see a *walk in pathlib itself. I've been
> trying to use pathlib whenever I need, well, a path, but then I find I
> almost immediately need to step out and use an os.path function, and have
> to string-fy it anyway -- makes me wonder what the point is..
>  And honestly, if open, os.walk, etc. aren't going to work with Path
>>> objects,
> but they should -- of course they should.....
> Truly pushing for adoption of a new abstraction like this takes many years
>> -- pathlib was new (and provisional) in 3.4 so it really hasn't been long
>> enough to give up on it. The OP hasn't!
> it will take many years for sure -- but the standard library cold at least
> adopt it as much as possible.
> Path.walk would be a nice start :-)
> My example: one of our sysadmins wanted a little script to go thorugh an
> entire drive (Windows), and check if any paths were longer than 256
> characters (Windows, remember..)
> I came up with this:
> def get_all_paths(start_dir='/'):
>     for dirpath, dirnames, filenames in os.walk(start_dir):
>         for filename in filenames:
>             yield os.path.join(dirpath, filename)
> too_long = []
> for p in get_all_paths('/'):
>     print("checking:", p)
>     if len(p) > 255:
>         too_long.append(p)
>         print("Path too long!")
> way too wordy!
> I started with pathlib, but that just made it worse.
> now that I think about it, maybe I could have simpily used
> pathlib.Path.rglob....
> However, when I try that, I get a permission error:
> /Users/chris.barker/miniconda2/envs/py3/lib/python3.5/ in
> wrapped(pathobj, *args)
>     369         @functools.wraps(strfunc)
>     370         def wrapped(pathobj, *args):
> --> 371             return strfunc(str(pathobj), *args)
>     372         return staticmethod(wrapped)
>     373
> PermissionError: [Errno 13] Permission denied:
> '/Users/.chris.barker.xahome/caches/opendirectory'
> as the error comes insider the rglob() generator, I'm not sure how to tell
> it to ignore and move on....
> os.walk is somehow able to deal with this.
> -CHB
> --
> Christopher Barker, Ph.D.
> Oceanographer
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> Chris.Barker at

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From surya.subbarao1 at  Tue Jan  5 00:32:49 2016
From: surya.subbarao1 at (u8y7541 The Awesome Person)
Date: Mon, 4 Jan 2016 21:32:49 -0800
Subject: [Python-ideas] How exactly does from ... import ... work?
Message-ID: <>

Suppose I have a file called which reads like this:
class A:
    def __init__(self, foo): = foo = bar(foo)
class B(A):
class C(B):
def bar(foo):
    return foo + 1

Suppose in another file in the same directory, I have another python

from randomFile import C

# some code

When C has to be imported, B also has to be imported because it is the
parent. Therefore, A also has to be imported. This also results in the
function bar being imported. When from ... import ... is called, does
Python follow all the references and import everything that is needed, or
does it just import the whole namespace (making wildcard imports acceptable

-Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Tue Jan  5 00:36:53 2016
From: rosuav at (Chris Angelico)
Date: Tue, 5 Jan 2016 16:36:53 +1100
Subject: [Python-ideas] How exactly does from ... import ... work?
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 5, 2016 at 4:32 PM, u8y7541 The Awesome Person
<surya.subbarao1 at> wrote:
> Suppose I have a file called which reads like this:
> class A:
>     def __init__(self, foo):
> = foo
> = bar(foo)
> class B(A):
>     pass
> class C(B):
>     pass
> def bar(foo):
>     return foo + 1
> Suppose in another file in the same directory, I have another python
> program.
> from randomFile import C
> # some code
> When C has to be imported, B also has to be imported because it is the
> parent. Therefore, A also has to be imported. This also results in the
> function bar being imported. When from ... import ... is called, does Python
> follow all the references and import everything that is needed, or does it
> just import the whole namespace (making wildcard imports acceptable :O)?

Not sure why this is on -ideas; explanations of how Python already
works would more normally go on python-list.

When you say "from X import Y", what Python does is, more-or-less:

import Y
X = Y.X
del Y

The entire file gets executed, and then one symbol from it gets
imported into the current namespace.


From ncoghlan at  Tue Jan  5 00:50:11 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 5 Jan 2016 15:50:11 +1000
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> On Jan 4, 2016, at 12:31, Guido van Rossum <guido at> wrote:
> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert at> wrote:
>> On Dec 27, 2015, at 09:04, Guido van Rossum <guido at> wrote:
>> > If we really want a way to turn something that just supports __getitem__
>> > into an Iterable maybe we can provide an additional ABC for that purpose;
>> > let's call it a HalfSequence until we've come up with a better name. (We
>> > can't use Iterable for this because Iterable should not reference
>> > __getitem__.)
>> #25988 (using Nick's name Indexable, and the details from that post).
> Oh, interesting. Though I have misgivings about that name.
> Now that you mention it, I can see the confusion. I interpreted Nick's
> "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as
> opposed to "subscriptable by arbitrary keys". But if I didn't already know
> what he intended, I suppose I could have instead guessed "usable as an
> index", which would be very misleading.
> There don't seem to be any existing terms for this that don't relate to
> "sequence", so maybe your HalfSequence (or Sequential or
> SequentiallySubscriptable or something even more horrible than that last
> one?) is the best option?
> Or, hopefully, someone _can_ come up with a better name. :)

I mainly suggested Indexable because it was the least-worst name I
could think of, and I'd previously suggested Index as the name for
"has an __index__ method" (in the context of typing, but it would also
work in the context of

The main alternative I've thought of is "IterableByIndex", which is
both explicit and accurate, with the only strike against it being


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Tue Jan  5 00:55:00 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 5 Jan 2016 15:55:00 +1000
Subject: [Python-ideas] Bad programming style in decorators?
In-Reply-To: <>
References: <>
Message-ID: <>

On 5 January 2016 at 10:26, Guido van Rossum <guido at> wrote:
> On Sat, Jan 2, 2016 at 10:42 PM, Nick Coghlan <ncoghlan at> wrote:
>> Always allocating f.__annotations__ actually puzzled me a bit - did we
>> do that for a specific reason, or did we just not think of setting it
>> to None when it's unused to save space the way we do for other
>> function attributes? (__closure__, __defaults__, etc)
> Where do you see that happening? The code in funcobject.c seems to indicate
> that it's created on demand. (And that's how I remember it always being.)

I didn't check the code, only the behaviour, so I missed that querying
f.__annotations__ was implicitly creating the dictionary.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ethan at  Tue Jan  5 01:04:26 2016
From: ethan at (Ethan Furman)
Date: Mon, 04 Jan 2016 22:04:26 -0800
Subject: [Python-ideas] How exactly does from ... import ... work?
In-Reply-To: <>
References: <>
Message-ID: <>

This list is for ideas about future Python (the language) enhancements.

Please direct questions about how Python currently works to the tutor list:


From guido at  Tue Jan  5 01:49:18 2016
From: guido at (Guido van Rossum)
Date: Mon, 4 Jan 2016 22:49:18 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum <guido at> wrote:

> Following up on this, in theory the right way to walk a tree using pathlib
> already exists, it's the rglob() method. E.g. all paths under /foo/bar
> should be found as follows:
>   for path in pathlib.Path('/foo/bar').rglob('**/*'):

Whoops, I just realized that I combined two ways of doing a recursive glob
here. It should be either rglob('*') or plain glob('**/*'). What I wrote
produces identical results, but at the cost of a lot of caching. :-)

Note that the PEP doesn't mention rglob() -- why do we even have it? It
seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep
is '/'). No TOOWTDI here?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tritium-list at  Tue Jan  5 09:13:24 2016
From: tritium-list at (Alexander Walters)
Date: Tue, 05 Jan 2016 09:13:24 -0500
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

Devils Advocate:  Please don't make me press shift more than twice in a 
base class name if you expect me to use it.  It just makes annoying 
avoidable typos more common.

'Subscripted' sounds good to me, if that's worth anything.

On 1/5/2016 00:50, Nick Coghlan wrote:
> On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas
> <python-ideas at> wrote:
>> On Jan 4, 2016, at 12:31, Guido van Rossum <guido at> wrote:
>> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert at> wrote:
>>> On Dec 27, 2015, at 09:04, Guido van Rossum <guido at> wrote:
>>>> If we really want a way to turn something that just supports __getitem__
>>>> into an Iterable maybe we can provide an additional ABC for that purpose;
>>>> let's call it a HalfSequence until we've come up with a better name. (We
>>>> can't use Iterable for this because Iterable should not reference
>>>> __getitem__.)
>>> #25988 (using Nick's name Indexable, and the details from that post).
>> Oh, interesting. Though I have misgivings about that name.
>> Now that you mention it, I can see the confusion. I interpreted Nick's
>> "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as
>> opposed to "subscriptable by arbitrary keys". But if I didn't already know
>> what he intended, I suppose I could have instead guessed "usable as an
>> index", which would be very misleading.
>> There don't seem to be any existing terms for this that don't relate to
>> "sequence", so maybe your HalfSequence (or Sequential or
>> SequentiallySubscriptable or something even more horrible than that last
>> one?) is the best option?
>> Or, hopefully, someone _can_ come up with a better name. :)
> I mainly suggested Indexable because it was the least-worst name I
> could think of, and I'd previously suggested Index as the name for
> "has an __index__ method" (in the context of typing, but it would also
> work in the context of
> The main alternative I've thought of is "IterableByIndex", which is
> both explicit and accurate, with the only strike against it being
> length.
> Cheers,
> Nick.

From chris.barker at  Tue Jan  5 11:30:00 2016
From: chris.barker at (Chris Barker - NOAA Federal)
Date: Tue, 5 Jan 2016 08:30:00 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <9171176375587048097@unknownmsgid>

Thanks for following up.

it's the rglob() method. E.g. all paths under /foo/bar should be found as

  for path in pathlib.Path('/foo/bar').rglob('**/*'):

The PermissionError bug you found is already reported: -- it even has  a patch but it's stuck in

Thanks for pinging that -- I had somehow assumed that the PermissionError
was intentional.

Sadly there's another error: loops introduced by symlinks cause infinite
recursion. I filed that here: (The fix
should be judicious use of is_symlink(), but the code is a little



On Mon, Dec 28, 2015 at 11:25 AM, Chris Barker <chris.barker at>

> On Tue, Dec 22, 2015 at 4:23 PM, Guido van Rossum <guido at>
> wrote:
>> The two-level iteration forced upon you by os.walk() is indeed often
>> unnecessary -- but handling dirs and files separately usually makes sense,
> indeed, but not always, so a simple API that allows you to get a flat walk
> would be nice....
> Of course for that basic use case, you could just write your own wrapper
>>> around os.walk:
> sure, but having to write "little" wrappers for common needs is
> unfortunate...
> The problem isn't designing a nice walk API; it's integrating it with
>>> pathlib.*
> indeed -- I'd really like to see a *walk in pathlib itself. I've been
> trying to use pathlib whenever I need, well, a path, but then I find I
> almost immediately need to step out and use an os.path function, and have
> to string-fy it anyway -- makes me wonder what the point is..
>  And honestly, if open, os.walk, etc. aren't going to work with Path
>>> objects,
> but they should -- of course they should.....
> Truly pushing for adoption of a new abstraction like this takes many years
>> -- pathlib was new (and provisional) in 3.4 so it really hasn't been long
>> enough to give up on it. The OP hasn't!
> it will take many years for sure -- but the standard library cold at least
> adopt it as much as possible.
> Path.walk would be a nice start :-)
> My example: one of our sysadmins wanted a little script to go thorugh an
> entire drive (Windows), and check if any paths were longer than 256
> characters (Windows, remember..)
> I came up with this:
> def get_all_paths(start_dir='/'):
>     for dirpath, dirnames, filenames in os.walk(start_dir):
>         for filename in filenames:
>             yield os.path.join(dirpath, filename)
> too_long = []
> for p in get_all_paths('/'):
>     print("checking:", p)
>     if len(p) > 255:
>         too_long.append(p)
>         print("Path too long!")
> way too wordy!
> I started with pathlib, but that just made it worse.
> now that I think about it, maybe I could have simpily used
> pathlib.Path.rglob....
> However, when I try that, I get a permission error:
> /Users/chris.barker/miniconda2/envs/py3/lib/python3.5/ in
> wrapped(pathobj, *args)
>     369         @functools.wraps(strfunc)
>     370         def wrapped(pathobj, *args):
> --> 371             return strfunc(str(pathobj), *args)
>     372         return staticmethod(wrapped)
>     373
> PermissionError: [Errno 13] Permission denied:
> '/Users/.chris.barker.xahome/caches/opendirectory'
> as the error comes insider the rglob() generator, I'm not sure how to tell
> it to ignore and move on....
> os.walk is somehow able to deal with this.
> -CHB
> --
> Christopher Barker, Ph.D.
> Oceanographer
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> Chris.Barker at

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Tue Jan  5 11:37:54 2016
From: chris.barker at (Chris Barker - NOAA Federal)
Date: Tue, 5 Jan 2016 08:37:54 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <8348957817041496645@unknownmsgid>

> Note that the PEP doesn't mention rglob() -- why do we even have it? It seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep is '/'). No TOOWTDI here?

Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind
of magic a newbie ( like me :-) ) would have research and understand.


From guido at  Tue Jan  5 11:45:11 2016
From: guido at (Guido van Rossum)
Date: Tue, 5 Jan 2016 08:45:11 -0800
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

Or maybe Indexable is fine after all, since the arguments to __getitem__
are supposed to be objects with an __index__ method (e.g. Integral, but not

BTW, Maybe Index needs to be added to as an ABC? PEP 357, which
introduced it, sounds like it pre-dates ABCs.

On Tue, Jan 5, 2016 at 6:13 AM, Alexander Walters <tritium-list at>

> Devils Advocate:  Please don't make me press shift more than twice in a
> base class name if you expect me to use it.  It just makes annoying
> avoidable typos more common.
> 'Subscripted' sounds good to me, if that's worth anything.
> On 1/5/2016 00:50, Nick Coghlan wrote:
>> On 5 January 2016 at 12:46, Andrew Barnert via Python-ideas
>> <python-ideas at> wrote:
>>> On Jan 4, 2016, at 12:31, Guido van Rossum <guido at> wrote:
>>> On Fri, Jan 1, 2016 at 2:25 PM, Andrew Barnert <abarnert at>
>>> wrote:
>>>> On Dec 27, 2015, at 09:04, Guido van Rossum <guido at> wrote:
>>>>> If we really want a way to turn something that just supports
>>>>> __getitem__
>>>>> into an Iterable maybe we can provide an additional ABC for that
>>>>> purpose;
>>>>> let's call it a HalfSequence until we've come up with a better name.
>>>>> (We
>>>>> can't use Iterable for this because Iterable should not reference
>>>>> __getitem__.)
>>>> #25988 (using Nick's name Indexable, and the details from that post).
>>> Oh, interesting. Though I have misgivings about that name.
>>> Now that you mention it, I can see the confusion. I interpreted Nick's
>>> "Indexable" to mean "subscriptable by indexes (and slices of indexes)" as
>>> opposed to "subscriptable by arbitrary keys". But if I didn't already
>>> know
>>> what he intended, I suppose I could have instead guessed "usable as an
>>> index", which would be very misleading.
>>> There don't seem to be any existing terms for this that don't relate to
>>> "sequence", so maybe your HalfSequence (or Sequential or
>>> SequentiallySubscriptable or something even more horrible than that last
>>> one?) is the best option?
>>> Or, hopefully, someone _can_ come up with a better name. :)
>> I mainly suggested Indexable because it was the least-worst name I
>> could think of, and I'd previously suggested Index as the name for
>> "has an __index__ method" (in the context of typing, but it would also
>> work in the context of
>> The main alternative I've thought of is "IterableByIndex", which is
>> both explicit and accurate, with the only strike against it being
>> length.
>> Cheers,
>> Nick.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tritium-list at  Tue Jan  5 12:17:33 2016
From: tritium-list at (Alexander Walters)
Date: Tue, 05 Jan 2016 12:17:33 -0500
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/5/2016 11:45, Guido van Rossum wrote:
> since the arguments to __getitem__ are supposed to be objects with an 
> __index__ method

...In the context of classic iteration only?

From abarnert at  Tue Jan  5 13:17:42 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 5 Jan 2016 10:17:42 -0800
Subject: [Python-ideas] Deprecating the old-style sequence protocol
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 5, 2016, at 09:17, Alexander Walters <tritium-list at> wrote:
>> On 1/5/2016 11:45, Guido van Rossum wrote:
>> since the arguments to __getitem__ are supposed to be objects with an __index__ method
> ...In the context of classic iteration only?

Basically, in the context of what makes a sequence different from a mapping.

The idea here is to have a way to signal that a class follows the old-style sequence protocol, as opposed to being a mapping or some other use of __getitem__: you can access its elements by subscripting it with indexes from 0 up to the first one that raises IndexError (or up to __len__, if you provide it, but that isn't necessary). But this doesn't have to be airtight for type proofs or anything; if your class inherits from Indexable but then accepts integers too large to fit in an Index, that's fine.

From srkunze at  Tue Jan  5 13:41:11 2016
From: srkunze at (Sven R. Kunze)
Date: Tue, 5 Jan 2016 19:41:11 +0100
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

Don't get me wrong but either glob('**/*') and rglob('*') sounds quite 
cryptic. Furthermore, globbing always sounds slow to me.

Is it fast?
And is there some way to leave out the '*' (three special characters for 
plain ol'everything)?
And how can I walk directories only and files only?

On 05.01.2016 07:49, Guido van Rossum wrote:
> On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum <guido at 
> <mailto:guido at>> wrote:
>     Following up on this, in theory the right way to walk a tree using
>     pathlib already exists, it's the rglob() method. E.g. all paths
>     under /foo/bar should be found as follows:
>       for path in pathlib.Path('/foo/bar').rglob('**/*'):
> Whoops, I just realized that I combined two ways of doing a recursive 
> glob here. It should be either rglob('*') or plain glob('**/*'). What 
> I wrote produces identical results, but at the cost of a lot of 
> caching. :-)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Tue Jan  5 15:21:09 2016
From: guido at (Guido van Rossum)
Date: Tue, 5 Jan 2016 12:21:09 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <8348957817041496645@unknownmsgid>
References: <>
Message-ID: <>

On Tue, Jan 5, 2016 at 8:37 AM, Chris Barker - NOAA Federal <
chris.barker at> wrote:

> > Note that the PEP doesn't mention rglob() -- why do we even have it? It
> seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep
> is '/'). No TOOWTDI here?
> Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind
> of magic a newbie ( like me :-) ) would have research and understand.

Sure. It's too late to remove it anyway.

Is there anything actionable here besides fixing the PermissionError and
the behavior under symlink loops? IMO if you want files only or directories
only you can just add a filter using e.g. is_dir():

p = pathlib.Path.cwd()
real_dirs =  [p for p in p.rglob('*') if p.is_dir() and not p.is_symlink()]

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From moloney at  Tue Jan  5 15:27:24 2016
From: moloney at (Brendan Moloney)
Date: Tue, 5 Jan 2016 20:27:24 +0000
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

The main issue is the lack of stat caching. That is why I wrote my own module around scandir which includes the DirEntry objects for each path so that the consumer can also do stuff with the cached stat info (like check if it is a file or directory). Often we won't need to call stat on the path at all, and if we do it will only be once.

Brendan Moloney
Research Associate
Advanced Imaging Research Center
Oregon Health Science University
From: Python-ideas [ at] on behalf of Guido van Rossum [guido at]
Sent: Tuesday, January 05, 2016 12:21 PM
To: Chris Barker - NOAA Federal
Cc: Python-Ideas
Subject: Re: [Python-ideas] find-like functionality in pathlib

On Tue, Jan 5, 2016 at 8:37 AM, Chris Barker - NOAA Federal <chris.barker at<mailto:chris.barker at>> wrote:
> Note that the PEP doesn't mention rglob() -- why do we even have it? It seems rglob(pat) is exactly the same as glob('**/' + path) (assuming os.sep is '/'). No TOOWTDI here?

Much as I believe in TOOWTDI, I like having rglob(). "**/" is the kind
of magic a newbie ( like me :-) ) would have research and understand.

Sure. It's too late to remove it anyway.

Is there anything actionable here besides fixing the PermissionError and the behavior under symlink loops? IMO if you want files only or directories only you can just add a filter using e.g. is_dir():

p = pathlib.Path.cwd()
real_dirs =  [p for p in p.rglob('*') if p.is_dir() and not p.is_symlink()]

--Guido van Rossum (<>)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Tue Jan  5 16:04:21 2016
From: guido at (Guido van Rossum)
Date: Tue, 5 Jan 2016 13:04:21 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 5, 2016 at 12:27 PM, Brendan Moloney <moloney at> wrote:

> The main issue is the lack of stat caching. That is why I wrote my own
> module around scandir which includes the DirEntry objects for each path so
> that the consumer can also do stuff with the cached stat info (like check
> if it is a file or directory). Often we won't need to call stat on the path
> at all, and if we do it will only be once.

I wonder if stat() caching shouldn't be made an orthogonal optional feature
of Path objects somehow; it keeps coming back as useful in various cases
even though we don't want to enable it by default.

One problem with stat() caching is that Path objects are considered
immutable, and two Path objects referring to the same path are completely
interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is a
set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
cache_stat=True), the behavior of two instances of that object might be
observably different (if they were instantiated at times when the contents
of the filesystem was different). So maybe stat-caching Path instances
should be considered unequal, or perhaps unhashable. Or perhaps they should
only be considered equal if their stat() values are actually equal (i.e. if
the file's stat() info didn't change).
So this is a thorny issue that requires some real thought before we commit
to an API.

We might also want to create Path instances directly from DirEntry objects.
(Interesting, the DirEntry API seems to be a subset of the Path API, except
for the .path attribute which is equivalent to the str() of a Path object.)

Maybe some of this can be done first as a 3rd party module forked from the
original 3rd party pathlib? seems
reasonably up to date.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From barry at  Tue Jan  5 18:49:21 2016
From: barry at (Barry Warsaw)
Date: Tue, 5 Jan 2016 18:49:21 -0500
Subject: [Python-ideas] PEP 9 - plaintext PEP format - is officially
Message-ID: <>

I don't think this will be at all controversial.  Brett suggested, and there
was no disagreement from the PEP editors, that plain text PEPs be deprecated.
reStructuredText is clearly a better format, and all recent PEP submissions
have been in reST for a while now anyway.

I am therefore withdrawing[*] PEP 9 and have made other appropriate changes to
make it clear that only PEP 12 format is acceptable going forward.  The PEP
editors will not be converting the legacy PEPs to reST, nor will we currently
be renaming the relevant PEP source files to end with ".rst" since there's too
much tooling that would have to change to do so.  However, if either task
really interests you, please get in touch with the PEP editors.

it-only-took-15-years-ly y'rs,
-Barry (on behalf of the PEP editors)

[*] Status: Withdrawn being about the only currently appropriate resolution
status for process PEPs.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From greg.ewing at  Tue Jan  5 17:59:49 2016
From: greg.ewing at (Greg Ewing)
Date: Wed, 06 Jan 2016 11:59:49 +1300
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum wrote:
> I wonder if stat() caching shouldn't be made an orthogonal optional 
> feature of Path objects somehow; it keeps coming back as useful in 
> various cases even though we don't want to enable it by default.

Maybe path.stat() could return a PathWithStat object that
inherits from Path and can do everything that a Path can
do, but also contains cached stat info and has a suitable
set of attributes for accessing it.

This would make it clear at what point in time the info
is valid for, i.e. the moment you called stat(). It would
also provide an obvious way to refresh the info: calling
path_with_stat.stat() would give you a new PathWithStat
containing updated info.

Things like scandir could then return pre-populated
PathWithStat objects.


From guido at  Tue Jan  5 19:02:41 2016
From: guido at (Guido van Rossum)
Date: Tue, 5 Jan 2016 16:02:41 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 5, 2016 at 2:59 PM, Greg Ewing <greg.ewing at>

> Guido van Rossum wrote:
>> I wonder if stat() caching shouldn't be made an orthogonal optional
>> feature of Path objects somehow; it keeps coming back as useful in various
>> cases even though we don't want to enable it by default.
> Maybe path.stat() could return a PathWithStat object that
> inherits from Path and can do everything that a Path can
> do, but also contains cached stat info and has a suitable
> set of attributes for accessing it.

Well, Path.stat() is already defined and returns the same type of object
that os.stat() returns, and I don't think we should change that.

We could add a new method that does this, but as long as it inherits from
Path it wouldn't really address the issue with objects being == to each
other but holding different stat info.

> This would make it clear at what point in time the info
> is valid for, i.e. the moment you called stat(). It would
> also provide an obvious way to refresh the info: calling
> path_with_stat.stat() would give you a new PathWithStat
> containing updated info.
> Things like scandir could then return pre-populated
> PathWithStat objects.

I presume you are proposing a new Path.scandir() method -- the existing
os.scandir() method already returns DirEntry objects which we really don't
want to change at this point.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Wed Jan  6 11:11:13 2016
From: random832 at (Random832)
Date: Wed, 06 Jan 2016 11:11:13 -0500
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
> One problem with stat() caching is that Path objects are considered
> immutable, and two Path objects referring to the same path are completely
> interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is
> a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
> cache_stat=True), the behavior of two instances of that object might be
> observably different (if they were instantiated at times when the
> contents of the filesystem was different). So maybe stat-caching Path instances
> should be considered unequal, or perhaps unhashable. Or perhaps they
> should only be considered equal if their stat() values are actually equal (i.e.
> if the file's stat() info didn't change).

What about a global cache?

From guido at  Wed Jan  6 11:48:30 2016
From: guido at (Guido van Rossum)
Date: Wed, 6 Jan 2016 08:48:30 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at> wrote:

> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
> > One problem with stat() caching is that Path objects are considered
> > immutable, and two Path objects referring to the same path are completely
> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')} is
> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
> > cache_stat=True), the behavior of two instances of that object might be
> > observably different (if they were instantiated at times when the
> > contents of the filesystem was different). So maybe stat-caching Path
> instances
> > should be considered unequal, or perhaps unhashable. Or perhaps they
> > should only be considered equal if their stat() values are actually
> equal (i.e.
> > if the file's stat() info didn't change).
> What about a global cache?

It would have to use a weak dict so if the last reference goes away it
discards the cached stats for a given path, otherwise you'd have trouble
containing the cache size.

And caching Path objects should still not be comparable to non-caching Path
objects (which we will need to preserve the semantics that repeatedly
calling stat() on a Path object created the default way will always redo
the syscall). The main advantage would be that caching Path objects could
be compared safely.

It could still cause unexpected results. E.g. if you have just traversed
some big tree using caching, and saved some results (so hanging on to some
paths and hence their stat() results), and then you make some changes and
traverse it again to look for something else, you might accidentally be
seeing stale (i.e. cached) stat() results.

Maybe there's a middle ground, where the user can create a StatCache object
and pass it into Path creation and traversal operations. Paths with the
same StatCache object (or both None) compare equal if their path components
are equal. Paths with different StatCache objects never compare equal (but
otherwise are ordered by path as usual -- the StatCache object's identity
is only used when the paths are equal.

Are you (or anyone still reading this) interested in implementing this idea?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From srkunze at  Wed Jan  6 12:04:38 2016
From: srkunze at (Sven R. Kunze)
Date: Wed, 6 Jan 2016 18:04:38 +0100
Subject: [Python-ideas] intuitive timedeltas like in go
Message-ID: <>


timedelta handling always felt cumbersome to me:

from datetime import timedelta

short_period = timedelta(seconds=10)
long_period = timedelta(hours=4, seconds=37)

Today, I came across this one 
and I found the creation of a 10 seconds timeout extremely intuitive. 
Would this represent a valuable addition to Python?

from datetime import second, hour

short period = 10*second
long_period = 4*hour + 37*second


From ian.g.kelly at  Wed Jan  6 12:24:28 2016
From: ian.g.kelly at (Ian Kelly)
Date: Wed, 6 Jan 2016 10:24:28 -0700
Subject: [Python-ideas] intuitive timedeltas like in go
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 6, 2016 at 10:04 AM, Sven R. Kunze <srkunze at> wrote:
> Hi,
> timedelta handling always felt cumbersome to me:
> from datetime import timedelta
> short_period = timedelta(seconds=10)
> long_period = timedelta(hours=4, seconds=37)
> Today, I came across this one and
> I found the creation of a 10 seconds timeout extremely intuitive. Would this
> represent a valuable addition to Python?
> from datetime import second, hour
> short period = 10*second
> long_period = 4*hour + 37*second

Anybody who wants this can already accomplish it with just a few extra lines:

>>> from datetime import timedelta
>>> second = timedelta(seconds=1)
>>> hour = timedelta(hours=1)
>>> 10*second
datetime.timedelta(0, 10)
>>> 4*hour + 37*second
datetime.timedelta(0, 14437)

From storchaka at  Wed Jan  6 12:36:30 2016
From: storchaka at (Serhiy Storchaka)
Date: Wed, 6 Jan 2016 19:36:30 +0200
Subject: [Python-ideas] intuitive timedeltas like in go
In-Reply-To: <>
References: <>
Message-ID: <n6jjav$jbu$>

On 06.01.16 19:04, Sven R. Kunze wrote:
> timedelta handling always felt cumbersome to me:
> from datetime import timedelta
> short_period = timedelta(seconds=10)
> long_period = timedelta(hours=4, seconds=37)
> Today, I came across this one
> and I found the creation of a 10 seconds timeout extremely intuitive.
> Would this represent a valuable addition to Python?
> from datetime import second, hour
> short period = 10*second
> long_period = 4*hour + 37*second

Does Go support keyword arguments?

From guido at  Wed Jan  6 15:00:24 2016
From: guido at (Guido van Rossum)
Date: Wed, 6 Jan 2016 12:00:24 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 4, 2016 at 9:25 PM, Guido van Rossum <guido at> wrote:

> Following up on this, in theory the right way to walk a tree using pathlib
> already exists, it's the rglob() method. E.g. all paths under /foo/bar
> should be found as follows:
>   for path in pathlib.Path('/foo/bar').rglob('**/*'):
[actually, rglob('*') or glob('**/*')]

>       print(path)
> The PermissionError bug you found is already reported:
> -- it even has  a patch but it's stuck
> in review.

I committed this fix.

> Sadly there's another error: loops introduced by symlinks cause infinite
> recursion. I filed that here: (The fix
> should be judicious use of is_symlink(), but the code is a little
> convoluted.)

I committed a fix for this too (turned out to need just one call to

I also added a .path attribute to pathlib.*Path objects, so that p.path ==
str(p). You can now use the idiom getattr(arg, 'path', arg) to extract the
path from a pathlib.Path object, or from an os.DirEntry object, or fall
back to a plain string, without using str(arg), which would turn *any*
object into a string, which is never what you want to happen by default.

These changes will be released in Python 3.4.5, 3.5.2 and 3.6.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Wed Jan  6 17:42:35 2016
From: guido at (Guido van Rossum)
Date: Wed, 6 Jan 2016 14:42:35 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

I couldn't help myself and coded up a prototype for the StatCache design I
sketched. See Feedback welcome! On my
Mac it only seems to offer limited benefits though...

On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum <guido at> wrote:

> On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at> wrote:
>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
>> > One problem with stat() caching is that Path objects are considered
>> > immutable, and two Path objects referring to the same path are
>> completely
>> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')}
>> is
>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
>> > cache_stat=True), the behavior of two instances of that object might be
>> > observably different (if they were instantiated at times when the
>> > contents of the filesystem was different). So maybe stat-caching Path
>> instances
>> > should be considered unequal, or perhaps unhashable. Or perhaps they
>> > should only be considered equal if their stat() values are actually
>> equal (i.e.
>> > if the file's stat() info didn't change).
>> What about a global cache?
> It would have to use a weak dict so if the last reference goes away it
> discards the cached stats for a given path, otherwise you'd have trouble
> containing the cache size.
> And caching Path objects should still not be comparable to non-caching
> Path objects (which we will need to preserve the semantics that repeatedly
> calling stat() on a Path object created the default way will always redo
> the syscall). The main advantage would be that caching Path objects could
> be compared safely.
> It could still cause unexpected results. E.g. if you have just traversed
> some big tree using caching, and saved some results (so hanging on to some
> paths and hence their stat() results), and then you make some changes and
> traverse it again to look for something else, you might accidentally be
> seeing stale (i.e. cached) stat() results.
> Maybe there's a middle ground, where the user can create a StatCache
> object and pass it into Path creation and traversal operations. Paths with
> the same StatCache object (or both None) compare equal if their path
> components are equal. Paths with different StatCache objects never compare
> equal (but otherwise are ordered by path as usual -- the StatCache object's
> identity is only used when the paths are equal.
> Are you (or anyone still reading this) interested in implementing this
> idea?
> --
> --Guido van Rossum (

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From moloney at  Wed Jan  6 17:48:45 2016
From: moloney at (Brendan Moloney)
Date: Wed, 6 Jan 2016 22:48:45 +0000
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

Its important to keep in mind the main benefit of scandir is you don't have to do ANY stat call in many cases, because the directory listing provides some subset of this info. On Linux you can at least tell if a path is a file or directory.  On windows there is much more info provided by the directory listing. Avoiding subsequent stat calls is also nice, but not nearly as important due to OS level caching.

Brendan Moloney
Research Associate
Advanced Imaging Research Center
Oregon Health Science University
From: Python-ideas [ at] on behalf of Guido van Rossum [guido at]
Sent: Wednesday, January 06, 2016 2:42 PM
To: Random832
Cc: Python-Ideas
Subject: Re: [Python-ideas] find-like functionality in pathlib

I couldn't help myself and coded up a prototype for the StatCache design I sketched. See Feedback welcome! On my Mac it only seems to offer limited benefits though...

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Wed Jan  6 23:35:00 2016
From: chris.barker at (Chris Barker - NOAA Federal)
Date: Wed, 6 Jan 2016 20:35:00 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <7140563289735493919@unknownmsgid>

The PermissionError bug you found is already reported

I committed this fix.


I also added a .path attribute to pathlib.*Path objects, so that p.path ==
str(p). You can now use the idiom getattr(arg, 'path', arg) to extract the
path from a pathlib.Path object, or from an os.DirEntry object, or fall
back to a plain string, without using str(arg), which would turn *any*
object into a string, which is never what you want to happen by default.

Very nice -- that opens the door to stdlib and third party modules taking
Path objects in addition to strings. Maybe we will see greater adoption of
pathlib after all!


These changes will be released in Python 3.4.5, 3.5.2 and 3.6.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From wes.turner at  Thu Jan  7 04:03:01 2016
From: wes.turner at (Wes Turner)
Date: Thu, 7 Jan 2016 03:03:01 -0600
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

A bit OT, possibly, but this may be a long way around (to a cached *graph*
of paths and metadata) with similar use cases:, NetworkX edge, node dicts

def walk_path_into_graph(g, path_, errors='warn'):

This stats and reads limited image format metadata as CSV, TSV, JSON:

I suppose because of race conditions this metadata should actually be
stored in a filesystem triplestore with extended attributes and also
secontext attributes.

(... gnome-tracker reads filesystem stat data into RDF, for SPARQL).

BSP vertex messaging can probably handle cascading cache invalidation (with
On Jan 6, 2016 4:44 PM, "Guido van Rossum" <guido at> wrote:

> I couldn't help myself and coded up a prototype for the StatCache design I
> sketched. See Feedback welcome! On my
> Mac it only seems to offer limited benefits though...
> On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum <guido at> wrote:
>> On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at> wrote:
>>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
>>> > One problem with stat() caching is that Path objects are considered
>>> > immutable, and two Path objects referring to the same path are
>>> completely
>>> > interchangeable. For example, {pathlib.Path('/a'), pathlib.Path('/a')}
>>> is
>>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
>>> > cache_stat=True), the behavior of two instances of that object might be
>>> > observably different (if they were instantiated at times when the
>>> > contents of the filesystem was different). So maybe stat-caching Path
>>> instances
>>> > should be considered unequal, or perhaps unhashable. Or perhaps they
>>> > should only be considered equal if their stat() values are actually
>>> equal (i.e.
>>> > if the file's stat() info didn't change).
>>> What about a global cache?
>> It would have to use a weak dict so if the last reference goes away it
>> discards the cached stats for a given path, otherwise you'd have trouble
>> containing the cache size.
>> And caching Path objects should still not be comparable to non-caching
>> Path objects (which we will need to preserve the semantics that repeatedly
>> calling stat() on a Path object created the default way will always redo
>> the syscall). The main advantage would be that caching Path objects could
>> be compared safely.
>> It could still cause unexpected results. E.g. if you have just traversed
>> some big tree using caching, and saved some results (so hanging on to some
>> paths and hence their stat() results), and then you make some changes and
>> traverse it again to look for something else, you might accidentally be
>> seeing stale (i.e. cached) stat() results.
>> Maybe there's a middle ground, where the user can create a StatCache
>> object and pass it into Path creation and traversal operations. Paths with
>> the same StatCache object (or both None) compare equal if their path
>> components are equal. Paths with different StatCache objects never compare
>> equal (but otherwise are ordered by path as usual -- the StatCache object's
>> identity is only used when the paths are equal.
>> Are you (or anyone still reading this) interested in implementing this
>> idea?
>> --
>> --Guido van Rossum (
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From wes.turner at  Thu Jan  7 04:08:45 2016
From: wes.turner at (Wes Turner)
Date: Thu, 7 Jan 2016 03:08:45 -0600
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

The PyFilesystem filesystem abstraction APIs may also have / be in need of
a sensible .walk() API

  walk() Like listdir() but descends in to sub-directories

  walkdirs() Returns an iterable of paths to sub-directories

  walkfiles() Returns an iterable of file paths in a directory, and its
On Jan 7, 2016 3:03 AM, "Wes Turner" <wes.turner at> wrote:

> A bit OT, possibly, but this may be a long way around (to a cached *graph*
> of paths and metadata) with similar use cases:
>, NetworkX edge, node dicts
> def walk_path_into_graph(g, path_, errors='warn'):
>     """
>     """
> This stats and reads limited image format metadata as CSV, TSV, JSON:
> I suppose because of race conditions this metadata should actually be
> stored in a filesystem triplestore with extended attributes and also
> secontext attributes.
> (... gnome-tracker reads filesystem stat data into RDF, for SPARQL).
> BSP vertex messaging can probably handle cascading cache invalidation
> (with supersteps).
> On Jan 6, 2016 4:44 PM, "Guido van Rossum" <guido at> wrote:
>> I couldn't help myself and coded up a prototype for the StatCache design
>> I sketched. See Feedback welcome! On
>> my Mac it only seems to offer limited benefits though...
>> On Wed, Jan 6, 2016 at 8:48 AM, Guido van Rossum <guido at>
>> wrote:
>>> On Wed, Jan 6, 2016 at 8:11 AM, Random832 <random832 at>
>>> wrote:
>>>> On Tue, Jan 5, 2016, at 16:04, Guido van Rossum wrote:
>>>> > One problem with stat() caching is that Path objects are considered
>>>> > immutable, and two Path objects referring to the same path are
>>>> completely
>>>> > interchangeable. For example, {pathlib.Path('/a'),
>>>> pathlib.Path('/a')} is
>>>> > a set of length 1: {PosixPath('/a')}. But if we had e.g. Path('/a',
>>>> > cache_stat=True), the behavior of two instances of that object might
>>>> be
>>>> > observably different (if they were instantiated at times when the
>>>> > contents of the filesystem was different). So maybe stat-caching Path
>>>> instances
>>>> > should be considered unequal, or perhaps unhashable. Or perhaps they
>>>> > should only be considered equal if their stat() values are actually
>>>> equal (i.e.
>>>> > if the file's stat() info didn't change).
>>>> What about a global cache?
>>> It would have to use a weak dict so if the last reference goes away it
>>> discards the cached stats for a given path, otherwise you'd have trouble
>>> containing the cache size.
>>> And caching Path objects should still not be comparable to non-caching
>>> Path objects (which we will need to preserve the semantics that repeatedly
>>> calling stat() on a Path object created the default way will always redo
>>> the syscall). The main advantage would be that caching Path objects could
>>> be compared safely.
>>> It could still cause unexpected results. E.g. if you have just traversed
>>> some big tree using caching, and saved some results (so hanging on to some
>>> paths and hence their stat() results), and then you make some changes and
>>> traverse it again to look for something else, you might accidentally be
>>> seeing stale (i.e. cached) stat() results.
>>> Maybe there's a middle ground, where the user can create a StatCache
>>> object and pass it into Path creation and traversal operations. Paths with
>>> the same StatCache object (or both None) compare equal if their path
>>> components are equal. Paths with different StatCache objects never compare
>>> equal (but otherwise are ordered by path as usual -- the StatCache object's
>>> identity is only used when the paths are equal.
>>> Are you (or anyone still reading this) interested in implementing this
>>> idea?
>>> --
>>> --Guido van Rossum (
>> --
>> --Guido van Rossum (
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From em at  Thu Jan  7 04:20:35 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 10:20:35 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8 if
 latin-1 fails in http.client
Message-ID: <>


I hope python-ideas is the right place to post this, I'm very new to 
this and appreciate a pointer in the right direction if this is not it.

The requests project is getting multiple bug reports about a problem in 
the stdlib http.client, so I thought I'd raise an issue about it here. 
The bug reports concern people posting http requests with unicode 
strings when they should be using utf-8 encoded strings.

Since RFC 2616 says latin-1 is the default encoding http.client tries 
that and fails with a UnicodeEncodeError.

My idea is NOT to change from latin-1 to something else, that would 
break compliance with the spec, but instead catch that exception, and 
try encoding with utf-8 instead. That would avoid breaking backward 
compatibility, unless someone specifically relied on that exception, 
which I think is very unlikely.

This is also how other languages http libraries seem to deal with this, 
sending in unicode just works:

In cURL (works fine):
curl -d "Celebrate ?"

In Ruby with http.rb (works fine):
require 'http'
r ="", :body => "Celebrate ?)

In Node with request (works fine):
var request = require('request');{url: '', body: "Celebrate ?"}, function 
(error, response, body) {

But Python 3 with requests crashes instead:
import requests
r ="http://localhost:8000/tag", data="Celebrate ?")

...with the following stacktrace:
   File "../lib/python3.4/http/", line 1127, in _send_request
     body = body.encode('iso-8859-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 
14-15: ordinal not in range(256)


So the rationale for this idea is:

* http.client doesn't work the way beginners expect for very basic 
usecases (posting unicode strings)
* Libraries in other languages behave like beginners expect, which 
magnifies the problem.
* Changing the default latin-1 encoding probably isn't possible, because 
it would break the spec...
* But catching the exception and try encoding in utf-8 instead wouldn't 
break the spec and solves the problem.


Here's a couple of issues where people expect things to work differently:


Does this make sense?


From rosuav at  Thu Jan  7 04:49:55 2016
From: rosuav at (Chris Angelico)
Date: Thu, 7 Jan 2016 20:49:55 +1100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016 at 8:20 PM, Emil Stenstr?m <em at> wrote:
> So the rationale for this idea is:
> * http.client doesn't work the way beginners expect for very basic usecases (posting unicode strings)
> * Libraries in other languages behave like beginners expect, which magnifies the problem.
> * Changing the default latin-1 encoding probably isn't possible, because it would break the spec...
> * But catching the exception and try encoding in utf-8 instead wouldn't break the spec and solves the problem.
> ----
> Here's a couple of issues where people expect things to work differently:
> ----
> Does this make sense?

It makes sense, but I disagree with the suggestion. Having "Latin-1 or
UTF-8" as the effective default encoding is not a good idea, IMO;
sometimes I've *de*coded text using such heuristics (the other order,
of course; attempt UTF-8 decode, and if that fail, decode as Latin-1
or possibly CP-1252) as a means of coping with broken systems, but I
would much prefer the default to simply be one or the other.

As the 'requests' module is not part of Python's standard library, it
would be free to change its own default, regardless of the behaviour
of http.client; whether that's a good idea or not is for the requests
community to decide (unless there's something specifically binding it
to http.client). But whether you're asking for a change in http.client
or in requests, I would disagree with the "either-or" approach; change
to a UTF-8 default, perhaps, but not to the hybrid.


From cory at  Thu Jan  7 05:07:41 2016
From: cory at (Cory Benfield)
Date: Thu, 7 Jan 2016 10:07:41 +0000
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

> On 7 Jan 2016, at 09:20, Emil Stenstr?m <em at> wrote:
> Since RFC 2616 says latin-1 is the default encoding http.client tries that and fails with a UnicodeEncodeError.

I cannot stress this enough: there is *no* default encoding for HTTP bodies!

This conversation is very confused, and it all starts because of a thoroughly misleading comment in http.client.

Firstly, let?s all remember that RFC 2616 is dead (hurrah!), now superseded by RFCs 7230 through 7238. However, http.client blames its decision on RFC 2616. Note the comment here[0]. This is (in my view) a *misreading* of RFC 2616 Section 3.7.1, which says:

> When no explicit charset
> parameter is provided by the sender, media subtypes of the ?text"
> type are defined to have a default charset value of "ISO-8859-1" when
> received via HTTP.

The thing is, this paragraph is referring to MIME types: that is, when the Content-Type header reads ?text/<something>?, and specifies no charset parameter, the body should be encoded in UTF-8.

That, of course, is not the invariant this code enforces. Instead, this code spots the *only* explicit reference to a text encoding and chooses to use it for any unicode string sent by the user. That?s a somewhat defensible decision, though it?s not the one I?d have made.

*However*, that fallback was removed in RFC 7231. In appendix B of that RFC, we see this note:

> The default charset of ISO-8859-1 for text media types has been
> removed; the default is now whatever the media type definition says.
> Likewise, special treatment of ISO-8859-1 has been removed from the
> Accept-Charset header field.

This means there is no longer a default content encoding for HTTP, and instead the default encoding varies based on media type. The relevant RFC for this is RFC 6657, which specifies the following things:

- The default encoding for text/plain is US-ASCII
- All other text subtypes either MUST provide a charset parameter that explicitly indicates what their encoding is, or MUST NOT provide one under any circumstances and instead carry that information in their contents (e.g. HTML, XML). That is to say, there are no defaults for text/* encodings: only explicit encoding choices!

This whole thing was really very confusing from the beginning. IMO, the only safe decision is for http.client to simply refuse to accept unicode strings *at all* as request bodies: the ambiguity over what they mean is simply too great. Requests has had a large number of bug reports from people who claimed that something ?didn?t work?, when in practice there was just a disagreement over what the correct encoding of something was. And having written both a HTTP/1.1 and a HTTP/2 client myself, in both cases I restricted the arguments of HTTPConnection.send() to bytestrings.

For what it?s worth, I don?t believe it?s a good idea to change the default body encoding of unicode strings. This is the kind of really perplexing change that takes working code that implicitly relies on this behaviour and breaks it. In my experience, breakage of this manner is particularly tricky to catch because anything that can be validly encoded as Latin-1 can be validly encoded as UTF-8, so the failure will manifest as request failures rather than tracebacks. In this instance I believe the http.client module has made its bed, and will need to lie in it.

If this *did* change, Requests would (at least for the remainder of the 2.X release cycle) need to enforce the Latin-1 behaviour itself for the very same backward compatibility reasons, which removes any benefit we?d get from this anyway.

The really correct behaviour would be to tell users they cannot send unicode strings, because it makes no sense. That?s a change I could get behind. But moving from one guess to another, even though the new guess is more likely to be right, seems to me to be misunderstanding the problem.


N.B: I should note that only one of the linked requests issues, #2838, is actually about the request body. Of the others, one is about unicode in the request URI and one is about unicode in header values. This set of related issues demonstrates an ongoing confusion amongst users about what unicode strings are and how they work, but that?s a separate discussion to this one.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From p.f.moore at  Thu Jan  7 06:37:39 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Jan 2016 11:37:39 +0000
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 January 2016 at 09:20, Emil Stenstr?m <em at> wrote:

> This is also how other languages http libraries seem to deal with this,
> sending in unicode just works:
> In cURL (works fine):
> curl -d "Celebrate ?"

In a Unix shell, this would be supplying a bytestring argument to the curl
exe, that encoded the characters in whatever language setting the user had
specified (likely UTF-8).

In Windows Powershell (the only Windows shell I can think of that would
support Unicode) what would happen would depend on how curl accessed its
command line. This probably relies on which specific CRT the code was built

> In Ruby with http.rb (works fine):
> require 'http'
> r ="", :body => "Celebrate ?)

I don't know how Ruby handles Unicode, but would that body argument
*actually* be Unicode, or would it be a UTF-8 encoded bytestring? I have a
vague recollection that Ruby uses a "utf-8 for internal string encodings"
model, which may mean it's not as strict as Python 3 is about separating
bytestrings and Unicode strings...

> In Node with request (works fine):
> var request = require('request');
>{url: '', body: "Celebrate ?"}, function
> (error, response, body) {
>     console.log(body)
> })

Same response here as for Ruby. It depends on the semantics of the language
regarding Unicode support as to what's happening here.

> But Python 3 with requests crashes instead:
> import requests
> r ="http://localhost:8000/tag", data="Celebrate ?")
> ...with the following stacktrace:
> ...
>   File "../lib/python3.4/http/", line 1127, in _send_request
>     body = body.encode('iso-8859-1')
> UnicodeEncodeError: 'latin-1' codec can't encode characters in position
> 14-15: ordinal not in range(256)

What does the requests documentation say it'll do with a Unicode string
being passed as POST data to a request where there's no encoding? If it
says it'll encode as latin-1, then that error is entirely correct. If it
says it'll encode in some other encoding, then it isn't doing so (and
that's a requests bug). If it's not explaining what it's doing, then the
requests documentation is doing its users a disservice by not explaining
the realities of sending Unicode over a byte-oriented protocol - and it's
also leaving a huge "undefined behaviour" hole that people are falling into.

I understand that beginners are confused by the apparent problem that other
environments "just work", but they really don't - and the problems will hit
the user further down the line, when the issue is harder to debug. For
example, you're completely ignoring the potential issue of what the target
server will do when faced with UTF-8 data - there's no guarantee that it
will work in general.

So IMO, this needs to be addressed as a documentation (and possibly code)
fix in requests. It's something of a shame that httplib.client doesn't
reject Unicode strings rather than making a silent assumption of the
encoding, but that's something we have to live with for backward
compatibility reasons. But there's no reason requests has to expose that
behaviour to the user.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Thu Jan  7 06:53:57 2016
From: rosuav at (Chris Angelico)
Date: Thu, 7 Jan 2016 22:53:57 +1100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016 at 10:37 PM, Paul Moore <p.f.moore at> wrote:
> So IMO, this needs to be addressed as a documentation (and possibly code) fix in requests. It's something of a shame that httplib.client doesn't reject Unicode strings rather than making a silent assumption of the encoding, but that's something we have to live with for backward compatibility reasons. But there's no reason requests has to expose that behaviour to the user.

Personally, I would be happy with any of three behaviours:

1) Raise TypeError and demand that byte strings be used
2) Encode as UTF-8, since that's most likely to "just work", and is
also consistent
3) Encode as ASCII, and let any errors bubble up.

But, backward compat.


From p.f.moore at  Thu Jan  7 07:09:23 2016
From: p.f.moore at (Paul Moore)
Date: Thu, 7 Jan 2016 12:09:23 +0000
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On 7 January 2016 at 11:53, Chris Angelico <rosuav at> wrote:
> 3) Encode as ASCII, and let any errors bubble up.

4) Encode as ASCII and catch UnicodeEncodeError and re-raise as a
TypeError "Unicode string supplied without an explicit encoding".

IMO, the underlying encoding errors are very user-unfriendly, and
should nearly always be caught internally and replaced with something
more user friendly. Most of the user confusion I see from Unicode
issues could probably be significantly alleviated if the user was
presented with something better than a raw (en/de)coding error and


From rosuav at  Thu Jan  7 07:29:31 2016
From: rosuav at (Chris Angelico)
Date: Thu, 7 Jan 2016 23:29:31 +1100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016 at 11:09 PM, Paul Moore <p.f.moore at> wrote:
> On 7 January 2016 at 11:53, Chris Angelico <rosuav at> wrote:
>> 3) Encode as ASCII, and let any errors bubble up.
> 4) Encode as ASCII and catch UnicodeEncodeError and re-raise as a
> TypeError "Unicode string supplied without an explicit encoding".
> IMO, the underlying encoding errors are very user-unfriendly, and
> should nearly always be caught internally and replaced with something
> more user friendly. Most of the user confusion I see from Unicode
> issues could probably be significantly alleviated if the user was
> presented with something better than a raw (en/de)coding error and
> traceback.

Maybe. Same difference, though - permit ASCII-only, anything else is an error.


From steve at  Thu Jan  7 07:59:19 2016
From: steve at (Steven D'Aprano)
Date: Thu, 7 Jan 2016 23:59:19 +1100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:

> It makes sense, but I disagree with the suggestion. Having "Latin-1 or
> UTF-8" as the effective default encoding is not a good idea, IMO;

I'm curious what your reasoning is. That seems to be fairly common 
behavious with some email clients, for example I seem to recall that 
Thunderbird will try encoding emails as US-ASCII, if that fails, 
Latin-1, and only send UTF-8 if the other two don't work.

I'm not defending this tactic, but wondering what you have against it.


From em at  Thu Jan  7 08:11:01 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 14:11:01 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On 2016-01-07 13:59, Steven D'Aprano wrote:
> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
>> It makes sense, but I disagree with the suggestion. Having "Latin-1 or
>> UTF-8" as the effective default encoding is not a good idea, IMO;
> I'm curious what your reasoning is. That seems to be fairly common
> behavious with some email clients, for example I seem to recall that
> Thunderbird will try encoding emails as US-ASCII, if that fails,
> Latin-1, and only send UTF-8 if the other two don't work.
> I'm not defending this tactic, but wondering what you have against it.

I'm fine with either tactic, either defaulting to utf-8 or trying them 
one after the other. The important thing for me is that the API works as 
expected by many.

My main reason for not changing the default was that it would break 
backwards compatibility, but only for the case that people sent latin-1 
strings as if they where unicode strings.

If the reading of the spec that led to using latin-1 is incorrect that 
really makes we question if having latin-1 there is a good idea from the 

So I'm definitely pro switching to utf-8 as default as it would make the 
API work like many (including me) would expect.


From rosuav at  Thu Jan  7 08:25:33 2016
From: rosuav at (Chris Angelico)
Date: Fri, 8 Jan 2016 00:25:33 +1100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016 at 11:59 PM, Steven D'Aprano <steve at> wrote:
> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
>> It makes sense, but I disagree with the suggestion. Having "Latin-1 or
>> UTF-8" as the effective default encoding is not a good idea, IMO;
> I'm curious what your reasoning is. That seems to be fairly common
> behavious with some email clients, for example I seem to recall that
> Thunderbird will try encoding emails as US-ASCII, if that fails,
> Latin-1, and only send UTF-8 if the other two don't work.
> I'm not defending this tactic, but wondering what you have against it.

An application is free to do that if it likes, although personally I
wouldn't bother. For a library, I'd much rather the rules be as simple
as possible. Maybe "ASCII or UTF-8" (since one is a strict subset of
the other), but not "ASCII or Latin-1 or UTF-7". I'd prefer something
extremely simple: if you don't specify an encoding, it has one
default. That corresponds to a function signature that says
encoding="UTF-8", and you can be 100% confident that omitting the
encoding parameter will do the same thing as passing "UTF-8".


From guido at  Thu Jan  7 11:32:45 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Jan 2016 08:32:45 -0800
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks especially to Cory for digging into the source and the RFCs here!

Personally I'm perplexed that Requests, which claims to be "HTTP for
Humans" doesn't take care of this but just lets http/ blow up.
(However, IIUC both 2838 and 1822 are about the body.encode() call in
Python 3's http/ at _send_request(). 1926 seems to originate in
Requests itself; it's also Python 2.7.)

Anyways, if we were to follow the Python 3 philosophy regarding Unicode to
the letter we would have to reject the str type altogether here, and insist
on bytes. The error message could tell the caller what to do, e.g. "use
data.encode('utf-8') if you want the data to be encoded in UTF-8". (Then of
course the server might not like it.)

An alternative could be to look at the content-type header (if one is
given) and use the charset from there or the default from the RFC for the

But all these are rather painfully backwards incompatible, which is a big
concern here.

Maybe the best solution (most backward compatible *and* most likely to stem
the flood of bug reports) is to just catch the UnicodeError and replace its
message with something more Human-friendly, explaining that the data must
be encoded before sending it. Then the user can figure out what encoding to
use (though yes, most likely UTF-8 is it, so the message could suggest
trying that first).

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From cory at  Thu Jan  7 11:46:50 2016
From: cory at (Cory Benfield)
Date: Thu, 7 Jan 2016 16:46:50 +0000
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

> On 7 Jan 2016, at 16:32, Guido van Rossum <guido at> wrote:
> Personally I'm perplexed that Requests, which claims to be "HTTP for Humans" doesn't take care of this but just lets http/ blow up. (However, IIUC both 2838 and 1822 are about the body.encode() call in Python 3's http/ at _send_request(). 1926 seems to originate in Requests itself; it's also Python 2.7.)

The main reason is historical: this was missed in the original (substantial) rewrite in requests 2.0, and as a result we can?t change it without a backward compat break, just the same as Python. We?ll probably fix it in 3.0.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From random832 at  Thu Jan  7 11:56:07 2016
From: random832 at (Random832)
Date: Thu, 07 Jan 2016 11:56:07 -0500
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016, at 06:53, Chris Angelico wrote:
> On Thu, Jan 7, 2016 at 10:37 PM, Paul Moore <p.f.moore at> wrote:
> >
> > So IMO, this needs to be addressed as a documentation (and possibly code) fix in requests. It's something of a shame that httplib.client doesn't reject Unicode strings rather than making a silent assumption of the encoding, but that's something we have to live with for backward compatibility reasons. But there's no reason requests has to expose that behaviour to the user.
> >
> Personally, I would be happy with any of three behaviours:
> 1) Raise TypeError and demand that byte strings be used
> 2) Encode as UTF-8, since that's most likely to "just work", and is
> also consistent
> 3) Encode as ASCII, and let any errors bubble up.

What about:
4) Silently add a content type (default text/plain; charset=UTF-8) or
charset (if the user has specified a content type without one) if a
unicode string is used. If a byte string is used, use
application/octet-stream for the default content type and don't add a
charset in any case (even if the user-specified content type is text/*)

From abarnert at  Thu Jan  7 11:57:42 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 7 Jan 2016 08:57:42 -0800
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 7, 2016, at 01:20, Emil Stenstr?m <em at> wrote:
> This is also how other languages http libraries seem to deal with this, sending in unicode just works:

No, sending Unicode as UTF-8 doesn't "just work", except when the server is expecting UTF-8. Otherwise, it just makes the problem harder to debug.

Most commonly, people who run into this problem with requests are trying to send JSON or form-encoded data. In either case, the solution is simple: just pass the object to the json= or data= parameter. It's only if you try to do it half-way yourself, calling json.dumps but then not calling .encode, that you run into a problem.

I've also seen people run into this uploading files. Again, if you let requests just take care of it for you (by passing it the filename or file object), it just works. But if you try to do it half-way, reading the whole file into memory as a string but not encoding it, that's when you have problems.

The solution in every case is simple: don't make things harder for yourself by doing extra work and then trying to use the lower-level API, just let requests do it for you.

Of course if you're using http.client or urllib instead of requests, you don't have that option. But if http.client is too low-level for you, the solution isn't to hack up http.client to be more magical when used by people who don't know what they're doing in hopes that it'll work more often than it'll cause further and harder-to-debug problems, it's to tell them to use requests if they don't want to learn what they're doing. 

From random832 at  Thu Jan  7 11:59:23 2016
From: random832 at (Random832)
Date: Thu, 07 Jan 2016 11:59:23 -0500
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 7, 2016, at 07:59, Steven D'Aprano wrote:
> On Thu, Jan 07, 2016 at 08:49:55PM +1100, Chris Angelico wrote:
> > It makes sense, but I disagree with the suggestion. Having "Latin-1 or
> > UTF-8" as the effective default encoding is not a good idea, IMO;
> I'm curious what your reasoning is. That seems to be fairly common 
> behavious with some email clients, for example I seem to recall that 
> Thunderbird will try encoding emails as US-ASCII, if that fails, 
> Latin-1, and only send UTF-8 if the other two don't work.

Sure, but it includes a content-type header with a charset parameter.

I think the behavior of encoding text but not including a charset
parameter is fundamentally broken. If the user supplies a charset
parameter, it should try to use the matching encoding, otherwise it
should pick an encoding (whether that is "always UTF-8" or some other
rule) and add the charset parameter.

From brett at  Thu Jan  7 12:01:08 2016
From: brett at (Brett Cannon)
Date: Thu, 07 Jan 2016 17:01:08 +0000
Subject: [Python-ideas] intuitive timedeltas like in go
In-Reply-To: <n6jjav$jbu$>
References: <> <n6jjav$jbu$>
Message-ID: <>

On Wed, 6 Jan 2016 at 09:37 Serhiy Storchaka <storchaka at> wrote:

> On 06.01.16 19:04, Sven R. Kunze wrote:
> > timedelta handling always felt cumbersome to me:
> >
> > from datetime import timedelta
> >
> > short_period = timedelta(seconds=10)
> > long_period = timedelta(hours=4, seconds=37)
> >
> > Today, I came across this one
> > and I found the creation of a 10 seconds timeout extremely intuitive.
> > Would this represent a valuable addition to Python?
> >
> > from datetime import second, hour
> >
> > short period = 10*second
> > long_period = 4*hour + 37*second
> Does Go support keyword arguments?



> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From em at  Thu Jan  7 13:50:49 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 19:50:49 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

Den 2016-01-07 kl. 17:46, skrev Cory Benfield:
>> On 7 Jan 2016, at 16:32, Guido van Rossum <guido at>
>> wrote:
>> Personally I'm perplexed that Requests, which claims to be "HTTP
>> for Humans" doesn't take care of this but just lets http/
>> blow up. (However, IIUC both 2838 and 1822 are about the
>> body.encode() call in Python 3's http/ at _send_request().
>> 1926 seems to originate in Requests itself; it's also Python 2.7.)
> The main reason is historical: this was missed in the original
> (substantial) rewrite in requests 2.0, and as a result we can?t
> change it without a backward compat break, just the same as Python.
> We?ll probably fix it in 3.0.

So as things stand:

* The general consensus seems to be that the raised error should be 
changed to something like: TypeError("Unicode string supplied without an 
explicit encoding")

* Python would like to change http.client to reject unicode input with 
an exception, but won't because of backwards compatibility

* Requests would like to do the same but won't because of backwards 

I think it will be very hard to find code that breaks because of a type 
change in the exception when sending invalid data. On the other hand, 
it's VERY easy to find people that are affected by the confusing error 
currently in use everywhere.

When a backward compatible change makes life easier for 99.9% of users, 
and 0.1% of users need to debug a TypeError with a very clear error 
message (which was probably a bug in their code to begin with), I'm 
starting to question having a policy that strict.


From guido at  Thu Jan  7 14:04:13 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Jan 2016 11:04:13 -0800
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Thu, Jan 7, 2016 at 10:50 AM, Emil Stenstr?m <em at> wrote:

> Den 2016-01-07 kl. 17:46, skrev Cory Benfield:
>> On 7 Jan 2016, at 16:32, Guido van Rossum <guido at>
>>> wrote:
>>> Personally I'm perplexed that Requests, which claims to be "HTTP
>>> for Humans" doesn't take care of this but just lets http/
>>> blow up. (However, IIUC both 2838 and 1822 are about the
>>> body.encode() call in Python 3's http/ at _send_request().
>>> 1926 seems to originate in Requests itself; it's also Python 2.7.)
>> The main reason is historical: this was missed in the original
>> (substantial) rewrite in requests 2.0, and as a result we can?t
>> change it without a backward compat break, just the same as Python.
>> We?ll probably fix it in 3.0.
> So as things stand:
> * The general consensus seems to be that the raised error should be
> changed to something like: TypeError("Unicode string supplied without an
> explicit encoding")
> * Python would like to change http.client to reject unicode input with an
> exception, but won't because of backwards compatibility
> * Requests would like to do the same but won't because of backwards
> compatibility
> I think it will be very hard to find code that breaks because of a type
> change in the exception when sending invalid data. On the other hand, it's
> VERY easy to find people that are affected by the confusing error currently
> in use everywhere.
> When a backward compatible change makes life easier for 99.9% of users,
> and 0.1% of users need to debug a TypeError with a very clear error message
> (which was probably a bug in their code to begin with), I'm starting to
> question having a policy that strict.

What policy are you referring to? I don't think anyone objects against
making the error message clearer. The objection is against rejecting
unicode strings that in the past would have been successfully encoded using

I'm not sure whether it's a good idea to change the exception type from
TypeError to UnicodeError -- the exception is really related to Unicode so
keeping UnicodeError but changing the message sounds like the right thing
to do. And this can be done independently in both Requests and the stdlib.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Thu Jan  7 14:31:45 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 7 Jan 2016 19:31:45 +0000 (UTC)
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday, January 7, 2016 11:05 AM, Guido van Rossum <guido at> wrote:

> I'm not sure whether it's a good idea to change the exception type from TypeError to UnicodeError -- the exception is really related to Unicode so keeping UnicodeError but changing the message sounds like the right thing to do. And this can be done independently in both Requests and the stdlib.
That sounds like a good idea. A UnicodeEncodeError (or subclass of it?) with text like "HTTP body without encoding defaults to 'latin-1', which can't encode character '\u5555' in position 30: ordinal not in range(256)") would be pretty simple to implement, and would help a lot more than the current text. (And, for those who still can't figure it out, being a unique error message means that within a few days of the change, googling it should get a relevant StackOverflow answer, which isn't true for the generic encoding error message.)

Requests could get fancier. For example, if the string starts with "{", make the error message ask if maybe they wanted to use json=obj instead of data=json.dumps(obj). But I think that wouldn't be appropriate for the stdlib. (Especially since http.client doesn't have a json parameter...) But then it sounds like Requests is planning to remove implicitly-Latin-1 strings via data= anyway in 3.0, which would solve the problem more simply.

From em at  Thu Jan  7 14:40:56 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 20:40:56 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

Den 2016-01-07 kl. 20:04, skrev Guido van Rossum:
> What policy are you referring to?

I was reading, 
which specifies "raised exceptions", but I see now that it's only a draft.

 > I don't think anyone objects against
> making the error message clearer. The objection is against rejecting
> unicode strings that in the past would have been successfully encoded
> using Latin-1.

Then I misunderstood, sorry.

> I'm not sure whether it's a good idea to change the exception type from
> TypeError to UnicodeError -- the exception is really related to Unicode
> so keeping UnicodeError but changing the message sounds like the right
> thing to do. And this can be done independently in both Requests and the
> stdlib.

Agreed. I would also suggest adding the suggestion of encoding in 
"utf-8" specifically which is most likely what will fix the problem. As 
time goes by and more and more legacy systems disappear, this advise 
will become truer each year.


From abarnert at  Thu Jan  7 15:04:04 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 7 Jan 2016 12:04:04 -0800
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Jan 7, 2016, at 11:40, Emil Stenstr?m <em at> wrote:
> Agreed. I would also suggest adding the suggestion of encoding in "utf-8" specifically which is most likely what will fix the problem. As time goes by and more and more legacy systems disappear, this advise will become truer each year.

I disagree. Services that take raw, unformatted text as HTTP bodies and do something useful with it are disappearing in general, not changing the encoding they use for that raw, unformatted text from Latin-1 to UTF-8. And they were never that common in the first place.

So we shouldn't be making it easier to send raw, unformatted text as UTF-8; we should be making it easier to send JSON, form-encoded, multipart, XML, etc. Which, again, Requests already does.

From guido at  Thu Jan  7 15:24:27 2016
From: guido at (Guido van Rossum)
Date: Thu, 7 Jan 2016 12:24:27 -0800
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

It's time that someone files a tracker issue so we can move the remaining
discussion there.

On Thu, Jan 7, 2016 at 12:04 PM, Andrew Barnert <abarnert at> wrote:

> On Jan 7, 2016, at 11:40, Emil Stenstr?m <em at> wrote:
> >
> > Agreed. I would also suggest adding the suggestion of encoding in
> "utf-8" specifically which is most likely what will fix the problem. As
> time goes by and more and more legacy systems disappear, this advise will
> become truer each year.
> I disagree. Services that take raw, unformatted text as HTTP bodies and do
> something useful with it are disappearing in general, not changing the
> encoding they use for that raw, unformatted text from Latin-1 to UTF-8. And
> they were never that common in the first place.
> So we shouldn't be making it easier to send raw, unformatted text as
> UTF-8; we should be making it easier to send JSON, form-encoded, multipart,
> XML, etc. Which, again, Requests already does.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From em at  Thu Jan  7 17:28:13 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 23:28:13 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

Den 2016-01-07 kl. 21:24, skrev Guido van Rossum:
> It's time that someone files a tracker issue so we can move the
> remaining discussion there.

Here is the relevant issue:


From em at  Thu Jan  7 17:36:06 2016
From: em at (=?UTF-8?Q?Emil_Stenstr=c3=b6m?=)
Date: Thu, 7 Jan 2016 23:36:06 +0100
Subject: [Python-ideas] Fall back to encoding unicode strings in utf-8
 if latin-1 fails in http.client
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

Den 2016-01-07 kl. 21:04, skrev Andrew Barnert:
> I disagree. Services that take raw, unformatted text as HTTP bodies
> and do something useful with it are disappearing in general, not
> changing the encoding they use for that raw, unformatted text from
> Latin-1 to UTF-8. And they were never that common in the first
> place.

I just wrote a service like this last week. It takes raw unformatted 
text and returns part-of-speech tags for the text as JSON. That's common 
for NLP services that structure unstructured text. The rationale for 
accepting POST body is simply that it makes it very simple to call the 
service from curl:

curl -d "string here"

So there's no reason these kinds of services would be disappearing.

Let's continue the discussion in the bug tracker:


From victor.stinner at  Fri Jan  8 16:27:09 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 8 Jan 2016 22:27:09 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
Message-ID: <>


Here is a first PEP, part of a serie of 3 PEP to add an API to
implement a static Python optimizer specializing functions with

HTML version:

PEP: xxx
Title: Add dict.__version__
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-January-2016
Python-Version: 3.6


Add a new read-only ``__version__`` property to ``dict`` and
``collections.UserDict`` types, incremented at each change.


In Python, the builtin ``dict`` type is used by many instructions. For
example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the
global namespace, or in the builtins namespace (two dict lookups).
Python uses ``dict`` for the builtins namespace, globals namespace, type
namespaces, instance namespaces, etc. The local namespace (namespace of
a function) is usually optimized to an array, but it can be a dict too.

Python is hard to optimize because almost everything is mutable: builtin
functions, function code, global variables, local variables, ... can be
modified at runtime. Implementing optimizations respecting the Python
semantic requires to detect when "something changes": we will call these
checks "guards".

The speedup of optimizations depends on the speed of guard checks. This
PEP proposes to add a version to dictionaries to implement efficient
guards on namespaces.

Example of optimization: replace loading a global variable with a
constant.  This optimization requires a guard on the global variable to
check if it was modified. If the variable is modified, the variable must
be loaded at runtime, instead of using the constant.

Guard example

Pseudo-code of an efficient guard to check if a dictionary key was
modified (created, updated or deleted)::

    UNSET = object()

    class Guard:
        def __init__(self, dict, key):
            self.dict = dict
            self.key = key
            self.value = dict.get(key, UNSET)
            self.version = dict.__version__

        def check(self):
            """Return True if the dictionary value did not changed."""
            version = self.dict.__version__
            if version == self.version:
                # Fast-path: avoid the dictionary lookup
                return True

            value = self.dict.get(self.key, UNSET)
            if value == self.value:
                # another key was modified:
                # cache the new dictionary version
                self.version = version
                return True

            return False


Add a read-only ``__version__`` property to builtin ``dict`` type and to
the ``collections.UserDict`` type. New empty dictionaries are initilized
to version ``0``. The version is incremented at each change:

* ``clear()`` if the dict was non-empty
* ``pop(key)`` if the key exists
* ``popitem()`` if the dict is non-empty
* ``setdefault(key, value)`` if the `key` does not exist
* ``__detitem__(key)`` if the key exists
* ``__setitem__(key, value)`` if the `key` doesn't exist or if the value
  is different
* ``update(...)`` if new values are different than existing values (the
  version can be incremented multiple times)


    >>> d = {}
    >>> d.__version__
    >>> d['key'] = 'value'
    >>> d.__version__
    >>> d['key'] = 'new value'
    >>> d.__version__
    >>> del d['key']
    >>> d.__version__

If a dictionary is created with items, the version is also incremented
at each dictionary insertion. Example::

    >>> d=dict(x=7, y=33)
    >>> d.__version__

The version is not incremented is an existing key is modified to the
same value, but only the identifier of the value is tested, not the
content of the value. Example::

    >>> d={}
    >>> value = object()
    >>> d['key'] = value
    >>> d.__version__
    >>> d['key'] = value
    >>> d.__version__

.. note::
   CPython uses some singleton like integers in the range [-5; 257],
   empty tuple, empty strings, Unicode strings of a single character in
   the range [U+0000; U+00FF], etc. When a key is set twice to the same
   singleton, the version is not modified.

The PEP is designed to implement guards on namespaces, only the ``dict``
type can be used for namespaces in practice.  ``collections.UserDict``
is modified because it must mimicks ``dict``. ``collections.Mapping`` is

Integer overflow

The implementation uses the C unsigned integer type ``size_t`` to store
the version.  On 32-bit systems, the maximum version is ``2**32-1``
(more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum
version is ``2**64-1`` (more than ``1.8 * 10**19``).

The C code uses ``version++``. The behaviour on integer overflow of the
version is undefined. The minimum guarantee is that the version always
changes when the dictionary is modified.

The check ``dict.__version__ == old_version`` can be true after an
integer overflow, so a guard can return false even if the value changed,
which is wrong. The bug occurs if the dict is modified at least ``2**64``
times (on 64-bit system) between two checks of the guard.

Using a more complex type (ex: ``PyLongObject``) to avoid the overflow
would slow down operations on the ``dict`` type. Even if there is a
theorical risk of missing a value change, the risk is considered too low
compared to the slow down of using a more complex type.


Add a version to each dict entry

A single version per dictionary requires to keep a strong reference to
the value which can keep the value alive longer than expected. If we add
also a version per dictionary entry, the guard can rely on the entry
version and so avoid the strong reference to the value (only strong
references to a dictionary and key are needed).

Changes: add a ``getversion(key)`` method to dictionary which returns
``None`` if the key doesn't exist. When a key is created or modified,
the entry version is set to the dictionary version which is incremented
at each change (create, modify, delete).

Pseudo-code of an efficient guard to check if a dict key was modified
using ``getversion()``::

    UNSET = object()

    class Guard:
        def __init__(self, dict, key):
            self.dict = dict
            self.key = key
            self.dict_version = dict.__version__
            self.entry_version = dict.getversion(key)

        def check(self):
            """Return True if the dictionary value did not changed."""
            dict_version = self.dict.__version__
            if dict_version == self.version:
                # Fast-path: avoid the dictionary lookup
                return True

            # lookup in the dictionary, but get the entry version,
            #not the value
            entry_version = self.dict.getversion(self.key)
            if entry_version == self.entry_version:
                # another key was modified:
                # cache the new dictionary version
                self.dict_version = dict_version
                return True

            return False

This main drawback of this option is the impact on the memory footprint.
It increases the size of each dictionary entry, so the overhead depends
on the number of buckets (dictionary entries, used or unused yet). For
example, it increases the size of each dictionary entry by 8 bytes on
64-bit system if we use ``size_t``.

In Python, the memory footprint matters and the trend is more to reduce
it. Examples:

* `PEP 393 -- Flexible String Representation
* `PEP 412 -- Key-Sharing Dictionary

Add a new dict subtype

Add a new ``verdict`` type, subtype of ``dict``. When guards are needed,
use the ``verdict`` for namespaces (module namespace, type namespace,
instance namespace, etc.) instead of ``dict``.

Leave the ``dict`` type unchanged to not add any overhead (memory
footprint) when guards are not needed.

Technical issue: a lot of C code in the wild, including CPython core,
expect the exact ``dict`` type. Issues:

* ``exec()`` requires a ``dict`` for globals and locals. A lot of code
  use ``globals={}``. It is not possible to cast the ``dict`` to a
  ``dict`` subtype because the caller expects the ``globals`` parameter
  to be modified (``dict`` is mutable).
* Functions call directly ``PyDict_xxx()`` functions, instead of calling
  ``PyObject_xxx()`` if the object is a ``dict`` subtype
* ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some
  functions require the exact ``dict`` type.
* ``Python/ceval.c`` does not completly supports dict subtypes for

The ``exec()`` issue is a blocker issue.

Other issues:

* The garbage collector has a special code to "untrack" ``dict``
  instances. If a ``dict`` subtype is used for namespaces, the garbage
  collector may be unable to break some reference cycles.
* Some functions have a fast-path for ``dict`` which would not be taken
  for ``dict`` subtypes, and so it would make Python a little bit

Usage of dict.__version__

astoptimizer of FAT Python

The astoptimizer of the FAT Python project implements many optimizations
which require guards on namespaces. Examples:

* Call pure builtins: to replace ``len("abc")`` with ``3``, guards on
  ``builtins.__dict__['len']`` and ``globals()['len']`` are required
* Loop unrolling: to unroll the loop ``for i in range(...): ...``,
  guards on ``builtins.__dict__['range']`` and ``globals()['range']``
  are required

The `FAT Python
<>`_ project is a
static optimizer for Python 3.6.


According of Brett Cannon, one of the two main developers of Pyjion, Pyjion can
also benefit from dictionary version to implement optimizations.

Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET Core

Unladen Swallow

Even if dictionary version was not explicitly mentionned, optimization globals
and builtins lookup was part of the Unladen Swallow plan: "Implement one of the
several proposed schemes for speeding lookups of globals and builtins."
Source: `Unladen Swallow ProjectPlan

Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler implemented
with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective

Prior Art

Cached globals+builtins lookup

In 2006, Andrea Griffini proposes a patch implementing a `Cached
globals+builtins lookup optimization <>`_.
The patch adds a private ``timestamp`` field to dict.

See the thread on python-dev: `About dictionary lookup caching

Globals / builtins cache

In 2010, Antoine Pitrou proposed a `Globals / builtins cache
<>`_ which adds a private
``ma_version`` field to the ``dict`` type. The patch adds a "global and
builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and
``STORE_GLOBAL`` instructions to use the cache.


`PySizer <>`_: a memory profiler for Python,
Google Summer of Code 2005 project by Nick Smallbone.

This project has a patch for CPython 2.4 which adds ``key_time`` and
``value_time`` fields to dictionary entries. It uses a global
process-wide counter for dictionaries, incremented each time that a
dictionary is modified. The times are used to decide when child objects
first appeared in their parent objects.


This document has been placed in the public domain.


From victor.stinner at  Fri Jan  8 16:31:40 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 8 Jan 2016 22:31:40 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
Message-ID: <>


Here is the second PEP, part of a serie of 3 PEP to add an API to
implement a static Python optimizer specializing functions with

HTML version:

PEP: xxx
Title: Specialized functions with guards
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-January-2016
Python-Version: 3.6


Add an API to add specialized functions with guards to functions, to
support static optimizers respecting the Python semantic.


Python is hard to optimize because almost everything is mutable: builtin
functions, function code, global variables, local variables, ... can be
modified at runtime. Implement optimizations respecting the Python
semantic requires to detect when "something changes", we will call these
checks "guards".

This PEP proposes to add a ``specialize()`` method to functions to add a
specialized functions with guards. When the function is called, the
specialized function is used if nothing changed, otherwise use the
original bytecode.

Writing an optimizer is out of the scope of this PEP.


Using bytecode

Replace ``chr(65)`` with ``"A"``::

    import myoptimizer

    def func():
        return chr(65)

    def fast_func():
        return "A"

    func.specialize(fast_func.__code__, [myoptimizer.GuardBuiltins("chr")])
    del fast_func

    print("func(): %s" % func())
    print("#specialized: %s" % len(func.get_specialized()))

    import builtins
    builtins.chr = lambda obj: "mock"

    print("func(): %s" % func())
    print("#specialized: %s" % len(func.get_specialized()))


    func(): A
    #specialized: 1

    func(): mock
    #specialized: 0

The hypothetical ``myoptimizer.GuardBuiltins("len")`` is a guard on the
builtin ``len()`` function and the ``len`` name in the global namespace.
The guard fails if the builtin function is replaced or if a ``len`` name
is defined in the global namespace.

The first call returns directly the string ``"A"``. The second call
removes the specialized function because the builtin ``chr()`` function
was replaced, and executes the original bytecode

On a microbenchmark, calling the specialized function takes 88 ns,
whereas the original bytecode takes 145 ns (+57 ns): 1.6 times as fast.

Using builtin function

Replace a slow Python function calling ``chr(obj)`` with a direct call
to the builtin ``chr()`` function::

    import myoptimizer

    def func(arg):
        return chr(arg)

    func.specialize(chr, [myoptimizer.GuardBuiltins("chr")])

    print("func(65): %s" % func(65))
    print("#specialized: %s" % len(func.get_specialized()))

    import builtins
    builtins.chr = lambda obj: "mock"

    print("func(65): %s" % func(65))
    print("#specialized: %s" % len(func.get_specialized()))


    func(): A
    #specialized: 1

    func(): mock
    #specialized: 0

The first call returns directly the builtin ``chr()`` function (without
creating a Python frame). The second call removes the specialized
function because the builtin ``chr()`` function was replaced, and
executes the original bytecode.

On a microbenchmark, calling the specialized function takes 95 ns,
whereas the original bytecode takes 155 ns (+60 ns): 1.6 times as fast.
Calling directly ``chr(65)`` takes 76 ns.

Python Function Call

Pseudo-code to call a Python function having specialized functions with

    def call_func(func, *args, **kwargs):
        # by default, call the regular bytecode
        code = func.__code__.co_code
        specialized = func.get_specialized()
        nspecialized = len(specialized)

        index = 0
        while index < nspecialized:
            guard = specialized[index].guard
            # pass arguments, some guards need them
            check = guard(args, kwargs)
            if check == 1:
                # guard succeeded: we can use the specialized function
                code = specialized[index].code
            elif check == -1:
                # guard will always fail: remove the specialized function
                del specialized[index]
            elif check == 0:
                # guard failed temporarely
                index += 1

        # code can be a code object or any callable object
        execute_code(code, args, kwargs)


* Add two new methods to functions:

  - ``specialize(code, guards: list)``: add specialized
    function with guard. `code` is a code object (ex:
    ``func2.__code__``) or any callable object (ex: ``len``).
    The specialization can be ignored if a guard already fails.
  - ``get_specialized()``: get the list of specialized functions with

* Base ``Guard`` type which can be used as parent type to implement
  guards. It requires to implement a ``check()`` function, with an
  optional ``first_check()`` function. API:

  * ``int check(PyObject *guard, PyObject **stack)``: return 1 on
    success, 0 if the guard failed temporarely, -1 if the guard will
    always fail
  * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on
    success, -1 if the guard will always fail

Microbenchmark on ``python3.6 -m timeit -s 'def f(): pass' 'f()'`` (best
of 3 runs):

* Original Python: 79 ns
* Patched Python: 79 ns

According to this microbenchmark, the changes has no overhead on calling
a Python function without specialization.


When a function code is replaced (``func.__code__ = new_code``), all
specialized functions are removed.

When a function is serialized (by ``marshal`` or ``pickle`` for
example), specialized functions and guards are ignored (not serialized).


This document has been placed in the public domain.


From guido at  Fri Jan  8 18:04:58 2016
From: guido at (Guido van Rossum)
Date: Fri, 8 Jan 2016 15:04:58 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
Message-ID: <>

At Dropbox we're trying to be good citizens and we're working towards
introducing gradual typing (PEP 484) into our Python code bases (several
million lines of code). However, that code base is mostly still Python 2.7
and we believe that we should introduce gradual typing first and start
working on conversion to Python 3 second (since having static types in the
code can help a big refactoring like that).

Since Python 2 doesn't support function annotations we've had to look for
alternatives. We considered stub files, a magic codec, docstrings, and
additional `# type:` comments. In the end we decided that `# type:`
comments are the most robust approach. We've experimented a fair amount
with this and we have a proposal for a standard.

The proposal is very simple. Consider the following function with Python 3

    def embezzle(self, account: str, funds: int = 1000000, *fake_receipts:
str) -> None:
        """Embezzle funds from account using fake receipts."""
        <code goes here>

An equivalent way to write this in Python 2 is the following:

    def embezzle(self, account, funds=1000000, *fake_receipts):
        # type: (str, int, *str) -> None
        """Embezzle funds from account using fake receipts."""
        <code goes here>

There are a few details to discuss:

- Every argument must be accounted for, except 'self' (for instance
methods) or 'cls' (for class methods). Also the return type is mandatory.
If in Python 3 you would omit some argument or the return type, the Python
2 notation should use 'Any'.

- If you're using names defined in the typing module, you must still import
them! (There's a backport on PyPI.)

- For `*args` and `**kwds`, put 1 or 2 starts in front of the corresponding
type annotation. As with Python 3 annotations, the annotation here denotes
the type of the individual argument values, not of the tuple/dict that you
receive as the special argument value 'args' or 'kwds'.

- The entire annotation must be one line. (However, see

We would like to propose this as a standard (either to be added to PEP 484
or as a new PEP) rather than making it a "proprietary" extension to mypy
only, so that others in a similar situation can also benefit.

A brief discussion of the considered alternatives:

- Stub files: this would complicate the analysis in mypy quite a bit,
because it would have to parse both the .py file and the .pyi file and
somehow combine the information gathered from both, and for each function
it would have to use the types from the stub file to type-check the body of
the function in the .py file. This would require a lot of additional
plumbing. And if we were using Python 3 we would want to use in-line
annotations anyway.

- A magic codec was implemented over a year ago ( but after using it
for a bit we didn't like it much. It slows down imports, it requires a `#
coding: mypy` declaration, it would conflict with pyxl (, things go horribly wrong when the codec
isn't installed and registered, other tools would be confused by the Python
3 syntax in Python 2 source code, and because of the way the codec was
implemented the Python interpreter would occasionally spit out confusing
error messages showing the codec's output (which is pretty bare-bones).

- While there are existing conventions for specifying types in docstrings,
we haven't been using any of these conventions (at least not consistently,
nor at an appreciable scale), and they are much more verbose if all you
want is adding argument annotations. We're working on a tool that
automatically adds type annotations[1], and such a tool would be
complicated by the need to integrate the generated annotations into
existing docstrings (which, in our code base, unfortunately are wildly
incongruous in their conventions).

- Finally, the proposed comment syntax is easy to mechanically translate
into standard Python 3 function annotations once we're ready to let go of
Python 2.7.

[1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a
bit over 200 lines. It's not very interesting yet, since it sets the types
of nearly all arguments to 'Any'. We're considering building a much more
advanced version that tries to guess much better argument types using some
form of whole-program analysis. I've heard that Facebook's Hack project got
a lot of mileage out of such a tool. I don't yet know how to write it yet
-- possibly we could use a variant of mypy's type inference engine, or
alternatively we might be able to use something like Jedi (

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Fri Jan  8 20:00:46 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Jan 2016 12:00:46 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 8:27 AM, Victor Stinner <victor.stinner at> wrote:
> Here is a first PEP, part of a serie of 3 PEP to add an API to
> implement a static Python optimizer specializing functions with
> guards.

Are you intending for these features to become part of the Python core
language, or are you discussing this as something that your alternate
implementation will do? If the former, send your PEP drafts to
peps at and we can get them assigned numbers; if the latter,
is there some specific subset of this which *is* for the language
core? (For example, MyPy has type checking, but PEP 484 isn't
proposing to include that in the core; all it asks is for a
'' to allow the code to run unchanged.)


From victor.stinner at  Fri Jan  8 20:09:39 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 02:09:39 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav at>:
> Are you intending for these features to become part of the Python core
> language


> If the former, send your PEP drafts to
> peps at and we can get them assigned numbers

My plan is to start a first round of discussion on python-ideas, then
get a PEP number for my PEPs before moving the discussion to


From rosuav at  Fri Jan  8 20:38:02 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Jan 2016 12:38:02 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 12:09 PM, Victor Stinner
<victor.stinner at> wrote:
> 2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav at>:
>> Are you intending for these features to become part of the Python core
>> language
> Yes.
>> If the former, send your PEP drafts to
>> peps at and we can get them assigned numbers
> My plan is to start a first round of discussion on python-ideas, then
> get a PEP number for my PEPs before moving the discussion to
> python-dev.

The discussion on python-ideas can benefit from PEP numbers too,
particularly since you're putting three separate proposals up. ("Wait,
I know I saw a comment about that. Oh right, that was in PEP 142857,
not 142856.") But it's up to you.


From rosuav at  Fri Jan  8 20:42:49 2016
From: rosuav at (Chris Angelico)
Date: Sat, 9 Jan 2016 12:42:49 +1100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 8:31 AM, Victor Stinner <victor.stinner at> wrote:
> When a function is serialized (by ``marshal`` or ``pickle`` for
> example), specialized functions and guards are ignored (not serialized).

Does this mean that any code imported from a .pyc file cannot take
advantage of these kinds of optimizations?


From victor.stinner at  Fri Jan  8 20:59:23 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 02:59:23 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 2:42 GMT+01:00 Chris Angelico <rosuav at>:
> On Sat, Jan 9, 2016 at 8:31 AM, Victor Stinner <victor.stinner at> wrote:
>> When a function is serialized (by ``marshal`` or ``pickle`` for
>> example), specialized functions and guards are ignored (not serialized).
> Does this mean that any code imported from a .pyc file cannot take
> advantage of these kinds of optimizations?

Ah yes, this sentence is confusing. It should not mention marshal, it's wrong.

A .pyc file doesn't not contain functions... It only contains code
objects. Functions are only created at runtime.

Specialized functions are also added at runtime.


From victor.stinner at  Fri Jan  8 21:01:39 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 03:01:39 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 2:38 GMT+01:00 Chris Angelico <rosuav at>:
> The discussion on python-ideas can benefit from PEP numbers too,
> particularly since you're putting three separate proposals up. ("Wait,
> I know I saw a comment about that. Oh right, that was in PEP 142857,
> not 142856.") But it's up to you.

Hum, I forgot to mention that I'm not 100% sure yet that I correctly
splitted my work on the FAT Python project into the right number of
PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into
sub-PEPs because it requires too many changes (I'm thinking at the
third PEP, not published yet, it's still a "private" draft).


From steve at  Fri Jan  8 21:08:10 2016
From: steve at (Steven D'Aprano)
Date: Sat, 9 Jan 2016 13:08:10 +1100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote:

> Since Python 2 doesn't support function annotations we've had to look for
> alternatives. We considered stub files, a magic codec, docstrings, and
> additional `# type:` comments. In the end we decided that `# type:`
> comments are the most robust approach. We've experimented a fair amount
> with this and we have a proposal for a standard.

> - Stub files: this would complicate the analysis in mypy quite a bit,
> because it would have to parse both the .py file and the .pyi file and
> somehow combine the information gathered from both, and for each function
> it would have to use the types from the stub file to type-check the body of
> the function in the .py file. This would require a lot of additional
> plumbing. And if we were using Python 3 we would want to use in-line
> annotations anyway.

I don't understand this paragraph. Doesn't mypy (and any other type 
checker) have to support stub files? I thought that stub files are 
needed for extension files, among other things. So I would have expected 
that any Python 2 type checker would have to support stub files as well, 
regardless of whether inline #type comments are introduced or not.

Will Python 3 type checkers be expected to support #type comments as 
well as annotations and stub files?


From ncoghlan at  Fri Jan  8 23:47:33 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jan 2016 14:47:33 +1000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2016 at 12:01, Victor Stinner <victor.stinner at> wrote:
> 2016-01-09 2:38 GMT+01:00 Chris Angelico <rosuav at>:
>> The discussion on python-ideas can benefit from PEP numbers too,
>> particularly since you're putting three separate proposals up. ("Wait,
>> I know I saw a comment about that. Oh right, that was in PEP 142857,
>> not 142856.") But it's up to you.
> Hum, I forgot to mention that I'm not 100% sure yet that I correctly
> splitted my work on the FAT Python project into the right number of
> PEPs. Maybe we could merge two PEPs, or a PEP should be splitted into
> sub-PEPs because it requires too many changes (I'm thinking at the
> third PEP, not published yet, it's still a "private" draft).

The first two proposals you've posted make sense to consider as
standalone changes, so it seems reasonable to assign them PEP numbers
now rather than waiting.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan  9 00:22:04 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jan 2016 15:22:04 +1000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2016 at 12:08, Steven D'Aprano <steve at> wrote:
> On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote:
> [...]
>> Since Python 2 doesn't support function annotations we've had to look for
>> alternatives. We considered stub files, a magic codec, docstrings, and
>> additional `# type:` comments. In the end we decided that `# type:`
>> comments are the most robust approach. We've experimented a fair amount
>> with this and we have a proposal for a standard.
> [...]
>> - Stub files: this would complicate the analysis in mypy quite a bit,
>> because it would have to parse both the .py file and the .pyi file and
>> somehow combine the information gathered from both, and for each function
>> it would have to use the types from the stub file to type-check the body of
>> the function in the .py file. This would require a lot of additional
>> plumbing. And if we were using Python 3 we would want to use in-line
>> annotations anyway.
> I don't understand this paragraph. Doesn't mypy (and any other type
> checker) have to support stub files? I thought that stub files are
> needed for extension files, among other things. So I would have expected
> that any Python 2 type checker would have to support stub files as well,
> regardless of whether inline #type comments are introduced or not.

Stub files are easy to use if you're using them *instead of* the
original source file (e.g. annotating extension modules, or typeshed
annotations for the standard library). Checking a stub file for
consistency against the published API of the corresponding module also
seems like it would be straightforward (while using both a stub file
*and* inline annotations for the same API seems like it would be a bad
idea, it's at least necessary to check that the *shape* of the API
matches, even if there's no type information).

However, if I'm understanding correctly, the problem Guido is talking
about here is a different one: analysing a function *implementation*
to ensure it is consistent with its own published API. That's
relatively straightforward with inline annotations (whether function
annotation based, comment based, or docstring based), but trickier if
you have to pause the analysis, go look for the right stub file, load
it, determine the expected public API, and then resume the analysis of
the original function.

The other downside of the stub file approach is the same reason it's
not the preferred approach in Python 3: you can't see the annotations
yourself when you're working on the function.

Folks working mostly on solo and small team projects may not see the
appeal of that, but when doing maintenance on large unfamiliar code
bases, the improved local reasoning those kinds of inline notes help
support can be very helpful.

> Will Python 3 type checkers be expected to support #type comments as
> well as annotations and stub files?

#type comment support is already required for variables and

That requirement for type checkers to support comment based type hints
would remain, even if we were to later add native syntactic support
for variable and attribute typing.

I read Guido's proposal here as offering something similar for
function annotations, only going in the other direction: providing a
variant spelling for function type hinting that can be used in single
source Python 2/3 code bases that can't use function annotations.

I don't have a strong opinion on the specifics, but am +1 on the
general idea - I think the approach Dropbox are pursuing of adopting
static type analysis first, and then migrating to Python 3 (or at
least single source Python 2/3 support) second is going to prove to be
a popular one, as it allows you to detect a lot of potential migration
issues without necessarily having to be able to exercise those code
paths in a test running under Python 3.

The 3 kinds of annotation would then have 3 clear function level use cases:

stub files: annotating third party libraries (e.g. for typeshed)
#type comments: annotating single source Python 2/3 code
function annotations: annotating Python 3 code


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From storchaka at  Sat Jan  9 01:03:12 2016
From: storchaka at (Serhiy Storchaka)
Date: Sat, 9 Jan 2016 08:03:12 +0200
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n6q7r1$2n9$>

On 08.01.16 23:27, Victor Stinner wrote:
> Add a new read-only ``__version__`` property to ``dict`` and
> ``collections.UserDict`` types, incremented at each change.

This may be not the best name for a property. Many modules already have 
the __version__ attribute, this may make a confusion.

> The C code uses ``version++``. The behaviour on integer overflow of the
> version is undefined. The minimum guarantee is that the version always
> changes when the dictionary is modified.

For clarification, this code has defined behavior in C (we should avoid 
introducing new undefined behaviors). May be you mean that the bahavior 
is not specified from Python side (since it is platform and 
implementation defined).

> Usage of dict.__version__
> =========================

This also can be used for better detecting dict mutating during 

From ncoghlan at  Sat Jan  9 02:43:41 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jan 2016 17:43:41 +1000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6q7r1$2n9$>
References: <>
Message-ID: <>

On 9 January 2016 at 16:03, Serhiy Storchaka <storchaka at> wrote:
> On 08.01.16 23:27, Victor Stinner wrote:
>> Add a new read-only ``__version__`` property to ``dict`` and
>> ``collections.UserDict`` types, incremented at each change.
> This may be not the best name for a property. Many modules already have the
> __version__ attribute, this may make a confusion.

The equivalent API for the global ABC object graph is

One of the reasons we chose that name is that even though it's a
number, the only operation with semantic significance is equality
testing, with the intended use case being cache invalidation when the
token changes value.

If we followed the same reasoning for Victor's proposal, then a
suitable attribute name would be "__cache_token__".

>> The C code uses ``version++``. The behaviour on integer overflow of the
>> version is undefined. The minimum guarantee is that the version always
>> changes when the dictionary is modified.
> For clarification, this code has defined behavior in C (we should avoid
> introducing new undefined behaviors). May be you mean that the bahavior is
> not specified from Python side (since it is platform and implementation
> defined).

At least in recent versions of the standard*, overflow is defined on
unsigned types as wrapping modulo-N. It only remains formally
undefined for signed types.

*(I'm not sure about C89, but with MSVC getting their standards
compliance act together, we could potentially update our minimum C
version expectation in PEP 7 to C99 or even C11).

>> Usage of dict.__version__
>> =========================
> This also can be used for better detecting dict mutating during iterating:

I initially thought the same thing, but the cache token will be
updated even if the keys all stay the same, and one of the values is
modified, while the mutation-during-iteration check is aimed at
detecting changes to the keys, rather than the values.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From victor.stinner at  Sat Jan  9 02:57:26 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 08:57:26 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6q7r1$2n9$>
References: <>
Message-ID: <>

Le samedi 9 janvier 2016, Serhiy Storchaka <storchaka at> a ?crit :

> On 08.01.16 23:27, Victor Stinner wrote:
>> Add a new read-only ``__version__`` property to ``dict`` and
>> ``collections.UserDict`` types, incremented at each change.
> This may be not the best name for a property. Many modules already have
> the __version__ attribute, this may make a confusion.

It's fine to have a __version__ property and a __version__ key in the same
dict. They are different. For a module, it's something like:

With moddict = globals():

- moddict.__version__ is the dict version
- moddict['__version__'] is the module version

Using the same name for different things is not new in Python. An example
still in the module namespace:

- moddict.__class__.__name__ is the dict class name
- moddict['__name__'] is the module name (or '__main__')

"Version" is really my favorite name for the name feature. Sometimes I saw
"timestamp", but in my opinion it's more confusing because it's not related
to a clock.

> The C code uses ``version++``. The behaviour on integer overflow of the
>> version is undefined. The minimum guarantee is that the version always
>> changes when the dictionary is modified.
> For clarification, this code has defined behavior in C (we should avoid
> introducing new undefined behaviors). May be you mean that the bahavior is
> not specified from Python side (since it is platform and implementation
> defined).

The C type for version is unsigned (size_t). I hope that version++ is
defined but I was too lazy to check C specs for that :-) Does it wrap to 0
on overflow on all architecture (supported by Python)?

If not, it's easy to wrap manually:

version = (version==size_max) ? 0 : version+1;

> Usage of dict.__version__
>> =========================
> This also can be used for better detecting dict mutating during iterating:

Oh, cool.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Sat Jan  9 03:55:29 2016
From: storchaka at (Serhiy Storchaka)
Date: Sat, 9 Jan 2016 10:55:29 +0200
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n6qhu1$2h2$>

On 09.01.16 09:57, Victor Stinner wrote:
> Le samedi 9 janvier 2016, Serhiy Storchaka
>     This may be not the best name for a property. Many modules already
>     have the __version__ attribute, this may make a confusion.
> It's fine to have a __version__ property and a __version__ key in the
> same dict. They are different.

Oh, I meant not a confusion between a property and a key, but between 
properties of two related objects. Perhaps one time we'll want to add 
the property with the same meaning directly to module object, but it is 
already in use.

> "Version" is really my favorite name for the name feature. Sometimes I
> saw "timestamp", but in my opinion it's more confusing because it's not
> related to a clock.

Nick's "__cache_token__" LGTM.

From storchaka at  Sat Jan  9 03:57:42 2016
From: storchaka at (Serhiy Storchaka)
Date: Sat, 9 Jan 2016 10:57:42 +0200
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n6qi26$2h2$>

On 09.01.16 09:43, Nick Coghlan wrote:
> If we followed the same reasoning for Victor's proposal, then a
> suitable attribute name would be "__cache_token__".


>> This also can be used for better detecting dict mutating during iterating:
> I initially thought the same thing, but the cache token will be
> updated even if the keys all stay the same, and one of the values is
> modified, while the mutation-during-iteration check is aimed at
> detecting changes to the keys, rather than the values.

This makes Raymond's objections even more strong.

From ncoghlan at  Sat Jan  9 04:08:54 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jan 2016 19:08:54 +1000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6qhu1$2h2$>
References: <>
Message-ID: <>

On 9 January 2016 at 18:55, Serhiy Storchaka <storchaka at> wrote:
> On 09.01.16 09:57, Victor Stinner wrote:
>> Le samedi 9 janvier 2016, Serhiy Storchaka
>>     This may be not the best name for a property. Many modules already
>>     have the __version__ attribute, this may make a confusion.
>> It's fine to have a __version__ property and a __version__ key in the
>> same dict. They are different.
> Oh, I meant not a confusion between a property and a key, but between
> properties of two related objects. Perhaps one time we'll want to add the
> property with the same meaning directly to module object, but it is already
> in use.

The confusion I was referring to was yet a third variant of possible
confusion: when people read "version", they're inevitably going to
think "module version" or "package version" (since dealing with those
kinds of versions is a day to day programming activity, regardless of
domain), not "cache validity token" (as "version" in that sense is a
technical term of art most programmers won't have encountered before).

Yes, technically, "version" and "cache validity token" refer to the
same thing in the context of data versioning, but the latter
emphasises what the additional piece of information is primarily *for*
in practical terms (checking if your caches are still valid), rather
than what it *is* in formal terms (the current version of the stored


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From victor.stinner at  Sat Jan  9 04:18:20 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 10:18:20 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Nick,

2016-01-09 8:43 GMT+01:00 Nick Coghlan <ncoghlan at>:
>> For clarification, this code has defined behavior in C (we should avoid
>> introducing new undefined behaviors). May be you mean that the bahavior is
>> not specified from Python side (since it is platform and implementation
>> defined).
> At least in recent versions of the standard*, overflow is defined on
> unsigned types as wrapping modulo-N. It only remains formally
> undefined for signed types.
> *(I'm not sure about C89, but with MSVC getting their standards
> compliance act together, we could potentially update our minimum C
> version expectation in PEP 7 to C99 or even C11).


>>> Usage of dict.__version__
>>> =========================
>> This also can be used for better detecting dict mutating during iterating:
> I initially thought the same thing, but the cache token will be
> updated even if the keys all stay the same, and one of the values is
> modified, while the mutation-during-iteration check is aimed at
> detecting changes to the keys, rather than the values.

Serhiy's unit test ensure that creating a new key and deleting a key
during an iteration is detected as a dict mutation, even if the dict
size doesn't change. This use case works well with dict.__version__.
Any __setitem__() changes the version (except if the key already
exists and the value is exactly the same, id(old_value) ==
id(new_value)). Example:

>>> d={1 :1}
>>> len(d)
>>> d.__version__, len(d)
(1, 1)
>>> d[2]=2
>>> del d[1]
>>> d.__version__, len(d)
(3, 1)

Changing the value can be detected as well during iteration using

>>> d={1:1}
>>> d.__version__, len(d)
(1, 1)
>>> d[1]=2
>>> d.__version__, len(d)
(2, 1)

It would be nice to detect keys mutation while iteration on
dict.keys(), but it would also be be nice to detect values mutation
while iterating on dict.values() and dict.items(). No?


From pavol.lisy at  Sat Jan  9 04:54:11 2016
From: pavol.lisy at (Pavol Lisy)
Date: Sat, 9 Jan 2016 10:54:11 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 0:04 GMT+01:00, Guido van Rossum <guido at>:
> At Dropbox we're trying to be good citizens and we're working towards
> introducing gradual typing (PEP 484) into our Python code bases (several
> million lines of code). However, that code base is mostly still Python 2.7
> and we believe that we should introduce gradual typing first and start
> working on conversion to Python 3 second (since having static types in the
> code can help a big refactoring like that).
> Since Python 2 doesn't support function annotations we've had to look for
> alternatives. We considered stub files, a magic codec, docstrings, and
> additional `# type:` comments. In the end we decided that `# type:`
> comments are the most robust approach. We've experimented a fair amount
> with this and we have a proposal for a standard.
> The proposal is very simple. Consider the following function with Python 3
> annotations:
>     def embezzle(self, account: str, funds: int = 1000000, *fake_receipts:
> str) -> None:
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> An equivalent way to write this in Python 2 is the following:
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         # type: (str, int, *str) -> None
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> There are a few details to discuss:
> - Every argument must be accounted for, except 'self' (for instance
> methods) or 'cls' (for class methods). Also the return type is mandatory.
> If in Python 3 you would omit some argument or the return type, the Python
> 2 notation should use 'Any'.
> - If you're using names defined in the typing module, you must still import
> them! (There's a backport on PyPI.)
> - For `*args` and `**kwds`, put 1 or 2 starts in front of the corresponding
> type annotation. As with Python 3 annotations, the annotation here denotes
> the type of the individual argument values, not of the tuple/dict that you
> receive as the special argument value 'args' or 'kwds'.
> - The entire annotation must be one line. (However, see
> We would like to propose this as a standard (either to be added to PEP 484
> or as a new PEP) rather than making it a "proprietary" extension to mypy
> only, so that others in a similar situation can also benefit.
> A brief discussion of the considered alternatives:
> - Stub files: this would complicate the analysis in mypy quite a bit,
> because it would have to parse both the .py file and the .pyi file and
> somehow combine the information gathered from both, and for each function
> it would have to use the types from the stub file to type-check the body of
> the function in the .py file. This would require a lot of additional
> plumbing. And if we were using Python 3 we would want to use in-line
> annotations anyway.
> - A magic codec was implemented over a year ago (
> but after using it
> for a bit we didn't like it much. It slows down imports, it requires a `#
> coding: mypy` declaration, it would conflict with pyxl (
>, things go horribly wrong when the codec
> isn't installed and registered, other tools would be confused by the Python
> 3 syntax in Python 2 source code, and because of the way the codec was
> implemented the Python interpreter would occasionally spit out confusing
> error messages showing the codec's output (which is pretty bare-bones).
> - While there are existing conventions for specifying types in docstrings,
> we haven't been using any of these conventions (at least not consistently,
> nor at an appreciable scale), and they are much more verbose if all you
> want is adding argument annotations. We're working on a tool that
> automatically adds type annotations[1], and such a tool would be
> complicated by the need to integrate the generated annotations into
> existing docstrings (which, in our code base, unfortunately are wildly
> incongruous in their conventions).
> - Finally, the proposed comment syntax is easy to mechanically translate
> into standard Python 3 function annotations once we're ready to let go of
> Python 2.7.
> __________
> [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a
> bit over 200 lines. It's not very interesting yet, since it sets the types
> of nearly all arguments to 'Any'. We're considering building a much more
> advanced version that tries to guess much better argument types using some
> form of whole-program analysis. I've heard that Facebook's Hack project got
> a lot of mileage out of such a tool. I don't yet know how to write it yet
> -- possibly we could use a variant of mypy's type inference engine, or
> alternatively we might be able to use something like Jedi (
> --
> --Guido van Rossum (

Could not something like this ->

    def embezzle(self, account, funds=1000000, *fake_receipts):
        # def embezzle(self, account: str, funds: int = 1000000,
*fake_receipts: str) -> None:
        """Embezzle funds from account using fake receipts."""
        <code goes here>

1. transition from python2 to python3 more simple?
2. python3 checkers more easily changeable to understand new python2 standard?
3. simpler impact to documentation (means also simpler knowledbase to
be learn) about annotations?

From victor.stinner at  Sat Jan  9 04:58:00 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 10:58:00 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6qi26$2h2$>
References: <>
Message-ID: <>

2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka at>:
>>> This also can be used for better detecting dict mutating during
>>> iterating:
> (...)
> This makes Raymond's objections even more strong.

Raymond has two major objections: memory footprint and performance. I
opened an issue with a patch implementing dict__version__ and I ran

pybench doesn't seem reliable: microbenchmarks on dict seems faster
with the patch, it doesn't make sense. I expect worse or same

With my own timeit microbenchmarks, I don't see any slowdown with the
patch. For an unknown reason (it's really strange), dict operations
seem even faster with the patch.

For the memory footprint, it's clearly stated in the PEP that it adds
8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype"
section which explains why I proposed to modify directly the dict

IMHO adding 8 bytes per dict is worth it. See for example
microbenchmarks on func.specialize() which rely on dict.__version__ to
implement efficient guards on namespaces:

"1.6 times" (155 ns => 95 ns) is better than a few percent as fast
usually seen when optimizing dict operations.


From victor.stinner at  Sat Jan  9 05:16:38 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 11:16:38 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 5:47 GMT+01:00 Nick Coghlan <ncoghlan at>:
> The first two proposals you've posted make sense to consider as
> standalone changes, so it seems reasonable to assign them PEP numbers
> now rather than waiting.

Ok fine, I requested 3 numbers for my first draft PEPs.


From mal at  Sat Jan  9 07:09:13 2016
From: mal at (M.-A. Lemburg)
Date: Sat, 9 Jan 2016 13:09:13 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On 09.01.2016 10:58, Victor Stinner wrote:
> 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka at>:
>>>> This also can be used for better detecting dict mutating during
>>>> iterating:
>> (...)
>> This makes Raymond's objections even more strong.
> Raymond has two major objections: memory footprint and performance. I
> opened an issue with a patch implementing dict__version__ and I ran
> pybench:
> pybench doesn't seem reliable: microbenchmarks on dict seems faster
> with the patch, it doesn't make sense. I expect worse or same
> performance.
> With my own timeit microbenchmarks, I don't see any slowdown with the
> patch. For an unknown reason (it's really strange), dict operations
> seem even faster with the patch.

This can well be caused by a better memory alignment, which
depends on the CPU you're using.

> For the memory footprint, it's clearly stated in the PEP that it adds
> 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype"
> section which explains why I proposed to modify directly the dict
> type.

Some questions:

* How would the implementation deal with wrap around of the
  version number for fast changing dicts (esp. on 32-bit platforms) ?

* Given that this is an optimization and not meant to be exact
  science, why would we need 64 bits worth of version information ?

  AFAIK, you only need the version information to be able to
  answer the question "did anything change compared to last time
  I looked ?".

  For an optimization it's good enough to get an answer "yes"
  for slow changing dicts and "no" for all other cases. False
  negatives don't really hurt. False positives are not allowed.

  What you'd need to answer the question is a way for the
  code in need of the information to remember the dict
  state and then later compare it's remembered state
  with the now current state of the dict.

  dicts could do this with a 16-bit index into an array
  of state object slots which are set by the code tracking
  the dict.

  When it's time to check, the code would simply ask for the
  current index value and compare the state object in the
  array with the one it had set.

* Wouldn't it be possible to use the hash array itself to
  store the state index ?

  We could store the state object as regular key in the
  dict and filter this out when accessing the dict.

  Alternatively, we could try to use the free slots for
  storing these state objects by e.g. declaring a free
  slot as being NULL or a pointer to a state object.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 09 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mal at  Sat Jan  9 07:48:14 2016
From: mal at (M.-A. Lemburg)
Date: Sat, 9 Jan 2016 13:48:14 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On 09.01.2016 00:04, Guido van Rossum wrote:
> Since Python 2 doesn't support function annotations we've had to look for
> alternatives. We considered stub files, a magic codec, docstrings, and
> additional `# type:` comments. In the end we decided that `# type:`
> comments are the most robust approach. We've experimented a fair amount
> with this and we have a proposal for a standard.
> The proposal is very simple. Consider the following function with Python 3
> annotations:
>     def embezzle(self, account: str, funds: int = 1000000, *fake_receipts:
> str) -> None:
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> An equivalent way to write this in Python 2 is the following:
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         # type: (str, int, *str) -> None
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>

By using comments, the annotations would not be available at
runtime via an .__annotations__ attribute and every tool would
have to implement a parser for extracting them.

Wouldn't it be better and more in line with standard Python
syntax to use decorators to define them ?

    @typehint("(str, int, *str) -> None")
    def embezzle(self, account, funds=1000000, *fake_receipts):
        """Embezzle funds from account using fake receipts."""
        <code goes here>

This would work in Python 2 as well and could (optionally)
add an .__annotations__ attribute to the function/method,
automatically create a type annotations file upon import,

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 09 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From ncoghlan at  Sat Jan  9 08:12:49 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 9 Jan 2016 23:12:49 +1000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On 9 January 2016 at 19:18, Victor Stinner <victor.stinner at> wrote:
> It would be nice to detect keys mutation while iteration on
> dict.keys(), but it would also be be nice to detect values mutation
> while iterating on dict.values() and dict.items(). No?

No, because mutating values as you go while iterating over a
dictionary is perfectly legal:

>>> data = dict.fromkeys(range(5))
>>> for k in data:
...     data[k] = k
>>> for k, v in data.items():
...     data[k] = v ** 2
>>> data
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

It's only changing the key in the dict that's problematic, as that's
the one that can affect the iteration order, regardless of whether
you're emitting keys, values, or both.

Raymond did mention that when closing the issue, but it was as an
aside in one of his bullet points, rather than as a full example.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From steve at  Sat Jan  9 08:21:10 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jan 2016 00:21:10 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:

> * Given that this is an optimization and not meant to be exact
>   science, why would we need 64 bits worth of version information ?
>   AFAIK, you only need the version information to be able to
>   answer the question "did anything change compared to last time
>   I looked ?".
>   For an optimization it's good enough to get an answer "yes"
>   for slow changing dicts and "no" for all other cases.

I don't understand this. The question has nothing to do with 
how quickly or slowly the dict has changed, but only on whether or not 
it actually has changed. Maybe your dict has been stable for three 
hours, except for one change; or it changes a thousand times a second. 
Either way, it has still changed.

>   False
>   negatives don't really hurt. False positives are not allowed.

I think you have this backwards. False negatives potentially will 
introduce horrible bugs. A false negative means that you fail to notice 
when the dict has changed, when it actually has. ("Has the dict 
changed?" "No.") The result of that will be to apply the optimization 
when you shouldn't, and that is potentially catastrophic (the entirely 
wrong function is mysteriously called).

A false positive means you wrongly think the dict has changed when it 
hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you 
miss out on the possibility of applying the optimization when you 
actually could have, but it's not so bad. So false positives (wrongly 
thinking the dict has changed when it hasn't) can be permitted, but 
false negatives shouldn't be.

>   What you'd need to answer the question is a way for the
>   code in need of the information to remember the dict
>   state and then later compare it's remembered state
>   with the now current state of the dict.
>   dicts could do this with a 16-bit index into an array
>   of state object slots which are set by the code tracking
>   the dict.
>   When it's time to check, the code would simply ask for the
>   current index value and compare the state object in the
>   array with the one it had set.

If I've understand that correctly, and I may not have, that will on 
detect (some?) insertions and deletions to the dict, but fail to 
detect when an existing key has a new value bound.


From victor.stinner at  Sat Jan  9 08:24:07 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 14:24:07 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 13:09 GMT+01:00 M.-A. Lemburg <mal at>:
> * How would the implementation deal with wrap around of the
>   version number for fast changing dicts (esp. on 32-bit platforms) ?

Let me try to do some maths.

haypo at selma$ python3 -m timeit 'd={}' 'for i in range(2**16): d[i]=i'
100 loops, best of 3: 7.01 msec per loop

haypo at selma$ python3
Python 3.4.3 (default, Jun 29 2015, 12:16:01)
>>> t=7.01e-3 / 2**16
>>> t*1e9

It looks like __setitem__() takes 107 in average. I guess that the
number depends a lot on the dictionary size, the number of required
resize (rehash all keys), etc. But well, it's just to have an

>>> print(datetime.timedelta(seconds=2**32 * t))

With a 32-bit version, less than 8 minutes are enough to hit the
integer overflow if each dict operation changes the dict version and
you modify a dict in a loop.

>>> print(2016 + datetime.timedelta(seconds=2**64 * t) / datetime.timedelta(days=365.25))

With a 64-bit version, the situation is very different: the next
overflow will not occur before the year 64 541 :-)

Maybe it's worth to use a 64-bit version on 32-bit platforms? Python
3.5 already uses a 64-bit integer on 32-bit platforms to store a
timestamp in the private "pytime" API.

Guard has only a bug on integer overflow if the new version modulo
2^32 (or modulo 2^64) is equal to the old version. The bet is also
that it's "unlikely".

> * Given that this is an optimization and not meant to be exact
>   science, why would we need 64 bits worth of version information ?

If a guard says that nothing changes where something changes, it is a
real issue for me. It means that the optimization changes the Python

>   For an optimization it's good enough to get an answer "yes"
>   for slow changing dicts and "no" for all other cases. False
>   negatives don't really hurt. False positives are not allowed.

False negative means that you loose the optimization. It would be
annoying to see server performance degrades after N days before of an
integer overflow :-/ It can be a big issue. How do you choose the
number of servers if performances are not stable?


From rosuav at  Sat Jan  9 08:32:21 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 00:32:21 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 12:21 AM, Steven D'Aprano <steve at> wrote:
> On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
>> * Given that this is an optimization and not meant to be exact
>>   science, why would we need 64 bits worth of version information ?
>>   AFAIK, you only need the version information to be able to
>>   answer the question "did anything change compared to last time
>>   I looked ?".
>>   For an optimization it's good enough to get an answer "yes"
>>   for slow changing dicts and "no" for all other cases.
> I don't understand this. The question has nothing to do with
> how quickly or slowly the dict has changed, but only on whether or not
> it actually has changed. Maybe your dict has been stable for three
> hours, except for one change; or it changes a thousand times a second.
> Either way, it has still changed.
>>   False
>>   negatives don't really hurt. False positives are not allowed.
> I think you have this backwards. False negatives potentially will
> introduce horrible bugs. A false negative means that you fail to notice
> when the dict has changed, when it actually has. ("Has the dict
> changed?" "No.") The result of that will be to apply the optimization
> when you shouldn't, and that is potentially catastrophic (the entirely
> wrong function is mysteriously called).
> A false positive means you wrongly think the dict has changed when it
> hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you
> miss out on the possibility of applying the optimization when you
> actually could have, but it's not so bad. So false positives (wrongly
> thinking the dict has changed when it hasn't) can be permitted, but
> false negatives shouldn't be.

I think we're getting caught in terminology a bit. The original
question was "why a 64-bit counter". Here's my take on it:

* If the dict has changed but we say it hasn't, this is a critical
failure. M-A L called this a "false positive", which works if the
question is "may we use the optimized version".

* If the dict has changed exactly N times since it was last checked,
where N is the integer wrap-around period of the counter, a naive
counter comparison will show that it has not changed.

Consequently, a small counter is more problematic than a large one. If
the counter has 2**8 states, then collisions will be frequent, and
that would be bad. If it has 2**32 states, then a slow-changing dict
will last longer than any typical run of a program (if it changes,
say, once per second, you get over a century of uptime before it's a
problem), but a fast-changing dict could run into issues (change every
millisecond and you'll run into trouble after a couple of months). A
64-bit counter could handle ridiculously fast mutation (say, every
nanosecond) for a ridiculously long time (hundreds of years).

That's the only way that fast-changing and slow-changing have any meaning.


From victor.stinner at  Sat Jan  9 08:42:22 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 14:42:22 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>


2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik at>:
> How is this not just a poorer version of PyPy's optimizations?

This a very good question :-) There are a lot of optimizers in the
wild, mostly JIT compilers. The problem is that most of them are
specific to numerical computations, and the remaining ones are generic
but not widely used. The most advanced and complete fast
implementation of Python is obviously PyPy. I didn't heard a lot of
deployements with PyPy. For example, PyPy is not used to install
OpenStack (a very large project which has a big number of
dependencies). I'm not even sure that PyPy is the favorite
implementation of Python used to run Django, to give another example
of popular Python application.

PyPy is just amazing in term of performances, but for an unknown
reason, it didn't replace CPython yet. PyPy has some drawbacks: it
only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it
has bad performances on the C API and I heard that performances are
not as amazing as expected on some applications. PyPy has also a worse
startup time and use more memory. IMHO the major issue of Python is
the backward compatibility on the C API.

In short, almost all users are stuck at CPython and CPython implements
close to 0 optimization (come on, constant folding and dead code
elimintation is not what I would call an "optimization" ;-)).

My goal is to fill the hole between CPython (0 optimization) and PyPy
(the reference for best performances).

I wrote a whole website to explain the status of the Python optimizers
and why I want to write my own optimizer:

> If what you want is optimization, it would be much better to devote time to a solution
> that can potentially yield orders of magnitude worth of speedup like PyPy
> rather than increasing language complexity for a minor payoff.

I disagree that my proposed changes increase the "language
complexity". According to early benchmarks, my changes has a
negligible impact on performances. I don't see how adding a read-only
__version__ property to dict makes the Python *language* more complex?

My whole design is based on the idea that my optimizer will be
optimal. You will be free to not use it ;-)

And sorry, I'm not interested to contribute to PyPy.


From mal at  Sat Jan  9 08:50:06 2016
From: mal at (M.-A. Lemburg)
Date: Sat, 9 Jan 2016 14:50:06 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 09.01.2016 14:21, Steven D'Aprano wrote:
> On Sat, Jan 09, 2016 at 01:09:13PM +0100, M.-A. Lemburg wrote:
>> * Given that this is an optimization and not meant to be exact
>>   science, why would we need 64 bits worth of version information ?
>>   AFAIK, you only need the version information to be able to
>>   answer the question "did anything change compared to last time
>>   I looked ?".
>>   For an optimization it's good enough to get an answer "yes"
>>   for slow changing dicts and "no" for all other cases.
> I don't understand this. The question has nothing to do with 
> how quickly or slowly the dict has changed, but only on whether or not 
> it actually has changed. Maybe your dict has been stable for three 
> hours, except for one change; or it changes a thousand times a second. 
> Either way, it has still changed.

I was referring to how many versions will likely have passed
since the code querying the dict last looked. Most algorithms
won't be interested in the version number itself, but simply
want to know whether the dict has changed or not.

>>   False
>>   negatives don't really hurt. False positives are not allowed.
> I think you have this backwards.

With "false negatives" I meant: the code says the dict has
changed, even though it has not. With "false positives" I meant
the code says the dict has not changed, even though it has.

But you're right: I should have used more explicit definitions :-)

> False negatives potentially will
> introduce horrible bugs. A false negative means that you fail to notice 
> when the dict has changed, when it actually has. ("Has the dict 
> changed?" "No.") The result of that will be to apply the optimization 
> when you shouldn't, and that is potentially catastrophic (the entirely 
> wrong function is mysteriously called).
> A false positive means you wrongly think the dict has changed when it 
> hasn't. ("Has the dict changed?" "Yes.") That's still bad, because you 
> miss out on the possibility of applying the optimization when you 
> actually could have, but it's not so bad. So false positives (wrongly 
> thinking the dict has changed when it hasn't) can be permitted, but 
> false negatives shouldn't be.
>>   What you'd need to answer the question is a way for the
>>   code in need of the information to remember the dict
>>   state and then later compare it's remembered state
>>   with the now current state of the dict.
>>   dicts could do this with a 16-bit index into an array
>>   of state object slots which are set by the code tracking
>>   the dict.
>>   When it's time to check, the code would simply ask for the
>>   current index value and compare the state object in the
>>   array with the one it had set.
> If I've understand that correctly, and I may not have, that will on 
> detect (some?) insertions and deletions to the dict, but fail to 
> detect when an existing key has a new value bound.

This depends on how the state object is managed by the dictionary

It's currently just a rough idea. Thinking about this some
more, I guess having external code set the state object
would result in potential race conditions, so not a good

The idea to add a level of indirection to reduce the
memory overhead, under the assumption that only few
dictionaries will actually need to report changes.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 09 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From mistersheik at  Sat Jan  9 09:55:08 2016
From: mistersheik at (Neil Girdhar)
Date: Sat, 9 Jan 2016 09:55:08 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner at>

> Hi,
> 2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik at>:
> > How is this not just a poorer version of PyPy's optimizations?
> This a very good question :-) There are a lot of optimizers in the
> wild, mostly JIT compilers. The problem is that most of them are
> specific to numerical computations, and the remaining ones are generic
> but not widely used. The most advanced and complete fast
> implementation of Python is obviously PyPy. I didn't heard a lot of
> deployements with PyPy. For example, PyPy is not used to install
> OpenStack (a very large project which has a big number of
> dependencies). I'm not even sure that PyPy is the favorite
> implementation of Python used to run Django, to give another example
> of popular Python application.
> PyPy is just amazing in term of performances, but for an unknown
> reason, it didn't replace CPython yet. PyPy has some drawbacks: it
> only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it
> has bad performances on the C API and I heard that performances are
> not as amazing as expected on some applications. PyPy has also a worse
> startup time and use more memory. IMHO the major issue of Python is
> the backward compatibility on the C API.
> In short, almost all users are stuck at CPython and CPython implements
> close to 0 optimization (come on, constant folding and dead code
> elimintation is not what I would call an "optimization" ;-)).
> My goal is to fill the hole between CPython (0 optimization) and PyPy
> (the reference for best performances).
> I wrote a whole website to explain the status of the Python optimizers
> and why I want to write my own optimizer:

I think this is admirable.  I also dream of faster Python.  However, we
have a fundamental disagreement about how to get there.  You can spend your
whole life adding one or two optimizations a year and Python may only end
up twice as fast as it is now, which would still be dog slow. A meaningful
speedup requires a JIT.  So, I question the value of this kind of change.

> > If what you want is optimization, it would be much better to devote time
> to a solution
> > that can potentially yield orders of magnitude worth of speedup like PyPy
> > rather than increasing language complexity for a minor payoff.
> I disagree that my proposed changes increase the "language
> complexity". According to early benchmarks, my changes has a
> negligible impact on performances. I don't see how adding a read-only
> __version__ property to dict makes the Python *language* more complex?
It makes it more complex because you're adding a user-facing property.
Every little property adds up in the cognitive load of a language.  It also
means that all of the other Python implementation need to follow suit even
if their optimizations work differently.

What is the point of making __version__ an exposed property?  Why can't it
be a hidden variable in CPython's underlying implementation of dict?  If
some code needs to query __version__ to see if it's changed then CPython
should be the one trying to discover this pattern and automatically
generate the right code.  Ultimately, this is just a piece of a JIT, which
is the way this is going to end up.

My whole design is based on the idea that my optimizer will be
> optimal. You will be free to not use it ;-)
> And sorry, I'm not interested to contribute to PyPy.

That's fine, but I think you are probably wasting your time then :)  The
"hole between CPython and PyPy" disappears as soon as PyPy catches up to
CPython 3.5 with numpy, and then all of this work goes with it.

> Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ram at  Sat Jan  9 10:13:38 2016
From: ram at (Ram Rachum)
Date: Sat, 9 Jan 2016 17:13:38 +0200
Subject: [Python-ideas] More friendly access to chmod
Message-ID: <>

Hi everyone,

What do you think about enabling a more friendly interface to chmod
information in Python? I believe that currently if I want to get chmod
information from a file, I need to do this:

my_path.stat().st_mode & 0o777

(I'm using `pathlib`.)

(If there's a nicer way than this, please let me know.)

This sucks. And then the result is then a number, like 511, which you then
have to call `oct` on it to get 0o777. I'm not even happy with getting the
octal number. For some of us who live and breathe Linux, seeing a number
like 0o440 might be crystal-clear, since your mind automatically translates
that to the permissions that user/group/others have, but I haven't reached
that level.

I would really like an object-oriented approach to chmod, like an object
which I can ask "Does group have execute permissions?" and say "Please add
read permissions to everyone" etc. Just because Linux speaks in code
doesn't mean that we need to.

And of course, I'd want that on the `pathlib` module so I could do it all
on the path object without referencing another module.

What do you think?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Jan  9 10:39:31 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jan 2016 02:39:31 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 09, 2016 at 05:13:38PM +0200, Ram Rachum wrote:
> Hi everyone,
> What do you think about enabling a more friendly interface to chmod
> information in Python?

I think that would make an awesome tool added to your own personal 
toolbox. Once you are satisfied that it works well, then it would be 
really good to realise it to the public as a third-party library or 
recipe on ActiveState or similar.

And then we can talk about whether or not it belongs in the stdlib.

> And of course, I'd want that on the `pathlib` module so I could do it all
> on the path object without referencing another module.

What's wrong with referencing other modules?


From ram at  Sat Jan  9 10:41:19 2016
From: ram at (Ram Rachum)
Date: Sat, 9 Jan 2016 17:41:19 +0200
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 5:39 PM, Steven D'Aprano <steve at> wrote:

> On Sat, Jan 09, 2016 at 05:13:38PM +0200, Ram Rachum wrote:
> > Hi everyone,
> >
> > What do you think about enabling a more friendly interface to chmod
> > information in Python?
> I think that would make an awesome tool added to your own personal
> toolbox. Once you are satisfied that it works well, then it would be
> really good to realise it to the public as a third-party library or
> recipe on ActiveState or similar.
> And then we can talk about whether or not it belongs in the stdlib.

Okay. I'm working on it now, we'll see how it goes.

> > And of course, I'd want that on the `pathlib` module so I could do it all
> > on the path object without referencing another module.
> What's wrong with referencing other modules?
Not wrong, just desirable to avoid. For example, I think that doing
`path.chmod(x)` is preferable to `os.chmod(path, x)`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sat Jan  9 10:59:41 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 02:59:41 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 2:13 AM, Ram Rachum <ram at> wrote:
> What do you think about enabling a more friendly interface to chmod
> information in Python? I believe that currently if I want to get chmod
> information from a file, I need to do this:
> my_path.stat().st_mode & 0o777
> (I'm using `pathlib`.)
> (If there's a nicer way than this, please let me know.)

Have you looked at the 'stat' module? At very least, you can ask
questions like "Does group have execute permissions?" like this:

my_path.stat().st_mode & stat.S_IXGRP

You can also get a printable rwxrwxrwx with:



From ram at  Sat Jan  9 11:06:57 2016
From: ram at (Ram Rachum)
Date: Sat, 9 Jan 2016 18:06:57 +0200
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 5:59 PM, Chris Angelico <rosuav at> wrote:

> On Sun, Jan 10, 2016 at 2:13 AM, Ram Rachum <ram at> wrote:
> >
> > What do you think about enabling a more friendly interface to chmod
> > information in Python? I believe that currently if I want to get chmod
> > information from a file, I need to do this:
> >
> > my_path.stat().st_mode & 0o777
> >
> > (I'm using `pathlib`.)
> >
> > (If there's a nicer way than this, please let me know.)
> Have you looked at the 'stat' module? At very least, you can ask
> questions like "Does group have execute permissions?" like this:
> my_path.stat().st_mode & stat.S_IXGRP

> You can also get a printable rwxrwxrwx with:
> stat.filemode(my_path.stat().st_mode)

Thanks for the reference. Personally I think that `my_path.stat().st_mode &
stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API.
Probably this for the same action you described:

'x' in my_path.chmod()['g']

> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ericfahlgren at  Sat Jan  9 11:09:26 2016
From: ericfahlgren at (Eric Fahlgren)
Date: Sat, 9 Jan 2016 08:09:26 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <007001d14af8$1dcdb3a0$59691ae0$>

Pavol Lisy, Saturday, January 09, 2016 01:54:
> Could not something like this ->
>    def embezzle(self, account, funds=1000000, *fake_receipts):
>        # def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None:
>        """Embezzle funds from account using fake receipts."""
>        <code goes here>
> make
> 1. transition from python2 to python3 more simple?
> 2. python3 checkers more easily changeable to understand new python2 standard?
> 3. simpler impact to documentation (means also simpler knowledbase to be learn) about annotations?

+1 on this, which is close to what I've been doing for a while now.

4. Educates people who have only seen Py2 prototypes to recognize what the Py3 annotations look like.

From rosuav at  Sat Jan  9 11:11:13 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 03:11:13 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum <ram at> wrote:
> Thanks for the reference. Personally I think that `my_path.stat().st_mode &
> stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API.
> Probably this for the same action you described:
> 'x' in my_path.chmod()['g']

Okay. I'm not sure how popular that'll be, but sure.

As an alternative API, you could have it return a tuple of permission
strings, which you'd use thus:

'gx' in my_path.mode() # Group eXecute permission is set

But scratch your own itch, and don't give in to the armchair advisers.


From victor.stinner at  Sat Jan  9 11:27:31 2016
From: victor.stinner at (Victor Stinner)
Date: Sat, 9 Jan 2016 17:27:31 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

Le samedi 9 janvier 2016, Neil Girdhar <mistersheik at> a ?crit :
> On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner at
> <javascript:_e(%7B%7D,'cvml','victor.stinner at');>> wrote:
>> I wrote a whole website to explain the status of the Python optimizers
>> and why I want to write my own optimizer:
> I think this is admirable.  I also dream of faster Python.  However, we
> have a fundamental disagreement about how to get there.  You can spend your
> whole life adding one or two optimizations a year and Python may only end
> up twice as fast as it is now, which would still be dog slow. A meaningful
> speedup requires a JIT.  So, I question the value of this kind of change.

There are multiple JIT compilers for Python actively developped: PyPy,
Pyston, Pyjion, Numba (numerical computation), etc.

I don't think that my work will slow down these projects. I hope that it
will create more competition and that we will cooperate. For example, I am
in contact with a Pythran developer who told me that my PEPs will help his
project. As I wrote in the dict.__version__ PEP, the dictionary version
will also be useful for Pyjion according to Brett Canon.

But Antoine Pitrou told me that dictionary version will not help Numba.
Numba doesn't use dictionaries and already has its own
efficient implemenation for guards.

> What is the point of making __version__ an exposed property?

Hum, technically I don't need it at the Python level. Guards are
implemented in C and access directly the field from the strcuture.

Having the property in Python helps to write unit tests, to write
prototypes (experiment new things), etc.

> That's fine, but I think you are probably wasting your time then :)  The
> "hole between CPython and PyPy" disappears as soon as PyPy catches up to
> CPython 3.5 with numpy, and then all of this work goes with it.

PyPy works since many years but it's still not widely used by users. Maybe
PyPy has drawbacks and the speedup is not enough to convince users to use
it? I'm not sure that Python 3.5 support wil make PyPy immediatly more
popular. Users still widely use Python 2 in practice.

Yes, better and faster numpy will help PyPy.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mahmoud at  Sat Jan  9 11:51:18 2016
From: mahmoud at (Mahmoud Hashemi)
Date: Sat, 9 Jan 2016 08:51:18 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

I think it's a pretty common itch! Have you seen the boltons
<> implementation?


On Sat, Jan 9, 2016 at 8:11 AM, Chris Angelico <rosuav at> wrote:

> On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum <ram at> wrote:
> > Thanks for the reference. Personally I think that
> `my_path.stat().st_mode &
> > stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API.
> > Probably this for the same action you described:
> >
> > 'x' in my_path.chmod()['g']
> >
> >
> Okay. I'm not sure how popular that'll be, but sure.
> As an alternative API, you could have it return a tuple of permission
> strings, which you'd use thus:
> 'gx' in my_path.mode() # Group eXecute permission is set
> But scratch your own itch, and don't give in to the armchair advisers.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mistersheik at  Sat Jan  9 12:01:51 2016
From: mistersheik at (Neil Girdhar)
Date: Sat, 9 Jan 2016 12:01:51 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 11:27 AM, Victor Stinner <victor.stinner at>

> Le samedi 9 janvier 2016, Neil Girdhar <mistersheik at> a ?crit :
>> On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner at>
>> wrote:
>>> I wrote a whole website to explain the status of the Python optimizers
>>> and why I want to write my own optimizer:
>> I think this is admirable.  I also dream of faster Python.  However, we
>> have a fundamental disagreement about how to get there.  You can spend your
>> whole life adding one or two optimizations a year and Python may only end
>> up twice as fast as it is now, which would still be dog slow. A meaningful
>> speedup requires a JIT.  So, I question the value of this kind of change.
> There are multiple JIT compilers for Python actively developped: PyPy,
> Pyston, Pyjion, Numba (numerical computation), etc.
> I don't think that my work will slow down these projects. I hope that it
> will create more competition and that we will cooperate. For example, I am
> in contact with a Pythran developer who told me that my PEPs will help his
> project. As I wrote in the dict.__version__ PEP, the dictionary version
> will also be useful for Pyjion according to Brett Canon.
> But Antoine Pitrou told me that dictionary version will not help Numba.
> Numba doesn't use dictionaries and already has its own
> efficient implemenation for guards.
>> What is the point of making __version__ an exposed property?
> Hum, technically I don't need it at the Python level. Guards are
> implemented in C and access directly the field from the strcuture.
> Having the property in Python helps to write unit tests, to write
> prototypes (experiment new things), etc.

I understand what you mean, but If you can do this without changing the
language, I think that would be better.  Isn't it still possible to write
your unit tests in the same C interface that you expose "version" with?
Then the language with stay the same, but CPython would be faster, which is
what you wanted.

>> That's fine, but I think you are probably wasting your time then :)  The
>> "hole between CPython and PyPy" disappears as soon as PyPy catches up to
>> CPython 3.5 with numpy, and then all of this work goes with it.
> PyPy works since many years but it's still not widely used by users. Maybe
> PyPy has drawbacks and the speedup is not enough to convince users to use
> it? I'm not sure that Python 3.5 support wil make PyPy immediatly more
> popular. Users still widely use Python 2 in practice.
> Yes, better and faster numpy will help PyPy.
> Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jim.baker at  Sat Jan  9 13:52:39 2016
From: jim.baker at (Jim Baker)
Date: Sat, 9 Jan 2016 11:52:39 -0700
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <007001d14af8$1dcdb3a0$59691ae0$>
References: <>
Message-ID: <>

+1, I would really like to try out type annotation support in Jython, given
the potential for tying in with Java as a source of type annotations
(basically the equivalent of stubs for free). I'm planning on sprinting on
Jython 3 at PyCon, but let's face it, that's going to take a while to
really finish.

re the two approaches, both are workable with Jython:

* lib2to3 is something we should support in Jython 2.7. There are a couple
of data files that we don't support in the tests (too large of a method for
Java bytecode in, not terribly interesting), plus a
few other tests that should work. Therefore lib2to3 should be in the next
release (2.7.1).

* Jedi now works with the last commit to Jython 2.7 trunk, passing whatever
it means to run random tests using its sith script against its source. (The
sith test does not pass with either CPython or Jython's stdlib, starting

- Jim

On Sat, Jan 9, 2016 at 9:09 AM, Eric Fahlgren <ericfahlgren at>

> Pavol Lisy, Saturday, January 09, 2016 01:54:
> > Could not something like this ->
> >
> >    def embezzle(self, account, funds=1000000, *fake_receipts):
> >        # def embezzle(self, account: str, funds: int = 1000000,
> *fake_receipts: str) -> None:
> >        """Embezzle funds from account using fake receipts."""
> >        <code goes here>
> >
> > make
> > 1. transition from python2 to python3 more simple?
> > 2. python3 checkers more easily changeable to understand new python2
> standard?
> > 3. simpler impact to documentation (means also simpler knowledbase to be
> learn) about annotations?
> +1 on this, which is close to what I've been doing for a while now.
> 4. Educates people who have only seen Py2 prototypes to recognize what the
> Py3 annotations look like.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan  9 14:30:50 2016
From: guido at (Guido van Rossum)
Date: Sat, 9 Jan 2016 11:30:50 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy <pavol.lisy at> wrote:

> Could not something like this ->
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         # def embezzle(self, account: str, funds: int = 1000000,
> *fake_receipts: str) -> None:
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> make
> 1. transition from python2 to python3 more simple?
> 2. python3 checkers more easily changeable to understand new python2
> standard?
> 3. simpler impact to documentation (means also simpler knowledbase to
> be learn) about annotations?

There would still have to be some marker like "# type:" for the type
checker to recognize -- I'm sure that plain comments with alternate 'def'
statements are pretty common and we really don't want the type checker to
be confused by those.

I don't like that the form you propose has so much repetition -- the design
of Python 3 annotations intentionally is the least redundant possible, and
my (really Jukka's) proposal tries to keep that property.

Modifying type checkers to support this syntax is easy (Jukka already did
it for mypy).

Note that type checkers already have to parse the source code without the
help of Python's ast module, because there are other things in comments:
PEP 484 specifies variable annotations and a few forms of `# type: ignore`

Regarding the idea of a decorator, this was discussed and rejected for the
original PEP 484 proposal as well. The problem is similar to that with your
'def' proposal: too verbose. Also a decorator is more expensive (we're
envisioning adding many thousands of decorators, and it would weigh down
program startup). We don't envision needing to introspect __annotations__
at run time. (Also, we already use decorators quite heavily -- introducing
a @typehint decorator would make the code less readable due to excessive
stacking of decorators.)

Our needs at Dropbox are several: first, we want to add annotations to the
code so that new engineers can learn their way around the code quicker and
refactoring will be easier; second, we want to automatically check
conformance to the annotations as part of our code review and continuous
integration processes (this is where mypy comes in); third, once we have
annotated enough of the code we want to start converting it to Python 3
with as much automation is feasible. The latter part is as yet unproven,
but there's got to be a better way than manually checking the output of
2to3 (whose main weakness is that it does not know the types of variables).
We see many benefits of annotations and automatically checking them using
mypy -- but we don't want the them to affect the runtime at all.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sat Jan  9 14:32:45 2016
From: brett at (Brett Cannon)
Date: Sat, 09 Jan 2016 19:32:45 +0000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 9 Jan 2016 at 07:04 Neil Girdhar <mistersheik at> wrote:

> On Sat, Jan 9, 2016 at 8:42 AM, Victor Stinner <victor.stinner at>
> wrote:
>> Hi,
>> 2016-01-09 13:48 GMT+01:00 Neil Girdhar <mistersheik at>:
>> > How is this not just a poorer version of PyPy's optimizations?
>> This a very good question :-) There are a lot of optimizers in the
>> wild, mostly JIT compilers. The problem is that most of them are
>> specific to numerical computations, and the remaining ones are generic
>> but not widely used. The most advanced and complete fast
>> implementation of Python is obviously PyPy. I didn't heard a lot of
>> deployements with PyPy. For example, PyPy is not used to install
>> OpenStack (a very large project which has a big number of
>> dependencies). I'm not even sure that PyPy is the favorite
>> implementation of Python used to run Django, to give another example
>> of popular Python application.
>> PyPy is just amazing in term of performances, but for an unknown
>> reason, it didn't replace CPython yet. PyPy has some drawbacks: it
>> only supports Python 2.7 and 3.2 (CPython is at the version 3.5), it
>> has bad performances on the C API and I heard that performances are
>> not as amazing as expected on some applications. PyPy has also a worse
>> startup time and use more memory. IMHO the major issue of Python is
>> the backward compatibility on the C API.
>> In short, almost all users are stuck at CPython and CPython implements
>> close to 0 optimization (come on, constant folding and dead code
>> elimintation is not what I would call an "optimization" ;-)).
>> My goal is to fill the hole between CPython (0 optimization) and PyPy
>> (the reference for best performances).
>> I wrote a whole website to explain the status of the Python optimizers
>> and why I want to write my own optimizer:
> I think this is admirable.  I also dream of faster Python.  However, we
> have a fundamental disagreement about how to get there.  You can spend your
> whole life adding one or two optimizations a year and Python may only end
> up twice as fast as it is now, which would still be dog slow. A meaningful
> speedup requires a JIT.  So, I question the value of this kind of change.

Obviously a JIT can help, but even they can benefit from this. For
instance, Pyjion could rely on this instead of creating our own guards for
built-in and global namespaces if we wanted to inline calls to certain

>> > If what you want is optimization, it would be much better to devote
>> time to a solution
>> > that can potentially yield orders of magnitude worth of speedup like
>> PyPy
>> > rather than increasing language complexity for a minor payoff.
>> I disagree that my proposed changes increase the "language
>> complexity". According to early benchmarks, my changes has a
>> negligible impact on performances. I don't see how adding a read-only
>> __version__ property to dict makes the Python *language* more complex?
> It makes it more complex because you're adding a user-facing property.
> Every little property adds up in the cognitive load of a language.  It also
> means that all of the other Python implementation need to follow suit even
> if their optimizations work differently.
> What is the point of making __version__ an exposed property?  Why can't it
> be a hidden variable in CPython's underlying implementation of dict?  If
> some code needs to query __version__ to see if it's changed then CPython
> should be the one trying to discover this pattern and automatically
> generate the right code.  Ultimately, this is just a piece of a JIT, which
> is the way this is going to end up.
> My whole design is based on the idea that my optimizer will be
>> optimal. You will be free to not use it ;-)
>> And sorry, I'm not interested to contribute to PyPy.
> That's fine, but I think you are probably wasting your time then :)  The
> "hole between CPython and PyPy" disappears as soon as PyPy catches up to
> CPython 3.5 with numpy, and then all of this work goes with it.

That doesn't solve the C API compatibility problem, nor other issues some
people have with PyPy deployments (e.g., inconsistent performance that
can't necessarily be relied upon).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sat Jan  9 14:46:51 2016
From: brett at (Brett Cannon)
Date: Sat, 09 Jan 2016 19:46:51 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 9 Jan 2016 at 11:31 Guido van Rossum <guido at> wrote:

> On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy <pavol.lisy at> wrote:
>> Could not something like this ->
>>     def embezzle(self, account, funds=1000000, *fake_receipts):
>>         # def embezzle(self, account: str, funds: int = 1000000,
>> *fake_receipts: str) -> None:
>>         """Embezzle funds from account using fake receipts."""
>>         <code goes here>
>> make
>> 1. transition from python2 to python3 more simple?
>> 2. python3 checkers more easily changeable to understand new python2
>> standard?
>> 3. simpler impact to documentation (means also simpler knowledbase to
>> be learn) about annotations?
> There would still have to be some marker like "# type:" for the type
> checker to recognize -- I'm sure that plain comments with alternate 'def'
> statements are pretty common and we really don't want the type checker to
> be confused by those.
> I don't like that the form you propose has so much repetition -- the
> design of Python 3 annotations intentionally is the least redundant
> possible, and my (really Jukka's) proposal tries to keep that property.
> Modifying type checkers to support this syntax is easy (Jukka already did
> it for mypy).
> Note that type checkers already have to parse the source code without the
> help of Python's ast module, because there are other things in comments:
> PEP 484 specifies variable annotations and a few forms of `# type: ignore`
> comments.
> Regarding the idea of a decorator, this was discussed and rejected for the
> original PEP 484 proposal as well. The problem is similar to that with your
> 'def' proposal: too verbose. Also a decorator is more expensive (we're
> envisioning adding many thousands of decorators, and it would weigh down
> program startup). We don't envision needing to introspect __annotations__
> at run time. (Also, we already use decorators quite heavily -- introducing
> a @typehint decorator would make the code less readable due to excessive
> stacking of decorators.)
> Our needs at Dropbox are several: first, we want to add annotations to the
> code so that new engineers can learn their way around the code quicker and
> refactoring will be easier; second, we want to automatically check
> conformance to the annotations as part of our code review and continuous
> integration processes (this is where mypy comes in); third, once we have
> annotated enough of the code we want to start converting it to Python 3
> with as much automation is feasible. The latter part is as yet unproven,
> but there's got to be a better way than manually checking the output of
> 2to3 (whose main weakness is that it does not know the types of variables).
> We see many benefits of annotations and automatically checking them using
> mypy -- but we don't want the them to affect the runtime at all.

To help answer the question about whether this could help with porting code
to Python 3, the answer is "yes"; it's not essential but definitely would
be helpful.

Between Modernize, pylint, `python2.7 -3`, and `python3 -bb` you cover
almost all of the issues that can arise in moving to Python 3. But notice
that half of those tools are running your code under an interpreter with a
certain flag flipped, which means run-time checks that require excellent
test coverage. With type annotations you can do offline, static checking
which is less reliant on your tests covering all corner cases. Depending on
how the tools choose to handle representing str/unicode in Python 2/3 code
(i.e., say that if you specify the type as 'str' it's an error and anything
that is 'unicode' is considered the 'str' type in Python 3?), I don't see
why mypy can't have a 2/3 compatibility mode that warns against uses of,
e.g. the bytes type that don't directly translate between Python 2 and 3
like indexing. That kind of static warning would definitely be beneficial
to anyone moving their code over as they wouldn't need to rely on e.g.,
`python3 -bb ` and their tests  to catch that common issue with bytes and

There is also the benefit of gradual porting with this kind of offline
checking. Since you can slowly add more type information, you can slowly
catch more issues in your code. Relying on `python3 -bb`, though, requires
you have ported all of your code over first before running it under Python
3 to catch some issues.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Sat Jan  9 15:24:10 2016
From: tjreedy at (Terry Reedy)
Date: Sat, 9 Jan 2016 15:24:10 -0500
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <n6rq9d$glk$>

On 1/8/2016 6:04 PM, Guido van Rossum wrote:
> At Dropbox we're trying to be good citizens and we're working towards
> introducing gradual typing (PEP 484) into our Python code bases (several
> million lines of code). However, that code base is mostly still Python
> 2.7 and we believe that we should introduce gradual typing first and
> start working on conversion to Python 3 second (since having static
> types in the code can help a big refactoring like that).
> Since Python 2 doesn't support function annotations we've had to look
> for alternatives. We considered stub files, a magic codec, docstrings,
> and additional `# type:` comments. In the end we decided that `# type:`
> comments are the most robust approach. We've experimented a fair amount
> with this and we have a proposal for a standard.
> The proposal is very simple. Consider the following function with Python
> 3 annotations:
>      def embezzle(self, account: str, funds: int = 1000000,
> *fake_receipts: str) -> None:
>          """Embezzle funds from account using fake receipts."""
>          <code goes here>
> An equivalent way to write this in Python 2 is the following:
>      def embezzle(self, account, funds=1000000, *fake_receipts):
>          # type: (str, int, *str) -> None
>          """Embezzle funds from account using fake receipts."""
>          <code goes here>

I find the this separate signature line to be at least as readable as 
the intermixed 3.x version.  I noticed the same thing as Lemburg (no 
runtime .__annotations__ attributes, but am not sure whether adding them 
in 2.x code is a good or bad thing.

> There are a few details to discuss:
> - Every argument must be accounted for, except 'self' (for instance
> methods) or 'cls' (for class methods). Also the return type is
> mandatory. If in Python 3 you would omit some argument or the return
> type, the Python 2 notation should use 'Any'.
> - If you're using names defined in the typing module, you must still
> import them! (There's a backport on PyPI.)
> - For `*args` and `**kwds`, put 1 or 2 starts in front of the
> corresponding type annotation. As with Python 3 annotations, the
> annotation here denotes the type of the individual argument values, not
> of the tuple/dict that you receive as the special argument value 'args'
> or 'kwds'.
> - The entire annotation must be one line. (However, see

To me, really needed.

> We would like to propose this as a standard (either to be added to PEP
> 484 or as a new PEP) rather than making it a "proprietary" extension to
> mypy only, so that others in a similar situation can also benefit.

Since I am personally pretty much done with 2.x, the details do not 
matter to me, but I think a suggested standard approach is a good idea. 
  I also think a new informational PEP, with a reference added to 484, 
would be better.  'Type hints for 2.x and 2&3 code'

For a helpful tool, I would at least want something that added a 
template comment, without dummy 'Any's to be erased, to each function.

# type: (, , *) ->

A GUI with suggestions from both type-inferencing and from a name -> 
type dictionary would be even nicer.  Name to type would work really 
well for a project with consistent use of parameter names.

Terry Jan Reedy

From tjreedy at  Sat Jan  9 17:18:40 2016
From: tjreedy at (Terry Reedy)
Date: Sat, 9 Jan 2016 17:18:40 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n6s103$i6f$>

On 1/8/2016 4:27 PM, Victor Stinner wrote:

> Add a new read-only ``__version__`` property to ``dict`` and
> ``collections.UserDict`` types, incremented at each change.

I agree with Neil Girdhar that this looks to me like a CPython-specific 
implementation detail that should not be imposed on other 
implementations.  For testing, perhaps we could add a dict_version 
function in that uses ctypes to access the internals.

Another reason to hide __version__ from the Python level is that its use 
seems to me rather tricky and bug-prone.

> Python is hard to optimize because almost everything is mutable: builtin
> functions, function code, global variables, local variables, ... can be
> modified at runtime.

I believe that C-coded functions are immutable.  But I believe that 
mutability otherwise otherwise undercuts what your are trying to do.

> Implementing optimizations respecting the Python
> semantic requires to detect when "something changes":

But as near as I can tell, your proposal cannot detect all relevant 
changes unless one is *very* careful.  A dict maps hashable objects to 
objects.  Objects represent values.  So a dict represents a mapping of 
values to values.  If an object is mutated, the object to object mapping 
is not changed, but the semantic value to value mapping *is* changed. 
In the following example, __version__ twice gives the 'wrong' answer 
from a value perspective.

d = {'f': [int]}
d['f'][0] = float # object mapping unchanged, value mapping changed
d['f'] = [float]  # object mapping changed, value mapping unchanged

> The astoptimizer of the FAT Python project implements many optimizations
> which require guards on namespaces. Examples:
> * Call pure builtins: to replace ``len("abc")`` with ``3``,

Replacing a call with a return value assumes that the function is 
immutable, deterministic, and without side-effect.  Perhaps this is what 
you meant by 'pure'.  Are you proposing to provide astoptimizer with 
either a whitelist or blacklist of builtins that qualify or not?

Aside from this, I don't find this example motivational.  I would either 
write '3' in the first place or write something like "slen = 
len('akjslkjgkjsdkfjsldjkfs')" outside of any loop.  I would more likely 
write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len = 
len(key)" to keep a reference to both the string and its length.  Will 
astoptimizer 'propogate the constant' (in this case 'key')?

The question in my mind is whether real code has enough pure builtin 
calls of constants to justify the overhead.

> * Loop unrolling: to unroll the loop ``for i in range(...): ...``,

How often is this useful in modern real-world Python code?  Many old 
uses of range have been or could be replaced with enumerate or a 
collection iterator, making it less common than it once was.

How often is N small enough that one wants complete versus partial 
unrolling?  Wouldn't it be simpler to only use a (specialized) 
loop-unroller where range is known to be the builtin?

Terry Jan Reedy

From rosuav at  Sat Jan  9 17:19:15 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 09:19:15 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 3:51 AM, Mahmoud Hashemi <mahmoud at> wrote:
> I think it's a pretty common itch! Have you seen the boltons implementation?

Yes it is, and no I haven't; everyone has a slightly different idea of
what makes a good API, and that's why I put that caveat onto my
suggestion. You can't make everyone happy, and APIs should not be
designed by committee :)


From victor.stinner at  Sat Jan  9 18:08:56 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 00:08:56 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6s103$i6f$>
References: <>
Message-ID: <>

2016-01-09 23:18 GMT+01:00 Terry Reedy <tjreedy at>:
> But as near as I can tell, your proposal cannot detect all relevant changes
> unless one is *very* careful.  A dict maps hashable objects to objects.
> Objects represent values.  So a dict represents a mapping of values to
> values.  If an object is mutated, the object to object mapping is not
> changed, but the semantic value to value mapping *is* changed. In the
> following example, __version__ twice gives the 'wrong' answer from a value
> perspective.

dict.__version__ is a technical solution to implement efficient guards
on namespace. You are true, that it's not enough to detect any kind of
change. For example, to inline a function inside the same module, we
need a guard on the global variable, but also a guard on the function
itself. We need to disable the optimization if the function code
(func.__code__) is modified. Maybe a guard is also needed on default
values of function parameters. But guards on functions don't need to
modify CPython internals. It's already possible to implement efficient
guards on functions.

> Replacing a call with a return value assumes that the function is immutable,
> deterministic, and without side-effect.

that the function is deterministic and has no side-effect, yep.

>  Perhaps this is what you meant by
> 'pure'.  Are you proposing to provide astoptimizer with either a whitelist
> or blacklist of builtins that qualify or not?

Currently, I'm using a whitelist of builtin functions which are known
to be pure. Later, I plan to detect automatically pure functions by
analyzing the (AST) code.

> Aside from this, I don't find this example motivational.  I would either
> write '3' in the first place or write something like "slen =
> len('akjslkjgkjsdkfjsldjkfs')" outside of any loop.  I would more likely
> write something like "key = 'jlkjfksjlkdfjlskfjkslkjeicji'; key_len =
> len(key)" to keep a reference to both the string and its length.  Will
> astoptimizer 'propogate the constant' (in this case 'key')?

FYI I already have a working implementation of the astoptimizer: it's
possible to run the full Python test suite with the optimizer.
Implemented optimizations:

Constant propagation and constant folding optimizations are
implemented. A single optimization is not interesting, It's more
interesting when you combine optimizations. Like constant propagation
+ constant folding + loop unrolling.

> The question in my mind is whether real code has enough pure builtin calls
> of constants to justify the overhead.

Replacing len("abc") with 3 is not the goal of FAT Python. It's only
an example simple to understand.

> How often is this useful in modern real-world Python code?  Many old uses of
> range have been or could be replaced with enumerate or a collection
> iterator, making it less common than it once was.

IMHO the optimizations currently implemented will not provide any
major speedup. It will become more interesting with function inlining.
The idea is more to create an API to support pluggable static

> How often is N small enough that one wants complete versus partial
> unrolling?  Wouldn't it be simpler to only use a (specialized) loop-unroller
> where range is known to be the builtin?

What is the link between your question and dict.__version__?


From victor.stinner at  Sat Jan  9 18:14:10 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 00:14:10 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 2:00 GMT+01:00 Chris Angelico <rosuav at>:
> (...) send your PEP drafts to peps at and we can get them assigned numbers

Ok, this PEP got the number 509:

"PEP 0509 -- Add dict.__version__"

FYI the second PEP got the number 510:

"PEP 0510 -- Specialized functions with guards"


From victor.stinner at  Sat Jan  9 18:16:13 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 00:16:13 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

> PEP: xxx
> Title: Specialized functions with guards

FYI I published the PEP at and it got the number 510:

"PEP 0510 -- Specialized functions with guards"


From mistersheik at  Sat Jan  9 07:48:30 2016
From: mistersheik at (Neil Girdhar)
Date: Sat, 9 Jan 2016 04:48:30 -0800 (PST)
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

How is this not just a poorer version of PyPy's optimizations?  If what you 
want is optimization, it would be much better to devote time to a solution 
that can potentially yield orders of magnitude worth of speedup like PyPy 
rather than increasing language complexity for a minor payoff.



On Friday, January 8, 2016 at 4:27:53 PM UTC-5, Victor Stinner wrote:
> Hi, 
> Here is a first PEP, part of a serie of 3 PEP to add an API to 
> implement a static Python optimizer specializing functions with 
> guards. 
> HTML version: 
> PEP: xxx 
> Title: Add dict.__version__ 
> Version: $Revision$ 
> Last-Modified: $Date$ 
> Author: Victor Stinner <victor.... at <javascript:>> 
> Status: Draft 
> Type: Standards Track 
> Content-Type: text/x-rst 
> Created: 4-January-2016 
> Python-Version: 3.6 
> Abstract 
> ======== 
> Add a new read-only ``__version__`` property to ``dict`` and 
> ``collections.UserDict`` types, incremented at each change. 
> Rationale 
> ========= 
> In Python, the builtin ``dict`` type is used by many instructions. For 
> example, the ``LOAD_GLOBAL`` instruction searchs for a variable in the 
> global namespace, or in the builtins namespace (two dict lookups). 
> Python uses ``dict`` for the builtins namespace, globals namespace, type 
> namespaces, instance namespaces, etc. The local namespace (namespace of 
> a function) is usually optimized to an array, but it can be a dict too. 
> Python is hard to optimize because almost everything is mutable: builtin 
> functions, function code, global variables, local variables, ... can be 
> modified at runtime. Implementing optimizations respecting the Python 
> semantic requires to detect when "something changes": we will call these 
> checks "guards". 
> The speedup of optimizations depends on the speed of guard checks. This 
> PEP proposes to add a version to dictionaries to implement efficient 
> guards on namespaces. 
> Example of optimization: replace loading a global variable with a 
> constant.  This optimization requires a guard on the global variable to 
> check if it was modified. If the variable is modified, the variable must 
> be loaded at runtime, instead of using the constant. 
> Guard example 
> ============= 
> Pseudo-code of an efficient guard to check if a dictionary key was 
> modified (created, updated or deleted):: 
>     UNSET = object() 
>     class Guard: 
>         def __init__(self, dict, key): 
>             self.dict = dict 
>             self.key = key 
>             self.value = dict.get(key, UNSET) 
>             self.version = dict.__version__ 
>         def check(self): 
>             """Return True if the dictionary value did not changed.""" 
>             version = self.dict.__version__ 
>             if version == self.version: 
>                 # Fast-path: avoid the dictionary lookup 
>                 return True 
>             value = self.dict.get(self.key, UNSET) 
>             if value == self.value: 
>                 # another key was modified: 
>                 # cache the new dictionary version 
>                 self.version = version 
>                 return True 
>             return False 
> Changes 
> ======= 
> Add a read-only ``__version__`` property to builtin ``dict`` type and to 
> the ``collections.UserDict`` type. New empty dictionaries are initilized 
> to version ``0``. The version is incremented at each change: 
> * ``clear()`` if the dict was non-empty 
> * ``pop(key)`` if the key exists 
> * ``popitem()`` if the dict is non-empty 
> * ``setdefault(key, value)`` if the `key` does not exist 
> * ``__detitem__(key)`` if the key exists 
> * ``__setitem__(key, value)`` if the `key` doesn't exist or if the value 
>   is different 
> * ``update(...)`` if new values are different than existing values (the 
>   version can be incremented multiple times) 
> Example:: 
>     >>> d = {} 
>     >>> d.__version__ 
>     0 
>     >>> d['key'] = 'value' 
>     >>> d.__version__ 
>     1 
>     >>> d['key'] = 'new value' 
>     >>> d.__version__ 
>     2 
>     >>> del d['key'] 
>     >>> d.__version__ 
>     3 
> If a dictionary is created with items, the version is also incremented 
> at each dictionary insertion. Example:: 
>     >>> d=dict(x=7, y=33) 
>     >>> d.__version__ 
>     2 
> The version is not incremented is an existing key is modified to the 
> same value, but only the identifier of the value is tested, not the 
> content of the value. Example:: 
>     >>> d={} 
>     >>> value = object() 
>     >>> d['key'] = value 
>     >>> d.__version__ 
>     2 
>     >>> d['key'] = value 
>     >>> d.__version__ 
>     2 
> .. note:: 
>    CPython uses some singleton like integers in the range [-5; 257], 
>    empty tuple, empty strings, Unicode strings of a single character in 
>    the range [U+0000; U+00FF], etc. When a key is set twice to the same 
>    singleton, the version is not modified. 
> The PEP is designed to implement guards on namespaces, only the ``dict`` 
> type can be used for namespaces in practice.  ``collections.UserDict`` 
> is modified because it must mimicks ``dict``. ``collections.Mapping`` is 
> unchanged. 
> Integer overflow 
> ================ 
> The implementation uses the C unsigned integer type ``size_t`` to store 
> the version.  On 32-bit systems, the maximum version is ``2**32-1`` 
> (more than ``4.2 * 10 ** 9``, 4 billions). On 64-bit systems, the maximum 
> version is ``2**64-1`` (more than ``1.8 * 10**19``). 
> The C code uses ``version++``. The behaviour on integer overflow of the 
> version is undefined. The minimum guarantee is that the version always 
> changes when the dictionary is modified. 
> The check ``dict.__version__ == old_version`` can be true after an 
> integer overflow, so a guard can return false even if the value changed, 
> which is wrong. The bug occurs if the dict is modified at least ``2**64`` 
> times (on 64-bit system) between two checks of the guard. 
> Using a more complex type (ex: ``PyLongObject``) to avoid the overflow 
> would slow down operations on the ``dict`` type. Even if there is a 
> theorical risk of missing a value change, the risk is considered too low 
> compared to the slow down of using a more complex type. 
> Alternatives 
> ============ 
> Add a version to each dict entry 
> -------------------------------- 
> A single version per dictionary requires to keep a strong reference to 
> the value which can keep the value alive longer than expected. If we add 
> also a version per dictionary entry, the guard can rely on the entry 
> version and so avoid the strong reference to the value (only strong 
> references to a dictionary and key are needed). 
> Changes: add a ``getversion(key)`` method to dictionary which returns 
> ``None`` if the key doesn't exist. When a key is created or modified, 
> the entry version is set to the dictionary version which is incremented 
> at each change (create, modify, delete). 
> Pseudo-code of an efficient guard to check if a dict key was modified 
> using ``getversion()``:: 
>     UNSET = object() 
>     class Guard: 
>         def __init__(self, dict, key): 
>             self.dict = dict 
>             self.key = key 
>             self.dict_version = dict.__version__ 
>             self.entry_version = dict.getversion(key) 
>         def check(self): 
>             """Return True if the dictionary value did not changed.""" 
>             dict_version = self.dict.__version__ 
>             if dict_version == self.version: 
>                 # Fast-path: avoid the dictionary lookup 
>                 return True 
>             # lookup in the dictionary, but get the entry version, 
>             #not the value 
>             entry_version = self.dict.getversion(self.key) 
>             if entry_version == self.entry_version: 
>                 # another key was modified: 
>                 # cache the new dictionary version 
>                 self.dict_version = dict_version 
>                 return True 
>             return False 
> This main drawback of this option is the impact on the memory footprint. 
> It increases the size of each dictionary entry, so the overhead depends 
> on the number of buckets (dictionary entries, used or unused yet). For 
> example, it increases the size of each dictionary entry by 8 bytes on 
> 64-bit system if we use ``size_t``. 
> In Python, the memory footprint matters and the trend is more to reduce 
> it. Examples: 
> * `PEP 393 -- Flexible String Representation 
>   <>`_ 
> * `PEP 412 -- Key-Sharing Dictionary 
>   <>`_ 
> Add a new dict subtype 
> ---------------------- 
> Add a new ``verdict`` type, subtype of ``dict``. When guards are needed, 
> use the ``verdict`` for namespaces (module namespace, type namespace, 
> instance namespace, etc.) instead of ``dict``. 
> Leave the ``dict`` type unchanged to not add any overhead (memory 
> footprint) when guards are not needed. 
> Technical issue: a lot of C code in the wild, including CPython core, 
> expect the exact ``dict`` type. Issues: 
> * ``exec()`` requires a ``dict`` for globals and locals. A lot of code 
>   use ``globals={}``. It is not possible to cast the ``dict`` to a 
>   ``dict`` subtype because the caller expects the ``globals`` parameter 
>   to be modified (``dict`` is mutable). 
> * Functions call directly ``PyDict_xxx()`` functions, instead of calling 
>   ``PyObject_xxx()`` if the object is a ``dict`` subtype 
> * ``PyDict_CheckExact()`` check fails on ``dict`` subtype, whereas some 
>   functions require the exact ``dict`` type. 
> * ``Python/ceval.c`` does not completly supports dict subtypes for 
>   namespaces 
> The ``exec()`` issue is a blocker issue. 
> Other issues: 
> * The garbage collector has a special code to "untrack" ``dict`` 
>   instances. If a ``dict`` subtype is used for namespaces, the garbage 
>   collector may be unable to break some reference cycles. 
> * Some functions have a fast-path for ``dict`` which would not be taken 
>   for ``dict`` subtypes, and so it would make Python a little bit 
>   slower. 
> Usage of dict.__version__ 
> ========================= 
> astoptimizer of FAT Python 
> -------------------------- 
> The astoptimizer of the FAT Python project implements many optimizations 
> which require guards on namespaces. Examples: 
> * Call pure builtins: to replace ``len("abc")`` with ``3``, guards on 
>   ``builtins.__dict__['len']`` and ``globals()['len']`` are required 
> * Loop unrolling: to unroll the loop ``for i in range(...): ...``, 
>   guards on ``builtins.__dict__['range']`` and ``globals()['range']`` 
>   are required 
> The `FAT Python 
> <>`_ project is a 
> static optimizer for Python 3.6. 
> Pyjion 
> ------ 
> According of Brett Cannon, one of the two main developers of Pyjion, 
> Pyjion can 
> also benefit from dictionary version to implement optimizations. 
> Pyjion is a JIT compiler for Python based upon CoreCLR (Microsoft .NET 
> Core 
> runtime). 
> Unladen Swallow 
> --------------- 
> Even if dictionary version was not explicitly mentionned, optimization 
> globals 
> and builtins lookup was part of the Unladen Swallow plan: "Implement one 
> of the 
> several proposed schemes for speeding lookups of globals and builtins." 
> Source: `Unladen Swallow ProjectPlan 
> <>`_. 
> Unladen Swallow is a fork of CPython 2.6.1 adding a JIT compiler 
> implemented 
> with LLVM. The project stopped in 2011: `Unladen Swallow Retrospective 
> <>`_. 
> Prior Art 
> ========= 
> Cached globals+builtins lookup 
> ------------------------------ 
> In 2006, Andrea Griffini proposes a patch implementing a `Cached 
> globals+builtins lookup optimization <>`_. 
> The patch adds a private ``timestamp`` field to dict. 
> See the thread on python-dev: `About dictionary lookup caching 
> <>`_. 
> Globals / builtins cache 
> ------------------------ 
> In 2010, Antoine Pitrou proposed a `Globals / builtins cache 
> <>`_ which adds a private 
> ``ma_version`` field to the ``dict`` type. The patch adds a "global and 
> builtin cache" to functions and frames, and changes ``LOAD_GLOBAL`` and 
> ``STORE_GLOBAL`` instructions to use the cache. 
> PySizer 
> ------- 
> `PySizer <>`_: a memory profiler for Python, 
> Google Summer of Code 2005 project by Nick Smallbone. 
> This project has a patch for CPython 2.4 which adds ``key_time`` and 
> ``value_time`` fields to dictionary entries. It uses a global 
> process-wide counter for dictionaries, incremented each time that a 
> dictionary is modified. The times are used to decide when child objects 
> first appeared in their parent objects. 
> Copyright 
> ========= 
> This document has been placed in the public domain. 
> -- 
> Victor 
> _______________________________________________ 
> Python-ideas mailing list 
> Python... at <javascript:> 
> Code of Conduct: 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Sat Jan  9 22:22:41 2016
From: abarnert at (Andrew Barnert)
Date: Sat, 9 Jan 2016 19:22:41 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 9, 2016, at 04:48, Neil Girdhar <mistersheik at> wrote:
> How is this not just a poorer version of PyPy's optimizations?  If what you want is optimization, it would be much better to devote time to a solution that can potentially yield orders of magnitude worth of speedup like PyPy rather than increasing language complexity for a minor payoff.

I think he's already answered this twice between the two threads, plus at least once in the thread last year, not to mention similar questions from slightly different angles.

Which implies to me that the PEPs really need to anticipate and answer these questions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sat Jan  9 22:31:41 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jan 2016 14:31:41 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 09, 2016 at 09:55:08AM -0500, Neil Girdhar wrote:

> I think this is admirable.  I also dream of faster Python.  However, we
> have a fundamental disagreement about how to get there.  You can spend your
> whole life adding one or two optimizations a year and Python may only end
> up twice as fast as it is now, which would still be dog slow. A meaningful
> speedup requires a JIT.  So, I question the value of this kind of change.

I think that's pessimistic and unrealistic. If Python were twice as fast 
as it is now, it would mean that scripts could process twice as much 
data in the same time as they do now. How is that not meaningful?

Sometimes I work hard to get a 5% or 10% improvement in speed of a 
function, because it's worth it. Doubling the speed is something I can 
only dream about.

As for a JIT, they have limited value for code that isn't long-running. 
As the PyPy FAQ says:

"Note also that our JIT has a very high warm-up cost, meaning that any 
program is slow at the beginning. If you want to compare the timings 
with CPython, even relatively simple programs need to run at least one 
second, preferrably at least a few seconds."

which means that PyPy is going to have little or no benefit for 
short-lived programs and scripts. But if you call those scripts 
thousands or tens of thousands of times (say, from the shell) the total 
amount of time can be considerable. Halving that time would be a good 

There is plenty of room in the Python ecosystem for many different 
approaches to optimization.

> It makes it more complex because you're adding a user-facing property.
> Every little property adds up in the cognitive load of a language.  It also
> means that all of the other Python implementation need to follow suit even
> if their optimizations work differently.

That second point is a reasonable criticism of Victor's idea.

> What is the point of making __version__ an exposed property?  Why can't it
> be a hidden variable in CPython's underlying implementation of dict?

Making it public means that anyone can make use of it. Just because 
Victor wants to use it for CPython optimizations doesn't mean that 
others can't or shouldn't make use of it for their own code. Victor 
wants to detect changes to globals() and builtins, but I might want 
to use it to detect changes to some other dict:

mydict = {'many': 1, 'keys': 2, 'with': 3, 'an': 4, 'invariant': 5}
v = mydict.__version__
if v != mydict.__version__:

If Victor is right that tracking this version flag is cheap, then 
there's no reason not to expose it. Some people will find a good use for 
it, and others can ignore it.


From steve at  Sat Jan  9 23:24:27 2016
From: steve at (Steven D'Aprano)
Date: Sun, 10 Jan 2016 15:24:27 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6s103$i6f$>
References: <>
Message-ID: <>

On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
> On 1/8/2016 4:27 PM, Victor Stinner wrote:
> >Add a new read-only ``__version__`` property to ``dict`` and
> >``collections.UserDict`` types, incremented at each change.
> I agree with Neil Girdhar that this looks to me like a CPython-specific 
> implementation detail that should not be imposed on other 
> implementations.  For testing, perhaps we could add a dict_version 
> function in that uses ctypes to access the internals.
> Another reason to hide __version__ from the Python level is that its use 
> seems to me rather tricky and bug-prone.

What makes you say that? Isn't it a simple matter of:

v = mydict.__version__
if v != mydict.__version__:
    print("dict has changed")

which doesn't seen tricky or bug-prone to me.

The only thing I would consider is the risk that people will write v > 
mydict.__version__ instead of not equal, which is wrong if the flag 
overflows back to zero. But with a 64-bit flag, and one modification to 
the dict every nanosecond (i.e. a billion changes per second), it will 
take approximately 584 years before the counter overflows. I don't think 
this is a realistic scenario. How many computers do you know with an 
uptime of more than a decade?

(A 32-bit counter, on the other hand, will only take four seconds to 
overflow at that rate.)

> >Python is hard to optimize because almost everything is mutable: builtin
> >functions, function code, global variables, local variables, ... can be
> >modified at runtime.
> I believe that C-coded functions are immutable.  But I believe that 
> mutability otherwise otherwise undercuts what your are trying to do.

If I have understood Victor's intention correctly, what he's looking for 
is a way to quickly detect the shadowing or monkey-patching of builtins, 
so that if they *haven't* been shadowed/monkey-patched, functions can 
bypass the (slow) lookup process with a fast inline version.

Here's a sketch of the idea:

def demo(arg):
    return len(arg)

This has to do a time-consuming lookup of len in the globals, and if not 
found, then a second lookup in builtins. But 99.99% of the time, we 
haven't shadowed or monkey-patched len, so the compiler ought to be able 
to inline the function and skip the search. This is how static 
programming languages typically operate, and is one of the reasons why 
they're so fast.

In Python, you will often see functions like this:

def demo(arg, len=len):
    return len(arg)

which replace the slow global lookup with a fast local lookup, but at 
the cost of adding an extra parameter to the function call. Ugly and 
confusing. And, it has the side-effect that if you do shadow or 
monkey-patch len, the demo function won't see the new version, which may 
not be what you want.

Victor wants to be able to make that idiom obsolete by allowing the 
compiler to automatically translate this:

def demo(arg):
    return len(arg)

into something like this:

def demo(arg):
    if len has been shadowed or monkey-patched:
        return len(arg)  # calls the new version
        return inlined or cached version of len(arg)

(I stress that you, the code's author, don't have to write the code like 
that, the compiler will automatically do this. And it won't just 
operate on len, it could potentially operate on any function that has 
no side-effects.)

This relies on the test for shadowing etc to be cheap, which Victor's 
tests suggest it is. But he needs a way to detect when the globals() and 
builtins.__dict__ dictionaries have been changed, hence his proposal.

> >Implementing optimizations respecting the Python
> >semantic requires to detect when "something changes":
> But as near as I can tell, your proposal cannot detect all relevant 
> changes unless one is *very* careful.  A dict maps hashable objects to 
> objects.  Objects represent values.  So a dict represents a mapping of 
> values to values.  If an object is mutated, the object to object mapping 
> is not changed, but the semantic value to value mapping *is* changed. 
> In the following example, __version__ twice gives the 'wrong' answer 
> from a value perspective.
> d = {'f': [int]}
> d['f'][0] = float # object mapping unchanged, value mapping changed
> d['f'] = [float]  # object mapping changed, value mapping unchanged

I don't think that matters for Victor's use-case. Going back to the toy 
example above, Victor doesn't need to detect internal modifications to 
the len built-in, because as you say it's immutable:

py> = "spam"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'builtin_function_or_method' object has no attribute 

He just needs to know if globals()['len'] and/or builtins.len are 
different (in any way) from how they were when the function "demo" was 

I'm sure that there are complications that I haven't thought of, but 
these sorts of static compiler optimizations aren't cutting edge 
computer science, they've been around for decades and are well-studied 
and well-understood.

> >The astoptimizer of the FAT Python project implements many optimizations
> >which require guards on namespaces. Examples:
> >
> >* Call pure builtins: to replace ``len("abc")`` with ``3``,
> Replacing a call with a return value assumes that the function is 
> immutable, deterministic, and without side-effect.  Perhaps this is what 
> you meant by 'pure'.  

Yes, "pure function" is the term of art for a function which is 
deterministic and free of side-effects.

Immutability is only important in the sense that if a function *is* pure 
now, you know it will be pure in the future as well.

> Are you proposing to provide astoptimizer with 
> either a whitelist or blacklist of builtins that qualify or not?

I don't think the implementation details of astoptimizer are important 
for this proposal.

> The question in my mind is whether real code has enough pure builtin 
> calls of constants to justify the overhead.

Its not just builtin calls of constants, this technique has much wider 
application. If I understand Victor correctly, he thinks he can get 
function inlining, where instead of having to make a full function call 
to the built-in (which is slow), the compiler can jump directly to the 
function's implementation as if it were written inline.

Obviously you can't do this optimization if len has changed from the 
inlined version, hence Victor needs to detect changes to globals() and 

This shouldn't turn into a general critique of optimization techniques, 
but I think that Victor's PEP should justify why he is confident that 
these optimizations have a good chance to be worthwhile. It's not enough 
to end up with "well, we applied all the optimizations we could, and the 
good news is that Python is no slower". We want some evidence that it 
will actually be faster.


From rosuav at  Sun Jan 10 00:23:46 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 16:23:46 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano <steve at> wrote:
> On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
>> On 1/8/2016 4:27 PM, Victor Stinner wrote:
>> >Add a new read-only ``__version__`` property to ``dict`` and
>> >``collections.UserDict`` types, incremented at each change.
>> I agree with Neil Girdhar that this looks to me like a CPython-specific
>> implementation detail that should not be imposed on other
>> implementations.  For testing, perhaps we could add a dict_version
>> function in that uses ctypes to access the internals.
>> Another reason to hide __version__ from the Python level is that its use
>> seems to me rather tricky and bug-prone.
> What makes you say that? Isn't it a simple matter of:
> v = mydict.__version__
> maybe_modify(mydict)
> if v != mydict.__version__:
>     print("dict has changed")
> which doesn't seen tricky or bug-prone to me.

That doesn't. I would, however, expect that __version__ is a read-only
attribute. I can't imagine any justifiable excuse for changing it; if
you want to increment it, just mutate the dict in some unnecessary

>> But as near as I can tell, your proposal cannot detect all relevant
>> changes unless one is *very* careful.  A dict maps hashable objects to
>> objects.  Objects represent values.  So a dict represents a mapping of
>> values to values.  If an object is mutated, the object to object mapping
>> is not changed, but the semantic value to value mapping *is* changed.
>> In the following example, __version__ twice gives the 'wrong' answer
>> from a value perspective.
>> d = {'f': [int]}
>> d['f'][0] = float # object mapping unchanged, value mapping changed
>> d['f'] = [float]  # object mapping changed, value mapping unchanged
> I don't think that matters for Victor's use-case. Going back to the toy
> example above, Victor doesn't need to detect internal modifications to
> the len built-in, because as you say it's immutable:
> py> = "spam"
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'builtin_function_or_method' object has no attribute
> 'foo'
> He just needs to know if globals()['len'] and/or builtins.len are
> different (in any way) from how they were when the function "demo" was
> compiled.

There's more to it than that. Yes, a dict maps values to values; but
the keys MUST be immutable (otherwise hashing has problems), and this
optimization doesn't actually care about the immutability of the
value. When you use the name "len" in a Python function, somewhere
along the way, that will resolve to some object. Currently, CPython
knows in advance that it isn't in the function-locals, but checks at
run-time for a global and then a built-in; all FAT Python is doing
differently is snapshotting the object referred to, and then having a
quick check to prove that globals and builtins haven't been mutated.

def enumerate_classes():
    return (cls.__name__ for cls in object.__subclasses__())

As long as nobody has *rebound* the name 'object', this will continue
to work - and it'll pick up new subclasses, which means that
something's mutable or non-pure in there. FAT Python should be able to
handle this just as easily as it handles an immutable. The only part
that has to be immutable is the string "len" or "object" that is used
as the key.

The significance of len being immutable and pure comes from the other
optimization, which is actually orthogonal to the non-rebound names
optimization, except that CPython already does this where it doesn't
depend on names.

CPython already constant-folds in situations where no names are
involved. That's how we maintain the illusion that there is such a
thing as a "complex literal":

>>> dis.dis(lambda: 1+2j)
  1           0 LOAD_CONST               3 ((1+2j))
              3 RETURN_VALUE

FAT Python proposes to do the same here:

>>> dis.dis(lambda: len("abc"))
  1           0 LOAD_GLOBAL              0 (len)
              3 LOAD_CONST               1 ('abc')
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE

And that's where it might be important to check more than just the
identity of the object. If len were implemented in Python:

>>> def len(x):
...     l = 0
...     for l, _ in enumerate(x, 1): pass
...     return l
>>> len("abc")
>>> len
<function len at 0x7fc6111769d8>

then it would be possible to keep the same len object but change its behaviour.

>>> len.__code__ = (lambda x: 5).__code__
>>> len
<function len at 0x7fc6111769d8>
>>> len("abc")

Does anyone EVER do this? C compilers often have optimization levels
that can potentially alter the program's operation (eg replacing
division with multiplication by the reciprocal); if FAT Python has an
optimization flag that says "Assume no __code__ objects are ever
replaced", most programs would have no problem with it. (Having it
trigger an immediate exception would mean there's no "what the bleep
is going on" moment, and I still doubt it'll ever happen.)

I think there are some interesting possibilities here. Whether they
actually result in real improvement I don't know; but if FAT Python is
aiming to be fast at the "start program, do a tiny bit of work, and
then terminate" execution model (where JIT compilation can't help),
then it could potentially make Mercurial a *lot* faster to fiddle


From ethan at  Sun Jan 10 01:25:25 2016
From: ethan at (Ethan Furman)
Date: Sat, 09 Jan 2016 22:25:25 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <>

On 01/09/2016 09:23 PM, Chris Angelico wrote:
> On Sun, Jan 10, 2016 at 3:24 PM, Steven D'Aprano <steve at> wrote:
>> On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:
>>> On 1/8/2016 4:27 PM, Victor Stinner wrote:
>>>> Add a new read-only ``__version__`` property to ``dict`` and
>>>> ``collections.UserDict`` types, incremented at each change.
>>> Another reason to hide __version__ from the Python level is that its use
>>> seems to me rather tricky and bug-prone.
>> What makes you say that? Isn't it a simple matter of:
>>  [snip]
>> which doesn't seen tricky or bug-prone to me.
> That doesn't. I would, however, expect that __version__ is a read-only
> attribute.

You mean like it says in the first quote of this message?  ;)


From rosuav at  Sun Jan 10 02:17:07 2016
From: rosuav at (Chris Angelico)
Date: Sun, 10 Jan 2016 18:17:07 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 5:25 PM, Ethan Furman <ethan at> wrote:
>> That doesn't. I would, however, expect that __version__ is a read-only
>> attribute.
> You mean like it says in the first quote of this message?  ;)

D'oh. Yep. Reminder, to self: Read through things twice. You never
know what you missed the first time.


From bunslow at  Sun Jan 10 03:27:36 2016
From: bunslow at (Bill Winslow)
Date: Sun, 10 Jan 2016 02:27:36 -0600
Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some
 arguments of a function
In-Reply-To: <>
References: <>
Message-ID: <>

Sorry for the late reply everyone.

I think relying on closures, while a solution, is messy. I'd still much
prefer a way to tell lru_cache to merely ignore certain arguments. I'll use
some variant of the
method (either with the cached-recursive hidden from the top level via the
try/except stuff, or with a more simple wrapper function).

I've further considered my original proposal, and rather than naming it
"arg_filter", I realized that builtins like sorted(), min(), max(), etc all
already have the exact same thing -- a "key" argument which transforms the
elements to the user's purpose. (In the sorted/min/max case, it's called on
the elements of the argument rather than the argument itself, but it's
still the same concept.) So basically, my original proposal with renaming
from arg_filter to key, is tantamount to extending the same functionality
from sorted/min/max to lru_cache as well. As has been pointed out, my own
use case is almost certainly *not* the only use case. The implementation
and interface are both simple, and simpler than the alternatives which I'll
rely on for now (wrappers and closures, or worse, global singletons etc). I
would still like to see it in the stdlib in the future. I've appended a
largely similar patch with the proposed additions (there's some internal
variable renaming to avoid confusion, resulting in a longer diff).

Thanks again for all the input.



< def _make_key(args, kwds, typed,
> def _make_key(args, kwds, typed, key,
<     key = args
>     if key is not None:
>         args, kwds = key(args, kwds)
>     cache_key = args
<         key += kwd_mark
>         cache_key += kwd_mark
<             key += item
>             cache_key += item
<         key += tuple(type(v) for v in args)
>         cache_key += tuple(type(v) for v in args)
<             key += tuple(type(v) for k, v in sorted_items)
<     elif len(key) == 1 and type(key[0]) in fasttypes:
<         return key[0]
<     return _HashedSeq(key)
>             cache_key += tuple(type(v) for k, v in sorted_items)
>     elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes:
>         return cache_key[0]
>     return _HashedSeq(cache_key)
< def lru_cache(maxsize=128, typed=False):
> def lru_cache(maxsize=128, typed=False, key=None):
>     If *key* is not None, it must be a callable which acts on the
>     passed to the function. Its return value is used in place of the
>     arguments. It works analogously to the *key* argument to the builtins
>     sorted, max, and min.
>     if key is not None and not callable(key):
>         raise TypeErrpr('Expected key to be a callable')
<         wrapper = _lru_cache_wrapper(user_function, maxsize, typed,
>         wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key,
>                                      _CacheInfo)
< def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
> def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo):
<             key = make_key(args, kwds, typed)
<             result = cache_get(key, sentinel)
>             cache_key = make_key(args, kwds, typed, key)
>             result = cache_get(cache_key, sentinel)
<             cache[key] = result
>             cache[cache_key] = result
<             key = make_key(args, kwds, typed)
>             cache_key = make_key(args, kwds, typed, key)
<                 link = cache_get(key)
>                 link = cache_get(cache_key)
<                 if key in cache:
>                 if cache_key in cache:
<                     oldroot[KEY] = key
>                     oldroot[KEY] = cache_key
<                     cache[key] = oldroot
>                     cache[cache_key] = oldroot
<                     link = [last, root, key, result]
<                     last[NEXT] = root[PREV] = cache[key] = link
>                     link = [last, root, cache_key, result]
>                     last[NEXT] = root[PREV] = cache[cache_key] = link

On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik <mike at> wrote:

> On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee <
> leewangzhong+python at> wrote:
>> On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik <mike at> wrote:
>> > On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee <
>> leewangzhong+python at>
>> > wrote:
> > This whole thing is probably best implemented as two separate functions
>> > rather than using a closure, depending on how intertwined the code
>> paths are
>> > for the shortcut/non-shortcut versions.
>> I like the closure because it has semantic ownership: the inner
>> function is a worker for the outer function.
> True, a closure has better encapsulation, making it less likely someone
> will misuse the helper function. On the other hand, that means there's less
> modularity and it would be difficult for someone to use the inner function.
> It's hard to know the right choice without seeing the exact problem the
> original author was working on.
>> >> On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee
>> >> <leewangzhong+python at> wrote:
>> >> > 1. Rewrite your recursive function so that the partial state is a
>> >> > nonlocal variable (in the closure), and memoize the recursive part.
>> >
>> > I'd flip the rare-case to the except block and put the normal-case in
>> the
>> > try block. I believe this will be more compute-efficient and more
>> readable.
>> The rare case is in the except block, though.
> You're correct. Sorry, I somehow misinterpreted the comment, "# To trigger
> the exception the first time" as indicating that code path would run only
> once.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rmcgibbo at  Sun Jan 10 04:50:31 2016
From: rmcgibbo at (Robert McGibbon)
Date: Sun, 10 Jan 2016 01:50:31 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <n6rq9d$glk$>
References: <>
Message-ID: <>

On 1/8/2016 6:04 PM, Guido van Rossum wrote:

> At Dropbox we're trying to be good citizens and we're working towards
> introducing gradual typing (PEP 484) into our Python code bases (several
> million lines of code). However, that code base is mostly still Python
> 2.7 and we believe that we should introduce gradual typing first and
> start working on conversion to Python 3 second (since having static
> types in the code can help a big refactoring like that).

Big +1

I maintain some packages that are single-source 2/3 compatible packages,
thus we haven't been able to add type annotations yet (which I was
initially skeptical about, but now love) without dropping py2 support. So
even for packages that have already been ported to py3, this proposal would
be great.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Sun Jan 10 08:01:05 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 14:01:05 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>


Andrew Barnert:
> Which implies to me that the PEPs really need to anticipate and answer these questions.

The dict.__version__ PEP mentions FAT python as an use case. In fact,
I should point to the func.specialize() PEP which already explains
partially the motivation for static optimizers:

But ok I will enhance the PEP 510 rationale to explain why static
optimizers makes sense in Python, maybe even more sense than a JIT
compiler in some cases (short living programs). By the way, I think
that Mercurial is a good example of short living program. (There is a
project for a local "server" to keep a process in backgroud, this one
would benefit from a JIT compiler.)


From victor.stinner at  Sun Jan 10 08:08:47 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 14:08:47 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <>

2016-01-10 5:24 GMT+01:00 Steven D'Aprano <steve at>:
> Here's a sketch of the idea:
> def demo(arg):
>     return len(arg)
> (...)

For examples of guards and how they can be used, please see the PEP 510:


From victor.stinner at  Sun Jan 10 08:32:46 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 14:32:46 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <>


2016-01-10 6:23 GMT+01:00 Chris Angelico <rosuav at>:
> Consider:
> def enumerate_classes():
>     return (cls.__name__ for cls in object.__subclasses__())
> As long as nobody has *rebound* the name 'object', this will continue
> to work - and it'll pick up new subclasses, which means that
> something's mutable or non-pure in there. FAT Python should be able to
> handle this just as easily as it handles an immutable. The only part
> that has to be immutable is the string "len" or "object" that is used
> as the key.

FYI I implemented a "copy builtin to constant" optimization which
replaces "LOAD_GLOBAL object" instruction with "LOAD_CONST <object

It uses a guard on the builtin and global namespaces to disable the
optimization if object is replaced.

If you want to make object.__subclasses__ constant, we need more guards:

* guard on the object.__subclasses__ attribute
* guard on the private tp_version_tag attribute of the object type
* ... and it looks like object.__subclasses__ uses weak references, so
I'm not sure that it's really possible to make object.__subclasses__()
constant with guards. Is it really worth it? Is it a common case?

Oh... I just remember that the "type" type already implements a
version as I propose for dict. It's called "tp_version_tag" and it's
private. It has the C type "unsigned int" and it's incremented at each

> And that's where it might be important to check more than just the
> identity of the object. If len were implemented in Python:
>>>> def len(x):
> ...     l = 0
> ...     for l, _ in enumerate(x, 1): pass
> ...     return l
> ...
>>>> len("abc")
> 3
>>>> len
> <function len at 0x7fc6111769d8>
> then it would be possible to keep the same len object but change its behaviour.
>>>> len.__code__ = (lambda x: 5).__code__
>>>> len
> <function len at 0x7fc6111769d8>
>>>> len("abc")
> 5
> Does anyone EVER do this?

FAT Python implements a fat.GuardFunc which checks if func.__code__
was replaced or not. It doesn't matter if replacing replacing
func.__code__ is unlikely. An optimizer must not change the Python
semantic, otherwise it will break some applications and cannot be used

> if FAT Python has an
> optimization flag that says "Assume no __code__ objects are ever
> replaced", most programs would have no problem with it. (Having it
> trigger an immediate exception would mean there's no "what the bleep
> is going on" moment, and I still doubt it'll ever happen.)

In my plan, I will add an option to skip guards if you are 100% sure
that some things will never change. For example, if you control all
code of your application (not only the app itself, all modules) and
you know that func.__code__ is never replaced, you can skip
fat.GuardFunc (not emit them).

> I think there are some interesting possibilities here. Whether they
> actually result in real improvement I don't know; but if FAT Python is
> aiming to be fast at the "start program, do a tiny bit of work, and
> then terminate" execution model (where JIT compilation can't help),
> then it could potentially make Mercurial a *lot* faster to fiddle
> with.

FAT Python is designed to compile the code ahead of code. The
installation can be pre-optimized in a package, or optimized at the
installation, but it's not optimized when the program is started. If
the optimization are efficient, the program will run faster, even for
short living programs (yes, like Mercurial).


From ericfahlgren at  Sun Jan 10 10:28:07 2016
From: ericfahlgren at (Eric Fahlgren)
Date: Sun, 10 Jan 2016 07:28:07 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <01b601d14bbb$82648b90$872da2b0$>

Steven D'Aprano Saturday, January 09, 2016 19:32:
> I think that's pessimistic and unrealistic. If Python were twice as fast
as it is now, it would mean that scripts could process twice as much data in
the same time as they do now. How is that not meaningful?
> Sometimes I work hard to get a 5% or 10% improvement in speed of a
function, because it's worth it. Doubling the speed is something I can only
dream about.

Often when I hear people complain about "tiny" improvements, I change the
context: "Ok, I'm going to raise your salary 5%, or is that too small and
you don't want it?"  Suddenly that 5% looks pretty good.

From rosuav at  Sun Jan 10 10:36:00 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 02:36:00 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <01b601d14bbb$82648b90$872da2b0$>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 2:28 AM, Eric Fahlgren <ericfahlgren at> wrote:
> Steven D'Aprano Saturday, January 09, 2016 19:32:
>> I think that's pessimistic and unrealistic. If Python were twice as fast
> as it is now, it would mean that scripts could process twice as much data in
> the same time as they do now. How is that not meaningful?
>> Sometimes I work hard to get a 5% or 10% improvement in speed of a
> function, because it's worth it. Doubling the speed is something I can only
> dream about.
> Often when I hear people complain about "tiny" improvements, I change the
> context: "Ok, I'm going to raise your salary 5%, or is that too small and
> you don't want it?"  Suddenly that 5% looks pretty good.

Although realistically, it's more like saying "If you put in enough
overtime, I'll raise by 5% the rate you get paid for one of the many
types of work you do". Evaluating that depends on what proportion of
your salary comes from that type of work.

5% across the board is pretty good. 5% to one function is only worth
serious effort if that's a hot spot. But broadly I do agree - 5% is


From nicholas.chammas at  Sun Jan 10 10:38:54 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 10 Jan 2016 10:38:54 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <01b601d14bbb$82648b90$872da2b0$>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 10:28 AM, Eric Fahlgren <ericfahlgren at>

> Often when I hear people complain about "tiny" improvements, I change the
> context: "Ok, I'm going to raise your salary 5%, or is that too small and
> you don't want it?"  Suddenly that 5% looks pretty good.

To extend this analogy a bit, I think Neil's objection was more along the
lines of "Why work an extra 5 hours a week for only a 5% raise?"

I don't think anyone's going to pooh-pooh a performance improvement. Neil's
concern is just about whether the benefit justifies the cost.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Sun Jan 10 11:47:21 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 17:47:21 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

Le 10 janv. 2016 4:39 PM, "Nicholas Chammas" <nicholas.chammas at> a
?crit :
> To extend this analogy a bit, I think Neil's objection was more along the
lines of "Why work an extra 5 hours a week for only a 5% raise?"

Your analogy is wrong. I am working and you get the salary.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mistersheik at  Sun Jan 10 11:48:35 2016
From: mistersheik at (Neil Girdhar)
Date: Sun, 10 Jan 2016 11:48:35 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

I read through this thread and just want to quickly address some good

First of all, I didn't mean to suggest that this kind of optimization is
not useful.  Of course, I will be thankful of any optimization that makes
it into CPython.  Making CPython faster is good, useful work.   It's just
that my dream of the future of Python is one where Python is faster than C
thanks to a very clever JIT.

> I agree with Neil Girdhar that this looks to me like a CPython-specific
> > implementation detail that should not be imposed on other
> > implementations.  For testing, perhaps we could add a dict_version
> > function in that uses ctypes to access the internals.
> >
> > Another reason to hide __version__ from the Python level is that its use
> > seems to me rather tricky and bug-prone.
> What makes you say that? Isn't it a simple matter of:
> v = mydict.__version__
> maybe_modify(mydict)
> if v != mydict.__version__:
>     print("dict has changed")

This is exactly what I want to avoid.  If you want to do something like
this, I think you should do it in regular Python by subclassing dict and
overriding the mutating methods.  What happens if someone uses a custom
Mapping?  Do all custom Mappings need to implement __version__?   Do they
need a __version__ that indicates that no key-value pairs have changed, and
another version that indicates that nothing has changed (for example
OrderedDict has an order, sorteddict has a sort function; changing either
of those doesn't change key-value pairs).  This is not supposed to be
user-facing;  this is an interpreter optimization.

> Obviously a JIT can help, but even they can benefit from this. For
> instance, Pyjion could rely on this instead of creating our own guards for
> built-in and global namespaces if we wanted to inline calls to certain
> built-ins.

I understand that, but what if another JIT decides that instead of
__version__ being an attribute on the object, version is a global mapping
 from objects to version numbers?  What if someone else wants to implement
it instead as a set of changed objects at ever sequence point?  There are
many ways to do this optimization.  It's not obvious to me that everyone
will want to do it this way.

> C compilers often have optimization levels that can potentially alter the
> program's operation

Some of those optimizations lead to bugs that are very hard to track down.
One of the advantages of Python is that what you pay for in runtime, you
save ten-fold in development time.

In summary, I am 100% behind this idea if it were hidden from the user.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Sun Jan 10 12:57:32 2016
From: steve at (Steven D'Aprano)
Date: Mon, 11 Jan 2016 04:57:32 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:

> > v = mydict.__version__
> > maybe_modify(mydict)
> > if v != mydict.__version__:
> >     print("dict has changed")
> This is exactly what I want to avoid.  If you want to do something like
> this, I think you should do it in regular Python by subclassing dict and
> overriding the mutating methods.

That doesn't help Victor, because exec need an actual dict, not 
subclasses. Victor's PEP says this is a blocker.

I can already subclass dict to do that now. But if Victor's suggestion 
is accepted, then I don't need to. The functionality will already exist. 
Why shouldn't I use it?

> What happens if someone uses a custom Mapping?

If they inherit from dict or UserDict, they get this functionality for 
free. If they don't, they're responsible for implementing it if they 
want it.

> Do all custom Mappings need to implement __version__?

I believe the answer to that is No, but the PEP probably should clarify 


From mistersheik at  Sun Jan 10 13:35:10 2016
From: mistersheik at (Neil Girdhar)
Date: Sun, 10 Jan 2016 13:35:10 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve at>

> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
> [...]
> > > v = mydict.__version__
> > > maybe_modify(mydict)
> > > if v != mydict.__version__:
> > >     print("dict has changed")
> >
> >
> > This is exactly what I want to avoid.  If you want to do something like
> > this, I think you should do it in regular Python by subclassing dict and
> > overriding the mutating methods.
> That doesn't help Victor, because exec need an actual dict, not
> subclasses. Victor's PEP says this is a blocker.

No, he can still do what he wants transparently in the interpreter.  What I
want to avoid is Python users using __version__ in their own code.

> I can already subclass dict to do that now. But if Victor's suggestion
> is accepted, then I don't need to. The functionality will already exist.
> Why shouldn't I use it?

Because people write code for the abc "Mapping".  What you are suggesting
is then to add "__version__" to the abc Mapping, which I am against.
Mapping provides the minimum interface to be a mapping; there is no reason
that every Mapping should have a "__version__".

> > What happens if someone uses a custom Mapping?
> If they inherit from dict or UserDict, they get this functionality for
> free. If they don't, they're responsible for implementing it if they
> want it.

But they shouldn't have to implement it just so that code written for
Mappings works ? as it does now.

> > Do all custom Mappings need to implement __version__?
> I believe the answer to that is No, but the PEP probably should clarify
> that.

If the answer is "no" then honestly no user should write code counting on
the existence of __version__.

> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> --
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at
> For more options, visit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mike at  Sun Jan 10 14:03:55 2016
From: mike at (Michael Selik)
Date: Sun, 10 Jan 2016 19:03:55 +0000
Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some
 arguments of a function
In-Reply-To: <>
References: <>
Message-ID: <>

Shouldn't the key function be called with ``key(*args, **kwargs)``?

It'd be helpful to see the entire revision, rather than just the diff. It's
easier for me to read at least.

On Sun, Jan 10, 2016, 3:27 AM Bill Winslow <bunslow at> wrote:

> Sorry for the late reply everyone.
> I think relying on closures, while a solution, is messy. I'd still much
> prefer a way to tell lru_cache to merely ignore certain arguments. I'll use
> some variant of the
> storing-the-partial-progress-as-an-attribute-on-the-cached-recursive-function
> method (either with the cached-recursive hidden from the top level via the
> try/except stuff, or with a more simple wrapper function).
> I've further considered my original proposal, and rather than naming it
> "arg_filter", I realized that builtins like sorted(), min(), max(), etc all
> already have the exact same thing -- a "key" argument which transforms the
> elements to the user's purpose. (In the sorted/min/max case, it's called on
> the elements of the argument rather than the argument itself, but it's
> still the same concept.) So basically, my original proposal with renaming
> from arg_filter to key, is tantamount to extending the same functionality
> from sorted/min/max to lru_cache as well. As has been pointed out, my own
> use case is almost certainly *not* the only use case. The implementation
> and interface are both simple, and simpler than the alternatives which I'll
> rely on for now (wrappers and closures, or worse, global singletons etc). I
> would still like to see it in the stdlib in the future. I've appended a
> largely similar patch with the proposed additions (there's some internal
> variable renaming to avoid confusion, resulting in a longer diff).
> Thanks again for all the input.
> -Bill
> -----------------------------------------------------------------------------------------------------------------
> diff
> 363c363
> < def _make_key(args, kwds, typed,
> ---
> > def _make_key(args, kwds, typed, key,
> 377c377,379
> <     key = args
> ---
> >     if key is not None:
> >         args, kwds = key(args, kwds)
> >     cache_key = args
> 380c382
> <         key += kwd_mark
> ---
> >         cache_key += kwd_mark
> 382c384
> <             key += item
> ---
> >             cache_key += item
> 384c386
> <         key += tuple(type(v) for v in args)
> ---
> >         cache_key += tuple(type(v) for v in args)
> 386,389c388,391
> <             key += tuple(type(v) for k, v in sorted_items)
> <     elif len(key) == 1 and type(key[0]) in fasttypes:
> <         return key[0]
> <     return _HashedSeq(key)
> ---
> >             cache_key += tuple(type(v) for k, v in sorted_items)
> >     elif len(cache_key) == 1 and type(cache_key[0]) in fasttypes:
> >         return cache_key[0]
> >     return _HashedSeq(cache_key)
> 391c393
> < def lru_cache(maxsize=128, typed=False):
> ---
> > def lru_cache(maxsize=128, typed=False, key=None):
> 400a403,407
> >     If *key* is not None, it must be a callable which acts on the
> arguments
> >     passed to the function. Its return value is used in place of the
> actual
> >     arguments. It works analogously to the *key* argument to the builtins
> >     sorted, max, and min.
> >
> 421a429,431
> >     if key is not None and not callable(key):
> >         raise TypeErrpr('Expected key to be a callable')
> >
> 423c433,434
> <         wrapper = _lru_cache_wrapper(user_function, maxsize, typed,
> _CacheInfo)
> ---
> >         wrapper = _lru_cache_wrapper(user_function, maxsize, typed, key,
> >                                      _CacheInfo)
> 428c439
> < def _lru_cache_wrapper(user_function, maxsize, typed, _CacheInfo):
> ---
> > def _lru_cache_wrapper(user_function, maxsize, typed, key, _CacheInfo):
> 456,457c467,468
> <             key = make_key(args, kwds, typed)
> <             result = cache_get(key, sentinel)
> ---
> >             cache_key = make_key(args, kwds, typed, key)
> >             result = cache_get(cache_key, sentinel)
> 462c473
> <             cache[key] = result
> ---
> >             cache[cache_key] = result
> 471c482
> <             key = make_key(args, kwds, typed)
> ---
> >             cache_key = make_key(args, kwds, typed, key)
> 473c484
> <                 link = cache_get(key)
> ---
> >                 link = cache_get(cache_key)
> 487c498
> <                 if key in cache:
> ---
> >                 if cache_key in cache:
> 496c507
> <                     oldroot[KEY] = key
> ---
> >                     oldroot[KEY] = cache_key
> 513c524
> <                     cache[key] = oldroot
> ---
> >                     cache[cache_key] = oldroot
> 517,518c528,529
> <                     link = [last, root, key, result]
> <                     last[NEXT] = root[PREV] = cache[key] = link
> ---
> >                     link = [last, root, cache_key, result]
> >                     last[NEXT] = root[PREV] = cache[cache_key] = link
> On Wed, Dec 30, 2015 at 11:10 PM, Michael Selik <mike at> wrote:
>> On Tue, Dec 29, 2015 at 2:14 AM Franklin? Lee <
>> leewangzhong+python at> wrote:
>>> On Sat, Dec 12, 2015 at 1:34 PM, Michael Selik <mike at> wrote:
>>> > On Fri, Dec 11, 2015, 8:20 PM Franklin? Lee <
>>> leewangzhong+python at>
>>> > wrote:
>> > This whole thing is probably best implemented as two separate functions
>>> > rather than using a closure, depending on how intertwined the code
>>> paths are
>>> > for the shortcut/non-shortcut versions.
>>> I like the closure because it has semantic ownership: the inner
>>> function is a worker for the outer function.
>> True, a closure has better encapsulation, making it less likely someone
>> will misuse the helper function. On the other hand, that means there's less
>> modularity and it would be difficult for someone to use the inner function.
>> It's hard to know the right choice without seeing the exact problem the
>> original author was working on.
>>> >> On Fri, Dec 11, 2015 at 8:01 PM, Franklin? Lee
>>> >> <leewangzhong+python at> wrote:
>>> >> > 1. Rewrite your recursive function so that the partial state is a
>>> >> > nonlocal variable (in the closure), and memoize the recursive part.
>>> >
>>> > I'd flip the rare-case to the except block and put the normal-case in
>>> the
>>> > try block. I believe this will be more compute-efficient and more
>>> readable.
>>> The rare case is in the except block, though.
>> You're correct. Sorry, I somehow misinterpreted the comment, "# To
>> trigger the exception the first time" as indicating that code path would
>> run only once.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Sun Jan 10 15:02:47 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 10 Jan 2016 21:02:47 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve at>:
>> Do all custom Mappings need to implement __version__?
> I believe the answer to that is No, but the PEP probably should clarify
> that.

In the PEP, I wrote "The PEP is designed to implement guards on
namespaces, only the dict type can be used for namespaces in practice.
collections.UserDict is modified because it must mimicks dict.
collections.Mapping is unchanged."

Is it enough? If no, what do you suggest to be more explicit?


From ethan at  Sun Jan 10 15:32:50 2016
From: ethan at (Ethan Furman)
Date: Sun, 10 Jan 2016 12:32:50 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/10/2016 12:02 PM, Victor Stinner wrote:
> 2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve at>:
>>> Do all custom Mappings need to implement __version__?
>> I believe the answer to that is No, but the PEP probably should clarify
>> that.
> In the PEP, I wrote "The PEP is designed to implement guards on
> namespaces, only the dict type can be used for namespaces in practice.
> collections.UserDict is modified because it must mimicks dict.
> collections.Mapping is unchanged."
> Is it enough? If no, what do you suggest to be more explicit?

It is enough.


From jim.baker at  Sun Jan 10 16:51:13 2016
From: jim.baker at (Jim Baker)
Date: Sun, 10 Jan 2016 14:51:13 -0700
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

FWIW, we now fully support both Jedi and lib2to3 in Jython 2.7.1 master.
With some other work this weekend, we should be releasing 2.7.1 beta 3 and
then shortly a RC - we just fixed the last blocking bug.

On Sat, Jan 9, 2016 at 11:52 AM, Jim Baker <jim.baker at> wrote:

> +1, I would really like to try out type annotation support in Jython,
> given the potential for tying in with Java as a source of type annotations
> (basically the equivalent of stubs for free). I'm planning on sprinting on
> Jython 3 at PyCon, but let's face it, that's going to take a while to
> really finish.
> re the two approaches, both are workable with Jython:
> * lib2to3 is something we should support in Jython 2.7. There are a couple
> of data files that we don't support in the tests (too large of a method for
> Java bytecode in, not terribly interesting), plus a
> few other tests that should work. Therefore lib2to3 should be in the next
> release (2.7.1).
> * Jedi now works with the last commit to Jython 2.7 trunk, passing
> whatever it means to run random tests using its sith script against its
> source. (The sith test does not pass with either CPython or Jython's
> stdlib, starting with
> - Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Sun Jan 10 18:30:12 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 00:30:12 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-09 14:12 GMT+01:00 Nick Coghlan <ncoghlan at>:
> On 9 January 2016 at 19:18, Victor Stinner <victor.stinner at> wrote:
>> It would be nice to detect keys mutation while iteration on
>> dict.keys(), but it would also be be nice to detect values mutation
>> while iterating on dict.values() and dict.items(). No?
> No, because mutating values as you go while iterating over a
> dictionary is perfectly legal: (...)

Oh you're right. I removed the reference to the issue #19332 from the
PEP, since the PEP doesn't help. Too bad.


From steve at  Sun Jan 10 19:12:18 2016
From: steve at (Steven D'Aprano)
Date: Mon, 11 Jan 2016 11:12:18 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 09:02:47PM +0100, Victor Stinner wrote:
> 2016-01-10 18:57 GMT+01:00 Steven D'Aprano <steve at>:
> >> Do all custom Mappings need to implement __version__?
> >
> > I believe the answer to that is No, but the PEP probably should clarify
> > that.
> In the PEP, I wrote "The PEP is designed to implement guards on
> namespaces, only the dict type can be used for namespaces in practice.
> collections.UserDict is modified because it must mimicks dict.
> collections.Mapping is unchanged."
> Is it enough? If no, what do you suggest to be more explicit?

You also should argue whether or not __version__ should be visible 
to users from pure Python, or only from C code (as Neil wants). In other 
words, should __version__ be part of the public API of dict, or an 
implementation detail?

(1) Make __version__ part of the public API.


- Simpler implementation?
- Allows easier debugging.
- Users can make use of it for their own purposes.


- Neil wants to avoid users making use of this feature. 
  (Why, he hasn't explained, or if he did, I missed it.)
- All implementations (PyPy, Jython, etc.) must copy it.
- You lock in one specific implementation for guards and 
  cannot change to another one.

(2) Keep __version__ private.


- Other implementations can ignore it.
- You can change the implementation for guards.


- Users may resort to ctypes to make use of it.
  (If they can.)


From victor.stinner at  Sun Jan 10 19:15:47 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 01:15:47 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-10 19:35 GMT+01:00 Neil Girdhar <mistersheik at>:
> If the answer is "no" then honestly no user should write code counting on
> the existence of __version__.

For my use case, I don't need a public (read-only) property at the
Python level. When I wrote the PEP, I proposed a public property to
try to find more use cases and make the PEP more interesting.

I'm not sure anymore that it's worth since they are legit and good
counterargument were listed:

* it gives more work for other Python implementations, whereas they
may not use or benefit from the overall API for static optimizers
(discussed in following PEPs). Except of guards used for static
optimizers, I don't see any use case for dictionary versionning.

* the behaviour on integer overflow is an implementation detail, it's
sad to have to describe it in the specification of a *Python*
property. Users expect Python to abtract the hardware


From victor.stinner at  Sun Jan 10 19:20:33 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 01:20:33 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-11 1:12 GMT+01:00 Steven D'Aprano <steve at>:
> Cons:
> - Users may resort to ctypes to make use of it.
>   (If they can.)

It's not something new. It's already possible to access any C private
attribute using ctypes. I don't think that it's a real issue. "We are
all consenting adults here" ;-)


From rosuav at  Sun Jan 10 19:27:45 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 11:27:45 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 11:15 AM, Victor Stinner
<victor.stinner at> wrote:
> * the behaviour on integer overflow is an implementation detail, it's
> sad to have to describe it in the specification of a *Python*
> property. Users expect Python to abtract the hardware

Compromise: Document that it's an integer that changes every time the
dictionary is changed, and has a "vanishingly small chance" of ever
reusing a number. It'll trap the same people who try to use id(obj) as
a memory address, but at least it'll be documented as false.


From abarnert at  Sun Jan 10 19:37:56 2016
From: abarnert at (Andrew Barnert)
Date: Sun, 10 Jan 2016 16:37:56 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 10, 2016, at 10:35, Neil Girdhar <mistersheik at> wrote:
>> On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve at> wrote:
>> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
>> [...]
>> > > v = mydict.__version__
>> > > maybe_modify(mydict)
>> > > if v != mydict.__version__:
>> > >     print("dict has changed")
>> >
>> >
>> > This is exactly what I want to avoid.  If you want to do something like
>> > this, I think you should do it in regular Python by subclassing dict and
>> > overriding the mutating methods.
>> That doesn't help Victor, because exec need an actual dict, not
>> subclasses. Victor's PEP says this is a blocker.
> No, he can still do what he wants transparently in the interpreter.  What I want to avoid is Python users using __version__ in their own code. 

Well, he could change exec so it can use arbitrary mappings (or at least dict subclasses), but I assume that's much harder and more disruptive than his proposed change.

Anyway, if I understand your point, it's this: __version__ should either be a private implementation-specific property of dicts, or it should be a property of all mappings; anything in between gets all the disadvantages of both.

If so, I agree with you. Encouraging people to use __version__ for other purposes besides namespace guards, but not doing anything to guarantee it actually exists anywhere besides namespaces, seems like a bad idea.

But there is still something in between public and totally internal to FAT Python. Making it a documented property of PyDict objects at the C API level is a different story--there are already plenty of ways that C code can use those objects that won't work with arbitrary mappings, so adding another doesn't seem like a problem. And even making it public but implementation-specific at the Python level may be useful for other CPython-specific optimizers (even if partially written in Python); if so, the best way to deal with the danger that someone could abuse it for code that should work with arbitrary mappings or with another Python implementation should be solved by clearly documenting it's non portability and discouraging its abuse in the docs, not by hiding it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Sun Jan 10 19:53:14 2016
From: abarnert at (Andrew Barnert)
Date: Sun, 10 Jan 2016 16:53:14 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 10, 2016, at 05:01, Victor Stinner <victor.stinner at> wrote:
> Andrew Barnert:
>> Which implies to me that the PEPs really need to anticipate and answer these questions.
> The dict.__version__ PEP mentions FAT python as an use case. In fact,
> I should point to the func.specialize() PEP which already explains
> partially the motivation for static optimizers:

Sure, linking to PEP 510 instead of repeating its while rationale seems perfectly reasonable to me.

> But ok I will enhance the PEP 510 rationale to explain why static
> optimizers makes sense in Python, maybe even more sense than a JIT
> compiler in some cases (short living programs). By the way, I think
> that Mercurial is a good example of short living program.

If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work?

Anyway, my point is just that it would be nice if, the next time someone raises the same kind of objection (because I'll bet it comes up when you post to -dev on the next pass, from people who don't read -ideas), you could just say "read this section of PEP 509 and that section of PEP 510 and then tell me what objections you still have", instead of needing to repeat the arguments you've already made.

From victor.stinner at  Sun Jan 10 20:36:20 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 02:36:20 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-11 1:53 GMT+01:00 Andrew Barnert <abarnert at>:
> If CPython is already faster than PyPy for hg, and your optimization makes it faster, then you've got a great answer for "why should anyone care about making CPython a little faster?" Can you benchmark that, or at least a toy app that simulates the same kind of work?

My optimizer now has a good library to implement optimizations, but I
didn't start to implement optimizations which will provide real
speedup on real applications. I expect a speedup with function
inlining, detecting pure functions, elimination of "unused" variables
(after constant propagation), etc.

In short, since the optimizer is "incomplete", I don't even want to
start playing with benchmarks. You can play with microbenchmarks if
you want. Try FAT Python, it's a working Python 3.6:

Currently, you have to run it with "-X fat" to enable the optimizer.
But the command line argument may change, I'm still working on the


From steve at  Sun Jan 10 20:39:01 2016
From: steve at (Steven D'Aprano)
Date: Mon, 11 Jan 2016 12:39:01 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote:

> * the behaviour on integer overflow is an implementation detail, it's
> sad to have to describe it in the specification of a *Python*
> property. Users expect Python to abtract the hardware

Is that a real possibility? A 32-bit counter will overflow, sure, but a 
64-bit counter starting from zero should never overflow in a human 

Even if we assume a billion increments per second (one per nanosecond), 
it would take over 584 years of continuous operation for the counter to 
overflow. What am I missing?

So I would be inclined to just document that the counter may overflow, 
and you should always compare it using == or != and not >. I think 
anything else is overkill.


From rosuav at  Sun Jan 10 20:55:24 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 12:55:24 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 12:39 PM, Steven D'Aprano <steve at> wrote:
> On Mon, Jan 11, 2016 at 01:15:47AM +0100, Victor Stinner wrote:
>> * the behaviour on integer overflow is an implementation detail, it's
>> sad to have to describe it in the specification of a *Python*
>> property. Users expect Python to abtract the hardware
> Is that a real possibility? A 32-bit counter will overflow, sure, but a
> 64-bit counter starting from zero should never overflow in a human
> lifetime.
> Even if we assume a billion increments per second (one per nanosecond),
> it would take over 584 years of continuous operation for the counter to
> overflow. What am I missing?

You're missing that a 32-bit build of Python would then be allowed to
use a 32-bit counter. But if the spec says "64-bit counter", then
yeah, we can pretty much assume that it won't overflow.

Reasonable usage wouldn't include nanosecondly updates; I doubt you
could even achieve 1000 updates a second, sustained over a long period
of time, and that would only overflow every 50ish days. Unless there's
some bizarre lockstep system that forces you to run into the rollover,
it's going to be basically one chance in four billion that you hit the
exact equal counter. So even a 32-bit counter is unlikely to cause
problems in real-world situations; and anyone who's paranoid can just
insist on using a 64-bit build of Python. (Most of us probably are


From abarnert at  Sun Jan 10 21:25:49 2016
From: abarnert at (Andrew Barnert)
Date: Sun, 10 Jan 2016 18:25:49 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 10, 2016, at 17:55, Chris Angelico <rosuav at> wrote:
> You're missing that a 32-bit build of Python would then be allowed to
> use a 32-bit counter. But if the spec says "64-bit counter", then
> yeah, we can pretty much assume that it won't overflow.

As I understand it from Victor's PEP, the added cost of maintaining this counter is literally so small as to be unmeasurable against the cost of normal dict operations in microbenchmarks. If that's true, surely the cost of requiring a 64-bit counter is going to be acceptable?

I realize that some MicroPython projects will be targeting platforms where there's no fast way to do an inc64 (or where the available compilers are too dumb to do it the fast way), but those projects are probably not going to want FAT Python anyway. On a P3 or later x86 or an ARM 7 or something like that, the cost should be more than acceptable. Or at least it's worth testing.

From guido at  Sun Jan 10 22:17:48 2016
From: guido at (Guido van Rossum)
Date: Sun, 10 Jan 2016 19:17:48 -0800
Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict
 keys/values views behave not as expected?
In-Reply-To: <>
References: <>
Message-ID: <>

Seems like we dropped the ball... Is there any action item here?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Mon Jan 11 01:04:02 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 01:04:02 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <n6vgkl$alv$>

On 1/10/2016 12:23 AM, Chris Angelico wrote:

(in reponse to Steven's response to my post)

> There's more to it than that. Yes, a dict maps values to values; but
> the keys MUST be immutable

Keys just have to be hashable; only hashes need to be immutable.  By 
default, hashes depends on ids, which are immutable for a particular 
object within a run.

(otherwise hashing has problems),

only if the hash depends on values that mutate.  Some do.

> and this optimization

 > doesn't actually care about the immutability of the value.

astoptimizer has multiple optimizations. One is not repeating name 
lookups. This is safe as long as the relevant dicts have not changed. I 
am guessing that you were pointing to this one.

Another is not repeating the call of a function with a particular value. 
This optimization, in general, is not safe even if dicts have not 
changed.  It *does* care about the nature of dict values -- in 
particular the nature of functions that are dict values.  It is the one 
*I* discussed, and the reason I claimed that using __version__ is tricky.

His toy example is replacing conditionally replacing 'len('abc') (at 
runtime) with '3', where '3' is computed *when the code is compiled. 
For this, it is crucial that builtin len is pure and immutable.

Viktor is being super careful to not break code.  In response to my 
question, Viktor said astoptimizer uses a whitelist of pure builtins to 
supplement the information supplied by .__version__.  Dict history, 
summarized by __version__ is not always enough to answer 'is this 
optimization safe'?  The nature of values is sometimes crucially important.

However, others might use __version__ *without* thinking through what 
other information is needed.  This is why I think its exposure is a bit 
dangerous.  19 years of experience suggests to me that misuse  *will* 
happen.  Viktor just reported that CPython's type already has a 
*private* version count.  The issue of exposing a new internal feature 
is somewhat separate and comes after the decision to add it.

As you know, and even alluded to later in your post, CPython already 
replaces '1 + 1' with '2' at compile time.  Method int.__add__ is pure 
and immutable.  Since it (unlike len) also cannot be replaced or 
shadowed, the replacement can be complete, with '2' put in the code 
object (and .pyc if written), as if the programmer had actually written '2'.

 >>> from dis import dis
 >>> dis('1 + 1')
   1           0 LOAD_CONST               1 (2)
               3 RETURN_VALUE

JIT compilers depend on the same properties of int, float, and str 
operations, for instance, as well as the fact that unbox(Py object) and 
box(machine value) are inverses, so that unbox(box(temp_machine_value) 
can be replaced by temp_machine_value.

Terry Jan Reedy

From tjreedy at  Mon Jan 11 01:16:19 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 01:16:19 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <n6vhbm$lfj$>

On 1/9/2016 11:24 PM, Steven D'Aprano wrote:
> On Sat, Jan 09, 2016 at 05:18:40PM -0500, Terry Reedy wrote:

>> Another reason to hide __version__ from the Python level is that its use
>> seems to me rather tricky and bug-prone.
> What makes you say that?

We would like to replace slow tortoise steps with quick rabbit jumps. 
Is it safe?  For avoiding name lookups in dicts, careful dict guards 
using __version__ should be enough.  For avoiding function calls, they 
help but are not enough.

Optimization is empirically tricky and bug prone.

CPython has many private implementation details that have not been 
exposed at the Python level because the expected gain is not worth the 
expected pain.  If __version__ is added, I think exposing it should be 
giving separate consideration.

Terry Jan Reedy

From tjreedy at  Mon Jan 11 01:36:19 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 01:36:19 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n6vih6$8lm$>

On 1/10/2016 3:02 PM, Victor Stinner wrote:

> In the PEP, I wrote "The PEP is designed to implement guards on
> namespaces, only the dict type can be used for namespaces in practice.
> collections.UserDict is modified because it must mimicks dict.

collections.UserDict mimics the public interface of dict, not internal 
implementation details.  It uses an actual dict to do this.  If 
__version__ is not exposed at the python level, it will not be and 
should not be visible via UserDict.

> collections.Mapping is unchanged."
> Is it enough? If no, what do you suggest to be more explicit?

Your minimal core proposal is or should be to add a possibly private 
.__version__ attribute to CPython dicts, so as to enable astoptimizer. 
Stick with that.  Stop inviting peripheral discussion and distractions. 
  Modifying UserDict and exposing __version__ to Python code are 
separate issues, and can be done later if later deemed to be desirable.

Terry Jan Reedy

From abarnert at  Mon Jan 11 01:48:28 2016
From: abarnert at (Andrew Barnert)
Date: Sun, 10 Jan 2016 22:48:28 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6vgkl$alv$>
References: <>
 <n6s103$i6f$> <>
Message-ID: <>

On Jan 10, 2016, at 22:04, Terry Reedy <tjreedy at> wrote:
> On 1/10/2016 12:23 AM, Chris Angelico wrote:
> (in reponse to Steven's response to my post)
>> There's more to it than that. Yes, a dict maps values to values; but
>> the keys MUST be immutable
> Keys just have to be hashable; only hashes need to be immutable.  

> By default, hashes depends on ids, which are immutable for a particular object within a run.
> (otherwise hashing has problems),
> only if the hash depends on values that mutate.  Some do.

But if equality depends on values, the hash has to depend on those same values. (Because two values that are equal have to hash equal.) Which means that if equality depends on any mutable values, the type can't be hashable. Which is why none of the built-in mutable types are hashable.

Of course Python doesn't stop you from writing your own types that can provide different hashes for equal values, or that can change hashes as they're mutated. It's even possible to use them as dict keys as long as you're very careful (the keys don't mutate in a way that changes either their hash or their equivalence while they're in the dict, and you never look up or add a key that's equal to an existing key but has a different hash).

But it's not _that_ much of an oversimplification to say that keys have to be immutable. And any dict-based optimizations can safely rely on the same thing basic dict usage relies on: if the keys _aren't_ actually immutable, they're coded and used carefully (as described above) so that you can't tell they're mutable.

From rosuav at  Mon Jan 11 03:07:32 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 19:07:32 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n6vgkl$alv$>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 5:04 PM, Terry Reedy <tjreedy at> wrote:
> On 1/10/2016 12:23 AM, Chris Angelico wrote:
> (in reponse to Steven's response to my post)
>> There's more to it than that. Yes, a dict maps values to values; but
>> the keys MUST be immutable
> Keys just have to be hashable; only hashes need to be immutable.  By
> default, hashes depends on ids, which are immutable for a particular object
> within a run.

Yes, but if you're using the ID as the hash and identity as equality,
then *by definition* the only way to look up that key is with that
object. That means it doesn't matter to the lookup optimization if the
object itself has changed:

class Puddle(object): pass
d = {}
key, val = Puddle(), Puddle() = "foo"; = "bar"
d[key] = val

snapshotted_d_key = d[key] = "not foo"

The optimization in question is effectively using a local reference
like snapshotted_d_key rather than doing the actual lookup again. It
can safely do this even if the attributes of that key have changed,
because there is no way for that to affect the result of the lookup.
So in terms of dict lookups, whatever affects hash and equality *is*
the object's value; if that's its identity, then identity is the sole
value that object has.

>> and this optimization
>> doesn't actually care about the immutability of the value.
> astoptimizer has multiple optimizations. One is not repeating name lookups.
> This is safe as long as the relevant dicts have not changed. I am guessing
> that you were pointing to this one.

Yes, that's the one I was talking about.

> Another is not repeating the call of a function with a particular value.
> This optimization, in general, is not safe even if dicts have not changed.
> It *does* care about the nature of dict values -- in particular the nature
> of functions that are dict values.  It is the one *I* discussed, and the
> reason I claimed that using __version__ is tricky.

Okay. In that case, yes, it takes a lot more checks.

> His toy example is replacing conditionally replacing 'len('abc') (at
> runtime) with '3', where '3' is computed *when the code is compiled. For
> this, it is crucial that builtin len is pure and immutable.

Correct. I'm getting this mental picture of angelic grace, with a
chosen few most beautiful functions being commended for their purity,
immutability, and reverence.

> Viktor is being super careful to not break code.  In response to my
> question, Viktor said astoptimizer uses a whitelist of pure builtins to
> supplement the information supplied by .__version__.  Dict history,
> summarized by __version__ is not always enough to answer 'is this
> optimization safe'?  The nature of values is sometimes crucially important.

There would be very few operations that can be optimized like this. In
practical terms, the only ones that I can think of are what you might
call "computed literals" - like (2+3j), they aren't technically
literals, but the programmer thinks of them that way. Things like
module-level constants (the 'stat' module comes to mind), a small
handful of simple transformations, and maybe some text<->bytes
transformations (eg "abc".encode("ascii") could be replaced at
compile-time with b"abc"). There won't be very many others, I suspect.


From victor.stinner at  Mon Jan 11 05:00:24 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 11:00:24 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
Message-ID: <>


2016-01-11 9:07 GMT+01:00 Chris Angelico <rosuav at>:
> Yes, but if you're using the ID as the hash and identity as equality,
> then *by definition* the only way to look up that key is with that
> object. That means it doesn't matter to the lookup optimization if the
> object itself has changed:
> class Puddle(object): pass
> d = {}
> key, val = Puddle(), Puddle()
> = "foo"; = "bar"
> d[key] = val

IMHO the discussion gone too far. See the PEP: the goal is to
implement efficient guards on namespaces. In namespaces, keys are
short immutable strings. Not funny objects. Keys come from the Python
source code, like "x" from "x=1".

Again, if the dict value is mutable (like functions implemented in
pure Python), they are dedicated guards for that, but no PEP is
required to implement these guards ;-) See the PEP 510: to specialize
a function, you have to a pass a *list* of guards. There is not
arbitrary limit on the number of guards :-) (But I expect to have less
than 10 guards for the common case, or more likely just a few ones.)

> There would be very few operations that can be optimized like this. In
> practical terms, the only ones that I can think of are what you might
> call "computed literals" - like (2+3j), they aren't technically
> literals, but the programmer thinks of them that way.

FYI Python 2 peephole optimizer is not able to optimize all operations
like that because of technical issues, it's limited :-/ Python 3
peephole optimizer is better ;-)

In more general, the optimizer is limited because it works on the
bytecode which is difficult to manipulate. It's difficult to implement
simple optimizations. For example, the peephole optimizer of Python 3
maintains a "stack of constants" to implement constant folding.
Implemeting constant folding on the AST is much easier, you can browse
the subtree of a node with nice Python objects. If you are curious,
you can take a look at the constant folding optimization step of

It implements more optimizations than the peephole optimizer:

> Things like
> module-level constants (the 'stat' module comes to mind),

In Python, it's rare to manipulate directly constants. But it's common
to access constant coming from a different namespace, like constants
at the module level. To implement constant propagation on these
constants, we also need guards on the namespace to disable the
optimization when a "constant" is modified (which can be done for unit
tests for example).

For example, the base64 module defines:

   _b32alphabet = b'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567'

and later in a function, it uses:

    {v: k for k, v in enumerate(_b32alphabet)}

This "complex" dict-comprehension calling enumerate() can be replaced
with a simpler dict literal: {65: 0, 66: 1, ...} or dict(((65,0),
(66,0), ...)). I don't know if it's the best example, I don't know if
it's really much faster, it's just to explain the general idea.

Another simple example: defines many constants like MARK =
b'(' and TUPLE = b(', defined at module-level. Later it uses for
example MARK + TUPLE. Using guards on the global namespace, it's
possible to replace MARK + TUPLE with b'((' to avoid two dict lookups
and a call to byte string concatenation. Again, it's a simple explain
to explain the principle.

Usually, a single optimization alone is not interesting. It's when you
combine optimization that it becomes interesting. For example,
constant propagation + constant folding + simplify iterable + loop
unrolling + elimitation of unused variables really makes the code
simpler (and more efficient).

> a small
> handful of simple transformations, and maybe some text<->bytes
> transformations (eg "abc".encode("ascii") could be replaced at
> compile-time with b"abc"). There won't be very many others, I suspect.

It's possible to optimize some method calls on builtin types without
guards since it's not possible to replace methods of builtin types. My
old AST optimizer implements such optimizations (I didn't reimplement
them in my new AST optimizer yet), but alone they are not really
interesting in term of performance.


From mistersheik at  Mon Jan 11 05:18:59 2016
From: mistersheik at (Neil Girdhar)
Date: Mon, 11 Jan 2016 05:18:59 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 7:37 PM, Andrew Barnert <abarnert at> wrote:

> On Jan 10, 2016, at 10:35, Neil Girdhar <mistersheik at> wrote:
> On Sun, Jan 10, 2016 at 12:57 PM, Steven D'Aprano <steve at>
> wrote:
>> On Sun, Jan 10, 2016 at 11:48:35AM -0500, Neil Girdhar wrote:
>> [...]
>> > > v = mydict.__version__
>> > > maybe_modify(mydict)
>> > > if v != mydict.__version__:
>> > >     print("dict has changed")
>> >
>> >
>> > This is exactly what I want to avoid.  If you want to do something like
>> > this, I think you should do it in regular Python by subclassing dict and
>> > overriding the mutating methods.
>> That doesn't help Victor, because exec need an actual dict, not
>> subclasses. Victor's PEP says this is a blocker.
> No, he can still do what he wants transparently in the interpreter.  What
> I want to avoid is Python users using __version__ in their own code.
> Well, he could change exec so it can use arbitrary mappings (or at least
> dict subclasses), but I assume that's much harder and more disruptive than
> his proposed change.
> Anyway, if I understand your point, it's this: __version__ should either
> be a private implementation-specific property of dicts, or it should be a
> property of all mappings; anything in between gets all the disadvantages of
> both.

Right.  I prefer the the former since making it a property of mappings
bloats Mapping beyond a minimum interface.

> If so, I agree with you. Encouraging people to use __version__ for other
> purposes besides namespace guards, but not doing anything to guarantee it
> actually exists anywhere besides namespaces, seems like a bad idea.
> But there is still something in between public and totally internal to FAT
> Python. Making it a documented property of PyDict objects at the C API
> level is a different story--there are already plenty of ways that C code
> can use those objects that won't work with arbitrary mappings, so adding
> another doesn't seem like a problem.

Adding it to PyDict and exposing it in the C API is totally reasonable to

> And even making it public but implementation-specific at the Python level
> may be useful for other CPython-specific optimizers (even if partially
> written in Python); if so, the best way to deal with the danger that
> someone could abuse it for code that should work with arbitrary mappings or
> with another Python implementation should be solved by clearly documenting
> it's non portability and discouraging its abuse in the docs, not by hiding
> it.
Here is where I have to disagree.  I hate it when experts say "we'll just
document it and then it's the user's fault for misusing it".  Yeah, you're
right, but as a user, it is very frustrating to have to read other people's
documentation.  You know that some elite Python programmer is going to
optimize his code using this and someone years later is going to scratch
his head wondering where __version__ is coming from.  Is it the provided by
the caller?  Was it added to the object at some earlier point?  Finally,
he'll search the web, arrive at a stackoverflow question with 95 upvotes
that finally clears things up.  And for what?  Some minor optimization.
(Not Victor's optimization, but a Python user's optimization in Python

Python should make it easy to write clear code.  It's my opinion that
documentation is not a substitute for good language design, just as
comments are not a substitute for good code design.

Also, using this __version__ in source code is going to complicate
switching from CPython to any of the other Python implementations, so those
implementations will probably end up implementing it just to simplify
"porting", which would otherwise be painless.

Why don't we leave exposing __version__ in Python to another PEP?  Once
it's in the C API (as you proposed) you will be able to use it from Python
by writing an extension and then someone can demonstrate the value of
exposing it in Python by writing tests.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Jan 11 05:23:35 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 11 Jan 2016 20:23:35 +1000
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On 10 January 2016 at 08:19, Chris Angelico <rosuav at> wrote:
> On Sun, Jan 10, 2016 at 3:51 AM, Mahmoud Hashemi <mahmoud at> wrote:
>> I think it's a pretty common itch! Have you seen the boltons implementation?
> Yes it is, and no I haven't; everyone has a slightly different idea of
> what makes a good API, and that's why I put that caveat onto my
> suggestion. You can't make everyone happy, and APIs should not be
> designed by committee :)

In the context of Python as a cross-platform language, it's also
important to remember that POSIX-style user/group/other permissions
are only one form of file level access control - depending on your
filesystem and OS, there will be a range of others.

That significantly reduces the motivation to try to provide a platform
independent abstraction for an inherently platform specific concept
(at least in the standard library - PyPI is a different matter).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From abarnert at  Mon Jan 11 05:55:37 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 02:55:37 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 02:23, Nick Coghlan <ncoghlan at> wrote:
> In the context of Python as a cross-platform language, it's also
> important to remember that POSIX-style user/group/other permissions
> are only one form of file level access control - depending on your
> filesystem and OS, there will be a range of others.

Well, not a _huge_ range; as far as I know, the only things you're ever likely to run into besides POSIX permissions or a simple read-only flag are ACLs*. But that's still enough of a range to worry about...

* Yes, NT and POSIX ACLs aren't quite identical, and the POSIX standard was never completed and there are some minor differences between the Linux and BSD implementations, and OS X confused things by using the NT design with a POSIX-ish API and completely unique tools, so using ACLs portably isn't trivial. But, except for the problem of representing ACLs for users who don't exist on the system, they're pretty much equivalent for almost anything you care about at the application level.

From steve at  Mon Jan 11 06:20:11 2016
From: steve at (Steven D'Aprano)
Date: Mon, 11 Jan 2016 22:20:11 +1100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote:

> Here is where I have to disagree.  I hate it when experts say "we'll just
> document it and then it's the user's fault for misusing it".  Yeah, you're
> right, but as a user, it is very frustrating to have to read other people's
> documentation.  You know that some elite Python programmer is going to
> optimize his code using this and someone years later is going to scratch
> his head wondering where __version__ is coming from.  Is it the provided by
> the caller?  Was it added to the object at some earlier point?

Neil, don't you think you're being overly dramatic here? "Programmer 
needs to look up API feature, news at 11!" The same could be said about 
class.__name__, instance.__class__, obj.__doc__, module.__dict__ and 
indeed every single Python feature. Sufficiently inexperienced or naive 
programmers could be scratching their head over literally *anything*.

(I remember being perplexed by None the first time I read Python code. 
What was it and where did it come from? I had no idea.)

All those words for such a simple, and minor, point: every new API 
feature is one more thing for programmers to learn. We get that.

But the following is a good, strong argument:

> Also, using this __version__ in source code is going to complicate
> switching from CPython to any of the other Python implementations, so those
> implementations will probably end up implementing it just to simplify
> "porting", which would otherwise be painless.
> Why don't we leave exposing __version__ in Python to another PEP?  Once
> it's in the C API (as you proposed) you will be able to use it from Python
> by writing an extension and then someone can demonstrate the value of
> exposing it in Python by writing tests.

I can't really argue against this. As much as I would love to play 
around with __version__, I think you're right. It needs to prove itself 
before being exposed as a public API.


From ram at  Mon Jan 11 07:02:44 2016
From: ram at (Ram Rachum)
Date: Mon, 11 Jan 2016 14:02:44 +0200
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

Hi everyone,

I spent some time thinking about this. I come up with a big and impressive
API, then figured it's overkill, shelved it and made a simpler one :)

Here's my new preferred API. Assume that `path` is a `pathlib.Path` object.

        Checking the chmod of the file:
            int(path.chmod) # Get an int 393 which in octal is 0o611
            oct(path.chmod) # Get a string '0o611'
            str(path.chmod) # Get a string 'rw-r--r--'
            repr(path.chmod) # Get a string '<Chmod: rw-r--r-- / 0o611>

        Modifying the chmod of the file:
            path.chmod(0o611) # Set chmod to 0o611 (for backward
            path.chmod = 0o611 # Set chmod to 0o611
            path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal
            path.chmod = other_path.chmod # Set chmod to be the same as
that of some other file
            path.chmod = 'rw-r--r--' # Set chmod to 0o611
            path.chmod += '--x--x--x' # Add execute permission to everyone
            path.chmod -= '----rwx' # Remove all permissions from others

I've chosen += and -=, despite the fact they're not set operations, because
Python doesn't have __inand__. On an unrelated note, maybe we should have
__inand__? (I mean x ^~= y)

What do you think?

On Sat, Jan 9, 2016 at 6:11 PM, Chris Angelico <rosuav at> wrote:

> On Sun, Jan 10, 2016 at 3:06 AM, Ram Rachum <ram at> wrote:
> > Thanks for the reference. Personally I think that
> `my_path.stat().st_mode &
> > stat.S_IXGRP` is not human-readable enough. I'll work on a nicer API.
> > Probably this for the same action you described:
> >
> > 'x' in my_path.chmod()['g']
> >
> >
> Okay. I'm not sure how popular that'll be, but sure.
> As an alternative API, you could have it return a tuple of permission
> strings, which you'd use thus:
> 'gx' in my_path.mode() # Group eXecute permission is set
> But scratch your own itch, and don't give in to the armchair advisers.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Mon Jan 11 07:29:26 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 23:29:26 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 11:02 PM, Ram Rachum <ram at> wrote:
> Here's my new preferred API. Assume that `path` is a `pathlib.Path` object.
>         Checking the chmod of the file:
>             int(path.chmod) # Get an int 393 which in octal is 0o611
>             oct(path.chmod) # Get a string '0o611'
>             str(path.chmod) # Get a string 'rw-r--r--'
>             repr(path.chmod) # Get a string '<Chmod: rw-r--r-- / 0o611>
>         Modifying the chmod of the file:
>             path.chmod(0o611) # Set chmod to 0o611 (for backward
> compatibility)
>             path.chmod = 0o611 # Set chmod to 0o611
>             path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal
>             path.chmod = other_path.chmod # Set chmod to be the same as that
> of some other file
>             path.chmod = 'rw-r--r--' # Set chmod to 0o611
>             path.chmod += '--x--x--x' # Add execute permission to everyone
>             path.chmod -= '----rwx' # Remove all permissions from others
> I've chosen += and -=, despite the fact they're not set operations, because
> Python doesn't have __inand__. On an unrelated note, maybe we should have
> __inand__? (I mean x ^~= y)
> What do you think?

The one thing I'd do differently is call it "mode" or "permissions"
rather than "chmod" (CHange MODe), and drop the callability. If you're
going to do it as property assignment, making that property also be
callable feels awkward (plus it'll be a pain to implement). But
otherwise, yeah! Looks great!


From jsbueno at  Mon Jan 11 07:41:37 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Mon, 11 Jan 2016 10:41:37 -0200
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

If you are doing it OO and trying to create a human-usable API, then,
why the hell to stick with
octal and string representations from the 1970's?

path.chmod.executable could return a named-tuple-like object, with
owner=True, group=False, all=False  - and conversely, you could have
path.chmod.owner to return (read=True, write=True, execute=True)

Ad thus one could simply do:
if path.chmod.owner.writable:

On 11 January 2016 at 10:29, Chris Angelico <rosuav at> wrote:
> On Mon, Jan 11, 2016 at 11:02 PM, Ram Rachum <ram at> wrote:
>> Here's my new preferred API. Assume that `path` is a `pathlib.Path` object.
>>         Checking the chmod of the file:
>>             int(path.chmod) # Get an int 393 which in octal is 0o611
>>             oct(path.chmod) # Get a string '0o611'
>>             str(path.chmod) # Get a string 'rw-r--r--'
>>             repr(path.chmod) # Get a string '<Chmod: rw-r--r-- / 0o611>
>>         Modifying the chmod of the file:
>>             path.chmod(0o611) # Set chmod to 0o611 (for backward
>> compatibility)
>>             path.chmod = 0o611 # Set chmod to 0o611
>>             path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal
>>             path.chmod = other_path.chmod # Set chmod to be the same as that
>> of some other file
>>             path.chmod = 'rw-r--r--' # Set chmod to 0o611
>>             path.chmod += '--x--x--x' # Add execute permission to everyone
>>             path.chmod -= '----rwx' # Remove all permissions from others
>> I've chosen += and -=, despite the fact they're not set operations, because
>> Python doesn't have __inand__. On an unrelated note, maybe we should have
>> __inand__? (I mean x ^~= y)
>> What do you think?
> The one thing I'd do differently is call it "mode" or "permissions"
> rather than "chmod" (CHange MODe), and drop the callability. If you're
> going to do it as property assignment, making that property also be
> callable feels awkward (plus it'll be a pain to implement). But
> otherwise, yeah! Looks great!
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From rosuav at  Mon Jan 11 07:47:48 2016
From: rosuav at (Chris Angelico)
Date: Mon, 11 Jan 2016 23:47:48 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 11:41 PM, Joao S. O. Bueno
<jsbueno at> wrote:
> If you are doing it OO and trying to create a human-usable API, then,
> why the hell to stick with
> octal and string representations from the 1970's?

Because they are compact and readable, even when you have lots of them
in a column.
$ ll /tmp
total 24
drwx------ 2 rosuav rosuav 4096 Jan  8 05:27 gpg-4K37Xk
drwxr-xr-x 2 root   root   4096 Jan  8 05:36 hsperfdata_root
drwxr-xr-x 2 rosuav rosuav 4096 Jan 11 22:35 hsperfdata_rosuav
drwx------ 2 root   root   4096 Jan  8 05:26 pulse-PKdhtXMmr18n
prwxr-xr-x 1 rosuav rosuav    0 Jan 11 22:26
drwx------ 2 rosuav rosuav 4096 Jan  8 05:27 ssh-Feo5RK7e1TV3
drwx------ 3 root   root   4096 Jan  8 05:27

You can see at a glance which ones are readable by people other than
their owners. That's worth keeping.

It doesn't have to be the ONLY way to do things, but it's definitely
one that I do not want to lose.


From marcin at  Mon Jan 11 07:57:02 2016
From: marcin at (Marcin Sztolcman)
Date: Mon, 11 Jan 2016 13:57:02 +0100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 1:02 PM, Ram Rachum <ram at> wrote:

> I spent some time thinking about this. I come up with a big and impressive
> API, then figured it's overkill, shelved it and made a simpler one :)
> Here's my new preferred API. Assume that `path` is a `pathlib.Path` object.
>         Checking the chmod of the file:
>             int(path.chmod) # Get an int 393 which in octal is 0o611
>             oct(path.chmod) # Get a string '0o611'
>             str(path.chmod) # Get a string 'rw-r--r--'
>             repr(path.chmod) # Get a string '<Chmod: rw-r--r-- / 0o611>
>         Modifying the chmod of the file:
>             path.chmod(0o611) # Set chmod to 0o611 (for backward
> compatibility)
>             path.chmod = 0o611 # Set chmod to 0o611
>             path.chmod = 393 # Set chmod to 0o611, which is 393 in decimal
>             path.chmod = other_path.chmod # Set chmod to be the same as that
> of some other file
>             path.chmod = 'rw-r--r--' # Set chmod to 0o611
>             path.chmod += '--x--x--x' # Add execute permission to everyone
>             path.chmod -= '----rwx' # Remove all permissions from others
> I've chosen += and -=, despite the fact they're not set operations, because
> Python doesn't have __inand__. On an unrelated note, maybe we should have
> __inand__? (I mean x ^~= y)
> What do you think?

There is only one way...? ;)

My proposal (used for some time in few private projects, but extracted
as standalone few days ago):

Marcin Sztolcman :: ::

From victor.stinner at  Mon Jan 11 09:04:15 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 15:04:15 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-11 11:18 GMT+01:00 Neil Girdhar <mistersheik at>:
>> No, he can still do what he wants transparently in the interpreter.  What
>> I want to avoid is Python users using __version__ in their own code.
>> Well, he could change exec so it can use arbitrary mappings (or at least
>> dict subclasses), but I assume that's much harder and more disruptive than
>> his proposed change.
>> Anyway, if I understand your point, it's this: __version__ should either
>> be a private implementation-specific property of dicts, or it should be a
>> property of all mappings; anything in between gets all the disadvantages of
>> both.
> Right.  I prefer the the former since making it a property of mappings
> bloats Mapping beyond a minimum interface.

The discussion on adding a __version__ property on all mapping types
is interesting. I now agree that it's a boolean choice: no mapping
type must have a __version__ property, or all types must have it. It
would be annoying to get a cryptic issue when we pass a dict subtype
or a dict-like type to a function expecting a "mapping".

I *don't* want to require all mapping types to implement a __version__
property. Even if it's simple to implement, some types can be a simple
wrapper on top on an existing efficient mapping type which doesn't
implement such property (or worse, have a similar *but different*
property). For example, Jython and IronPython probably reuse existing
mapping types of Java and .NET, and I don't think that they have such
version property.

The Mapping ABC already requires a lot of methods, having to implement
yet another property would make the implementation even more complex
and difficult to maintain. My PEP 509 requires 8 methods (including
the constructor) to update the __version__.

> Here is where I have to disagree.  I hate it when experts say "we'll just
> document it and then it's the user's fault for misusing it".  Yeah, you're
> right, but as a user, it is very frustrating to have to read other people's
> documentation.  You know that some elite Python programmer is going to
> optimize his code using this and someone years later is going to scratch his
> head wondering where __version__ is coming from.  Is it the provided by the
> caller?  Was it added to the object at some earlier point?  Finally, he'll
> search the web, arrive at a stackoverflow question with 95 upvotes that
> finally clears things up.  And for what?  Some minor optimization. (Not
> Victor's optimization, but a Python user's optimization in Python code.)

I agree that it would be a bad practice to use widely __version__ in a
project to micro-optimize manually an application. Well,
micro-optimizations are bad practice in most cases ;-) Remember that
dict lookup have a complex of O(1), that's why they are used for
namespaces ;-)

It's a bad idea because at the Python level, the dict lookup and
checking the version has... the same cost! (48.7 ns vs 47.5 ns... a
difference of 1 nanosecond)

haypo at smithers$ ./python -m timeit -s 'd = {str(i):i for i in
range(100)}' 'd["33"] == 33'
10000000 loops, best of 3: 0.0487 usec per loop
haypo at smithers$ ./python -m timeit -s 'd = {str(i):i for i in
range(100)}' 'd.__version__ == 100'
10000000 loops, best of 3: 0.0475 usec per loop

The difference is only visible at the C level:

* PyObject_GetItem: 16.5 ns
* PyDict_GetItem: 14.8 ns
* fat.GuardDict: 3.8 ns (check dict.__version__)

Well, 3.8 ns (guard) vs 14.8 ns (dict lookup) is nice but not so
amazing, a dict lookup is already *fast*. The difference between
guards and dict lookups is that a guard check has a complexity of O(1)
in the common case (if the dict was not modified).  For example, an
optimization using 10 global variables in a function, the check costs
148 ns for 10 dict lookups, whereas the guard still only cost 3.8 ns
(39x as fast).

The guards must be as cheap as possible, otherwise it will have to
work harder to implement more efficient optimizations :-D

Note: the performance of a dict lookup also depends if the key is
"interned" (in short, it's a kind of singleton to compare strings by
their address instead of having to compare character per character).
For code objects, Python interns strings which are made of characters
a-z, A-Z and "_".

Well, it's just to confirm that yes, the PEP is designed to implement
fast guards in C, but it would be a bad idea to start to use it widely
at the Python level.

> Also, using this __version__ in source code is going to complicate switching
> from CPython to any of the other Python implementations, so those
> implementations will probably end up implementing it just to simplify
> "porting", which would otherwise be painless.

IMHO *if* we add __version__ to dict (or even to all mapping types),
it must be done for all Python implementations. It would be really
annoying to have to start putting kind of #ifdef in the code for a
feature of a core builtin type (dict).

But again, I now agree to not expose the version at the Python level...

> Why don't we leave exposing __version__ in Python to another PEP?

According to this thread and my benchmark above, the __version__
property at the Python level is a *bad* idea. So I'm not interested
anymore to expose it.


From barry at  Mon Jan 11 10:10:41 2016
From: barry at (Barry Warsaw)
Date: Mon, 11 Jan 2016 10:10:41 -0500
Subject: [Python-ideas] PEP 9 - plaintext PEP format - is officially
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 03:25 PM, anatoly techtonik wrote:

>On Wed, Jan 6, 2016 at 2:49 AM, Barry Warsaw <barry at> wrote:
>> reStructuredText is clearly a better format
>Can you expand on that? I use markdown everywhere

reST is better than plain text.  Markdown is not a PEP format option.

>> all recent PEP submissions have been in reST for a while now anyway.
>Is it possible to query exact numbers automatically?

Feel free to grep the PEPs hg repo.

>What is the tooling support for handling PEP 9 and PEP 12?

UTSL.  Everything is in the PEPs hg repo.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From victor.stinner at  Mon Jan 11 11:49:21 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 17:49:21 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

Thank you very much for the first round of comments on this PEP 509
(dict version). I posted a second version to python-dev. The main
changes since the first version are that the dictionary version is no
more exposed at the Python level and the field type now also has a
size of 64-bit on 32-bit platforms. Please continue the discussion
there, this thread is now closed ;-)

It's now time to review my second PEP 510 (func.specialize), also
posted on the python-ideas list 3 days ago:
"RFC: PEP: Specialized functions with guards"!


From abarnert at  Mon Jan 11 11:49:45 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 08:49:45 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 04:02, Ram Rachum <ram at> wrote:
> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__.

For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea.

> On an unrelated note, maybe we should have __inand__? (I mean x ^~= y)

First, why would you spell inand that way? x ^ ~y is the exclusive or of x and ~y, which is true for ~x and ~y and for x and y. That's completely different from nand, which is true for ~x and ~y, ~x and y, and x and y, but not x and ~y. And neither is what you want, which is true only for x and ~y, which you can easily write as x & ~y.

Second, why do you think you need an i-operation for a combined operator? x &= ~y does the same thing you'd expect from x &~= y.

And why do you think you need an overridable i-operator in the first place? If you call x |= y and x.__ior__ doesn't exist, it just compiles to the same as x = x | y. And, unless x is mutable (which would be very surprising for something that acts like an int), that's actually the way you want it to be interpreted anyway.

All of this implies that adding the 70s bitwise operator syntax for dealing with permissions doesn't help with concise but readable code so much as encourage people who don't actually understand bitwise operations to write things that aren't correct or to misread other people's code. What's wrong with just spelling it "clear"? Or, better, as attribute access ("p.chmod.executable = False" or " = False") or actual set operations with sets of enums instead of integers?

The other advantage of using named operations is that it lets you write things that are useful but can't be expressed in a single bitwise operation. For example, " = q.chmod.owner" is a lot simpler than "p.chmod = p.chmod & ~0o070 | (q.chmod >> 3) & 0o070".

Meanwhile, why are you calling the mode "chmod", which is an abbreviation for "change mode"? That's sort of readable but still weird for the cases when you're modifying it, but completely confusing for cases when you're just reading it off.

Have you looked at the existing alternatives on PyPI? If so, why isn't one of them good enough?

And meanwhile, why not just put your library on PyPI and see if others take it up and start using it? Is there a reason this has to be in the stdlib (and only available on 3.6+) to be usable?

From rosuav at  Mon Jan 11 11:53:04 2016
From: rosuav at (Chris Angelico)
Date: Tue, 12 Jan 2016 03:53:04 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert <abarnert at> wrote:
> On Jan 11, 2016, at 04:02, Ram Rachum <ram at> wrote:
>> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__.
> For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea.

I would expect it NOT to be a subclass of int, actually - just that it
has __int__ (and maybe __index__) to convert it to one.


From guido at  Mon Jan 11 12:27:20 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 09:27:20 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

Unless there's a huge outcry I'm going to add this as an informational
section to PEP 484.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan 11 12:38:14 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 09:38:14 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>


(I'm happy to change or move this if there *is* a serious concern -- but I
figured if there isn't I might as well get it over with.

On Mon, Jan 11, 2016 at 9:27 AM, Guido van Rossum <guido at> wrote:

> Unless there's a huge outcry I'm going to add this as an informational
> section to PEP 484.
> --
> --Guido van Rossum (

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From kramm at  Mon Jan 11 13:10:20 2016
From: kramm at (Matthias Kramm)
Date: Mon, 11 Jan 2016 10:10:20 -0800 (PST)
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote:
> At Dropbox we're trying to be good citizens and we're working towards 
> introducing gradual typing (PEP 484) into our Python code bases (several 
> million lines of code). However, that code base is mostly still Python 2.7 
> and we believe that we should introduce gradual typing first and start 
> working on conversion to Python 3 second (since having static types in the 
> code can help a big refactoring like that).
> Since Python 2 doesn't support function annotations we've had to look for 
> alternatives. We considered stub files, a magic codec, docstrings, and 
> additional `# type:` comments. In the end we decided that `# type:` 
> comments are the most robust approach.

FWIW, we had the same problem at Google. (Almost) all our code is Python 2. 
However, we went the route of backporting the type annotations grammar from 
Python 3. We now run a custom Python 2 that knows about PEP 3107.

The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on 
the eyes (capitalized container names, square brackets, quoting, ...) , and 
squeezing it all into comments wouldn't have helped matters, and would have 
hindered adoption.

We're still happy with our decision of running a custom Python 2, but your 
mileage might vary. It's certainly true that other tools (pylint etc.) need 
to learn to not be confused by the "odd" Python 2 syntax.

[1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a 
> bit over 200 lines. It's not very interesting yet, since it sets the types 
> of nearly all arguments to 'Any'. We're considering building a much more 
> advanced version that tries to guess much better argument types using some 
> form of whole-program analysis. I've heard that Facebook's Hack project got 
> a lot of mileage out of such a tool. I don't yet know how to write it yet 
> -- possibly we could use a variant of mypy's type inference engine, or 
> alternatively we might be able to use something like Jedi (

pytype ( already does (context sensitive, 
path-sensitive) whole-program analysis, and we're working on making it 
(more) PEP 484 compatible. We're also writing a (2to3 based) tool for 
inserting the derived tools back into the source code. Should we join 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Mon Jan 11 13:35:00 2016
From: chris.barker at (Chris Barker)
Date: Mon, 11 Jan 2016 10:35:00 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 7:41 AM, Ram Rachum <ram at> wrote:

>> What's wrong with referencing other modules?
> Not wrong, just desirable to avoid. For example, I think that doing
> `path.chmod(x)` is preferable to `os.chmod(path, x)`.

I often prefer OO structure as well, but you can get that by subclassing
Path -- it doesn't need to be in the stdlib.



Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg at  Mon Jan 11 13:42:30 2016
From: greg at (Gregory P. Smith)
Date: Mon, 11 Jan 2016 18:42:30 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 4:48 AM M.-A. Lemburg <mal at> wrote:

> On 09.01.2016 00:04, Guido van Rossum wrote:
> > Since Python 2 doesn't support function annotations we've had to look for
> > alternatives. We considered stub files, a magic codec, docstrings, and
> > additional `# type:` comments. In the end we decided that `# type:`
> > comments are the most robust approach. We've experimented a fair amount
> > with this and we have a proposal for a standard.
> >
> > The proposal is very simple. Consider the following function with Python
> 3
> > annotations:
> >
> >     def embezzle(self, account: str, funds: int = 1000000,
> *fake_receipts:
> > str) -> None:
> >         """Embezzle funds from account using fake receipts."""
> >         <code goes here>
> >
> > An equivalent way to write this in Python 2 is the following:
> >
> >     def embezzle(self, account, funds=1000000, *fake_receipts):
> >         # type: (str, int, *str) -> None
> >         """Embezzle funds from account using fake receipts."""
> >         <code goes here>
> By using comments, the annotations would not be available at
> runtime via an .__annotations__ attribute and every tool would
> have to implement a parser for extracting them.
> Wouldn't it be better and more in line with standard Python
> syntax to use decorators to define them ?
>     @typehint("(str, int, *str) -> None")
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> This would work in Python 2 as well and could (optionally)
> add an .__annotations__ attribute to the function/method,
> automatically create a type annotations file upon import,
> etc.

The goal of the # type: comments as described is to have this information
for offline analysis of code, not to make it available at run time.  Yes, a
decorator syntax could be adopted if anyone needs that. I don't expect
anyone does. Decorators and attributes would add run time cpu and memory
overhead whether the information was going to be used at runtime or not
(likely not; nobody is likely to *deploy* code that looks at


> --
> Marc-Andre Lemburg
> Professional Python Services directly from the Experts (#1, Jan 09 2016)
> >>> Python Projects, Coaching and Consulting ...
> >>> Python Database Interfaces ... 
> >>> Plone/Zope Database Interfaces ... 
> ________________________________________________________________________
> ::: We implement business ideas - efficiently in both time and costs :::
> Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From kramm at  Mon Jan 11 13:10:20 2016
From: kramm at (Matthias Kramm)
Date: Mon, 11 Jan 2016 10:10:20 -0800 (PST)
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote:
> At Dropbox we're trying to be good citizens and we're working towards 
> introducing gradual typing (PEP 484) into our Python code bases (several 
> million lines of code). However, that code base is mostly still Python 2.7 
> and we believe that we should introduce gradual typing first and start 
> working on conversion to Python 3 second (since having static types in the 
> code can help a big refactoring like that).
> Since Python 2 doesn't support function annotations we've had to look for 
> alternatives. We considered stub files, a magic codec, docstrings, and 
> additional `# type:` comments. In the end we decided that `# type:` 
> comments are the most robust approach.

FWIW, we had the same problem at Google. (Almost) all our code is Python 2. 
However, we went the route of backporting the type annotations grammar from 
Python 3. We now run a custom Python 2 that knows about PEP 3107.

The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on 
the eyes (capitalized container names, square brackets, quoting, ...) , and 
squeezing it all into comments wouldn't have helped matters, and would have 
hindered adoption.

We're still happy with our decision of running a custom Python 2, but your 
mileage might vary. It's certainly true that other tools (pylint etc.) need 
to learn to not be confused by the "odd" Python 2 syntax.

[1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a 
> bit over 200 lines. It's not very interesting yet, since it sets the types 
> of nearly all arguments to 'Any'. We're considering building a much more 
> advanced version that tries to guess much better argument types using some 
> form of whole-program analysis. I've heard that Facebook's Hack project got 
> a lot of mileage out of such a tool. I don't yet know how to write it yet 
> -- possibly we could use a variant of mypy's type inference engine, or 
> alternatively we might be able to use something like Jedi (

pytype ( already does (context sensitive, 
path-sensitive) whole-program analysis, and we're working on making it 
(more) PEP 484 compatible. We're also writing a (2to3 based) tool for 
inserting the derived tools back into the source code. Should we join 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg at  Mon Jan 11 13:57:43 2016
From: greg at (Gregory P. Smith)
Date: Mon, 11 Jan 2016 18:57:43 +0000
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney <moloney at> wrote:

> Its important to keep in mind the main benefit of scandir is you don't
> have to do ANY stat call in many cases, because the directory listing
> provides some subset of this info. On Linux you can at least tell if a path
> is a file or directory.  On windows there is much more info provided by the
> directory listing. Avoiding subsequent stat calls is also nice, but not
> nearly as important due to OS level caching.

+1 - this was one of the two primary motivations behind scandir.  Anything
trying to reimplement a filesystem tree walker without using scandir is
going to have sub-standard performance.

If we ever offer anything with "find like functionality" related to
pathlib, it *needs* to be based on scandir.  Anything else would just be
repeating the convenient but untrue limiting assumptions of os.listdir:
That the contents of a directory can be loaded into memory and that we
don't mind re-querying the OS for stat information that it already gave us
but we threw away as part of reading the directory.


> Brendan Moloney
> Research Associate
> Advanced Imaging Research Center
> Oregon Health Science University
> *From:* Python-ideas [ at]
> on behalf of Guido van Rossum [guido at]
> *Sent:* Wednesday, January 06, 2016 2:42 PM
> *To:* Random832
> *Cc:* Python-Ideas
> *Subject:* Re: [Python-ideas] find-like functionality in pathlib
> I couldn't help myself and coded up a prototype for the StatCache design I
> sketched. See Feedback welcome! On my
> Mac it only seems to offer limited benefits though...
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From p.f.moore at  Mon Jan 11 15:00:54 2016
From: p.f.moore at (Paul Moore)
Date: Mon, 11 Jan 2016 20:00:54 +0000
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On 11 January 2016 at 18:57, Gregory P. Smith <greg at> wrote:
> On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney <moloney at> wrote:
>> Its important to keep in mind the main benefit of scandir is you don't
>> have to do ANY stat call in many cases, because the directory listing
>> provides some subset of this info. On Linux you can at least tell if a path
>> is a file or directory.  On windows there is much more info provided by the
>> directory listing. Avoiding subsequent stat calls is also nice, but not
>> nearly as important due to OS level caching.
> +1 - this was one of the two primary motivations behind scandir.  Anything
> trying to reimplement a filesystem tree walker without using scandir is
> going to have sub-standard performance.
> If we ever offer anything with "find like functionality" related to pathlib,
> it needs to be based on scandir.  Anything else would just be repeating the
> convenient but untrue limiting assumptions of os.listdir: That the contents
> of a directory can be loaded into memory and that we don't mind re-querying
> the OS for stat information that it already gave us but we threw away as
> part of reading the directory.

This is very much why I feel that we need something in pathlib. I
understand the motivation for not caching stat information in path
objects. And I don't have a viable design for how a "find-like
functionality" API should be implemented in pathlib. But as it stands,
I feel as though using pathlib for anything that does bulk filesystem
scans is deliberately choosing something that I know won't scale well.
So (in my mind) pathlib doesn't fulfil the role of "one obvious way to
do things". Which is a shame, because Path.rglob is very often far
closer to what I need in my programs than os.walk (even when it's just

In practice, by far the most common need I have[1] for filetree
walking is to want to get back a list of all the names of files
starting at a particular directory with the returned filenames
*relative to the given root*. Pathlib.rglob gives absolute pathnames.
os.walk gives the absolute directory name and the base filename.
Neither is what I want, although obviously in both cases it's pretty
trivial to extract the "relative to the root" part from the returned
data. But an API that gave that information directly, with
scandir-level speed and scalability, in the form of pathlib.Path
relative path objects, would be ideal for me[1].


[1] And yes, I know this means I should just write a utility function for it :-)
[2] The feature creep starts when people want to control things like
pruning particular directories such as '.git', or only matching
particular glob patterns, or choosing whether or not to include
directories in the output, or... Adding *those* features without
ending up with a Frankenstein's monster of an API is the challenge :-)

From abarnert at  Mon Jan 11 15:22:55 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 12:22:55 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 10:42, Gregory P. Smith <greg at> wrote:
> The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time.  Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__).

These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed.

3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations?

Meanwhile, when _are_ annotations useful at runtime? Mostly during the kind of debugging that you'll be doing during something like a port from 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're not useful there, it's hard to imagine why they'd be useful after the port is done, when you're deploying your 3.x code.

So it seems like using decorators (or backporting the syntax, as Google has done) has better be acceptable for 2.7, or the PEP 484 design has a serious problem, and in a few months we're going to see Dropbox and Google and everyone else demanding a way to use type hinting without wasting memory on annotations are runtime in 3.x.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan 11 15:41:48 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 12:41:48 -0800
Subject: [Python-ideas] find-like functionality in pathlib
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 10:57 AM, Gregory P. Smith <greg at> wrote:

> On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney <moloney at> wrote:
>> Its important to keep in mind the main benefit of scandir is you don't
>> have to do ANY stat call in many cases, because the directory listing
>> provides some subset of this info. On Linux you can at least tell if a path
>> is a file or directory.  On windows there is much more info provided by the
>> directory listing. Avoiding subsequent stat calls is also nice, but not
>> nearly as important due to OS level caching.
> +1 - this was one of the two primary motivations behind scandir.  Anything
> trying to reimplement a filesystem tree walker without using scandir is
> going to have sub-standard performance.
> If we ever offer anything with "find like functionality" related to
> pathlib, it *needs* to be based on scandir.  Anything else would just be
> repeating the convenient but untrue limiting assumptions of os.listdir:
> That the contents of a directory can be loaded into memory and that we
> don't mind re-querying the OS for stat information that it already gave us
> but we threw away as part of reading the directory.

And we already have this in the form of pathlib's [r]glob() methods.
There's a patch to the glob module in and
as soon as that's committed I hope that its author(s) will work on doing a
similar patch for pathlib's [r]glob (tracking this in

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 11 15:44:18 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 12:44:18 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 08:53, Chris Angelico <rosuav at> wrote:
>> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert <abarnert at> wrote:
>>> On Jan 11, 2016, at 04:02, Ram Rachum <ram at> wrote:
>>> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__.
>> For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea.
> I would expect it NOT to be a subclass of int, actually - just that it
> has __int__ (and maybe __index__) to convert it to one.

If you read his proposal, he wants oct(path.chmod) to work. That doesn't work on types with __int__. 

Of course it does work on types with __index__, but that's because the whole point of __index__ is to allow your type to act like an actual int everywhere that Python expects an int, rather than just something coercible to int. The point of PEP
357 was to allow numpy.int64 to act as close to a subtype of int as possible without actually being a subtype.

It would be very surprising for, say, IntEnum (which subclasses int), or numpy.int64 (which uses __index__), to offer an __add__ method that actually did an or instead of an add. It will be just as surprising here. 

And the fact that he wants to make it possible (in fact, _encouraged_) to directly assign an int to the property makes it even more confusing.

For example, "p.chmod = q + 0o010" does one thing if q is an integer, and another thing if it's the chmod of another path object.

(Of course he also wants to be able to assign a string, but that's not a problem; if you mix up str and int, you get a nice TypeError, as opposed to mixing up int and an int subclass or __index__-using class, where you silently get incorrect behavior.)

From guido at  Mon Jan 11 16:22:11 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 13:22:11 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 10:10 AM, Matthias Kramm <kramm at> wrote:

> On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote:
>> At Dropbox we're trying to be good citizens and we're working towards
>> introducing gradual typing (PEP 484) into our Python code bases (several
>> million lines of code). However, that code base is mostly still Python 2.7
>> and we believe that we should introduce gradual typing first and start
>> working on conversion to Python 3 second (since having static types in the
>> code can help a big refactoring like that).
>> Since Python 2 doesn't support function annotations we've had to look for
>> alternatives. We considered stub files, a magic codec, docstrings, and
>> additional `# type:` comments. In the end we decided that `# type:`
>> comments are the most robust approach.
> FWIW, we had the same problem at Google. (Almost) all our code is Python
> 2. However, we went the route of backporting the type annotations grammar
> from Python 3. We now run a custom Python 2 that knows about PEP 3107.

Yeah, we looked into this but we use many 3rd party tools that would not
know what to do with the new syntax, so that's why we went the route of
adding support for these comments to mypy.

> The primary reasons are aesthetic - PEP 484 syntax is already a bit hard
> on the eyes (capitalized container names, square brackets, quoting, ...) ,
> and squeezing it all into comments wouldn't have helped matters, and would
> have hindered adoption.

Possibly. I haven't had any pushback about this from the Dropbox engineers
who have seen this so far.

> We're still happy with our decision of running a custom Python 2, but your
> mileage might vary. It's certainly true that other tools (pylint etc.) need
> to learn to not be confused by the "odd" Python 2 syntax.

We had some relevant experience with pyxl, and basically it wasn't good --
too many tools had to had custom support added or simply can't be used on
files containing pyxl syntax. (

> [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's
>> a bit over 200 lines. It's not very interesting yet, since it sets the
>> types of nearly all arguments to 'Any'. We're considering building a much
>> more advanced version that tries to guess much better argument types using
>> some form of whole-program analysis. I've heard that Facebook's Hack
>> project got a lot of mileage out of such a tool. I don't yet know how to
>> write it yet -- possibly we could use a variant of mypy's type inference
>> engine, or alternatively we might be able to use something like Jedi (
> pytype ( already does (context sensitive,
> path-sensitive) whole-program analysis, and we're working on making it
> (more) PEP 484 compatible. We're also writing a (2to3 based) tool for
> inserting the derived tools back into the source code. Should we join
> forces?

I would love to! Perhaps we can take this discussion off line?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan 11 16:38:53 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 13:38:53 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert <abarnert at> wrote:

> On Jan 11, 2016, at 10:42, Gregory P. Smith <greg at> wrote:
> The goal of the # type: comments as described is to have this information
> for offline analysis of code, not to make it available at run time.  Yes, a
> decorator syntax could be adopted if anyone needs that. I don't expect
> anyone does. Decorators and attributes would add run time cpu and memory
> overhead whether the information was going to be used at runtime or not
> (likely not; nobody is likely to *deploy* code that looks at
> __annotations__).
> These same arguments were made against PEP 484 in the first place, and (I
> think rightly) dismissed.

The way I recall it the argument was made against using decorators for PEP
484 and we rightly decided not to use decorators.

> 3.x code with annotations incurs a memory overhead, even though most
> runtime code is never going to use them. That was considered to be
> acceptable. So why isn't it acceptable for the same code before it's ported
> to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a
> serious blocking regression that, once the port is completed and you're
> running under 3.x, you're now wasting memory for those useless annotations?

I'm not objecting to the memory overhead of using decorators, but to the
execution time (the extra function call). And the scope for the proposal is
much smaller -- while PEP 484 is the first step on a long road towards
integrating gradual (i.e. OPTIONAL) typing into Python, the proposal on the
table today is only meant for annotating Python 2.7 code so we can get rid
of it more quickly.

> Meanwhile, when _are_ annotations useful at runtime? Mostly during the
> kind of debugging that you'll be doing during something like a port from
> 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're
> not useful there, it's hard to imagine why they'd be useful after the port
> is done, when you're deploying your 3.x code.

I'm not sure how to respond to this -- I disagree with your prediction but
I don't think either of us really has any hard data from experience yet.

I am however going to be building the kind of experience that might
eventually be used to decide this, over the next few years. The first step
is going to introduce annotations into Python 2.7 code, and I know my
internal customers well enough to know that convincing them that we should
use decorators for annotations would be a much bigger battle than putting
annotations in comments. Since I have many other battles to fight I would
like this one to be as short as possible.

So it seems like using decorators (or backporting the syntax, as Google has
> done) has better be acceptable for 2.7, or the PEP 484 design has a serious
> problem, and in a few months we're going to see Dropbox and Google and
> everyone else demanding a way to use type hinting without wasting memory on
> annotations are runtime in 3.x.

Again, I disagree with your assessment but it's difficult to prove anything
without hard data.

One possible argument may be that Python 3 offers a large package of
combined run-time advantages, with some cost that's hard to separate.
However, for Python 2.7 there's either a run-time cost or there's no
run-time cost -- there's no run-time benefit. And I don't want to have to
calculate how many extra machines we'll need to provision in order to make
up for the run-time cost.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From pavol.lisy at  Mon Jan 11 16:48:16 2016
From: pavol.lisy at (Pavol Lisy)
Date: Mon, 11 Jan 2016 22:48:16 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

What about this?

def embezzle(self, account: "PEP3107 annotation"):
    # type: (str) -> Any
    """Embezzle funds from account using fake receipts."""
    <code goes here>


And BTW in PEP484 text ->

Functions with the @no_type_check decorator or with a # type: ignore
comment should be treated as having no annotations.

could be probably? ->

Functions with the @no_type_check decorator or with a # type: ignore
comment should be treated as having no type hints.

From mertz at  Mon Jan 11 16:49:50 2016
From: mertz at (David Mertz)
Date: Mon, 11 Jan 2016 13:49:50 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

> > An equivalent way to write this in Python 2 is the following:
> >
> >     def embezzle(self, account, funds=1000000, *fake_receipts):
> >         # type: (str, int, *str) -> None
> >         """Embezzle funds from account using fake receipts."""
> >         <code goes here>
> By using comments, the annotations would not be available at
> runtime via an .__annotations__ attribute and every tool would
> have to implement a parser for extracting them.
> Wouldn't it be better and more in line with standard Python
> syntax to use decorators to define them ?
>     @typehint("(str, int, *str) -> None")
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>

I really like MAL's variation much better.  Being able to see
.__annotations__ at runtime feels like an important feature that we'd give
up with the purely comment style.

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan 11 16:52:28 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 13:52:28 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy <pavol.lisy at> wrote:

> What about this?
> def embezzle(self, account: "PEP3107 annotation"):
>     # type: (str) -> Any
>     """Embezzle funds from account using fake receipts."""
>     <code goes here>

I don't understand your proposal -- this is not valid Python 2.7 syntax so
we cannot use it.

> ---
> And BTW in PEP484 text ->
> Functions with the @no_type_check decorator or with a # type: ignore
> comment should be treated as having no annotations.
> could be probably? ->
> Functions with the @no_type_check decorator or with a # type: ignore
> comment should be treated as having no type hints.

In the context of the PEP the latter interpretation is already implied, so
I don't think I need to update the text.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg at  Mon Jan 11 17:21:33 2016
From: greg at (Gregory P. Smith)
Date: Mon, 11 Jan 2016 22:21:33 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 1:50 PM David Mertz <mertz at> wrote:

> > An equivalent way to write this in Python 2 is the following:
>> >
>> >     def embezzle(self, account, funds=1000000, *fake_receipts):
>> >         # type: (str, int, *str) -> None
>> >         """Embezzle funds from account using fake receipts."""
>> >         <code goes here>
>> By using comments, the annotations would not be available at
>> runtime via an .__annotations__ attribute and every tool would
>> have to implement a parser for extracting them.
>> Wouldn't it be better and more in line with standard Python
>> syntax to use decorators to define them ?
>>     @typehint("(str, int, *str) -> None")
>>     def embezzle(self, account, funds=1000000, *fake_receipts):
>>         """Embezzle funds from account using fake receipts."""
>>         <code goes here>
> I really like MAL's variation much better.  Being able to see
> .__annotations__ at runtime feels like an important feature that we'd give
> up with the purely comment style.

I'd like people who demonstrate practical important production uses for
having .__annotation__ information available at runtime to champion that.
Both Google and Dropbox are looking at it as only being meaningful in the
offline code analysis context. Even our (Google's) modified 2.7 with
annotation grammar backported is just that, grammar only, no
.__annotations__ or even validation of names while parsing. It may as well
be a # type: comment. We explicitly chose not to use decorators due to
their resource usage side effects.

2.7.x itself understandably is... highly unlikely to be modified... to put
it lightly. So a backport of ignored annotation syntax is a non-starter
there. In that sense I think the # type: comments are fine and are pretty
much what I've been expecting to see. The only other alternative not yet
mentioned would be to put the information in the docstring. But that has
yet other side effects and challenges. So the comments make a lot of sense
to recommend for Python 2 within the PEP.

.__annotations__ isn't something any Python 2 code has ever had in the
past. It can continue to live without it. I do not believe we need to
formally recommend a decorator and its implementation in the PEP. (read
another way: I do not expect Guido to do that work... but anyone is free to
propose it and see if anyone else wants to adopt it)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mal at  Mon Jan 11 17:38:40 2016
From: mal at (M.-A. Lemburg)
Date: Mon, 11 Jan 2016 23:38:40 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On 11.01.2016 22:38, Guido van Rossum wrote:
> On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert <abarnert at> wrote:
>> On Jan 11, 2016, at 10:42, Gregory P. Smith <greg at> wrote:
>> The goal of the # type: comments as described is to have this information
>> for offline analysis of code, not to make it available at run time.  Yes, a
>> decorator syntax could be adopted if anyone needs that. I don't expect
>> anyone does. Decorators and attributes would add run time cpu and memory
>> overhead whether the information was going to be used at runtime or not
>> (likely not; nobody is likely to *deploy* code that looks at
>> __annotations__).
>> These same arguments were made against PEP 484 in the first place, and (I
>> think rightly) dismissed.
> The way I recall it the argument was made against using decorators for PEP
> 484 and we rightly decided not to use decorators.

To clarify: My suggestion to use a simple decorator with essentially
the same syntax as proposed for the "# type: comments " was meant
as *additional* allowed syntax, not necessarily as the only one
to standardize.

I'm a bit worried that by standardizing on using comments
for these annotations only, we'll end up having people not
use the type annotations because they simply don't like the
style of having function bodies begin with comments instead
of doc-strings. I certainly wouldn't want to clutter up my
code like that. Tools parsing Python 2 source code may
also have a problem with this (e.g. not recognize the
doc-string anymore).

This simply reads better, IMO:

    @typehint("(str, int, *str) -> None")
    def embezzle(self, account, funds=1000000, *fake_receipts):
        """Embezzle funds from account using fake receipts."""
        <code goes here>

and it has the advantage of allowing to have the decorator
do additional things such as taking the annotations and
writing out a type annotations file for Python 3 and other
tools to use.

We could also use a variant of the two proposals and
additionally allow this syntax:

    #@typehint("(str, int, *str) -> None")
    def embezzle(self, account, funds=1000000, *fake_receipts):
        """Embezzle funds from account using fake receipts."""
        <code goes here>

to avoid memory and runtime overhead, if that's a problem.

Moving from one to the other would then be a simple
search&replace over the source code.

Or we could have -O remove all those typehint decorator
calls from the byte code to a similar effect.

Code written for Python 2 & 3 will have to stick to the
proposed syntax for quite a while, so we should try to find
something that doesn't introduce a new syntax variant of how
to specify additional function/method properties, because people
are inevitably going to start using the same scheme for all
sorts of other crazy stuff and this would make Python code look
closer to Java than necessary, IMO:

def embezzle(self, account, funds=1000000, *fake_receipts):
    # type: (str, int, *str) -> None
    # raises: ValueError, TypeError
    # optimize: jit, inline_globals
    # tracebacks: hide_locals
    # reviewed_by: xyz, abc
    """Embezzle funds from account using fake receipts."""
    <code goes here>

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 11 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From pavol.lisy at  Mon Jan 11 17:41:17 2016
From: pavol.lisy at (Pavol Lisy)
Date: Mon, 11 Jan 2016 23:41:17 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-11 22:52 GMT+01:00, Guido van Rossum <guido at>:
> On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy <pavol.lisy at> wrote:
>> What about this?
>> def embezzle(self, account: "PEP3107 annotation"):
>>     # type: (str) -> Any
>>     """Embezzle funds from account using fake receipts."""
>>     <code goes here>
> I don't understand your proposal -- this is not valid Python 2.7 syntax so
> we cannot use it.

I had two things in my mind:

1. suggest some possible impact in the future.

In time we are writing code compatible with python2 and python3 we
will have type hints comments under python3 too.

And because they are more compatible, there is risk(?) that they could
be more popular then original PEP484 (for python3) proposal!

2. PEP484 describe possibility how to support other use of annotations
and propose to use

   # type: ignore

but similar method how to preserve other use of annotations could be
(for example):

   # type: (str) -> Any

and this could combine goodness of type-hints-tools and other types of
annotations. At least in deprecation period (if there will be any) for
other annotation types.

From victor.stinner at  Mon Jan 11 17:44:15 2016
From: victor.stinner at (Victor Stinner)
Date: Mon, 11 Jan 2016 23:44:15 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

I discussed this PEP on the #pypy IRC channel. I will try to summarize
the discussion with comments on the PEP directly.

2016-01-08 22:31 GMT+01:00 Victor Stinner <victor.stinner at>:
> Add an API to add specialized functions with guards to functions, to
> support static optimizers respecting the Python semantic.

"respecting the Python semantics" is not 100% exact. In fact, my FAT
Python makes suble changes on the "Python semantics". For example,
loop unrolling can completly remove the call the range() function. If
a debugger is executed instruction per instruction, the output is
different on an unrolled loop, since the range() call was removed, and
the loop copy is copied. I should maybe elaborate this point in the
rationale, explain that a compromise must be found between the funny
"in Python, everything is mutable" and performance. But remember that
the whole thing (FAT Python, specialization, etc.) is developed
outside CPython and is fully optional.

> Changes
> =======
> * Add two new methods to functions:
>   - ``specialize(code, guards: list)``: add specialized
>     function with guard. `code` is a code object (ex:
>     ``func2.__code__``) or any callable object (ex: ``len``).
>     The specialization can be ignored if a guard already fails.

This method doesn't make sense at all in PyPy. The method is specific
to CPython since it relies on guards which have a pure C API (see
below). The PEP must be more explicit about that. IMHO it's perfectly
fine that PyPy makes this method a no-op (the method exactly does
nothing). It's already the case if a guard "always" fail in

>   - ``get_specialized()``: get the list of specialized functions with
>     guards

Again, it doesn't make sense for PyPy. Since this method is only used
for unit tests, it can be converted to a function and put somewhere
else, maybe in the _testcapi module.

It's not a good idea to rely on this method in an application, it's
really an implementation detail.

> * Base ``Guard`` type

In fact, exposing the type at the C level is enough. There is no need
to expose it at Python level, since the type has no method nor data,
and it's not possible to use it in Python. We might expose it in a
different module, again, maybe in _testcapi for unit tests.

>   * ``int check(PyObject *guard, PyObject **stack)``: return 1 on
>     success, 0 if the guard failed temporarely, -1 if the guard will
>     always fail

I forgot "int na" and "int nk" parameters to support keywords arguments.

Note for myself: I should add support for raising an exception.

>   * ``int first_check(PyObject *guard, PyObject *func)``: return 0 on
>     success, -1 if the guard will always fail

Note for myself: I should rename the method to "init()" and support
raising an exception.

> Behaviour
> =========
> When a function code is replaced (``func.__code__ = new_code``), all
> specialized functions are removed.

Moreover, the PEP must be clear about func.__code__ content:
func.specialize() must *not* modify func.__code__. It should be a
completly black box.


From greg at  Mon Jan 11 17:53:25 2016
From: greg at (Gregory P. Smith)
Date: Mon, 11 Jan 2016 22:53:25 +0000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 8, 2016 at 11:44 PM Nick Coghlan <ncoghlan at> wrote:

> On 9 January 2016 at 16:03, Serhiy Storchaka <storchaka at> wrote:
> > On 08.01.16 23:27, Victor Stinner wrote:
> >>
> >> Add a new read-only ``__version__`` property to ``dict`` and
> >> ``collections.UserDict`` types, incremented at each change.
> >
> >
> > This may be not the best name for a property. Many modules already have
> the
> > __version__ attribute, this may make a confusion.
> The equivalent API for the global ABC object graph is
> abc.get_cache_token:
> One of the reasons we chose that name is that even though it's a
> number, the only operation with semantic significance is equality
> testing, with the intended use case being cache invalidation when the
> token changes value.
> If we followed the same reasoning for Victor's proposal, then a
> suitable attribute name would be "__cache_token__".

+1 for consistency.  for most imaginable uses the actual value and type of
the value doesn't matter, you just care if it is different than the value
you recorded earlier. How the token/version gets mutated should be up to
the implementation within defined parameters such as "the same value is
never re-used twice for the lifetime of a process" (which pretty much
guarantees some form of unsigned 64-bit counter increment - but an
implementation could choose to use 256 bit random numbers for all we really
care).  Calling it __version__ implies numeric, but that isn't a

we _really_ don't want someone to write code depending upon it being a
number and expecting it to change in a given manner so that they do
something conditional on math performed on that number rather than a simple
== vs !=.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Mon Jan 11 17:56:11 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 17:56:11 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
 <n6vgkl$alv$> <>
Message-ID: <n71buf$t94$>

On 1/11/2016 1:48 AM, Andrew Barnert via Python-ideas wrote:
> On Jan 10, 2016, at 22:04, Terry Reedy
> <tjreedy at> wrote:
>> On 1/10/2016 12:23 AM, Chris Angelico wrote:
>> (in reponse to Steven's response to my post)
>>> There's more to it than that. Yes, a dict maps values to values;
>>> but the keys MUST be immutable
>> Keys just have to be hashable; only hashes need to be immutable.
>> By default, hashes depends on ids, which are immutable for a
>> particular object within a run.
>> (otherwise hashing has problems),

A '>' quote mark is missing here.  This line is from Chris.

>> only if the hash depends on values that mutate.  Some do.

In other words, hashes should not depend on values that mutate.  We all 
agree on that.

> But

We all three agree on the following.

> if equality depends on values, the hash has to depend on those
> same values. (Because two values that are equal have to hash equal.)
> Which means that if equality depends on any mutable values, the type
> can't be hashable. Which is why none of the built-in mutable types
> are hashable.

By default, object equality is based on ids.

> But it's not _that_ much of an oversimplification to say that keys
> have to be immutable.

By default, an instance of a subclass of object is mutable, hashable (by 
id, making the hash immutable), and usable as a dict key.  The majority 
of both builtin and user-defined classes follow this pattern and are 
quite usable as keys, contrary to the claim.

Classes with immutable instances (tuples, numbers, strings, frozen sets, 
some extension classes, and user classes that take special measures) are 
exceptions.  So are classes with mutable hashes (lists, sets, dicts, 
some extension classes, and user classes that override __eq__ and 

Terry Jan Reedy

From greg at  Mon Jan 11 17:57:14 2016
From: greg at (Gregory P. Smith)
Date: Mon, 11 Jan 2016 22:57:14 +0000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 9, 2016 at 4:09 AM M.-A. Lemburg <mal at> wrote:

> On 09.01.2016 10:58, Victor Stinner wrote:
> > 2016-01-09 9:57 GMT+01:00 Serhiy Storchaka <storchaka at>:
> >>>> This also can be used for better detecting dict mutating during
> >>>> iterating:
> >>>>
> >> (...)
> >>
> >> This makes Raymond's objections even more strong.
> >
> > Raymond has two major objections: memory footprint and performance. I
> > opened an issue with a patch implementing dict__version__ and I ran
> > pybench:
> >
> >
> > pybench doesn't seem reliable: microbenchmarks on dict seems faster
> > with the patch, it doesn't make sense. I expect worse or same
> > performance.
> >
> > With my own timeit microbenchmarks, I don't see any slowdown with the
> > patch. For an unknown reason (it's really strange), dict operations
> > seem even faster with the patch.
> This can well be caused by a better memory alignment, which
> depends on the CPU you're using.
> > For the memory footprint, it's clearly stated in the PEP that it adds
> > 8 bytes per dict (4 bytes on 32-bit platforms). See the "dict subtype"
> > section which explains why I proposed to modify directly the dict
> > type.
> Some questions:
> * How would the implementation deal with wrap around of the
>   version number for fast changing dicts (esp. on 32-bit platforms) ?
> * Given that this is an optimization and not meant to be exact
>   science, why would we need 64 bits worth of version information ?
>   AFAIK, you only need the version information to be able to
>   answer the question "did anything change compared to last time
>   I looked ?".
>   For an optimization it's good enough to get an answer "yes"
>   for slow changing dicts and "no" for all other cases. False
>   negatives don't really hurt. False positives are not allowed.
>   What you'd need to answer the question is a way for the
>   code in need of the information to remember the dict
>   state and then later compare it's remembered state
>   with the now current state of the dict.
>   dicts could do this with a 16-bit index into an array
>   of state object slots which are set by the code tracking
>   the dict.
>   When it's time to check, the code would simply ask for the
>   current index value and compare the state object in the
>   array with the one it had set.

Given it is for optimization only with the fallback slow path being to do
an actual dict lookup, we could implement this using a single bit.

Every modification sets the bit.  There exists an API to clear the bit and
to query the bit.  Nothing else is needed.  The bit could be stored
creatively to avoid increasing the struct size, though ABI compatibility
may prevent that...

> * Wouldn't it be possible to use the hash array itself to
>   store the state index ?
>   We could store the state object as regular key in the
>   dict and filter this out when accessing the dict.
>   Alternatively, we could try to use the free slots for
>   storing these state objects by e.g. declaring a free
>   slot as being NULL or a pointer to a state object.
> --
> Marc-Andre Lemburg
> Professional Python Services directly from the Experts (#1, Jan 09 2016)
> >>> Python Projects, Coaching and Consulting ...
> >>> Python Database Interfaces ... 
> >>> Plone/Zope Database Interfaces ... 
> ________________________________________________________________________
> ::: We implement business ideas - efficiently in both time and costs :::
> Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 11 18:04:46 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 15:04:46 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 11, 2016, at 13:38, Guido van Rossum <guido at> wrote:
>> On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert <abarnert at> wrote:
>>> On Jan 11, 2016, at 10:42, Gregory P. Smith <greg at> wrote:
>>> The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time.  Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__).
>> These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed.
> The way I recall it the argument was made against using decorators for PEP 484 and we rightly decided not to use decorators.

Sure. But you also decided that the type information has to be there at runtime.

Anyway, I don't buy GPS's argument, but I think I buy yours. Even if there are good reasons to have annotations at runtime, and they'd apply to debugging/introspecting/etc. code during a 2.7->3.6 port just as much as in new 3.6 work, but I can see that they may not be worth _enough_ to justify the cost of extra runtime CPU (which can't be avoided in 2.7 the way it is in 3.6). And that, even if they were worth the cost, it may still not be worth trying to convince a team of that fact, especially without any hard information). 

>> 3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations?
> I'm not objecting to the memory overhead of using decorators,

OK, but GPS was. And he was also arguing that having annotations at runtime is useless. Which is an argument that was made against PEP 484, and considered and rejected at the time. Your argument is different, and seems convincing to me, but I can't retroactively change my reply to his email.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 11 18:30:18 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 11 Jan 2016 15:30:18 -0800
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <n71buf$t94$>
References: <>
 <n6s103$i6f$> <>
 <n6vgkl$alv$> <>
Message-ID: <>

On Jan 11, 2016, at 14:56, Terry Reedy <tjreedy at> wrote:
> Classes with immutable instances (tuples, numbers, strings, frozen sets, some extension classes, and user classes that take special measures) are exceptions.  So are classes with mutable hashes (lists, sets, dicts, some extension classes, and user classes that override __eq__ and __hash__).

I don't understand your terminology here. What are "classes with mutable hashes"? Your examples of lists, sets, and dicts don't have mutable hashes; they have no hashes. If you write "hash([])", you get a TypeError("unhashable type: 'list'"). And well-behaved extensions classes and user classes that override __eq__ and __hash__ provide immutable hashes and immutable equality to match, or they use __hash__=None if they need mutable equality.

Python can't actually stop you from creating a class with mutable hashes, and even putting instances of such a class in a dict, but that dict won't actually work right. So, there's nothing for a version-guarded dict to worry about there.

From victor.stinner at  Mon Jan 11 18:34:59 2016
From: victor.stinner at (Victor Stinner)
Date: Tue, 12 Jan 2016 00:34:59 +0100
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

Marc-Andre Lemburg:
>> * Given that this is an optimization and not meant to be exact
>>   science, why would we need 64 bits worth of version information ?
>>   AFAIK, you only need the version information to be able to
>>   answer the question "did anything change compared to last time
>>   I looked ?".
>> (...)

Gregory P. Smith <greg at>:
> Given it is for optimization only with the fallback slow path being to do an
> actual dict lookup, we could implement this using a single bit.

You misunderstood the purpose of the PEP. The purpose is to implement
fast guards by avoiding dict lookups in the common case (when watched
keys are not modified) because dict lookups are fast, but still slower
than reading a field of a C structure and an integer comparison.

See the result of my microbenchmark:

We are talking about a nanoseconds.

For the optimizations that I implemented in FAT Python, I bet that
watched keys are rarely modified. But it's common to modify the
watched namespaces. For example, a global namespace can be modified by
the "lazy module import" pattern: "global module; if module is None:
import module". Or a global variable can be a counter used to generate
identifiers, counter modified regulary with "global counter; counter =
counter + 1" which changes the dictionary version.

>> * Wouldn't it be possible to use the hash array itself to
>>   store the state index ?
>>   We could store the state object as regular key in the
>>   dict and filter this out when accessing the dict.
>>   Alternatively, we could try to use the free slots for
>>   storing these state objects by e.g. declaring a free
>>   slot as being NULL or a pointer to a state object.

I'm sorry, I don't understand this idea.


From rosuav at  Mon Jan 11 19:12:20 2016
From: rosuav at (Chris Angelico)
Date: Tue, 12 Jan 2016 11:12:20 +1100
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 12, 2016 at 7:44 AM, Andrew Barnert <abarnert at> wrote:
> On Jan 11, 2016, at 08:53, Chris Angelico <rosuav at> wrote:
>>> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert <abarnert at> wrote:
>>>> On Jan 11, 2016, at 04:02, Ram Rachum <ram at> wrote:
>>>> I've chosen += and -=, despite the fact they're not set operations, because Python doesn't have __inand__.
>>> For a property that acts like a number, and presumably is implemented as a subclass of int, this seems like a horribly confusing idea.
>> I would expect it NOT to be a subclass of int, actually - just that it
>> has __int__ (and maybe __index__) to convert it to one.
> If you read his proposal, he wants oct(path.chmod) to work. That doesn't work on types with __int__.
> Of course it does work on types with __index__, but that's because the whole point of __index__ is to allow your type to act like an actual int everywhere that Python expects an int, rather than just something coercible to int. The point of PEP
> 357 was to allow numpy.int64 to act as close to a subtype of int as possible without actually being a subtype.

This is what I get for not actually testing stuff. I thought having
__int__ would work for oct. In that case, I would simply recommend
dropping that part of the proposal; retrieving the octal
representation can be spelled oct(int(x)), or maybe x.octal or
x.octal(). This is NOT an integer; it's much closer to a set of
bitwise enumeration.


From ncoghlan at  Mon Jan 11 19:47:57 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Jan 2016 10:47:57 +1000
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

On 12 January 2016 at 08:44, Victor Stinner <victor.stinner at> wrote:
> I discussed this PEP on the #pypy IRC channel. I will try to summarize
> the discussion with comments on the PEP directly.
> 2016-01-08 22:31 GMT+01:00 Victor Stinner <victor.stinner at>:
>> Add an API to add specialized functions with guards to functions, to
>> support static optimizers respecting the Python semantic.
> "respecting the Python semantics" is not 100% exact. In fact, my FAT
> Python makes suble changes on the "Python semantics". For example,
> loop unrolling can completly remove the call the range() function. If
> a debugger is executed instruction per instruction, the output is
> different on an unrolled loop, since the range() call was removed, and
> the loop copy is copied. I should maybe elaborate this point in the
> rationale, explain that a compromise must be found between the funny
> "in Python, everything is mutable" and performance. But remember that
> the whole thing (FAT Python, specialization, etc.) is developed
> outside CPython and is fully optional.
>> Changes
>> =======
>> * Add two new methods to functions:
>>   - ``specialize(code, guards: list)``: add specialized
>>     function with guard. `code` is a code object (ex:
>>     ``func2.__code__``) or any callable object (ex: ``len``).
>>     The specialization can be ignored if a guard already fails.
> This method doesn't make sense at all in PyPy. The method is specific
> to CPython since it relies on guards which have a pure C API (see
> below). The PEP must be more explicit about that. IMHO it's perfectly
> fine that PyPy makes this method a no-op (the method exactly does
> nothing). It's already the case if a guard "always" fail in
> first_check().

Perhaps the specialisation call should also move to being a pure C
API, only exposed through _testcapi for testing purposes?

That would move both this and the dict versioning PEP into the same
territory as the dynamic memory allocator PEP: low level C plumbing
that enables interesting CPython specific extensions (like
tracemalloc, in the dynamic allocator case) without committing other
implementations to emulating features that aren't useful to them in
any way.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From tjreedy at  Mon Jan 11 20:09:12 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 20:09:12 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
 <n6s103$i6f$> <>
 <n6vgkl$alv$> <>
 <n71buf$t94$> <>
Message-ID: <n71jns$fra$>

On 1/11/2016 6:30 PM, Andrew Barnert via Python-ideas wrote:
> On Jan 11, 2016, at 14:56, Terry Reedy
> <tjreedy at> wrote:
>> Classes with immutable instances (tuples, numbers, strings, frozen
>> sets, some extension classes, and user classes that take special
>> measures) are exceptions.  So are classes with mutable hashes
>> (lists, sets, dicts, some extension classes, and user classes that
>> override __eq__ and __hash__).
> I don't understand your terminology here.

Yes, the term, as a negation, is wrong.  It should be 'classes that 
don't have immutable hashes'.  The list is right, except that 'override' 
should really be 'disable'.

Anyway, Viktor changed the PEP and has moved on, so I will too.

Terry Jan Reedy

From tjreedy at  Mon Jan 11 20:44:45 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 11 Jan 2016 20:44:45 -0500
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <n71lqh$dr0$>

On 1/11/2016 5:38 PM, M.-A. Lemburg wrote:

> To clarify: My suggestion to use a simple decorator with essentially
> the same syntax as proposed for the "# type: comments " was meant
> as *additional* allowed syntax, not necessarily as the only one
> to standardize.

Code with type comments will run on any standard 2.7 interpreter.  Code 
with an @typehint decorator will have to either run on a nonstandard 
interpreter or import 'typehint' from somewhere other than the stdlib or 
define 'typehint' at the top of the file or have the decorators stripped 
out before public distribution.  To me, these options come close to 
making the decorator inappropriate as a core dev recommendation.

However, the new section of the PEP could have a short paragraph that 
mentions @typehint(typestring) as a possible alternative (with the 
options given above) and recommend that if a decorator is used, then the 
name should be 'typehint' (or something else agreed on) and that the 
typestring should be a quoted version of what would follow '# type: ' in 
a comment, 'as already defined above' (in the previous recommendation).

In other words, Guido's current addition has two recommendations:
1. the syntax for a typestring
2. the use of a typestring (append it to a '# type: ' comment)
If a decorator alternative uses the same syntax, a checker would need 
just one typestring parser.  I think the conditional recommendation 
would be within the scope of what is appropriate for us to do.

> I'm a bit worried that by standardizing on using comments
> for these annotations only, we'll end up having people not
> use the type annotations because they simply don't like the
> style of having function bodies begin with comments instead
> of doc-strings. I certainly wouldn't want to clutter up my
> code like that. Tools parsing Python 2 source code may
> also have a problem with this (e.g. not recognize the
> doc-string anymore).

I have to admit that I was not fully cognizant before than a comment 
could precede a docstring.

Terry Jan Reedy

From steve at  Mon Jan 11 20:39:11 2016
From: steve at (Steven D'Aprano)
Date: Tue, 12 Jan 2016 12:39:11 +1100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas wrote:

> in a few months we're going to see 
> Dropbox and Google and everyone else demanding a way to use type 
> hinting without wasting memory on annotations are runtime in 3.x.

I would be happy to see a runtime switch similar to -O that drops 
annotations in 3.x, similar to how -OO drops docstrings.


From at  Mon Jan 11 21:24:46 2016
From: at (Yury Selivanov)
Date: Mon, 11 Jan 2016 21:24:46 -0500
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

On 2016-01-11 7:47 PM, Nick Coghlan wrote:
> Perhaps the specialisation call should also move to being a pure C
> API, only exposed through _testcapi for testing purposes?
> That would move both this and the dict versioning PEP into the same
> territory as the dynamic memory allocator PEP: low level C plumbing
> that enables interesting CPython specific extensions (like
> tracemalloc, in the dynamic allocator case) without committing other
> implementations to emulating features that aren't useful to them in
> any way.

+1.  Exposing 'FunctionType.specialize()'-like APIs to
Python level feels very wrong to me.


From barry at  Mon Jan 11 22:04:12 2016
From: barry at (Barry Warsaw)
Date: Mon, 11 Jan 2016 22:04:12 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
References: <>
Message-ID: <>

On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote:

>IMHO adding 8 bytes per dict is worth it.

I'm not so sure.  There are already platforms where Python is unfeasible to
generally use (e.g. some mobile devices) at least in part because of memory
footprint.  Dicts are used everywhere so think about the kind of impact adding
8 bytes to every dict in an application running on such systems will have.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From mahmoud at  Mon Jan 11 22:10:10 2016
From: mahmoud at (Mahmoud Hashemi)
Date: Mon, 11 Jan 2016 19:10:10 -0800
Subject: [Python-ideas] More friendly access to chmod
In-Reply-To: <>
References: <>
Message-ID: <>

Seems like the committee has some designs after all?

is tested, on PyPI, and is even 2/3 compatible. And notice the lack of
"chmod" as a noun. ;)


On Mon, Jan 11, 2016 at 4:12 PM, Chris Angelico <rosuav at> wrote:

> On Tue, Jan 12, 2016 at 7:44 AM, Andrew Barnert <abarnert at>
> wrote:
> > On Jan 11, 2016, at 08:53, Chris Angelico <rosuav at> wrote:
> >>
> >>> On Tue, Jan 12, 2016 at 3:49 AM, Andrew Barnert <abarnert at>
> wrote:
> >>>> On Jan 11, 2016, at 04:02, Ram Rachum <ram at> wrote:
> >>>>
> >>>> I've chosen += and -=, despite the fact they're not set operations,
> because Python doesn't have __inand__.
> >>>
> >>> For a property that acts like a number, and presumably is implemented
> as a subclass of int, this seems like a horribly confusing idea.
> >>
> >> I would expect it NOT to be a subclass of int, actually - just that it
> >> has __int__ (and maybe __index__) to convert it to one.
> >
> > If you read his proposal, he wants oct(path.chmod) to work. That doesn't
> work on types with __int__.
> >
> > Of course it does work on types with __index__, but that's because the
> whole point of __index__ is to allow your type to act like an actual int
> everywhere that Python expects an int, rather than just something coercible
> to int. The point of PEP
> > 357 was to allow numpy.int64 to act as close to a subtype of int as
> possible without actually being a subtype.
> This is what I get for not actually testing stuff. I thought having
> __int__ would work for oct. In that case, I would simply recommend
> dropping that part of the proposal; retrieving the octal
> representation can be spelled oct(int(x)), or maybe x.octal or
> x.octal(). This is NOT an integer; it's much closer to a set of
> bitwise enumeration.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Mon Jan 11 22:37:29 2016
From: ncoghlan at (Nick Coghlan)
Date: Tue, 12 Jan 2016 13:37:29 +1000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On 12 January 2016 at 13:04, Barry Warsaw <barry at> wrote:
> On Jan 09, 2016, at 10:58 AM, Victor Stinner wrote:
>>IMHO adding 8 bytes per dict is worth it.
> I'm not so sure.  There are already platforms where Python is unfeasible to
> generally use (e.g. some mobile devices) at least in part because of memory
> footprint.  Dicts are used everywhere so think about the kind of impact adding
> 8 bytes to every dict in an application running on such systems will have.

This is another advantage of making this a CPython specific internal
implementation detail - embedded focused variants like MicroPython
won't need to implement it.

The question then becomes "Are we willing to let CPython cede high
memory pressure environments to more specialised Python variants?",
and I think the answer to that is "yes".


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From mike at  Mon Jan 11 22:57:37 2016
From: mike at (Michael Selik)
Date: Tue, 12 Jan 2016 03:57:37 +0000
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 6:20 AM Steven D'Aprano <steve at> wrote:

> On Mon, Jan 11, 2016 at 05:18:59AM -0500, Neil Girdhar wrote:
> > Here is where I have to disagree.  I hate it when experts say "we'll just
> > document it and then it's the user's fault for misusing it".  Yeah,
> you're
> > right, but as a user, it is very frustrating to have to read other
> people's
> > documentation.  You know that some elite Python programmer is going to
> > optimize his code using this and someone years later is going to scratch
> > his head wondering where __version__ is coming from.  Is it the provided
> by
> > the caller?  Was it added to the object at some earlier point?
> Neil, don't you think you're being overly dramatic here? "Programmer
> needs to look up API feature, news at 11!" The same could be said about
> class.__name__, instance.__class__, obj.__doc__, module.__dict__ and
> indeed every single Python feature. Sufficiently inexperienced or naive
> programmers could be scratching their head over literally *anything*.

> All those words for such a simple, and minor, point: every new API
> feature is one more thing for programmers to learn. We get that.

I don't think Neil is being overly dramatic, nor is it a minor point.
Simple, yes, but important. If Python wants to maintain its enviable
position as the majority language for intro computer science of top
schools, it needs to stay an easily teachable language. The more junk
showing up in ``dir()`` the harder it is to learn. When it's unclear what
purpose a feature would have for an expert, why not err on the side of
caution and keep the language as usable for a newbie as possible?

But the following is a good, strong argument:
> > Also, using this __version__ in source code is going to complicate
> > switching from CPython to any of the other Python implementations, so
> those
> > implementations will probably end up implementing it just to simplify
> > "porting", which would otherwise be painless.
> >
> > Why don't we leave exposing __version__ in Python to another PEP?  Once
> > it's in the C API (as you proposed) you will be able to use it from
> Python
> > by writing an extension and then someone can demonstrate the value of
> > exposing it in Python by writing tests.
> I can't really argue against this. As much as I would love to play
> around with __version__, I think you're right. It needs to prove itself
> before being exposed as a public API.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Mon Jan 11 23:38:59 2016
From: guido at (Guido van Rossum)
Date: Mon, 11 Jan 2016 20:38:59 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano <steve at>

> On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas
> wrote:
> > in a few months we're going to see
> > Dropbox and Google and everyone else demanding a way to use type
> > hinting without wasting memory on annotations are runtime in 3.x.
> I would be happy to see a runtime switch similar to -O that drops
> annotations in 3.x, similar to how -OO drops docstrings.

Actually my experience with -OO (and even -O) suggest that that's not a
great model (e.g. it can't work with libraries like PLY that inspect
docstrings). A better model might be to let people select this on a per
module basis. Though I could also see a future where __annotations__ is a
more space-efficient data structure than dict.

Have you already run into a situation where __annotations__ takes up too
much space?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Tue Jan 12 04:58:37 2016
From: victor.stinner at (Victor Stinner)
Date: Tue, 12 Jan 2016 10:58:37 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>


2016-01-12 1:47 GMT+01:00 Nick Coghlan <ncoghlan at>:
>> This method doesn't make sense at all in PyPy. The method is specific
>> to CPython since it relies on guards which have a pure C API (see
>> below). The PEP must be more explicit about that. IMHO it's perfectly
>> fine that PyPy makes this method a no-op (the method exactly does
>> nothing). It's already the case if a guard "always" fail in
>> first_check().
> Perhaps the specialisation call should also move to being a pure C
> API, only exposed through _testcapi for testing purposes?
> That would move both this and the dict versioning PEP into the same
> territory as the dynamic memory allocator PEP: low level C plumbing
> that enables interesting CPython specific extensions (like
> tracemalloc, in the dynamic allocator case) without committing other
> implementations to emulating features that aren't useful to them in
> any way.

I really like your idea :-) It solves many issues and technically it's
trivial to only add a C API and then expose it somewhere else at the
Python level (for example in my "fat" module", or as you said in
_testcapi for testing purpose).

Instead of adding func.specialize() and func.get_specialized() at
Python level, we can add *public* functions to the Python C API
(excluded of the stable ABI):

/* Add a specialized function with guards. Result:
 * - return 1 on success
 * - return 0 if the specialization has been ignored
 * - raise an exception and return -1 on error */
PyAPI_DATA(int) PyFunction_Specialize(PyObject *func, PyObject *func2,
                                      PyObject *guards);

/* Get the list of specialized functions as a list of
 * (func, guards) where func is a callable or code object and guards
 * is a list of PyFuncGuard (or subtypes) objects.
 * Raise an exception and return NULL on error. */
PyAPI_FUNC(PyObject*) PyFunction_GetSpecialized(PyObject *func);

/* Get the specialized function of a function. stack is a an array of PyObject*
 * objects: indexed arguments followed by (key, value) objects of keyword
 * arguments. na is the number of indexed arguments, nk is the number of
 * keyword arguments. stack contains na + nk * 2 objects.
 * Return a callable or a code object on success.
 * Raise an exception and return NULL on error. */
PyAPI_FUNC(PyObject*) PyFunction_GetSpecializedFunc(PyObject *func,
                                                    PyObject **stack,
                                                    int na, int nk);

Again, other Python implementations which don't want to implement
function specializations can implement these functions as no-op (it's
fine with the API):

* PyFunction_Specialize() just returns 0
* PyFunction_GetSpecialized() creates an empty list
* PyFunction_GetSpecializedFunc() returns the code object of the
function (which is not something new)

Or not implement these functions at all, since it doesn't make sense for them.


First, I tried hard to avoid the need of a module to specialize
functions. My first API added a specialize() method to functions which
took a list of dictionaries to describe guards. The problem is this
API is that it exposes the implementation details and it avoids to
extend easily guard (implement new guards). Now the AST optimizer
injects "import fat" to optimize code when needed.

Hey, it's difficult to design a simple and obvious API!


From brett at  Tue Jan 12 11:52:43 2016
From: brett at (Brett Cannon)
Date: Tue, 12 Jan 2016 16:52:43 +0000
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, 12 Jan 2016 at 01:59 Victor Stinner <victor.stinner at>

> Hi,
> 2016-01-12 1:47 GMT+01:00 Nick Coghlan <ncoghlan at>:
> >> This method doesn't make sense at all in PyPy. The method is specific
> >> to CPython since it relies on guards which have a pure C API (see
> >> below). The PEP must be more explicit about that. IMHO it's perfectly
> >> fine that PyPy makes this method a no-op (the method exactly does
> >> nothing). It's already the case if a guard "always" fail in
> >> first_check().
> >
> > Perhaps the specialisation call should also move to being a pure C
> > API, only exposed through _testcapi for testing purposes?
> >
> > That would move both this and the dict versioning PEP into the same
> > territory as the dynamic memory allocator PEP: low level C plumbing
> > that enables interesting CPython specific extensions (like
> > tracemalloc, in the dynamic allocator case) without committing other
> > implementations to emulating features that aren't useful to them in
> > any way.
> I really like your idea :-) It solves many issues and technically it's
> trivial to only add a C API and then expose it somewhere else at the
> Python level (for example in my "fat" module", or as you said in
> _testcapi for testing purpose).
> Instead of adding func.specialize() and func.get_specialized() at
> Python level, we can add *public* functions to the Python C API
> (excluded of the stable ABI):
> /* Add a specialized function with guards. Result:
>  * - return 1 on success
>  * - return 0 if the specialization has been ignored
>  * - raise an exception and return -1 on error */
> PyAPI_DATA(int) PyFunction_Specialize(PyObject *func, PyObject *func2,
>                                       PyObject *guards);
> /* Get the list of specialized functions as a list of
>  * (func, guards) where func is a callable or code object and guards
>  * is a list of PyFuncGuard (or subtypes) objects.
>  * Raise an exception and return NULL on error. */
> PyAPI_FUNC(PyObject*) PyFunction_GetSpecialized(PyObject *func);
> /* Get the specialized function of a function. stack is a an array of
> PyObject*
>  * objects: indexed arguments followed by (key, value) objects of keyword
>  * arguments. na is the number of indexed arguments, nk is the number of
>  * keyword arguments. stack contains na + nk * 2 objects.
>  *
>  * Return a callable or a code object on success.
>  * Raise an exception and return NULL on error. */
> PyAPI_FUNC(PyObject*) PyFunction_GetSpecializedFunc(PyObject *func,
>                                                     PyObject **stack,
>                                                     int na, int nk);
> Again, other Python implementations which don't want to implement
> function specializations can implement these functions as no-op (it's
> fine with the API):
> * PyFunction_Specialize() just returns 0
> * PyFunction_GetSpecialized() creates an empty list
> * PyFunction_GetSpecializedFunc() returns the code object of the
> function (which is not something new)
> Or not implement these functions at all, since it doesn't make sense for
> them.

 This is somewhat similar to the JIT API we have been considering through
our Pyjion work:

* PyJit_Init()
* PyJit_RegisterCodeObject()
* PyJit_CompileCodeObject()

If both ideas gain traction we may want to talk about whether there is some
way to consolidate the APIs so we don't end up with a ton of different ways
to optimize code objects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From barry at  Tue Jan 12 12:11:27 2016
From: barry at (Barry Warsaw)
Date: Tue, 12 Jan 2016 12:11:27 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
References: <>
Message-ID: <>

On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote:

>The question then becomes "Are we willing to let CPython cede high
>memory pressure environments to more specialised Python variants?",
>and I think the answer to that is "yes".

I'm not so willing to cede that space to alternative implementations, at least
not yet.  If this suite of ideas yields *significant* performance
improvements, it might be a worthwhile trade-off.  But I'm not in favor of
adding dict.__version__ in the hopes that we'll see that improvement; I think
we need proof.

That makes me think that 1) it should not be exposed to Python yet; 2) it
should be conditionally compiled in, and not by default.  This would allow
experimentation without committing us to long-term maintenance or an
across-the-board increase in memory pressures for speculative gains.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From victor.stinner at  Tue Jan 12 15:38:02 2016
From: victor.stinner at (Victor Stinner)
Date: Tue, 12 Jan 2016 21:38:02 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-12 17:52 GMT+01:00 Brett Cannon <brett at>:
>  This is somewhat similar to the JIT API we have been considering through
> our Pyjion work:
> * PyJit_Init()
> * PyJit_RegisterCodeObject()
> * PyJit_CompileCodeObject()
> If both ideas gain traction we may want to talk about whether there is some
> way to consolidate the APIs so we don't end up with a ton of different ways
> to optimize code objects.

Since the proposed changes adds many "public" symbols (prefixed with
"Py_", but excluded of the stable ABI and only exposed at the C
level), I chose to add "public" functions. Are you ok with that?


From leewangzhong+python at  Tue Jan 12 18:26:07 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 12 Jan 2016 18:26:07 -0500
Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some
 arguments of a function
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow <bunslow at> wrote:
> Sorry for the late reply everyone.
> I think relying on closures, while a solution, is messy. I'd still much
> prefer a way to tell lru_cache to merely ignore certain arguments.

Wait, why is it messy? The function is created inside the outer
function, and never gets released to the outside. I think it's
cleaner, because it's encapsulating the recursive part, the memo, and
the cached work. Besides, `lru_cache` is implemented using a closure,
and your solution of passing a key function might be used with a
closure on a nested function.

If you're solving dynamic programming puzzles, an outer/inner pairing
of non-recursive/recursive represents the fact that your memoization
can't be reused for different instances of the same problem. (See, for
example, Edit Distance
(, in which your
recursive parameters are indices into your non-recursive parameters.)

> I've further considered my original proposal, and rather than naming it
> "arg_filter", I realized that builtins like sorted(), min(), max(), etc all
> already have the exact same thing -- a "key" argument which transforms the
> elements to the user's purpose. (In the sorted/min/max case, it's called on
> the elements of the argument rather than the argument itself, but it's still
> the same concept.) So basically, my original proposal with renaming from
> arg_filter to key, is tantamount to extending the same functionality from
> sorted/min/max to lru_cache as well.

I like this conceptually, because the `key` parameter sort of lets you
customize the cache dict (or whatever). You can use, for example,
`str.lower` (though not directly).

Note that the key parameter in those other functions allows you to
call the key function only once per element, which is impossible for

Should it be possible to specify a tuple for `key` to transform each
arg separately? In your case, you might pass in `(None, lambda x: 0)`
to specify that the first parameter shouldn't be transformed, and the
second parameter should be considered constant. But that's very
confusing: should `None` mean "ignore", or "don't transform" (like
`filter`)? Or we can use `False` for "ignore", perhaps.

On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik <mike at> wrote:
> Shouldn't the key function be called with ``key(*args, **kwargs)``?

Does `lru_cache` know how to deal with passing regular args as kwargs? Does it

From tjreedy at  Tue Jan 12 18:33:10 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 12 Jan 2016 18:33:10 -0500
Subject: [Python-ideas] RFC: PEP: Add dict.__version__
In-Reply-To: <>
References: <>
Message-ID: <n742fu$24j$>

On 1/12/2016 12:11 PM, Barry Warsaw wrote:
> On Jan 12, 2016, at 01:37 PM, Nick Coghlan wrote:
>> The question then becomes "Are we willing to let CPython cede high
>> memory pressure environments to more specialised Python variants?",
>> and I think the answer to that is "yes".
> I'm not so willing to cede that space to alternative implementations, at least
> not yet.  If this suite of ideas yields *significant* performance
> improvements, it might be a worthwhile trade-off.  But I'm not in favor of
> adding dict.__version__ in the hopes that we'll see that improvement; I think
> we need proof.
> That makes me think that 1) it should not be exposed to Python yet; 2) it
> should be conditionally compiled in, and not by default.  This would allow
> experimentation without committing us to long-term maintenance or an
> across-the-board increase in memory pressures for speculative gains.

New modules can be labelled 'provisional', whose meaning includes 'might 
be removed'.  Can we do the same with new internal features?

Terry Jan Reedy

From victor.stinner at  Tue Jan 12 19:03:58 2016
From: victor.stinner at (Victor Stinner)
Date: Wed, 13 Jan 2016 01:03:58 +0100
Subject: [Python-ideas] RFC: PEP: Specialized functions with guards
In-Reply-To: <>
References: <>
Message-ID: <>

Thank you for comments on the first version of the PEP 510. I changed
it to only changes the C API, there is no more change on the Python
API. I just posted the second version of the PEP to python-dev. Please
move the discussion there.

If you want to review others PEP on python-ideas, I'm going to post a
first version of my AST transformer PEP (PEP 511), stay tuned :-D
(yeah, I love working on 3 PEPs at the same time!)


From khali119 at  Tue Jan 12 20:11:55 2016
From: khali119 at (Muhammad Ahmed Khalid)
Date: Tue, 12 Jan 2016 19:11:55 -0600
Subject: [Python-ideas] Password masking for getpass.getpass
Message-ID: <>


I am working on a project and I am using getpass.getpass() to grab
passwords from the user.

Some of the users wanted asterisks to be displayed when they were typing in
the passwords for feedback i.e. how many characters were typed and how many
to backspace.

Therefore I have created a solution but I think the feature should come by
default (and the programmer should have the option to use a blank or any
other character).

The code lives here:

Please let me know about your thoughts on the issue. Also my apologies if
this is the wrong mailing list. Please guide me in the right direction if
that is the case.

King Mak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Tue Jan 12 20:50:23 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Jan 2016 12:50:23 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 13, 2016 at 12:11 PM, Muhammad Ahmed Khalid
<khali119 at> wrote:
> Therefore I have created a solution but I think the feature should come by
> default (and the programmer should have the option to use a blank or any
> other character).
> The code lives here:
> Please let me know about your thoughts on the issue. Also my apologies if
> this is the wrong mailing list. Please guide me in the right direction if
> that is the case.

First off, here's a direct link, bypassing the URL shortener.

The use of raw_input at the end suggests that you're planning this for
Python 2.7, and not for 3.x. Does your code work (apart from that,
which is insignificant) on Python 3? If not, this would be well worth
considering; this mailing list is all about new features for the new
versions of Python, which means 3.6 at the moment. Cool snippets of
Py2 code might be useful as ActiveState recipes, or as PyPI modules,
but they won't be added to the core language.

You've used the Windows-only msvcrt module. That means your snippet
works only on Windows, which you acknowledge in a comment underneath
it. I'm echoing that here on the list to make sure that's clear.

Have you checked PyPI ( for similar code that already exists?


From steve at  Tue Jan 12 20:54:14 2016
From: steve at (Steven D'Aprano)
Date: Wed, 13 Jan 2016 12:54:14 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 12, 2016 at 07:11:55PM -0600, Muhammad Ahmed Khalid wrote:
> Greetings,
> I am working on a project and I am using getpass.getpass() to grab
> passwords from the user.
> Some of the users wanted asterisks to be displayed when they were typing in
> the passwords for feedback i.e. how many characters were typed and how many
> to backspace.

I think that's an excellent idea.

The old convention on Linux and Unix is to just suppress all feedback, 
but even on Linux GUI applications normally show bullets ? or asterisks. 
Users who are less familiar with old-school Unix conventions have 
trouble with the standard password idiom of suppressing all feedback.

> Therefore I have created a solution but I think the feature should come by
> default (and the programmer should have the option to use a blank or any
> other character).

I think that the default should remain as it is now, but I would support 
adding an extra argument for getpass() to set the feedback character. 
But it would need to support POSIX systems (Unix, Linux and Mac OS X) as 
well as Windows.


From mike at  Tue Jan 12 21:13:25 2016
From: mike at (Michael Selik)
Date: Wed, 13 Jan 2016 02:13:25 +0000
Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some
 arguments of a function
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee <leewangzhong+python at>

> On Sun, Jan 10, 2016 at 3:27 AM, Bill Winslow <bunslow at> wrote:
> > Sorry for the late reply everyone.
> >
> > I think relying on closures, while a solution, is messy. I'd still much
> > prefer a way to tell lru_cache to merely ignore certain arguments.
> Wait, why is it messy? The function is created inside the outer
> function, and never gets released to the outside. I think it's
> cleaner, because it's encapsulating the recursive part, the memo, and
> the cached work. Besides, `lru_cache` is implemented using a closure,
> and your solution of passing a key function might be used with a
> closure on a nested function.
> If you're solving dynamic programming puzzles, an outer/inner pairing
> of non-recursive/recursive represents the fact that your memoization
> can't be reused for different instances of the same problem. (See, for
> example, Edit Distance
> (, in which your
> recursive parameters are indices into your non-recursive parameters.)
> > I've further considered my original proposal, and rather than naming it
> > "arg_filter", I realized that builtins like sorted(), min(), max(), etc
> all
> > already have the exact same thing -- a "key" argument which transforms
> the
> > elements to the user's purpose. (In the sorted/min/max case, it's called
> on
> > the elements of the argument rather than the argument itself, but it's
> still
> > the same concept.) So basically, my original proposal with renaming from
> > arg_filter to key, is tantamount to extending the same functionality from
> > sorted/min/max to lru_cache as well.
> I like this conceptually, because the `key` parameter sort of lets you
> customize the cache dict (or whatever). You can use, for example,
> `str.lower` (though not directly).
> Note that the key parameter in those other functions allows you to
> call the key function only once per element, which is impossible for
> this.
> Should it be possible to specify a tuple for `key` to transform each
> arg separately? In your case, you might pass in `(None, lambda x: 0)`
> to specify that the first parameter shouldn't be transformed, and the
> second parameter should be considered constant. But that's very
> confusing: should `None` mean "ignore", or "don't transform" (like
> `filter`)? Or we can use `False` for "ignore", perhaps.

I think his intention was to mimic the ``key`` argument of sorted, which
expects a function that takes 1 and only 1 positional argument.

Perhaps it's best to see the exact use case and a few other examples, to
get a better idea for the specifics, before implementing this feature?

On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik <mike at> wrote:
> > Shouldn't the key function be called with ``key(*args, **kwargs)``?
> Does `lru_cache` know how to deal with passing regular args as kwargs?

Now that you mention it, I realized it treats the two differently.
``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different
entries in the cache.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From phd at  Tue Jan 12 21:17:46 2016
From: phd at (Oleg Broytman)
Date: Wed, 13 Jan 2016 03:17:46 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>


On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
> The old convention on Linux and Unix is to just suppress all feedback, 
> but even on Linux GUI applications normally show bullets ??? or asterisks. 

   Modern GUIs show the real character for a short period of time and
then replace it with an asterisk.

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From steve at  Tue Jan 12 21:17:57 2016
From: steve at (Steven D'Aprano)
Date: Wed, 13 Jan 2016 13:17:57 +1100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 11, 2016 at 08:38:59PM -0800, Guido van Rossum wrote:
> On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano <steve at>
> wrote:
> > On Mon, Jan 11, 2016 at 12:22:55PM -0800, Andrew Barnert via Python-ideas
> > wrote:
> >
> > > in a few months we're going to see
> > > Dropbox and Google and everyone else demanding a way to use type
> > > hinting without wasting memory on annotations are runtime in 3.x.
> >
> > I would be happy to see a runtime switch similar to -O that drops
> > annotations in 3.x, similar to how -OO drops docstrings.
> Actually my experience with -OO (and even -O) suggest that that's not a
> great model (e.g. it can't work with libraries like PLY that inspect
> docstrings). A better model might be to let people select this on a per
> module basis. Though I could also see a future where __annotations__ is a
> more space-efficient data structure than dict.
> Have you already run into a situation where __annotations__ takes up too
> much space?

No at such, but it does seem an obvious and low-impact place to save 
some memory. Like doc strings, they're rarely used at runtime outside of 
the interactive interpreter.

But your suggestion sounds more useful.


From rosuav at  Tue Jan 12 21:22:02 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Jan 2016 13:22:02 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
> Hi!
> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
>> The old convention on Linux and Unix is to just suppress all feedback,
>> but even on Linux GUI applications normally show bullets ??? or asterisks.
>    Modern GUIs show the real character for a short period of time and
> then replace it with an asterisk.

Ugh. I've only seen that on mobile devices, not on any desktop GUI,
and I think it's a sop to the terrible keyboards they have. I hope
this NEVER becomes a standard on full-sized computers with real


From phd at  Tue Jan 12 21:45:08 2016
From: phd at (Oleg Broytman)
Date: Wed, 13 Jan 2016 03:45:08 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico <rosuav at> wrote:
> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
> >> The old convention on Linux and Unix is to just suppress all feedback,
> >> but even on Linux GUI applications normally show bullets ??? or asterisks.
> >
> >    Modern GUIs show the real character for a short period of time and
> > then replace it with an asterisk.
> Ugh. I've only seen that on mobile devices, not on any desktop GUI,

   On desktop (Windows) I saw a password entry with a checkbox to switch
between real characters and asterisks.

> and I think it's a sop to the terrible keyboards they have. I hope
> this NEVER becomes a standard on full-sized computers with real
> keyboards.
> ChrisA

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From ethan at  Tue Jan 12 22:07:27 2016
From: ethan at (Ethan Furman)
Date: Tue, 12 Jan 2016 19:07:27 -0800
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/12/2016 06:45 PM, Oleg Broytman wrote:
> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote:
>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote:

>>>> The old convention on Linux and Unix is to just suppress all feedback,
>>>> but even on Linux GUI applications normally show bullets ??? or asterisks.
>>>     Modern GUIs show the real character for a short period of time and
>>> then replace it with an asterisk.
>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
>     On desktop (Windows) I saw a password entry with a checkbox to switch
> between real characters and asterisks.

While that can be handy, it is not the same as displaying each character 
as it is typed and then covering it with something else.  I agree with 
ChrisA and hope that never becomes the convention on non-mobile devices.


From leewangzhong+python at  Tue Jan 12 22:45:15 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Tue, 12 Jan 2016 22:45:15 -0500
Subject: [Python-ideas] Fwd: Using functools.lru_cache only on some
 arguments of a function
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 12, 2016 at 9:13 PM, Michael Selik <mike at> wrote:
> On Tue, Jan 12, 2016 at 6:26 PM Franklin? Lee
> <leewangzhong+python at> wrote:
>> Should it be possible to specify a tuple for `key` to transform each
>> arg separately? In your case, you might pass in `(None, lambda x: 0)`
>> to specify that the first parameter shouldn't be transformed, and the
>> second parameter should be considered constant. But that's very
>> confusing: should `None` mean "ignore", or "don't transform" (like
>> `filter`)? Or we can use `False` for "ignore", perhaps.
> I think his intention was to mimic the ``key`` argument of sorted, which
> expects a function that takes 1 and only 1 positional argument.

I know, but those three functions expect a sequence of single things,
while `lru_cache` expects several things. (Not exactly a tuple,
because that would be a single thing, but rather a collection of

> Perhaps it's best to see the exact use case and a few other examples, to get
> a better idea for the specifics, before implementing this feature?
>> On Sun, Jan 10, 2016 at 2:03 PM, Michael Selik <mike at> wrote:
>> > Shouldn't the key function be called with ``key(*args, **kwargs)``?
>> Does `lru_cache` know how to deal with passing regular args as kwargs?
> Now that you mention it, I realized it treats the two differently.
> ``def foo(x): pass`` would store ``foo(42)`` and ``foo(x=42)`` as different
> entries in the cache.

I feel like, at least ideally, there should be a way for
`update_wrapper`/`wraps` to unpack named arguments, so that wrappers
can truly reflect the params of the functions they wrap. (For example,
when inspecting a function, and by decoding kw-or-positional-args to
their place in `*args`.) It should also be possible to add or remove
args, though I'm not sure how useful that will be. (Also ideally, a
wrapper function would "pass up" its default args to the wrapper.)

From khali119 at  Tue Jan 12 23:31:38 2016
From: khali119 at (Muhammad Ahmed Khalid)
Date: Tue, 12 Jan 2016 22:31:38 -0600
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

I've gotten some ideas from people's emails and I think it is worth
investing more time with this feature. I will work to make the code
platform independent and python 3 compatible.

The standard library code for getpass.getpass() actually does use msvcrt
for the windows platform so I think I'll keep my code like that but I'll
add another function supporting unix.

Considering the mobile device issue: there can always be options. The
developers can choose either to implement that feature or not and even more
let the users decide if they want to use the feature. This is exactly what
i am aiming for with the desktop version too. The ability to choose.

~ KingMak

On Tue, Jan 12, 2016 at 9:07 PM, Ethan Furman <ethan at> wrote:

> On 01/12/2016 06:45 PM, Oleg Broytman wrote:
>> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
>>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote:
>>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote:
> The old convention on Linux and Unix is to just suppress all feedback,
>>>>> but even on Linux GUI applications normally show bullets ??? or
>>>>> asterisks.
>>>>     Modern GUIs show the real character for a short period of time and
>>>> then replace it with an asterisk.
>>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
>>     On desktop (Windows) I saw a password entry with a checkbox to switch
>> between real characters and asterisks.
> While that can be handy, it is not the same as displaying each character
> as it is typed and then covering it with something else.  I agree with
> ChrisA and hope that never becomes the convention on non-mobile devices.
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Wed Jan 13 05:04:43 2016
From: steve at (Steven D'Aprano)
Date: Wed, 13 Jan 2016 21:04:43 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
> > Hi!
> >
> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
> >> The old convention on Linux and Unix is to just suppress all feedback,
> >> but even on Linux GUI applications normally show bullets ??? or asterisks.
> >
> >    Modern GUIs show the real character for a short period of time and
> > then replace it with an asterisk.
> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
> and I think it's a sop to the terrible keyboards they have. I hope
> this NEVER becomes a standard on full-sized computers with real
> keyboards.

I don't know... I'm about 35% convinced that obfuscating the password is 
just security theatre. I'm not sure that "shoulder surfing" of passwords 
is a significant threat.

But the other 65% tells me that we should continue to obfuscate.


From rosuav at  Wed Jan 13 05:19:41 2016
From: rosuav at (Chris Angelico)
Date: Wed, 13 Jan 2016 21:19:41 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 13, 2016 at 9:04 PM, Steven D'Aprano <steve at> wrote:
> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
>> > Hi!
>> >
>> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
>> >> The old convention on Linux and Unix is to just suppress all feedback,
>> >> but even on Linux GUI applications normally show bullets ??? or asterisks.
>> >
>> >    Modern GUIs show the real character for a short period of time and
>> > then replace it with an asterisk.
>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
>> and I think it's a sop to the terrible keyboards they have. I hope
>> this NEVER becomes a standard on full-sized computers with real
>> keyboards.
> I don't know... I'm about 35% convinced that obfuscating the password is
> just security theatre. I'm not sure that "shoulder surfing" of passwords
> is a significant threat.
> But the other 65% tells me that we should continue to obfuscate.

In some situations it's absolutely appropriate to not hide the
password at all. (A lot of routers let me type in a wifi password
unobscured, for instance.) But if you're doing that, then just keep
the whole password visible, same as if you're asking for a user name.
Don't show the one last-typed character and then hide it.

You're quite probably right that obfuscating the display is security
theatre; but it's the security theatre that people are expecting. If
you're about to enter your credit card details into a web form, does
it really matter whether or not the form itself was downloaded over an
encrypted link? But people are used to "look for the padlock", which
means that NOT having the padlock will bother people. If you ask for a
password and it gets displayed, people will wonder if they're entering
it in the right place.

That said, though, I honestly don't think there's much value in seeing
the length of a password by the number of asterisks. Have you ever
looked at them and realized that you missed out a letter? But again,
they're what people expect...


From p.f.moore at  Wed Jan 13 05:26:08 2016
From: p.f.moore at (Paul Moore)
Date: Wed, 13 Jan 2016 10:26:08 +0000
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

On 13 January 2016 at 10:19, Chris Angelico <rosuav at> wrote:
> That said, though, I honestly don't think there's much value in seeing
> the length of a password by the number of asterisks. Have you ever
> looked at them and realized that you missed out a letter? But again,
> they're what people expect...

Personally, I frequently look at the line of asterisks and think "that
doesn't look right" - it helps me catch typos. Also, doing things like
deleting everything but the first N characters lets me retype from a
"known good" point. I tend to get uncomfortable when I get no feedback
at all, as is typical on Unix systems.

But yes, it's about expectations, and it depends what type of system
you typically work with. Although many, many people are used to seeing
feedback asterisks or similar, as that's the norm on Windows and in
many (most?) web applications.


From mal at  Wed Jan 13 05:36:07 2016
From: mal at (M.-A. Lemburg)
Date: Wed, 13 Jan 2016 11:36:07 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 13.01.2016 04:07, Ethan Furman wrote:
> On 01/12/2016 06:45 PM, Oleg Broytman wrote:
>> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
>>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote:
>>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote:
>>>>> The old convention on Linux and Unix is to just suppress all feedback,
>>>>> but even on Linux GUI applications normally show bullets ??? or asterisks.
>>>>     Modern GUIs show the real character for a short period of time and
>>>> then replace it with an asterisk.
>>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
>>     On desktop (Windows) I saw a password entry with a checkbox to switch
>> between real characters and asterisks.
> While that can be handy, it is not the same as displaying each character as it is typed and then
> covering it with something else.  I agree with ChrisA and hope that never becomes the convention on
> non-mobile devices.

At least in Windows GUIs, the password field only provides a
very thin layer to obfuscate the underlying password text:

More secure systems always show 8 bullets regardless of how
many characters the password actually has and only provide
limited feedback when hitting a key without allowing to
see the number of chars in the password.

Not showing anything is certainly more secure than any other
method of providing user feedback, so I agree that we should
not make this the default.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 13 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From jonathan at  Wed Jan 13 06:00:06 2016
From: jonathan at (Jonathan Slenders)
Date: Wed, 13 Jan 2016 12:00:06 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>


prompt_toolkit can prompt for password input:

It displays as asterisks and keeps all readline-like navigation.
The second is an example of password input where Ctrl-T toggels between
asterisks and all visible.

Feedback is welcome (create an issue), but this probably will never become
part of core Python.


2016-01-13 11:36 GMT+01:00 M.-A. Lemburg <mal at>:

> On 13.01.2016 04:07, Ethan Furman wrote:
> > On 01/12/2016 06:45 PM, Oleg Broytman wrote:
> >> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
> >>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman wrote:
> >>>> On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano wrote:
> >
> >>>>> The old convention on Linux and Unix is to just suppress all
> feedback,
> >>>>> but even on Linux GUI applications normally show bullets ??? or
> asterisks.
> >>>>
> >>>>     Modern GUIs show the real character for a short period of time and
> >>>> then replace it with an asterisk.
> >>>
> >>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
> >>
> >>     On desktop (Windows) I saw a password entry with a checkbox to
> switch
> >> between real characters and asterisks.
> >
> > While that can be handy, it is not the same as displaying each character
> as it is typed and then
> > covering it with something else.  I agree with ChrisA and hope that
> never becomes the convention on
> > non-mobile devices.
> At least in Windows GUIs, the password field only provides a
> very thin layer to obfuscate the underlying password text:
> More secure systems always show 8 bullets regardless of how
> many characters the password actually has and only provide
> limited feedback when hitting a key without allowing to
> see the number of chars in the password.
> Not showing anything is certainly more secure than any other
> method of providing user feedback, so I agree that we should
> not make this the default.
> --
> Marc-Andre Lemburg
> Professional Python Services directly from the Experts (#1, Jan 13 2016)
> >>> Python Projects, Coaching and Consulting ...
> >>> Python Database Interfaces ... 
> >>> Plone/Zope Database Interfaces ... 
> ________________________________________________________________________
> ::: We implement business ideas - efficiently in both time and costs :::
> Software, Skills and Services GmbH  Pastor-Loeh-Str.48
>     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
>            Registered at Amtsgericht Duesseldorf: HRB 46611
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From python-ideas at  Wed Jan 13 12:56:44 2016
From: python-ideas at (Mike Miller)
Date: Wed, 13 Jan 2016 09:56:44 -0800
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <>

As in everything, it depends on the situation:

The Security Now podcast has also expressed doubt on the practice in common cases.

My take is that a few flags to control the behavior with convenient defaults
perhaps, show_text=True, display_char=None, display_delay=0, and a Ctrl-T
keybinding to toggle (as mentioned elsewhere).

A good case could also be made for the most secure defaults instead.  As long as 
the toggle keybinding were available it wouldn't be a great burden.  This is a 
console-only solution, correct?  So, Ctrl/Alt keys should be available.


On 2016-01-13 02:04, Steven D'Aprano wrote:
 > I don't know... I'm about 35% convinced that obfuscating the password is
 > just security theatre. I'm not sure that "shoulder surfing" of passwords
 > is a significant threat.
 > But the other 65% tells me that we should continue to obfuscate.

From tjreedy at  Wed Jan 13 19:29:40 2016
From: tjreedy at (Terry Reedy)
Date: Wed, 13 Jan 2016 19:29:40 -0500
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <n76q5t$dm6$>

On 1/12/2016 8:11 PM, Muhammad Ahmed Khalid wrote:
> Greetings,
> I am working on a project and I am using getpass.getpass() to grab
> passwords from the user.
> Some of the users wanted asterisks to be displayed when they were typing
> in the passwords for feedback i.e. how many characters were typed and
> how many to backspace.
 > Please let me know about your thoughts on the issue.

You are debating the wrong issue.  I work at home.  I HATE Passwork 
Masking Security Theatre.  Since I cannot reliably type 10 random hidden 
characters (or so sites tell me), it causes me endless grief for 
0.00000% gain.  If any of my passwords is stolen, it will, with 
probability 1.0 - epsilon, be part of one of the hacks that steal 
millions at a time from corporate sites.  Epsilon would be something 
other than a stranger looking over my shoulder.

PS: When UNIX decided to give no feedback, most people had one short 
easy-to-remember, easy-to-type password.  Not a hundred hard to remember 
and type.

Terry Jan Reedy

From g.brandl at  Thu Jan 14 04:50:09 2016
From: g.brandl at (Georg Brandl)
Date: Thu, 14 Jan 2016 10:50:09 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
Message-ID: <n77r0n$idd$>

On 01/13/2016 11:04 AM, Steven D'Aprano wrote:
> On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
>> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
>> > Hi!
>> >
>> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <steve at> wrote:
>> >> The old convention on Linux and Unix is to just suppress all feedback,
>> >> but even on Linux GUI applications normally show bullets ??? or asterisks.
>> >
>> >    Modern GUIs show the real character for a short period of time and
>> > then replace it with an asterisk.
>> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
>> and I think it's a sop to the terrible keyboards they have. I hope
>> this NEVER becomes a standard on full-sized computers with real
>> keyboards.
> I don't know... I'm about 35% convinced that obfuscating the password is 
> just security theatre. I'm not sure that "shoulder surfing" of passwords 
> is a significant threat.

This might not apply for people working from home, but at work I regularly
enter my own password or passwords for other systems with other people
intentionally looking over my shoulder (e.g. pair-programming, debugging,
confirming error reports etc.)  Should I ask them to look away from the
screen each time?


From rosuav at  Thu Jan 14 05:08:24 2016
From: rosuav at (Chris Angelico)
Date: Thu, 14 Jan 2016 21:08:24 +1100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <n77r0n$idd$>
References: <>
Message-ID: <>

On Thu, Jan 14, 2016 at 8:50 PM, Georg Brandl <g.brandl at> wrote:
> This might not apply for people working from home, but at work I regularly
> enter my own password or passwords for other systems with other people
> intentionally looking over my shoulder (e.g. pair-programming, debugging,
> confirming error reports etc.)  Should I ask them to look away from the
> screen each time?

Yes - and ask them to block their ears, too. The sound of your
keyboard can give away information about what your password is.


From khali119 at  Thu Jan 14 05:07:54 2016
From: khali119 at (Muhammad Ahmed Khalid)
Date: Thu, 14 Jan 2016 04:07:54 -0600
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <n77r0n$idd$>
References: <>
 <> <n77r0n$idd$>
Message-ID: <>

Regarding the issue of people looking at the user typing in the password.
Unless the person looking is right next to the user, it doesn't really
matter if they look at the screen, because if password masking is enabled
they will only see the masking characters.

If the person looking is right next to the user then that person can just
look at the keyboard and the keys being pressed.

Also the main issue here is that there should be a choice provided by the
getpass function to provide feedback or not.

On Thu, Jan 14, 2016 at 3:50 AM, Georg Brandl <g.brandl at> wrote:

> On 01/13/2016 11:04 AM, Steven D'Aprano wrote:
> > On Wed, Jan 13, 2016 at 01:22:02PM +1100, Chris Angelico wrote:
> >> On Wed, Jan 13, 2016 at 1:17 PM, Oleg Broytman <phd at> wrote:
> >> > Hi!
> >> >
> >> > On Wed, Jan 13, 2016 at 12:54:14PM +1100, Steven D'Aprano <
> steve at> wrote:
> >> >> The old convention on Linux and Unix is to just suppress all
> feedback,
> >> >> but even on Linux GUI applications normally show bullets ??? or
> asterisks.
> >> >
> >> >    Modern GUIs show the real character for a short period of time and
> >> > then replace it with an asterisk.
> >>
> >> Ugh. I've only seen that on mobile devices, not on any desktop GUI,
> >> and I think it's a sop to the terrible keyboards they have. I hope
> >> this NEVER becomes a standard on full-sized computers with real
> >> keyboards.
> >
> > I don't know... I'm about 35% convinced that obfuscating the password is
> > just security theatre. I'm not sure that "shoulder surfing" of passwords
> > is a significant threat.
> This might not apply for people working from home, but at work I regularly
> enter my own password or passwords for other systems with other people
> intentionally looking over my shoulder (e.g. pair-programming, debugging,
> confirming error reports etc.)  Should I ask them to look away from the
> screen each time?
> cheers,
> Georg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From khali119 at  Thu Jan 14 05:29:18 2016
From: khali119 at (Muhammad Ahmed Khalid)
Date: Thu, 14 Jan 2016 04:29:18 -0600
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <n77r0n$idd$>
Message-ID: <>

This discussion is kind of going into different directions and I want to
bring it back to the getpass function.

The original argument is that there should be a choice provided by the
getpass function of getting feedback or not. Currently the getpass function
does not provide any feedback and I just want to add the ability to make it
so that I can get some feedback.

Some one earlier mentioned that by default the function will not echo
anything back and I totally agree with that. In fact I like that suggestion
a lot. Only when the users* want feedback they can change the parameters of
the function and add which ever character they want for masking.

Please note that currently the first parameter of the getpass function is
the Prompt. The second parameter can then be used as the masking character
which can be None / blank by default.

On Thu, Jan 14, 2016 at 4:08 AM, Chris Angelico <rosuav at> wrote:

> On Thu, Jan 14, 2016 at 8:50 PM, Georg Brandl <g.brandl at> wrote:
> > This might not apply for people working from home, but at work I
> regularly
> > enter my own password or passwords for other systems with other people
> > intentionally looking over my shoulder (e.g. pair-programming, debugging,
> > confirming error reports etc.)  Should I ask them to look away from the
> > screen each time?
> Yes - and ask them to block their ears, too. The sound of your
> keyboard can give away information about what your password is.
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From leewangzhong+python at  Thu Jan 14 05:32:46 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Thu, 14 Jan 2016 05:32:46 -0500
Subject: [Python-ideas] (FAT Python) Convert keyword arguments to positional?
Message-ID: <>

(FAT Python:

FAT Python uses guards to check whether a global name (for example,
the name for a function) has changed its value. Idea: If you know
exactly which function will be called, you can also optimize based on
the properties of that function.

According to Eli Bendersky's 2012 blog post[1] (which might be
outdated), a function call with keyword arguments is potentially
slower than one with only positional arguments.

    If the function in question accepts no arguments (marked by the
METH_NOARGS flag when the function is created) or just a single object
argument (METH_0 flag), call_function doesn't go through the usual
argument packing and can call the underlying function pointer


    do_call ... implements the most generic form of calling. However,
there's one more optimization - if func is a PyFunction (an object
used internally to represent functions defined in Python code), a
separate path is taken - fast_function.


    ... PyCFunction objects that do [receive] keyword arguments [use
do_call instead of fast_function]. A curious aspect of this fact is
that it's somewhat more efficient to not pass keyword arguments to C
functions that either accept them or are fine with just positional

So maybe, in a function which uses FAT Python's guards, we can replace
some of the keyworded-calls to global function with positional-only
calls. It might be a micro-optimization, but it's one that the Python
programmer doesn't have to worry about.

1. Is it possible to correctly determine, for a given function, which
positional parameters have which names?
2. Is it possible to change a function object's named parameters some
time after it's created (and inspected)?

PS: I didn't feel like this was appropriate for either of Victor's
running PEP threads, and the third milestone thread is in the previous
month's archives, so I thought that making a new thread would be best.


From mal at  Thu Jan 14 05:39:54 2016
From: mal at (M.-A. Lemburg)
Date: Thu, 14 Jan 2016 11:39:54 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <n77r0n$idd$>
Message-ID: <>

On 14.01.2016 11:29, Muhammad Ahmed Khalid wrote:
> This discussion is kind of going into different directions and I want to
> bring it back to the getpass function.
> The original argument is that there should be a choice provided by the
> getpass function of getting feedback or not. Currently the getpass function
> does not provide any feedback and I just want to add the ability to make it
> so that I can get some feedback.
> Some one earlier mentioned that by default the function will not echo
> anything back and I totally agree with that. In fact I like that suggestion
> a lot. Only when the users* want feedback they can change the parameters of
> the function and add which ever character they want for masking.
> Please note that currently the first parameter of the getpass function is
> the Prompt. The second parameter can then be used as the masking character
> which can be None / blank by default.

If you can make this work cross-platform, I don't think anyone
would object to having such an option, as long as the default
remains "show nothing" :-)

For more complex password / key card / single signon / etc.
functionality, I believe a PyPI installable package would be

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 14 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From g.brandl at  Thu Jan 14 06:17:45 2016
From: g.brandl at (Georg Brandl)
Date: Thu, 14 Jan 2016 12:17:45 +0100
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <>
References: <>
 <> <n77r0n$idd$>
Message-ID: <n7804p$8b9$>

On 01/14/2016 11:07 AM, Muhammad Ahmed Khalid wrote:
> Regarding the issue of people looking at the user typing in the password. Unless
> the person looking is right next to the user, it doesn't really matter if they
> look at the screen, because if password masking is enabled they will only see
> the masking characters.
> If the person looking is right next to the user then that person can just look
> at the keyboard and the keys being pressed.

Well, I can type reasonably fast.

For Chris and anyone else who'd like to pretend not to know what I meant:  The
point is that these are *coworkers*, not *hackers*.  They have no reason to go
to great lengths to know these passwords.  But at the same time, they're not
theirs to know, and I'd like to keep them to myself, otherwise we could just
use one company-wide password for everything.

Having the password masked on the screen, where eyes will be anyway, is a good
compromise to avoid needless interruptions.


From jsbueno at  Thu Jan 14 11:06:46 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Thu, 14 Jan 2016 14:06:46 -0200
Subject: [Python-ideas] Higher leavel heapq
Message-ID: <>


the heapq stdlib module is really handy, but a little low level -
in that it accepts a sequence, possibly only a list, as the heap-object,
and that object have to be handled independently, outside the functions
provided in there. (One can't otherwise insert or delete elements of that list,
without destroying the heap, for example).

It would be simple to have a higher level class that would do just
that, and simplify the use
of an ordered container - what about having an extra class there?

I have the snippet bellow I wrote on stack-overflow a couple years ago -
it is very handy.With a little more boiler plate and code hardening,
maybe it could
be a nice thing for the stdlib?

What do you say?


From guido at  Thu Jan 14 11:17:28 2016
From: guido at (Guido van Rossum)
Date: Thu, 14 Jan 2016 08:17:28 -0800
Subject: [Python-ideas] Higher leavel heapq
In-Reply-To: <>
References: <>
Message-ID: <>

Well, it's a lot of overhead for a very small bit of convenience. I say
let's not do this, it would just encourage people to settle for a slower
version. Not everything needs to be OO, you know!

On Thu, Jan 14, 2016 at 8:06 AM, Joao S. O. Bueno <jsbueno at>

> Hi,
> the heapq stdlib module is really handy, but a little low level -
> in that it accepts a sequence, possibly only a list, as the heap-object,
> and that object have to be handled independently, outside the functions
> provided in there. (One can't otherwise insert or delete elements of that
> list,
> without destroying the heap, for example).
> It would be simple to have a higher level class that would do just
> that, and simplify the use
> of an ordered container - what about having an extra class there?
> I have the snippet bellow I wrote on stack-overflow a couple years ago -
> it is very handy.With a little more boiler plate and code hardening,
> maybe it could
> be a nice thing for the stdlib?
> What do you say?
>   js
>  -><-
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Thu Jan 14 11:54:50 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 14 Jan 2016 08:54:50 -0800
Subject: [Python-ideas] Higher leavel heapq
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 14, 2016, at 08:06, Joao S. O. Bueno <jsbueno at> wrote:
> Hi,
> the heapq stdlib module is really handy, but a little low level -
> in that it accepts a sequence, possibly only a list, as the heap-object,
> and that object have to be handled independently, outside the functions
> provided in there. (One can't otherwise insert or delete elements of that list,
> without destroying the heap, for example).
> It would be simple to have a higher level class that would do just
> that, and simplify the use
> of an ordered container - what about having an extra class there?
> I have the snippet bellow I wrote on stack-overflow a couple years ago -
> it is very handy.With a little more boiler plate and code hardening,
> maybe it could
> be a nice thing for the stdlib?

Using (key(x), x) as the elements doesn't work if the real values aren't comparable, and isn't stable even if they are. So, to make this work fully generally, you have to add a third element, like (key(x), next(counter), x). But when that isn't necessary, it's pretty wasteful.

Also, for many uses, the key doesn't have anything to do with the values--e.g., a timer queue just uses the insertion time--so the sorting-style key function is misleading.

Also, a heap as a collection-like data structure isn't that useful on its own. There are a variety of iterative algorithms that use a heap internally, but they don't need to expose it to callers. (And most of the common ones are already included in the module.) And there are also a variety of data structures that use a heap internally, but they also don't need to expose it to callers. For example, read the section in the docs on priority queues, and try to implement a pqueue class on top of your wrapper class vs. directly against the module functions. And do the same with then nlargest function (you can find the source linked from the docs). In both cases, the version without the class is more readable, less code, and likely more efficient, and the API for users is the same, so what has the wrapper class bought you?

From victor.stinner at  Thu Jan 14 13:07:46 2016
From: victor.stinner at (Victor Stinner)
Date: Thu, 14 Jan 2016 19:07:46 +0100
Subject: [Python-ideas] (FAT Python) Convert keyword arguments to
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-14 11:32 GMT+01:00 Franklin? Lee <leewangzhong+python at>:
> (FAT Python:

FYI I moved the optimizer into a new project at GitHub to ease
contributions and experiments:

Running tests work on Python 3.4, but running optimized code required
a patched Python 3.6 (
repository which includes all patches).

> FAT Python uses guards to check whether a global name (for example,
> the name for a function) has changed its value. Idea: If you know
> exactly which function will be called, you can also optimize based on
> the properties of that function.

You need a guard on the function. The fat module provides such guard:

Right now, it only watch for func.__code__. I'm not sure that it's
enought. A function has many attributes which can change its behaviour
if they are modified: __defaults__, __closure__, __dict__,
__globals__, __kwdefaults__, __module__ (?), __name__ (?),
__qualname__ (?).

> According to Eli Bendersky's 2012 blog post[1] (which might be
> outdated), a function call with keyword arguments is potentially
> slower than one with only positional arguments.

Yeah, ext_do_call() has to create a temporary dictionary, while
calling a function only with indexed parameters can avoid *completly*
the creation of any temporary object. PyEval_EvalFrameEx() takes a
stack (array of objects) which is used to pass parameters from
CALL_FUNCTION, but only for pure Python functions.

> So maybe, in a function which uses FAT Python's guards, we can replace
> some of the keyworded-calls to global function with positional-only
> calls. It might be a micro-optimization, but it's one that the Python
> programmer doesn't have to worry about.

It looks like you have a plan, and I think that you can implement the
optimization without changing the Python semantics.

> Concerns:
> 1. Is it possible to correctly determine, for a given function, which
> positional parameters have which names?

I think so. Just "read" the function prototype no? Such info is
available in AST.

> 2. Is it possible to change a function object's named parameters some
> time after it's created (and inspected)?

What do you think?

> PS: I didn't feel like this was appropriate for either of Victor's
> running PEP threads, and the third milestone thread is in the previous
> month's archives, so I thought that making a new thread would be best.

Yeah, it's better to start a separated thread, thanks.


From python-ideas at  Thu Jan 14 13:35:59 2016
From: python-ideas at (Mike Miller)
Date: Thu, 14 Jan 2016 10:35:59 -0800
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <n76q5t$dm6$>
References: <>
Message-ID: <>

Sounds like this default should be user configurable as well, not only by the 
developer.  Perhaps a function call to set the preference in PYSTARTUP?  (or 
other precedent).

A hotkey to toggle would be helpful for laptops, which tend to travel.


On 2016-01-13 16:29, Terry Reedy wrote:
> I HATE Passwork Masking Security Theatre.

From greg.ewing at  Thu Jan 14 16:16:20 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 15 Jan 2016 10:16:20 +1300
Subject: [Python-ideas] Password masking for getpass.getpass
In-Reply-To: <n77r0n$idd$>
References: <>
 <> <n77r0n$idd$>
Message-ID: <>

Georg Brandl wrote:
> I regularly
> enter my own password or passwords for other systems with other people
> intentionally looking over my shoulder (e.g. pair-programming, debugging,
> confirming error reports etc.)

I have a solution!

It requires a display capable of emitting light with selected
polarisation. The password field displays the password in
such a way that it can only be seen when wearing polariod
glasses that are polarised in the correct direction. So a
pair of programmers can wear glasses that let them each
see their own password but not the other's. Bystanders
not wearing any glasses would not see either of them.


From leewangzhong+python at  Thu Jan 14 16:32:28 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Thu, 14 Jan 2016 16:32:28 -0500
Subject: [Python-ideas] (FAT Python) Convert keyword arguments to
In-Reply-To: <>
References: <>
Message-ID: <>

Maybe also have it substitute in the function's default args, if default
args take extra work (though it would take extra memory (new local
variables) and probably doesn't give any savings).

On Jan 14, 2016 1:08 PM, "Victor Stinner" <victor.stinner at> wrote:
> 2016-01-14 11:32 GMT+01:00 Franklin? Lee <leewangzhong+python at>:

> > Concerns:
> > 1. Is it possible to correctly determine, for a given function, which
> > positional parameters have which names?
> I think so. Just "read" the function prototype no? Such info is
> available in AST.
> > 2. Is it possible to change a function object's named parameters some
> > time after it's created (and inspected)?
> What do you think?

I'm not too familiar (yet) with the details of the AST.

I had function wrappers in mind. In particular, I would like to permit
"faked"/computed function signatures for wrappers based on what they wrap
(e.g. lru_cache, partial), and I'm not sure (though I suspect) that
computed signatures are compatible with immutable signatures (that is,
fixed upon creation).

(Sorry for the double-mail, Victor. I will try to remember not to post from
the phone.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Fri Jan 15 11:10:25 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 15 Jan 2016 17:10:25 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
Message-ID: <>


This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
to implement a static Python optimizer specializing functions with

If the PEP is accepted, it will solve a long list of issues, some
issues are old, like #1346238 which is 11 years old ;-) I found 12


I worked to make the PEP more generic that "this hook is written for
FAT Python". Please read the full PEP to see a long list of existing
usages in Python of code transformers.

You may read again the discussion which occurred 4 years ago about the
same topic:
(the thread starts with an idea of AST optimizer, but is moves quickly
to a generic API to transform the code)

Thanks to Red Hat for giving me time to experiment on this.

HTML version:

PEP: 511
Title: API for code transformers
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner at>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 4-January-2016
Python-Version: 3.6


Propose an API to register bytecode and AST transformers. Add also ``-o
OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o
noopt`` disables the peephole optimizer. Raise an ``ImportError``
exception on import if the ``.pyc`` file is missing and the code
transformers required to transform the code are missing.  code
transformers are not needed code transformed ahead of time (loaded from
``.pyc`` files).


Python does not provide a standard way to transform the code. Projects
transforming the code use various hooks. The MacroPy project uses an
import hook: it adds its own module finder in ``sys.meta_path`` to
hook its AST transformer. Another option is to monkey-patch the
builtin ``compile()`` function. There are even more options to
hook a code transformer.

Python 3.4 added a ``compile_source()`` method to
````. But code transformation is wider than
just importing modules, see described use cases below.

Writing an optimizer or a preprocessor is out of the scope of this PEP.

Usage 1: AST optimizer

Transforming an Abstract Syntax Tree (AST) is a convenient
way to implement an optimizer. It's easier to work on the AST than
working on the bytecode, AST contains more information and is more high

Since the optimization can done ahead of time, complex but slow
optimizations can be implemented.

Example of optimizations which can be implemented with an AST optimizer:

* `Copy propagation
  replace ``x=1; y=x`` with ``x=1; y=1``
* `Constant folding
  replace ``1+1`` with ``2``
* `Dead code elimination

Using guards (see the `PEP 510
<>`_), it is possible to
implement a much wider choice of optimizations. Examples:

* Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
  as iterable
* `Loop unrolling <>`_
* Call pure builtins: replace ``len("abc")`` with ``3``
* Copy used builtin symbols to constants
* See also `optimizations implemented in fatoptimizer
  a static optimizer for Python 3.6.

The following issues can be implemented with an AST optimizer:

* `Issue #1346238
  <>`_: A constant folding
  optimization pass for the AST
* `Issue #2181 <>`_:
  optimize out local variables at end of function
* `Issue #2499 <>`_:
  Fold unary + and not on constants
* `Issue #4264 <>`_:
  Patch: optimize code to use LIST_APPEND instead of calling list.append
* `Issue #7682 <>`_:
  Optimisation of if with constant expression
* `Issue #10399 <>`_: AST
  Optimization: inlining of function calls
* `Issue #11549 <>`_:
  Build-out an AST optimizer, moving some functionality out of the
  peephole optimizer
* `Issue #17068 <>`_:
  peephole optimization for constant strings
* `Issue #17430 <>`_:
  missed peephole optimization

Usage 2: Preprocessor

A preprocessor can be easily implemented with an AST transformer. A
preprocessor has various and different usages.

Some examples:

* Remove debug code like assertions and logs to make the code faster to
  run it for production.
* `Tail-call Optimization <>`_
* Add profiling code
* `Lazy evaluation <>`_:
  see `lazy_python <>`_
  (bytecode transformer) and `lazy macro of MacroPy
  <>`_ (AST transformer)
* Change dictionary literals into collection.OrderedDict instances
* Declare constants: see `@asconstants of codetransformer
* Domain Specific Language (DSL) like SQL queries. The
  Python language itself doesn't need to be modified. Previous attempts
  to implement DSL for SQL like `PEP 335 - Overloadable Boolean
  Operators <>`_ was rejected.
* Pattern Matching of functional languages
* String Interpolation, but `PEP 498 -- Literal String Interpolation
  <>`_ was merged into Python

`MacroPy <>`_ has a long list of
examples and use cases.

This PEP does not add any new code transformer. Using a code transformer
will require an external module and to register it manually.

See also `PyXfuscator <>`_: Python
obfuscator, deobfuscator, and user-assisted decompiler.

Usage 3: Disable all optimization

Ned Batchelder asked to add an option to disable the peephole optimizer
because it makes code coverage more difficult to implement. See the
discussion on the python-ideas mailing list: `Disable all peephole

This PEP adds a new ``-o noopt`` command line option to disable the
peephole optimizer. In Python, it's as easy as::


It will fix the `Issue #2506 <>`_: Add
mechanism to disable optimizations.

Usage 4: Write new bytecode optimizers in Python

Python 3.6 optimizes the code using a peephole optimizer. By
definition, a peephole optimizer has a narrow view of the code and so
can only implement basic optimizations. The optimizer rewrites the
bytecode. It is difficult to enhance it, because it written in C.

With this PEP, it becomes possible to implement a new bytecode optimizer
in pure Python and experiment new optimizations.

Some optimizations are easier to implement on the AST like constant
folding, but optimizations on the bytecode are still useful. For
example, when the AST is compiled to bytecode, useless jumps can be
emited because the compiler is naive and does not try to optimize

Use Cases

This section give examples of use cases explaining when and how code
transformers will be used.

Interactive interpreter

It will be possible to use code transformers with the interactive
interpreter which is popular in Python and commonly used to demonstrate

The code is transformed at runtime and so the interpreter can be slower
when expensive code transformers are used.

Build a transformed package

It will be possible to build a package of the transformed code.

A transformer can have a configuration. The configuration is not stored
in the package.

All ``.pyc`` files of the package must be transformed with the same code
transformers and the same transformers configuration.

It is possible to build different ``.pyc`` files using different
optimizer tags. Example: ``fat`` for the default configuration and
``fat_inline`` for a different configuration with function inlining

A package can contain ``.pyc`` files with different optimizer tags.

Install a package containing transformed .pyc files

It will be possible to install a package which contains transformed
``.pyc`` files.

All ``.pyc`` files with any optimizer tag contained in the package are
installed, not only for the current optimizer tag.

Build .pyc files when installing a package

If a package does not contain any ``.pyc`` files of the current
optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
created during the installation.

Code transformers of the optimizer tag are required. Otherwise, the
installation fails with an error.

Execute transformed code

It will be possible to execute transformed code.

Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
current optimizer tag is missing and the code transformers required to
transform the code are missing.

The interesting point here is that code transformers are not needed to
execute the transformed code if all required ``.pyc`` files are already

Code transformer API

A code transformer is a class with ``ast_transformer()`` and/or
``code_transformer()`` methods (API described below) and a ``name``

For efficiency, do not define a ``code_transformer()`` or
``ast_transformer()`` method if it does nothing.

The ``name`` attribute (``str``) must be a short string used to identify
an optimizer. It is used to build a ``.pyc`` filename. The name must not
contain dots (``'.'``), dashes (``'-'``) or directory separators: dots
are used to separated fields in a ``.pyc`` filename and dashes areused
to join code transformer names to build the optimizer tag.

.. note::
   It would be nice to pass the fully qualified name of a module in the
   *context* when an AST transformer is used to transform a module on
   import, but it looks like the information is not available in



    def code_transformer(code, consts, names, lnotab, context):
        return (code, consts, names, lnotab)


* *code*: the bytecode (``bytes``)
* *consts*: a sequence of constants
* *names*: tuple of variable names
* *lnotab*: table mapping instruction offsets to line numbers

The code transformer is run after the compilation to bytecode



    def ast_transformer(tree, context):
        return tree


* *tree*: an AST tree
* *context*: an object with a ``filename`` attribute (``str``)

It must return an AST tree. It can modify the AST tree in place, or
create a new AST tree.

The AST transformer is called after the creation of the AST by the
parser and before the compilation to bytecode. New attributes may be
added to *context* in the future.


In short, add:

* ``-o OPTIM_TAG`` command line option
* ``ast.Constant``
* ``sys.get_code_transformers()``
* ``sys.implementation.optim_tag``
* ``sys.set_code_transformers(transformers)``

API to get/set code transformers

Add new functions to register code transformers:

* ``sys.set_code_transformers(transformers)``: set the list of code
  transformers and update ``sys.implementation.optim_tag``
* ``sys.get_code_transformers()``: get the list of code

The order of code transformers matter. Running transformer A and then
transformer B can give a different output than running transformer B an
then transformer A.

Example to prepend a new code transformer::

    transformers = sys.get_code_transformers()
    transformers.insert(0, new_cool_transformer)

All AST tranformers are run sequentially (ex: the second transformer
gets the input of the first transformer), and then all bytecode
transformers are run sequentially.

Optimizer tag


* Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
  The default optimization tag is ``'opt'``.
* Add a new ``-o OPTIM_TAG`` command line option to set

Changes on ``importlib``:

* ``importlib`` uses ``sys.implementation.optim_tag`` to build the
  ``.pyc`` filename to importing modules, instead of always using
  ``opt``. Remove also the special case for the optimizer level ``0``
  with the default optimizer tag ``'opt'`` to simplify the code.
* When loading a module, if the ``.pyc`` file is missing but the ``.py``
  is available, the ``.py`` is only used if code optimizers have the
  same optimizer tag than the current tag, otherwise an ``ImportError``
  exception is raised.

Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
be compiled to import a module::

    def transformers_tag():
        transformers = sys.get_code_transformers()
        if not transformers:
            return 'noopt'
        return '-'.join(
                        for transformer in transformers)

    def use_py():
        return (transformers_tag() == sys.implementation.optim_tag)

The order of ``sys.get_code_transformers()`` matter. For example, the
``fat`` transformer followed by the ``pythran`` transformer gives the
optimizer tag ``fat-pythran``.

The behaviour of the ``importlib`` module is unchanged with the default
optimizer tag (``'opt'``).

Peephole optimizer

By default, ``sys.implementation.optim_tag`` is ``opt`` and
``sys.get_code_transformers()`` returns a list of one code transformer:
the peephole optimizer (optimize the bytecode).

Use ``-o noopt`` to disable the peephole optimizer. In this case, the
optimizer tag is ``noopt`` and no code transformer is registered.

Using the ``-o opt`` option has not effect.

AST enhancements

Enhancements to simplify the implementation of AST transformers:

* Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
  transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
* Add ``ast.Constant``: this type is not emited by the compiler, but
  can be used in an AST transformer to simplify the code. It does not
  contain line number and column offset informations on tuple or
  frozenset items.
* ``PyCodeObject.co_lnotab``: line number delta becomes signed to
  support moving instructions (note: need to modify MAGIC_NUMBER in
  importlib). Implemented in the `issue #26107
* Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
  constants. Currently, ``tuple`` and ``frozenset`` constants are
  created by the peephole transformer, after the bytecode compilation.
* ``marshal`` module: fix serialization of the empty frozenset singleton
* update ``Tools/parser/`` to support the new ``ast.Constant``
  node type


.pyc filenames

Example of ``.pyc`` filenames of the ``os`` module.

With the default optimizer tag ``'opt'``:

===========================   ==================
.pyc filename                 Optimization level
===========================   ==================
``os.cpython-36.opt-0.pyc``                    0
``os.cpython-36.opt-1.pyc``                    1
``os.cpython-36.opt-2.pyc``                    2
===========================   ==================

With the ``'fat'`` optimizer tag:

===========================   ==================
.pyc filename                 Optimization level
===========================   ==================
``os.cpython-36.fat-0.pyc``                    0
``os.cpython-36.fat-1.pyc``                    1
``os.cpython-36.fat-2.pyc``                    2
===========================   ==================

Bytecode transformer

Scary bytecode transformer replacing all strings with
``"Ni! Ni!  Ni!"``::

    import sys

    class BytecodeTransformer:
        name = "knights_who_say_ni"

        def code_transformer(self, code, consts, names, lnotab, context):
            consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const
                      for const in consts]
            return (code, consts, names, lnotab)

    # replace existing code transformers with the new bytecode transformer

    # execute code which will be transformed by code_transformer()
    exec("print('Hello World!')")


    Ni! Ni! Ni!

AST transformer

Similary to the bytecode transformer example, the AST transformer also
replaces all strings with ``"Ni! Ni! Ni!"``::

    import ast
    import sys

    class KnightsWhoSayNi(ast.NodeTransformer):
        def visit_Str(self, node):
            node.s = 'Ni! Ni! Ni!'
            return node

    class ASTTransformer:
        name = "knights_who_say_ni"

        def __init__(self):
            self.transformer = KnightsWhoSayNi()

        def ast_transformer(self, tree, context):
            return tree

    # replace existing code transformers with the new AST transformer

    # execute code which will be transformed by ast_transformer()
    exec("print('Hello World!')")


    Ni! Ni! Ni!

Other Python implementations

The PEP 511 should be implemented by all Python implementation, but the
bytecode and the AST are not standardized.

By the way, even between minor version of CPython, there are changes on
the AST API. There are differences, but only minor differences. It is
quite easy to write an AST transformer which works on Python 2.7 and
Python 3.5 for example.


* `[Python-Dev] AST optimizer implemented in Python
  (August 2012)

Prior Art

AST optimizers

In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
some functionality out of the peephole optimizer
<>`_.  The patch adds ``ast.Lit`` (it
was proposed to rename it to ``ast.Literal``).

In 2012, Victor Stinner wrote the `astoptimizer
<>`_ project, an AST optimizer
implementing various optimizations. Most interesting optimizations break
the Python semantics since no guard is used to disable optimization if
something changes.

In 2015, Victor Stinner wrote the `fatoptimizer
<>`_ project, an AST optimizer
specializing functions using guards.

The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST"
optimizer <>`_ was a first attempt of
API for code transformers, but specific to AST.

Python Preprocessors

* `MacroPy <>`_: MacroPy is an
  implementation of Syntactic Macros in the Python Programming Language.
  MacroPy provides a mechanism for user-defined functions (macros) to
  perform transformations on the abstract syntax tree (AST) of a Python
  program at import time.
* `pypreprocessor <>`_: C-style
  preprocessor directives in Python, like ``#define`` and ``#ifdef``

Bytecode transformers

* `codetransformer <>`_:
  Bytecode transformers for CPython inspired by the ``ast`` module?s
* `byteplay <>`_: Byteplay lets you
  convert Python code objects into equivalent objects which are easy to
  play with, and lets you convert those objects back into living Python
  code objects. It's useful for applying crazy transformations on Python
  functions, and is also useful in learning Python byte code
  intricacies. See `byteplay documentation

See also:

* `BytecodeAssembler <>`_


This document has been placed in the public domain.

From abarnert at  Fri Jan 15 11:12:58 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 15 Jan 2016 10:12:58 -0600
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

You linked to PEP 510 #changes. I think you wanted

Sent from my iPhone

> On Jan 15, 2016, at 10:10, Victor Stinner <victor.stinner at> wrote:
> Hi,
> This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
> to implement a static Python optimizer specializing functions with
> guards.
> If the PEP is accepted, it will solve a long list of issues, some
> issues are old, like #1346238 which is 11 years old ;-) I found 12
> issues:
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> I worked to make the PEP more generic that "this hook is written for
> FAT Python". Please read the full PEP to see a long list of existing
> usages in Python of code transformers.
> You may read again the discussion which occurred 4 years ago about the
> same topic:
> (the thread starts with an idea of AST optimizer, but is moves quickly
> to a generic API to transform the code)
> Thanks to Red Hat for giving me time to experiment on this.
> Victor
> HTML version:
> PEP: 511
> Title: API for code transformers
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner <victor.stinner at>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
> Abstract
> ========
> Propose an API to register bytecode and AST transformers. Add also ``-o
> OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o
> noopt`` disables the peephole optimizer. Raise an ``ImportError``
> exception on import if the ``.pyc`` file is missing and the code
> transformers required to transform the code are missing.  code
> transformers are not needed code transformed ahead of time (loaded from
> ``.pyc`` files).
> Rationale
> =========
> Python does not provide a standard way to transform the code. Projects
> transforming the code use various hooks. The MacroPy project uses an
> import hook: it adds its own module finder in ``sys.meta_path`` to
> hook its AST transformer. Another option is to monkey-patch the
> builtin ``compile()`` function. There are even more options to
> hook a code transformer.
> Python 3.4 added a ``compile_source()`` method to
> ````. But code transformation is wider than
> just importing modules, see described use cases below.
> Writing an optimizer or a preprocessor is out of the scope of this PEP.
> Usage 1: AST optimizer
> ----------------------
> Transforming an Abstract Syntax Tree (AST) is a convenient
> way to implement an optimizer. It's easier to work on the AST than
> working on the bytecode, AST contains more information and is more high
> level.
> Since the optimization can done ahead of time, complex but slow
> optimizations can be implemented.
> Example of optimizations which can be implemented with an AST optimizer:
> * `Copy propagation
>  <>`_:
>  replace ``x=1; y=x`` with ``x=1; y=1``
> * `Constant folding
>  <>`_:
>  replace ``1+1`` with ``2``
> * `Dead code elimination
>  <>`_
> Using guards (see the `PEP 510
> <>`_), it is possible to
> implement a much wider choice of optimizations. Examples:
> * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
>  as iterable
> * `Loop unrolling <>`_
> * Call pure builtins: replace ``len("abc")`` with ``3``
> * Copy used builtin symbols to constants
> * See also `optimizations implemented in fatoptimizer
>  <>`_,
>  a static optimizer for Python 3.6.
> The following issues can be implemented with an AST optimizer:
> * `Issue #1346238
>  <>`_: A constant folding
>  optimization pass for the AST
> * `Issue #2181 <>`_:
>  optimize out local variables at end of function
> * `Issue #2499 <>`_:
>  Fold unary + and not on constants
> * `Issue #4264 <>`_:
>  Patch: optimize code to use LIST_APPEND instead of calling list.append
> * `Issue #7682 <>`_:
>  Optimisation of if with constant expression
> * `Issue #10399 <>`_: AST
>  Optimization: inlining of function calls
> * `Issue #11549 <>`_:
>  Build-out an AST optimizer, moving some functionality out of the
>  peephole optimizer
> * `Issue #17068 <>`_:
>  peephole optimization for constant strings
> * `Issue #17430 <>`_:
>  missed peephole optimization
> Usage 2: Preprocessor
> ---------------------
> A preprocessor can be easily implemented with an AST transformer. A
> preprocessor has various and different usages.
> Some examples:
> * Remove debug code like assertions and logs to make the code faster to
>  run it for production.
> * `Tail-call Optimization <>`_
> * Add profiling code
> * `Lazy evaluation <>`_:
>  see `lazy_python <>`_
>  (bytecode transformer) and `lazy macro of MacroPy
>  <>`_ (AST transformer)
> * Change dictionary literals into collection.OrderedDict instances
> * Declare constants: see `@asconstants of codetransformer
>  <>`_
> * Domain Specific Language (DSL) like SQL queries. The
>  Python language itself doesn't need to be modified. Previous attempts
>  to implement DSL for SQL like `PEP 335 - Overloadable Boolean
>  Operators <>`_ was rejected.
> * Pattern Matching of functional languages
> * String Interpolation, but `PEP 498 -- Literal String Interpolation
>  <>`_ was merged into Python
>  3.6.
> `MacroPy <>`_ has a long list of
> examples and use cases.
> This PEP does not add any new code transformer. Using a code transformer
> will require an external module and to register it manually.
> See also `PyXfuscator <>`_: Python
> obfuscator, deobfuscator, and user-assisted decompiler.
> Usage 3: Disable all optimization
> ---------------------------------
> Ned Batchelder asked to add an option to disable the peephole optimizer
> because it makes code coverage more difficult to implement. See the
> discussion on the python-ideas mailing list: `Disable all peephole
> optimizations
> <>`_.
> This PEP adds a new ``-o noopt`` command line option to disable the
> peephole optimizer. In Python, it's as easy as::
>    sys.set_code_transformers([])
> It will fix the `Issue #2506 <>`_: Add
> mechanism to disable optimizations.
> Usage 4: Write new bytecode optimizers in Python
> ------------------------------------------------
> Python 3.6 optimizes the code using a peephole optimizer. By
> definition, a peephole optimizer has a narrow view of the code and so
> can only implement basic optimizations. The optimizer rewrites the
> bytecode. It is difficult to enhance it, because it written in C.
> With this PEP, it becomes possible to implement a new bytecode optimizer
> in pure Python and experiment new optimizations.
> Some optimizations are easier to implement on the AST like constant
> folding, but optimizations on the bytecode are still useful. For
> example, when the AST is compiled to bytecode, useless jumps can be
> emited because the compiler is naive and does not try to optimize
> anything.
> Use Cases
> =========
> This section give examples of use cases explaining when and how code
> transformers will be used.
> Interactive interpreter
> -----------------------
> It will be possible to use code transformers with the interactive
> interpreter which is popular in Python and commonly used to demonstrate
> Python.
> The code is transformed at runtime and so the interpreter can be slower
> when expensive code transformers are used.
> Build a transformed package
> ---------------------------
> It will be possible to build a package of the transformed code.
> A transformer can have a configuration. The configuration is not stored
> in the package.
> All ``.pyc`` files of the package must be transformed with the same code
> transformers and the same transformers configuration.
> It is possible to build different ``.pyc`` files using different
> optimizer tags. Example: ``fat`` for the default configuration and
> ``fat_inline`` for a different configuration with function inlining
> enabled.
> A package can contain ``.pyc`` files with different optimizer tags.
> Install a package containing transformed .pyc files
> ---------------------------------------------------
> It will be possible to install a package which contains transformed
> ``.pyc`` files.
> All ``.pyc`` files with any optimizer tag contained in the package are
> installed, not only for the current optimizer tag.
> Build .pyc files when installing a package
> ------------------------------------------
> If a package does not contain any ``.pyc`` files of the current
> optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
> created during the installation.
> Code transformers of the optimizer tag are required. Otherwise, the
> installation fails with an error.
> Execute transformed code
> ------------------------
> It will be possible to execute transformed code.
> Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
> current optimizer tag is missing and the code transformers required to
> transform the code are missing.
> The interesting point here is that code transformers are not needed to
> execute the transformed code if all required ``.pyc`` files are already
> available.
> Code transformer API
> ====================
> A code transformer is a class with ``ast_transformer()`` and/or
> ``code_transformer()`` methods (API described below) and a ``name``
> attribute.
> For efficiency, do not define a ``code_transformer()`` or
> ``ast_transformer()`` method if it does nothing.
> The ``name`` attribute (``str``) must be a short string used to identify
> an optimizer. It is used to build a ``.pyc`` filename. The name must not
> contain dots (``'.'``), dashes (``'-'``) or directory separators: dots
> are used to separated fields in a ``.pyc`` filename and dashes areused
> to join code transformer names to build the optimizer tag.
> .. note::
>   It would be nice to pass the fully qualified name of a module in the
>   *context* when an AST transformer is used to transform a module on
>   import, but it looks like the information is not available in
>   ``PyParser_ASTFromStringObject()``.
> code_transformer()
> ------------------
> Prototype::
>    def code_transformer(code, consts, names, lnotab, context):
>        ...
>        return (code, consts, names, lnotab)
> Parameters:
> * *code*: the bytecode (``bytes``)
> * *consts*: a sequence of constants
> * *names*: tuple of variable names
> * *lnotab*: table mapping instruction offsets to line numbers
>  (``bytes``)
> The code transformer is run after the compilation to bytecode
> ast_transformer()
> ------------------
> Prototype::
>    def ast_transformer(tree, context):
>        ...
>        return tree
> Parameters:
> * *tree*: an AST tree
> * *context*: an object with a ``filename`` attribute (``str``)
> It must return an AST tree. It can modify the AST tree in place, or
> create a new AST tree.
> The AST transformer is called after the creation of the AST by the
> parser and before the compilation to bytecode. New attributes may be
> added to *context* in the future.
> Changes
> =======
> In short, add:
> * ``-o OPTIM_TAG`` command line option
> * ``ast.Constant``
> * ``sys.get_code_transformers()``
> * ``sys.implementation.optim_tag``
> * ``sys.set_code_transformers(transformers)``
> API to get/set code transformers
> --------------------------------
> Add new functions to register code transformers:
> * ``sys.set_code_transformers(transformers)``: set the list of code
>  transformers and update ``sys.implementation.optim_tag``
> * ``sys.get_code_transformers()``: get the list of code
>  transformers.
> The order of code transformers matter. Running transformer A and then
> transformer B can give a different output than running transformer B an
> then transformer A.
> Example to prepend a new code transformer::
>    transformers = sys.get_code_transformers()
>    transformers.insert(0, new_cool_transformer)
>    sys.set_code_transformers(transformers)
> All AST tranformers are run sequentially (ex: the second transformer
> gets the input of the first transformer), and then all bytecode
> transformers are run sequentially.
> Optimizer tag
> -------------
> Changes:
> * Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
>  The default optimization tag is ``'opt'``.
> * Add a new ``-o OPTIM_TAG`` command line option to set
>  ``sys.implementation.optim_tag``.
> Changes on ``importlib``:
> * ``importlib`` uses ``sys.implementation.optim_tag`` to build the
>  ``.pyc`` filename to importing modules, instead of always using
>  ``opt``. Remove also the special case for the optimizer level ``0``
>  with the default optimizer tag ``'opt'`` to simplify the code.
> * When loading a module, if the ``.pyc`` file is missing but the ``.py``
>  is available, the ``.py`` is only used if code optimizers have the
>  same optimizer tag than the current tag, otherwise an ``ImportError``
>  exception is raised.
> Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
> be compiled to import a module::
>    def transformers_tag():
>        transformers = sys.get_code_transformers()
>        if not transformers:
>            return 'noopt'
>        return '-'.join(
>                        for transformer in transformers)
>    def use_py():
>        return (transformers_tag() == sys.implementation.optim_tag)
> The order of ``sys.get_code_transformers()`` matter. For example, the
> ``fat`` transformer followed by the ``pythran`` transformer gives the
> optimizer tag ``fat-pythran``.
> The behaviour of the ``importlib`` module is unchanged with the default
> optimizer tag (``'opt'``).
> Peephole optimizer
> ------------------
> By default, ``sys.implementation.optim_tag`` is ``opt`` and
> ``sys.get_code_transformers()`` returns a list of one code transformer:
> the peephole optimizer (optimize the bytecode).
> Use ``-o noopt`` to disable the peephole optimizer. In this case, the
> optimizer tag is ``noopt`` and no code transformer is registered.
> Using the ``-o opt`` option has not effect.
> AST enhancements
> ----------------
> Enhancements to simplify the implementation of AST transformers:
> * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
>  transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
>  transformers.
> * Add ``ast.Constant``: this type is not emited by the compiler, but
>  can be used in an AST transformer to simplify the code. It does not
>  contain line number and column offset informations on tuple or
>  frozenset items.
> * ``PyCodeObject.co_lnotab``: line number delta becomes signed to
>  support moving instructions (note: need to modify MAGIC_NUMBER in
>  importlib). Implemented in the `issue #26107
>  <>`_
> * Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
>  constants. Currently, ``tuple`` and ``frozenset`` constants are
>  created by the peephole transformer, after the bytecode compilation.
> * ``marshal`` module: fix serialization of the empty frozenset singleton
> * update ``Tools/parser/`` to support the new ``ast.Constant``
>  node type
> Examples
> ========
> .pyc filenames
> --------------
> Example of ``.pyc`` filenames of the ``os`` module.
> With the default optimizer tag ``'opt'``:
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.opt-0.pyc``                    0
> ``os.cpython-36.opt-1.pyc``                    1
> ``os.cpython-36.opt-2.pyc``                    2
> ===========================   ==================
> With the ``'fat'`` optimizer tag:
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.fat-0.pyc``                    0
> ``os.cpython-36.fat-1.pyc``                    1
> ``os.cpython-36.fat-2.pyc``                    2
> ===========================   ==================
> Bytecode transformer
> --------------------
> Scary bytecode transformer replacing all strings with
> ``"Ni! Ni!  Ni!"``::
>    import sys
>    class BytecodeTransformer:
>        name = "knights_who_say_ni"
>        def code_transformer(self, code, consts, names, lnotab, context):
>            consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const
>                      for const in consts]
>            return (code, consts, names, lnotab)
>    # replace existing code transformers with the new bytecode transformer
>    sys.set_code_transformers([BytecodeTransformer()])
>    # execute code which will be transformed by code_transformer()
>    exec("print('Hello World!')")
> Output::
>    Ni! Ni! Ni!
> AST transformer
> ---------------
> Similary to the bytecode transformer example, the AST transformer also
> replaces all strings with ``"Ni! Ni! Ni!"``::
>    import ast
>    import sys
>    class KnightsWhoSayNi(ast.NodeTransformer):
>        def visit_Str(self, node):
>            node.s = 'Ni! Ni! Ni!'
>            return node
>    class ASTTransformer:
>        name = "knights_who_say_ni"
>        def __init__(self):
>            self.transformer = KnightsWhoSayNi()
>        def ast_transformer(self, tree, context):
>            self.transformer.visit(tree)
>            return tree
>    # replace existing code transformers with the new AST transformer
>    sys.set_code_transformers([ASTTransformer()])
>    # execute code which will be transformed by ast_transformer()
>    exec("print('Hello World!')")
> Output::
>    Ni! Ni! Ni!
> Other Python implementations
> ============================
> The PEP 511 should be implemented by all Python implementation, but the
> bytecode and the AST are not standardized.
> By the way, even between minor version of CPython, there are changes on
> the AST API. There are differences, but only minor differences. It is
> quite easy to write an AST transformer which works on Python 2.7 and
> Python 3.5 for example.
> Discussion
> ==========
> * `[Python-Dev] AST optimizer implemented in Python
>  <>`_
>  (August 2012)
> Prior Art
> =========
> AST optimizers
> --------------
> In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
> a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
> some functionality out of the peephole optimizer
> <>`_.  The patch adds ``ast.Lit`` (it
> was proposed to rename it to ``ast.Literal``).
> In 2012, Victor Stinner wrote the `astoptimizer
> <>`_ project, an AST optimizer
> implementing various optimizations. Most interesting optimizations break
> the Python semantics since no guard is used to disable optimization if
> something changes.
> In 2015, Victor Stinner wrote the `fatoptimizer
> <>`_ project, an AST optimizer
> specializing functions using guards.
> The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST"
> optimizer <>`_ was a first attempt of
> API for code transformers, but specific to AST.
> Python Preprocessors
> --------------------
> * `MacroPy <>`_: MacroPy is an
>  implementation of Syntactic Macros in the Python Programming Language.
>  MacroPy provides a mechanism for user-defined functions (macros) to
>  perform transformations on the abstract syntax tree (AST) of a Python
>  program at import time.
> * `pypreprocessor <>`_: C-style
>  preprocessor directives in Python, like ``#define`` and ``#ifdef``
> Bytecode transformers
> ---------------------
> * `codetransformer <>`_:
>  Bytecode transformers for CPython inspired by the ``ast`` module?s
>  ``NodeTransformer``.
> * `byteplay <>`_: Byteplay lets you
>  convert Python code objects into equivalent objects which are easy to
>  play with, and lets you convert those objects back into living Python
>  code objects. It's useful for applying crazy transformations on Python
>  functions, and is also useful in learning Python byte code
>  intricacies. See `byteplay documentation
>  <>`_.
> See also:
> * `BytecodeAssembler <>`_
> Copyright
> =========
> This document has been placed in the public domain.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From victor.stinner at  Fri Jan 15 12:11:39 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 15 Jan 2016 18:11:39 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

I have a fully working implementation of the PEP 509, 510 and 511 (all
together). You can install it to play with it if you want ;-)

Get and compile patched (FAT) Python with:
hg clone
cd fatpython
./configure && make

Enjoy slow and non optimized bytecode :-)
$ ./python -o noopt -c 'import dis; dis.dis(compile("1+1", "test", "exec"))'
  1           0 LOAD_CONST               0 (1)
              3 LOAD_CONST               0 (1)
              6 BINARY_ADD
              7 POP_TOP
              8 LOAD_CONST               1 (None)
             11 RETURN_VALUE

Ok, now if you want to play with fat & fatoptimizer modules (FAT Python):
./python -m venv ENV
cd ENV
git clone
git clone
(cd fat; ../bin/python install)
(cd fatoptimizer; ../bin/python install)
cd ..

I'm not using virtual environment for my development, I prefer to copy
manually fatoptimizer/fatoptimizer/ directory and the build .so file
of the fat module into the Lib/ directory of the standard library. If
you installed the patched Python into /opt/fatpython (./confgure
--prefix=/opt/fatpython && make && sudo make install), you can also
use "python install" in fat/ and fatoptimizer/ to install
them easily.

The drawback of the virtualenv is that it's easy to use the wrong
python (./python vs ENV/bin/python) and don't have FAT Python enabled
because of which ignores silently
import errors in sitecustomize...

Ensure that FAT Python is enabled with:
$ ./python -X fat -c 'import sys; print(sys.implementation.optim_tag)'
You must get "fat-opt" (and not "opt").

Note: The optimizer tag is "fat-opt" and not "fat" because
fatoptimizer keeps the peephole optimizer.

Enable FAT Python using the "-X fat" command line option:
$ ENV/bin/python -X fat
>>> def func(): return len("abc")

>>> import dis
>>> dis.dis(func)
  1           0 LOAD_GLOBAL              0 (len)
              3 LOAD_CONST               1 ('abc')
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE

>>> import fat
>>> fat.get_specialized(func)
[(<code object func at 0x7f9d3155b1e0, file "<stdin>", line 1>,
[<fat.GuardBuiltins object at 0x7f9d39191198>])]

>>> dis.dis(fat.get_specialized(func)[0][0])
  1           0 LOAD_CONST               1 (3)
              3 RETURN_VALUE

Play with microbenchmarks:
$ ENV/bin/python -m timeit -s 'def f(): return len("abc")' 'f()'
10000000 loops, best of 3: 0.122 usec per loop

$ ENV/bin/python -X fat -m timeit -s 'def f(): return len("abc")' 'f()'
10000000 loops, best of 3: 0.0932 usec per loop
Oh look! It's faster without having to touch the code ;-)

I'm using Lib/ to register the optimizer if -X fat is used:
import sys
if sys._xoptions.get('fat'):
    import fatoptimizer; fatoptimizer._register()

If you want to run optimized code without registering the optimizer,
it doesn't work because .pyc are missing:
$ ENV/bin/python -o fat-opt
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: missing AST transformers for
'.../Lib/encodings/': optim_tag='fat', transformers

You have to compile optimized .pyc files:
# the optimizer is slow, so add -v to enable fatoptimizer logs for more fun
ENV/bin/python -X fat -v -m compileall
# why does compileall not compile encodings/*.py?
ENV/bin/python -X fat -m py_compile

Finally, enjoy optimized code with no registered optimized:
# hum, use maybe ENV/bin/activate instead of my magic tricks
$ export PYTHONPATH=ENV/lib/python3.6/site-packages/

$ ENV/bin/python -o fat-opt -c 'import sys;
print(sys.implementation.optim_tag, sys.get_code_transformers())'
fat-opt []

Remember that you cannot import .py files in this case, only .pyc:
$ touch
$ ENV/bin/python -o fat-opt -c 'import x'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: missing AST transformers for '.../':
optim_tag='fat-opt', transformers tag='noopt'


From brett at  Fri Jan 15 12:22:08 2016
From: brett at (Brett Cannon)
Date: Fri, 15 Jan 2016 17:22:08 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 15 Jan 2016 at 08:11 Victor Stinner <victor.stinner at>

> [SNIP]

> Optimizer tag
> -------------
> Changes:
> * Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
>   The default optimization tag is ``'opt'``.
> * Add a new ``-o OPTIM_TAG`` command line option to set
>   ``sys.implementation.optim_tag``.
> Changes on ``importlib``:
> * ``importlib`` uses ``sys.implementation.optim_tag`` to build the
>   ``.pyc`` filename to importing modules, instead of always using
>   ``opt``. Remove also the special case for the optimizer level ``0``
>   with the default optimizer tag ``'opt'`` to simplify the code.
> * When loading a module, if the ``.pyc`` file is missing but the ``.py``
>   is available, the ``.py`` is only used if code optimizers have the
>   same optimizer tag than the current tag, otherwise an ``ImportError``
>   exception is raised.
> Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
> be compiled to import a module::
>     def transformers_tag():
>         transformers = sys.get_code_transformers()
>         if not transformers:
>             return 'noopt'
>         return '-'.join(
>                         for transformer in transformers)
>     def use_py():
>         return (transformers_tag() == sys.implementation.optim_tag)
> The order of ``sys.get_code_transformers()`` matter. For example, the
> ``fat`` transformer followed by the ``pythran`` transformer gives the
> optimizer tag ``fat-pythran``.
> The behaviour of the ``importlib`` module is unchanged with the default
> optimizer tag (``'opt'``).

I just wanted to point out to people that the key part of this PEP is the
change in semantics of `-O` accepting an argument. Without this change
there is no way to cause import to pick up on optimized .pyc files that you
want it to use without abusing pre-existing .pyc filenames.

This also means that everything else is optional. That doesn't mean it
shouldn't be considered, mind you, as it makes using AST and bytecode
transformers more practical. But some `-O` change that allows user-defined
optimization tags is needed for any of this to work reasonably. From there
it's theoretically possible for someone to write their own compileall that
pre-compiles all Python code to .pyc files with a specific optimization tag
which they specify with `-O` using their own AST and bytecode transformers
and hence not need the transformation features built into sys/import.

I should also point out that this does get tricky in terms of how to handle
the stdlib if you have not pre-compiled it, e.g., if the first module
imported by Python is the encodings module then how to make sure the AST
optimizers are ready to go by the time that import happens?

And lastly, Victor proposes that all .pyc files get an optimization tag.
While there is nothing technically wrong with that, PEP 488
<> purposefully didn't do that in
the default case for backwards-compatibility, so that will need to be at
least mentioned in the PEP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Fri Jan 15 12:40:13 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 15 Jan 2016 18:40:13 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-15 18:22 GMT+01:00 Brett Cannon <brett at>:
> I just wanted to point out to people that the key part of this PEP is the
> change in semantics of `-O` accepting an argument.

The be exact, it's a new "-o arg" option, it's different from -O and
-OO (uppercase). Since I don't know what to do with -O and -OO, I
simply kept them :-D

> I should also point out that this does get tricky in terms of how to handle
> the stdlib if you have not pre-compiled it, e.g., if the first module
> imported by Python is the encodings module then how to make sure the AST
> optimizers are ready to go by the time that import happens?

Since importlib reads sys.implementation.optim_tag at each import, it
works fine.

For example, you start with "opt" optimizer tag. You import everything
needed for fatoptimizer. Then calling sys.set_code_transformers() will
set a new optimizer flag (ex: "fat-opt"). But it works since the
required code transformers are now available.

The tricky part is more when you want to deploy an application without
the code transformer, you have to ensure that all .py files are
compiled to .pyc. But there is no technical issues to compile them,
it's more a practical issue.

See my second email with a lot of commands, I showed how .pyc are
created with different .pyc filenames. Or follow my commands to try my
"fatpython" fork to play yourself with the code ;-)

> And lastly, Victor proposes that all .pyc files get an optimization tag.
> While there is nothing technically wrong with that, PEP 488 purposefully
> didn't do that in the default case for backwards-compatibility, so that will
> need to be at least mentioned in the PEP.

The PEP already contains:
"Remove also the special case for the optimizer level 0 with the
default optimizer tag 'opt' to simplify the code."

Code relying on the exact .pyc filename (like unit tests) already have
to be modified to use the optimizer tag. It's just an opportunity to
simplify the code. I don't really care of this specific change ;-)


From ethan at  Fri Jan 15 13:22:56 2016
From: ethan at (Ethan Furman)
Date: Fri, 15 Jan 2016 10:22:56 -0800
Subject: [Python-ideas] Boolean value of an Enum member
Message-ID: <>

When Enum was being designed one of the questions considered was where 
to start autonumbering: zero or one.

As I remember the discussion we chose not to start with zero because we 
didn't want an enum member to be False by default, and having a member 
with value 0 be True was discordant.  So the functional API starts with 
1 unless overridden.  In fact, according to the Enum docs:

    The reason for defaulting to ``1`` as the starting number and
    not ``0`` is that ``0`` is ``False`` in a boolean sense, but
    enum members all evaluate to ``True``.

However, if the Enum is combined with some other type (str, int, float, 
etc), then most behaviour is determined by that type -- including 
boolean evaluation.  So the empty string, 0 values, etc, will cause that 
Enum member to evaluate as False.

So the question now is:  for a standard Enum (meaning no other type 
besides Enum is involved) should __bool__ look to the value of the Enum 
member to determine True/False, or should we always be True by default 
and make the Enum creator add their own __bool__ if they want something 

On the one hand we have backwards compatibility, which will take a 
version to change.

On the other hand we have a pretty basic difference in how zero/empty is 
handled between "pure" Enums and "mixed" Enums.

On the gripping hand we have . . .

Please respond with your thoughts on changing pure Enums to match mixed 
Enums or any experience you have had with relying on the "always True" 
behaviour or if you have implemented your own __bool__ to match the 
standard True/False meanings or if you have implemented your own 
__bool__ to match some other scheme entirely.


From guido at  Fri Jan 15 13:28:40 2016
From: guido at (Guido van Rossum)
Date: Fri, 15 Jan 2016 10:28:40 -0800
Subject: [Python-ideas] Boolean value of an Enum member
In-Reply-To: <>
References: <>
Message-ID: <>

Honestly I think it's too late to change. The proposal to change plain
Enums to False when their value is zero (or falsey) would be a huge
backward incompatibility. I don't think there's a reasonable path forward,
and also don't think there's a big reason to regret the current semantics.

On Fri, Jan 15, 2016 at 10:22 AM, Ethan Furman <ethan at> wrote:

> When Enum was being designed one of the questions considered was where to
> start autonumbering: zero or one.
> As I remember the discussion we chose not to start with zero because we
> didn't want an enum member to be False by default, and having a member with
> value 0 be True was discordant.  So the functional API starts with 1 unless
> overridden.  In fact, according to the Enum docs:
>    The reason for defaulting to ``1`` as the starting number and
>    not ``0`` is that ``0`` is ``False`` in a boolean sense, but
>    enum members all evaluate to ``True``.
> However, if the Enum is combined with some other type (str, int, float,
> etc), then most behaviour is determined by that type -- including boolean
> evaluation.  So the empty string, 0 values, etc, will cause that Enum
> member to evaluate as False.
> So the question now is:  for a standard Enum (meaning no other type
> besides Enum is involved) should __bool__ look to the value of the Enum
> member to determine True/False, or should we always be True by default and
> make the Enum creator add their own __bool__ if they want something
> different?
> On the one hand we have backwards compatibility, which will take a version
> to change.
> On the other hand we have a pretty basic difference in how zero/empty is
> handled between "pure" Enums and "mixed" Enums.
> On the gripping hand we have . . .
> Please respond with your thoughts on changing pure Enums to match mixed
> Enums or any experience you have had with relying on the "always True"
> behaviour or if you have implemented your own __bool__ to match the
> standard True/False meanings or if you have implemented your own __bool__
> to match some other scheme entirely.
> --
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From barry at  Fri Jan 15 13:32:58 2016
From: barry at (Barry Warsaw)
Date: Fri, 15 Jan 2016 13:32:58 -0500
Subject: [Python-ideas] [Python-Dev] Boolean value of an Enum member
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 15, 2016, at 10:22 AM, Ethan Furman wrote:

>So the question now is: for a standard Enum (meaning no other type besides
>Enum is involved) should __bool__ look to the value of the Enum member to
>determine True/False, or should we always be True by default and make the
>Enum creator add their own __bool__ if they want something different?

The latter.  I think in general enums are primarily a symbolic value and don't
have truthiness.  It's also so easy to override when you define the enum that
it's not worth changing the current behavior.


From abarnert at  Fri Jan 15 14:41:15 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 15 Jan 2016 13:41:15 -0600
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 15, 2016, at 10:10, Victor Stinner <victor.stinner at> wrote:
> This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
> to implement a static Python optimizer specializing functions with
> guards.

Some thoughts (and I realize that for many of these the answer will just be "that's out of scope for this PEP"):

* You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers.

* Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self? (Are they automatically static methods, like __new__?) It seems like the only advantage to require attaching them to a class is to associate each one with a name; surely there's a simpler way to do that. And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers? Why not just let them be functions, and use the function name (or maybe have a separate attribute to override that, which a simple decorator can apply)?

* Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object?

* It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass?

* The way it's written, it sounds like the main advantage of your proposal is that it makes it easier to write optimizations that need guards. But it also makes it easier to write the same kinds of optimizations that are already possible but a bit painful. It might be worth rewording a bit to make that clearer. 

* There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal.

* In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit.

* It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode. For example, some language extensions add things that can't be parsed as a valid Python AST. This is particularly an issue when playing with new feature ideas. In some cases, a simple text preprocessor can convert it into code which can be compiled into AST nodes that you can then transform the way you want. At present, with import hooks being the best way to do any of these, there's no disparity that makes text transforms harder than AST transforms. But if we're going to have transformer objects with code_transformer and ast_transformer methods, but a text preprocessor still requires an import hook, that seems unfortunate. Is there a reason you can't add text_transformer as well? (And maybe bytes_transformer. And this would open the door to later add token_transformer in the same place--and for now, you can call tokenize, untokenize, and tokenize again inside a text_transformer.)

* I like that I can now compile to PyCF_AST or to PyCF_TRANSFORMED_AST. But can I call compile with an untransformed AST and the PyCF_TRANSFORMED_AST flag? This would be useful if I had some things that still worked via import hook--I could choose whether to hook in before or after the standard/registered set--e.g., if I'm using a text transformer, or a CPython compiled with a hacked-up grammar that generates dummy AST nodes for new language productions, I may want to then transform those to real nodes before the optimizers get to them. (This would be less necessary if we had text-transformer.)

* It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself? Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful.

* One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source; you have to manually recompile the source--making sure to use the same flags, globals, etc.--to get back to the AST. I think that will become even more of a problem now that you need separate ways to get the "basic" parse and the "post-all-installed-transformations" parse. Maybe this would be out of scope for your project, but having some way to access these rather than rebuild them could be very cool.

> If the PEP is accepted, it will solve a long list of issues, some
> issues are old, like #1346238 which is 11 years old ;-) I found 12
> issues:
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> I worked to make the PEP more generic that "this hook is written for
> FAT Python". Please read the full PEP to see a long list of existing
> usages in Python of code transformers.
> You may read again the discussion which occurred 4 years ago about the
> same topic:
> (the thread starts with an idea of AST optimizer, but is moves quickly
> to a generic API to transform the code)
> Thanks to Red Hat for giving me time to experiment on this.
> Victor
> HTML version:
> PEP: 511
> Title: API for code transformers
> Version: $Revision$
> Last-Modified: $Date$
> Author: Victor Stinner <victor.stinner at>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 4-January-2016
> Python-Version: 3.6
> Abstract
> ========
> Propose an API to register bytecode and AST transformers. Add also ``-o
> OPTIM_TAG`` command line option to change ``.pyc`` filenames, ``-o
> noopt`` disables the peephole optimizer. Raise an ``ImportError``
> exception on import if the ``.pyc`` file is missing and the code
> transformers required to transform the code are missing.  code
> transformers are not needed code transformed ahead of time (loaded from
> ``.pyc`` files).
> Rationale
> =========
> Python does not provide a standard way to transform the code. Projects
> transforming the code use various hooks. The MacroPy project uses an
> import hook: it adds its own module finder in ``sys.meta_path`` to
> hook its AST transformer. Another option is to monkey-patch the
> builtin ``compile()`` function. There are even more options to
> hook a code transformer.
> Python 3.4 added a ``compile_source()`` method to
> ````. But code transformation is wider than
> just importing modules, see described use cases below.
> Writing an optimizer or a preprocessor is out of the scope of this PEP.
> Usage 1: AST optimizer
> ----------------------
> Transforming an Abstract Syntax Tree (AST) is a convenient
> way to implement an optimizer. It's easier to work on the AST than
> working on the bytecode, AST contains more information and is more high
> level.
> Since the optimization can done ahead of time, complex but slow
> optimizations can be implemented.
> Example of optimizations which can be implemented with an AST optimizer:
> * `Copy propagation
>  <>`_:
>  replace ``x=1; y=x`` with ``x=1; y=1``
> * `Constant folding
>  <>`_:
>  replace ``1+1`` with ``2``
> * `Dead code elimination
>  <>`_
> Using guards (see the `PEP 510
> <>`_), it is possible to
> implement a much wider choice of optimizations. Examples:
> * Simplify iterable: replace ``range(3)`` with ``(0, 1, 2)`` when used
>  as iterable
> * `Loop unrolling <>`_
> * Call pure builtins: replace ``len("abc")`` with ``3``
> * Copy used builtin symbols to constants
> * See also `optimizations implemented in fatoptimizer
>  <>`_,
>  a static optimizer for Python 3.6.
> The following issues can be implemented with an AST optimizer:
> * `Issue #1346238
>  <>`_: A constant folding
>  optimization pass for the AST
> * `Issue #2181 <>`_:
>  optimize out local variables at end of function
> * `Issue #2499 <>`_:
>  Fold unary + and not on constants
> * `Issue #4264 <>`_:
>  Patch: optimize code to use LIST_APPEND instead of calling list.append
> * `Issue #7682 <>`_:
>  Optimisation of if with constant expression
> * `Issue #10399 <>`_: AST
>  Optimization: inlining of function calls
> * `Issue #11549 <>`_:
>  Build-out an AST optimizer, moving some functionality out of the
>  peephole optimizer
> * `Issue #17068 <>`_:
>  peephole optimization for constant strings
> * `Issue #17430 <>`_:
>  missed peephole optimization
> Usage 2: Preprocessor
> ---------------------
> A preprocessor can be easily implemented with an AST transformer. A
> preprocessor has various and different usages.
> Some examples:
> * Remove debug code like assertions and logs to make the code faster to
>  run it for production.
> * `Tail-call Optimization <>`_
> * Add profiling code
> * `Lazy evaluation <>`_:
>  see `lazy_python <>`_
>  (bytecode transformer) and `lazy macro of MacroPy
>  <>`_ (AST transformer)
> * Change dictionary literals into collection.OrderedDict instances
> * Declare constants: see `@asconstants of codetransformer
>  <>`_
> * Domain Specific Language (DSL) like SQL queries. The
>  Python language itself doesn't need to be modified. Previous attempts
>  to implement DSL for SQL like `PEP 335 - Overloadable Boolean
>  Operators <>`_ was rejected.
> * Pattern Matching of functional languages
> * String Interpolation, but `PEP 498 -- Literal String Interpolation
>  <>`_ was merged into Python
>  3.6.
> `MacroPy <>`_ has a long list of
> examples and use cases.
> This PEP does not add any new code transformer. Using a code transformer
> will require an external module and to register it manually.
> See also `PyXfuscator <>`_: Python
> obfuscator, deobfuscator, and user-assisted decompiler.
> Usage 3: Disable all optimization
> ---------------------------------
> Ned Batchelder asked to add an option to disable the peephole optimizer
> because it makes code coverage more difficult to implement. See the
> discussion on the python-ideas mailing list: `Disable all peephole
> optimizations
> <>`_.
> This PEP adds a new ``-o noopt`` command line option to disable the
> peephole optimizer. In Python, it's as easy as::
>    sys.set_code_transformers([])
> It will fix the `Issue #2506 <>`_: Add
> mechanism to disable optimizations.
> Usage 4: Write new bytecode optimizers in Python
> ------------------------------------------------
> Python 3.6 optimizes the code using a peephole optimizer. By
> definition, a peephole optimizer has a narrow view of the code and so
> can only implement basic optimizations. The optimizer rewrites the
> bytecode. It is difficult to enhance it, because it written in C.
> With this PEP, it becomes possible to implement a new bytecode optimizer
> in pure Python and experiment new optimizations.
> Some optimizations are easier to implement on the AST like constant
> folding, but optimizations on the bytecode are still useful. For
> example, when the AST is compiled to bytecode, useless jumps can be
> emited because the compiler is naive and does not try to optimize
> anything.
> Use Cases
> =========
> This section give examples of use cases explaining when and how code
> transformers will be used.
> Interactive interpreter
> -----------------------
> It will be possible to use code transformers with the interactive
> interpreter which is popular in Python and commonly used to demonstrate
> Python.
> The code is transformed at runtime and so the interpreter can be slower
> when expensive code transformers are used.
> Build a transformed package
> ---------------------------
> It will be possible to build a package of the transformed code.
> A transformer can have a configuration. The configuration is not stored
> in the package.
> All ``.pyc`` files of the package must be transformed with the same code
> transformers and the same transformers configuration.
> It is possible to build different ``.pyc`` files using different
> optimizer tags. Example: ``fat`` for the default configuration and
> ``fat_inline`` for a different configuration with function inlining
> enabled.
> A package can contain ``.pyc`` files with different optimizer tags.
> Install a package containing transformed .pyc files
> ---------------------------------------------------
> It will be possible to install a package which contains transformed
> ``.pyc`` files.
> All ``.pyc`` files with any optimizer tag contained in the package are
> installed, not only for the current optimizer tag.
> Build .pyc files when installing a package
> ------------------------------------------
> If a package does not contain any ``.pyc`` files of the current
> optimizer tag (or some ``.pyc`` files are missing), the ``.pyc`` are
> created during the installation.
> Code transformers of the optimizer tag are required. Otherwise, the
> installation fails with an error.
> Execute transformed code
> ------------------------
> It will be possible to execute transformed code.
> Raise an ``ImportError`` exception on import if the ``.pyc`` file of the
> current optimizer tag is missing and the code transformers required to
> transform the code are missing.
> The interesting point here is that code transformers are not needed to
> execute the transformed code if all required ``.pyc`` files are already
> available.
> Code transformer API
> ====================
> A code transformer is a class with ``ast_transformer()`` and/or
> ``code_transformer()`` methods (API described below) and a ``name``
> attribute.
> For efficiency, do not define a ``code_transformer()`` or
> ``ast_transformer()`` method if it does nothing.
> The ``name`` attribute (``str``) must be a short string used to identify
> an optimizer. It is used to build a ``.pyc`` filename. The name must not
> contain dots (``'.'``), dashes (``'-'``) or directory separators: dots
> are used to separated fields in a ``.pyc`` filename and dashes areused
> to join code transformer names to build the optimizer tag.
> .. note::
>   It would be nice to pass the fully qualified name of a module in the
>   *context* when an AST transformer is used to transform a module on
>   import, but it looks like the information is not available in
>   ``PyParser_ASTFromStringObject()``.
> code_transformer()
> ------------------
> Prototype::
>    def code_transformer(code, consts, names, lnotab, context):
>        ...
>        return (code, consts, names, lnotab)
> Parameters:
> * *code*: the bytecode (``bytes``)
> * *consts*: a sequence of constants
> * *names*: tuple of variable names
> * *lnotab*: table mapping instruction offsets to line numbers
>  (``bytes``)
> The code transformer is run after the compilation to bytecode
> ast_transformer()
> ------------------
> Prototype::
>    def ast_transformer(tree, context):
>        ...
>        return tree
> Parameters:
> * *tree*: an AST tree
> * *context*: an object with a ``filename`` attribute (``str``)
> It must return an AST tree. It can modify the AST tree in place, or
> create a new AST tree.
> The AST transformer is called after the creation of the AST by the
> parser and before the compilation to bytecode. New attributes may be
> added to *context* in the future.
> Changes
> =======
> In short, add:
> * ``-o OPTIM_TAG`` command line option
> * ``ast.Constant``
> * ``sys.get_code_transformers()``
> * ``sys.implementation.optim_tag``
> * ``sys.set_code_transformers(transformers)``
> API to get/set code transformers
> --------------------------------
> Add new functions to register code transformers:
> * ``sys.set_code_transformers(transformers)``: set the list of code
>  transformers and update ``sys.implementation.optim_tag``
> * ``sys.get_code_transformers()``: get the list of code
>  transformers.
> The order of code transformers matter. Running transformer A and then
> transformer B can give a different output than running transformer B an
> then transformer A.
> Example to prepend a new code transformer::
>    transformers = sys.get_code_transformers()
>    transformers.insert(0, new_cool_transformer)
>    sys.set_code_transformers(transformers)
> All AST tranformers are run sequentially (ex: the second transformer
> gets the input of the first transformer), and then all bytecode
> transformers are run sequentially.
> Optimizer tag
> -------------
> Changes:
> * Add ``sys.implementation.optim_tag`` (``str``): optimization tag.
>  The default optimization tag is ``'opt'``.
> * Add a new ``-o OPTIM_TAG`` command line option to set
>  ``sys.implementation.optim_tag``.
> Changes on ``importlib``:
> * ``importlib`` uses ``sys.implementation.optim_tag`` to build the
>  ``.pyc`` filename to importing modules, instead of always using
>  ``opt``. Remove also the special case for the optimizer level ``0``
>  with the default optimizer tag ``'opt'`` to simplify the code.
> * When loading a module, if the ``.pyc`` file is missing but the ``.py``
>  is available, the ``.py`` is only used if code optimizers have the
>  same optimizer tag than the current tag, otherwise an ``ImportError``
>  exception is raised.
> Pseudo-code of a ``use_py()`` function to decide if a ``.py`` file can
> be compiled to import a module::
>    def transformers_tag():
>        transformers = sys.get_code_transformers()
>        if not transformers:
>            return 'noopt'
>        return '-'.join(
>                        for transformer in transformers)
>    def use_py():
>        return (transformers_tag() == sys.implementation.optim_tag)
> The order of ``sys.get_code_transformers()`` matter. For example, the
> ``fat`` transformer followed by the ``pythran`` transformer gives the
> optimizer tag ``fat-pythran``.
> The behaviour of the ``importlib`` module is unchanged with the default
> optimizer tag (``'opt'``).
> Peephole optimizer
> ------------------
> By default, ``sys.implementation.optim_tag`` is ``opt`` and
> ``sys.get_code_transformers()`` returns a list of one code transformer:
> the peephole optimizer (optimize the bytecode).
> Use ``-o noopt`` to disable the peephole optimizer. In this case, the
> optimizer tag is ``noopt`` and no code transformer is registered.
> Using the ``-o opt`` option has not effect.
> AST enhancements
> ----------------
> Enhancements to simplify the implementation of AST transformers:
> * Add a new compiler flag ``PyCF_TRANSFORMED_AST`` to get the
>  transformed AST. ``PyCF_ONLY_AST`` returns the AST before the
>  transformers.
> * Add ``ast.Constant``: this type is not emited by the compiler, but
>  can be used in an AST transformer to simplify the code. It does not
>  contain line number and column offset informations on tuple or
>  frozenset items.
> * ``PyCodeObject.co_lnotab``: line number delta becomes signed to
>  support moving instructions (note: need to modify MAGIC_NUMBER in
>  importlib). Implemented in the `issue #26107
>  <>`_
> * Enhance the bytecode compiler to support ``tuple`` and ``frozenset``
>  constants. Currently, ``tuple`` and ``frozenset`` constants are
>  created by the peephole transformer, after the bytecode compilation.
> * ``marshal`` module: fix serialization of the empty frozenset singleton
> * update ``Tools/parser/`` to support the new ``ast.Constant``
>  node type
> Examples
> ========
> .pyc filenames
> --------------
> Example of ``.pyc`` filenames of the ``os`` module.
> With the default optimizer tag ``'opt'``:
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.opt-0.pyc``                    0
> ``os.cpython-36.opt-1.pyc``                    1
> ``os.cpython-36.opt-2.pyc``                    2
> ===========================   ==================
> With the ``'fat'`` optimizer tag:
> ===========================   ==================
> .pyc filename                 Optimization level
> ===========================   ==================
> ``os.cpython-36.fat-0.pyc``                    0
> ``os.cpython-36.fat-1.pyc``                    1
> ``os.cpython-36.fat-2.pyc``                    2
> ===========================   ==================
> Bytecode transformer
> --------------------
> Scary bytecode transformer replacing all strings with
> ``"Ni! Ni!  Ni!"``::
>    import sys
>    class BytecodeTransformer:
>        name = "knights_who_say_ni"
>        def code_transformer(self, code, consts, names, lnotab, context):
>            consts = ['Ni! Ni! Ni!' if isinstance(const, str) else const
>                      for const in consts]
>            return (code, consts, names, lnotab)
>    # replace existing code transformers with the new bytecode transformer
>    sys.set_code_transformers([BytecodeTransformer()])
>    # execute code which will be transformed by code_transformer()
>    exec("print('Hello World!')")
> Output::
>    Ni! Ni! Ni!
> AST transformer
> ---------------
> Similary to the bytecode transformer example, the AST transformer also
> replaces all strings with ``"Ni! Ni! Ni!"``::
>    import ast
>    import sys
>    class KnightsWhoSayNi(ast.NodeTransformer):
>        def visit_Str(self, node):
>            node.s = 'Ni! Ni! Ni!'
>            return node
>    class ASTTransformer:
>        name = "knights_who_say_ni"
>        def __init__(self):
>            self.transformer = KnightsWhoSayNi()
>        def ast_transformer(self, tree, context):
>            self.transformer.visit(tree)
>            return tree
>    # replace existing code transformers with the new AST transformer
>    sys.set_code_transformers([ASTTransformer()])
>    # execute code which will be transformed by ast_transformer()
>    exec("print('Hello World!')")
> Output::
>    Ni! Ni! Ni!
> Other Python implementations
> ============================
> The PEP 511 should be implemented by all Python implementation, but the
> bytecode and the AST are not standardized.
> By the way, even between minor version of CPython, there are changes on
> the AST API. There are differences, but only minor differences. It is
> quite easy to write an AST transformer which works on Python 2.7 and
> Python 3.5 for example.
> Discussion
> ==========
> * `[Python-Dev] AST optimizer implemented in Python
>  <>`_
>  (August 2012)
> Prior Art
> =========
> AST optimizers
> --------------
> In 2011, Eugene Toder proposed to rewrite some peephole optimizations in
> a new AST optimizer: issue #11549, `Build-out an AST optimizer, moving
> some functionality out of the peephole optimizer
> <>`_.  The patch adds ``ast.Lit`` (it
> was proposed to rename it to ``ast.Literal``).
> In 2012, Victor Stinner wrote the `astoptimizer
> <>`_ project, an AST optimizer
> implementing various optimizations. Most interesting optimizations break
> the Python semantics since no guard is used to disable optimization if
> something changes.
> In 2015, Victor Stinner wrote the `fatoptimizer
> <>`_ project, an AST optimizer
> specializing functions using guards.
> The Issue #17515 `"Add sys.setasthook() to allow to use a custom AST"
> optimizer <>`_ was a first attempt of
> API for code transformers, but specific to AST.
> Python Preprocessors
> --------------------
> * `MacroPy <>`_: MacroPy is an
>  implementation of Syntactic Macros in the Python Programming Language.
>  MacroPy provides a mechanism for user-defined functions (macros) to
>  perform transformations on the abstract syntax tree (AST) of a Python
>  program at import time.
> * `pypreprocessor <>`_: C-style
>  preprocessor directives in Python, like ``#define`` and ``#ifdef``
> Bytecode transformers
> ---------------------
> * `codetransformer <>`_:
>  Bytecode transformers for CPython inspired by the ``ast`` module?s
>  ``NodeTransformer``.
> * `byteplay <>`_: Byteplay lets you
>  convert Python code objects into equivalent objects which are easy to
>  play with, and lets you convert those objects back into living Python
>  code objects. It's useful for applying crazy transformations on Python
>  functions, and is also useful in learning Python byte code
>  intricacies. See `byteplay documentation
>  <>`_.
> See also:
> * `BytecodeAssembler <>`_
> Copyright
> =========
> This document has been placed in the public domain.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From at  Fri Jan 15 15:39:53 2016
From: at (Yury Selivanov)
Date: Fri, 15 Jan 2016 15:39:53 -0500
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

Hi Victor,

On 2016-01-15 11:10 AM, Victor Stinner wrote:
> Hi,
> This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
> to implement a static Python optimizer specializing functions with
> guards.

All your PEPs are very interesting, thanks for your hard work!
I'm very happy to see that we're trying to make CPython faster.

There are some comments below:

> If the PEP is accepted, it will solve a long list of issues, some
> issues are old, like #1346238 which is 11 years old ;-) I found 12
> issues:
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *

It's important to say that all of those issues (except 2506)
are not bugs, but proposals to implement some nano- and
micro- optimizations.

Issue 2506 is about having an option to disable the peephole
optimizer, which is a very narrow subset of what PEP 511
proposes to add.

> Usage 2: Preprocessor
> ---------------------
> A preprocessor can be easily implemented with an AST transformer. A
> preprocessor has various and different usages.
> Some examples:
> * Remove debug code like assertions and logs to make the code faster to
>    run it for production.
> * `Tail-call Optimization <>`_
> * Add profiling code
> * `Lazy evaluation <>`_:
>    see `lazy_python <>`_
>    (bytecode transformer) and `lazy macro of MacroPy
>    <>`_ (AST transformer)
> * Change dictionary literals into collection.OrderedDict instances
> * Declare constants: see `@asconstants of codetransformer
>    <>`_
> * Domain Specific Language (DSL) like SQL queries. The
>    Python language itself doesn't need to be modified. Previous attempts
>    to implement DSL for SQL like `PEP 335 - Overloadable Boolean
>    Operators <>`_ was rejected.
> * Pattern Matching of functional languages
> * String Interpolation, but `PEP 498 -- Literal String Interpolation
>    <>`_ was merged into Python
>    3.6.

I think that most of those examples are rather weak.  Things like
tail-call optimizations, constants declarations, pattern matching,
case classes (from MacroPy) are nice concepts, but they should be
either directly implemented in Python language or not used at all

Things like auto-changing dictionary literals to OrderedDict
objects or in-Python DSLs will only help in creating hard to
maintain code base.  I say this because I have a first-hand
experience with decorators that patch opcodes, and import
hooks that rewrite AST.  When you get back to your code years
after it was written, you usually regret about doing those things.

All in all, I think that adding a blessed API for preprocessors
shouldn't be a focus of this PEP.  MacroPy works right now
with importlib, and I think it's a good solution for it.

I propose to only expose new APIs on the C level,
and explicitly mark them as provisional and experimental.
It should be clear, that those APIs are only for
*writing optimizers*, and nothing else.

[off-topic] I do think that having a macro system similar to
Rust might be a good idea.  However, macro in Rust have explicit
and distinct syntax, they have the necessary level of
documentation and tooling.  But this is a separate matter
deserving its own PEP ;)

> Usage 4: Write new bytecode optimizers in Python
> ------------------------------------------------
> Python 3.6 optimizes the code using a peephole optimizer. By
> definition, a peephole optimizer has a narrow view of the code and so
> can only implement basic optimizations. The optimizer rewrites the
> bytecode. It is difficult to enhance it, because it written in C.
> With this PEP, it becomes possible to implement a new bytecode optimizer
> in pure Python and experiment new optimizations.
> Some optimizations are easier to implement on the AST like constant
> folding, but optimizations on the bytecode are still useful. For
> example, when the AST is compiled to bytecode, useless jumps can be
> emited because the compiler is naive and does not try to optimize
> anything.

Would it be possible to (or does it make any sense):

1. Add new APIs for AST transformers (only exposed on the C

2. Remove the peephole optimizer.

3. Re-implement peephole optimizer using new APIs in CPython
(peephole does some very basic optimizations).

4. Implement other basic optimizations (like limited constant
folding) in CPython.

5. Leave the door open for you and other people to add more
AST optimizers (so that FAT isn't locked to CPython's slow
release cycle)?

I also want to say this: I'm -1 on implementing all three PEPs
until we see that FAT is able to give us at least 10% performance
improvement on micro-benchmarks.  We still have several months
before 3.6beta to see if that's possible.


From victor.stinner at  Fri Jan 15 16:14:17 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 15 Jan 2016 22:14:17 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

Wow, giant emails (as mine, ok).

2016-01-15 20:41 GMT+01:00 Andrew Barnert <abarnert at>:
> * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers.

The goal is to have a short optimizer tag. I'm not sure yet that it
makes sense, but I would like to be able to transform AST and bytecode
in a single code transformer. I prefer to add a single get/set
function to sys, instead of two (4 new functions).

> * Why are transformers objects with ast_transformer and code_transformer methods, but those methods don't take self?

They take self. It's just a formating issue (a mistake in the PEP) :-)
They take self parameter, see examples.

It's just hard to format a PEP correctly when you know Sphinx :-) I
started to use ".. method:: ..." but it doesn't work, it's the simpler
reST format ;-)

> It seems like the only advantage to require attaching them to a class is to associate each one with a name

I started with a function, but it's a little bit weird to set a name
attribute to a function ( = "fat"). Moreover, it's convenient
to store some data in the object. In fatoptimizer, I store the
configuration. Even in the most simple AST transformer example of the
PEP, the constructor creates an object:

It may be possible to use functions, but classes are just more
"natural" in Python.

> And is there ever a good use case for putting both in the same class, given that the code transformer isn't going to run on the output of the AST transformer but rather on the output of all subsequent AST transformers and all preceding code transformers?

The two methods are disconnected, but they are linked by the optimizer
tag. IMHO it makes sense to implement all optimizations (crazy stuff
in AST, simple optimizer like peephole on bytecode) in a single code
transformer. It avoids to use a long optimizer tag like
"fat_ast-fat_bytecode". I also like short filenames.

> * Why does the code transformer only take consts and names? Surely you need varnames, and many of the other properties of code objects. And what's the use of lnotab if you can't set the base file and line? In fact, why not just pass a code object?

To be honest, I don't feel confortable with a function taking 5
parameters which has to return a tuple of 4 items :-/ Especially if
it's only the first version, we may have to add more items.

code_transformer() API comes from PyCode_Optimize() API: the CPython
peephole optimizer.

PyAPI_FUNC(PyObject*) PyCode_Optimize(PyObject *code, PyObject* consts,
                                      PyObject *names, PyObject *lnotab);

The function modifies lntotab in-place and returns the modified code.

Passing a whole code object makes the API much simpler and code
objects contain all information. I take your suggestion, thanks.

> * It seems like 99% of all ast_transformer methods are just going to construct and apply an ast.NodeTransformer subclass. Why not just register the NodeTransformer subclass?

fatoptimizer doesn't use ast.NodeTransformer ;-)

ast.NodeTransformer has a naive and inefficent design. For example,
fatoptimizer uses a metaclass to only create the mapping of visitors
once (visit_xxx methods). My transformer copies modified nodes to
leave the input tree unchanged. I need this to be able to duplicate a
tree later (to specialize functions).

(Maybe I can proposed to enhance ast.NodeTransformer, but that's a
different topic.)

> * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal.

I wrote "A preprocessor has various and different usages." Maybe I can
elaborate :-)

It looks like it is possible to "implement" f-string (PEP 498) using
macros. I think that it's a good example of experimenting evolutions
of the language (without having to modify the C code which is much
more complex, Yury Selivanov may want to share his experience here for
this async/await PEP ;-)).

> * In fact, I think this PEP could be useful even if the other two were rejected, if rewritten a bit.

Yeah, I tried to split changes to make them independant.

Only PEP 509 (dict version) is linked to PEP 510 (func specialize).
Even alone, the PEP 509 can be used to implement the "copy globals to
locals/constants" optimization mentioned in the PEP (at least two
developers proposed changes to implement! it was also in Unladen
Swallow plans).

> * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode.
> (...)
> Is there a reason you can't add text_transformer as well?

I don't know this part of the compiler.

Does Python already has an API to manipulate tokens, etc.? What about
other Python implementations?

I proposed AST transformers because it's already commonly used in the wild.

I also proposed bytecode to replace the peephole optimizer: make it
optional and maybe implement a new one (in Python to be more easily be

The Hy language uses its own parser and emits Python AST. Why not
using this design?

> (...) e.g., if I'm using a text transformer, (...)

IHMO you are going too far and it becomes out of the scope of the PEP.

You should also read the previous discussion:

> * It seems like doing any non-trivial bytecode transforms will still require a third-party library like byteplay (which has trailed 2.6, 2.7, 3.x in general, and each new 3.x version by anywhere from 3 months to 4 years). Have you considered integrating some of that functionality into Python itself?

To be honest, right now, I'm focsed on fatoptimizer. I don't want to
integrate it in the stdlib because:

* it's incomplete: see the giant list if you
are bored
* the stdlib is moving ... is not really moving... well, the
development process is way too slow for such very young project
* fatoptimizer still changes the Python semantics in subtle ways which
should be tested in large applications and discussed point per point
* etc.

It's way too early to discuss that (at least for fatoptimizer).

Since pip becomes standard, I don't think that it's real issue in practice.

> Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful.

byteplay doesn't seem to be maintained anymore. Last commit in 2010...

IHMO you can do the same than byteplay on the AST with much simpler
code. I only mentioned some projects modifying bytecode to pick ideas
of what can be done with a code transformer.

I don't think that it's worth to add more examples than the two "Ni!
Ni! Ni!" examples.

> * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source;

You should take a look at MacroPy, it looks like it has some crazy
stuff to modify the AST and compile at runtime. I'm not sure, I never
used MacroPy, I only read its documentation to generalize my PEP ;-)

Modifying and recompiling the code at runtime (using AST, something
higher level than bytecode) sounds like a Lisp feature and like JIT
compiler, two cool stuff ;)


From victor.stinner at  Fri Jan 15 17:16:38 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 15 Jan 2016 23:16:38 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-15 21:39 GMT+01:00 Yury Selivanov < at>:
> All your PEPs are very interesting, thanks for your hard work!
> I'm very happy to see that we're trying to make CPython faster.


> It's important to say that all of those issues (except 2506)
> are not bugs, but proposals to implement some nano- and
> micro- optimizations.

Hum, let me see.

>> *

"A constant folding optimization pass for the AST" & "Build-out an AST
optimizer, moving some functionality out of the peephole optimizer"

Well, that's a way to start working on larger optimizations.

Anyway, the peephole optimizer has many limits. Raymond Hettinger
keeps repeating that it was designed to be simple and limited. And
each time, suggested to reimplement the peephole optimize in pure
Python (as I'm proposing).

On AST, we can do much better than just 1+1, even without changing the
Python semantics.

But I'm ok that speedup are minor on such changes. Without
specialization and guards, you are limited.

>> *

"optimize out local variables at end of function"

Alone, this optimization is not really interesting. But other
optimizations can produce inefficient code. Example with loop

    for i in range(2):

is replaced with:

    i = 0

    i = 1

with constant propagation, it becomes:

    i = 0

    i = 1

at the point, i variable becomes useless and can removed the
optimization mentioned in


>> *

"AST Optimization: inlining of function calls"

IMHO this one is really interesting. But again, not alone, but when
combined with other optimizations.

>> Usage 2: Preprocessor
>> ---------------------
>> A preprocessor can be easily implemented with an AST transformer. A
>> preprocessor has various and different usages.
>>   3.6.
> [..]
> I think that most of those examples are rather weak.  Things like
> tail-call optimizations, constants declarations, pattern matching,
> case classes (from MacroPy) are nice concepts, but they should be
> either directly implemented in Python language or not used at all
> (IMHO).

At least, it allows to experiment new things. If a transformer becomes
popular, we can start to discuss integrating into Python.

About tail recursion, I recall that Guido wrote something about it:

I found a lot of code transformers projects. I understand that there
is a real need.

In a previous job, we used a text preprocessor to remove all calls to
log.debug() to release the code to the production. It was in the
embedded world (set top boxes), where performances matter. The
preprocessor was based on long and unreliable regular expressions. I
would prefer to use AST for that. That's my first item in the list:
"Remove debug code like assertions and logs to make the code faster to
run it for production."

> Things like auto-changing dictionary literals to OrderedDict
> objects or in-Python DSLs will only help in creating hard to
> maintain code base.  I say this because I have a first-hand
> experience with decorators that patch opcodes, and import
> hooks that rewrite AST.  When you get back to your code years
> after it was written, you usually regret about doing those things.

To be honest, I don't plan to use such macros, they look too magic,
and change Python semantics too much. But I dont want to restrict
users to do cool things in their sandbox. In my experience, Python
developers are good enough to make decision.

When the f-string PEP was discussed, I was strongly opposed to allow
*any* Python expressions in f-string. But Guido said that the language
designers must not restrict users. Well, something like, I probably
misuse his quote ;-)

> All in all, I think that adding a blessed API for preprocessors
> shouldn't be a focus of this PEP.  MacroPy works right now
> with importlib, and I think it's a good solution for it.

Do you mean that we should add the feature but add a warning in the
doc like "don't use it for evil things"?

I don't think that we can forbid users for specific usage of an API.
The only strong solution to ensure that users will not misuse an API
is to not add the API (reject the PEP) :-) So I chose instead to
document different kinds of usage of code transformers, just to know
how they can be used.

> I propose to only expose new APIs on the C level,
> and explicitly mark them as provisional and experimental.
> It should be clear, that those APIs are only for
> *writing optimizers*, and nothing else.

Currently, the PEP adds:

* -o OPTIM_TAG command line option
* sys.implementation.optim_tag
* sys.get_code_transformers()
* sys.set_code_transformers(transformers)
* ast.Constant

importlib uses sys.implementation.optim_tag and
sys.get_code_transformers(). *If* we want to remove them, we should
find a way to expose these information to importlib.

I really like ast.Constant, I would like to add it, but it's really a
minor part of the PEP. I don't think that it's controversal.

PyCF_TRANSFORMED_AST can only be exposed at the C level.

"-o OPTIM_TAG command line option" is a shortcut to set
sys.implementation.optim_tag. optim_tag can be set manually. But the
problem is to be able to set the optim_tag before the first Python
module is imported. It doesn't seem easy to avoid this change.
According to Brett, the whole PEP can be simplified to this single
command line option :-)

> [off-topic] I do think that having a macro system similar to
> Rust might be a good idea.  However, macro in Rust have explicit
> and distinct syntax, they have the necessary level of
> documentation and tooling.  But this is a separate matter
> deserving its own PEP ;)

I agree that extending the Python syntax is out of the scope of the PEP 511.

> Would it be possible to (or does it make any sense):
> 1. Add new APIs for AST transformers (only exposed on the C
> level!)
> 2. Remove the peephole optimizer.

FYI my fatoptimizer is quite slow. But it implements a lot of
optimizations, much more than the Python peephole optimizer.

I fear that the conversions are expensive:

* AST (light) internal objects => Python (heavy) AST objects
* (run AST optimizers implemented in Python)
* Python (heavy) AST objects => AST (light) internal objects

So in a near future, I prefer to keep the peephole optimizer
implemented in C. The performance of the optimizer itself matters when
you run a short script using "python" (without compilation
ahead of time).

> I also want to say this: I'm -1 on implementing all three PEPs
> until we see that FAT is able to give us at least 10% performance
> improvement on micro-benchmarks.  We still have several months
> before 3.6beta to see if that's possible.

I prefer to not start benchmarking fatoptimizer because I spent 3
months just to design the API, fix bugs, etc. I only few a small
fraction of time on writing optimizations. I expect significan
speedups with more optimizations like function inlining. If you are
curious, take a look at the todo list:

I understand that an optimizer which does not produce faster code is
not really interesting. My PEPs request many changes which become part
of the public API and have to be maintained later.

I already changed the PEP 509 and 510 to make the changes private
(only visible in the C API).


From abarnert at  Fri Jan 15 17:57:09 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 15 Jan 2016 16:57:09 -0600
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

Sent from my iPhone
> On Jan 15, 2016, at 15:14, Victor Stinner <victor.stinner at> wrote:
> Wow, giant emails (as mine, ok).

Well, this is a big idea, so it needs a big breakfast. I  mean a big email. :) But fortunately, you had great answers to most of my points, which means I can snip them out of this reply and make it not quite as giant.
> 2016-01-15 20:41 GMT+01:00 Andrew Barnert <abarnert at>:
>> * You can register transformers in any order, and they're run in the order specified, first all the AST transformers, then all the code transformers. That's very weird; it seems like it would be conceptually simpler to have a list of AST transformers, then a separate list of code transformers.
> The goal is to have a short optimizer tag. I'm not sure yet that it
> makes sense, but I would like to be able to transform AST and bytecode
> in a single code transformer.

But that doesn't work as soon as there are even two of them: the bytecode #0 no longer runs after ast #0, but after ast #1; similarly, bytecode #1 no longer runs after ast #1, but after bytecode #0. So, it seems like whatever benefits you get by keeping them coupled will be illusory.

> I prefer to add a single get/set
> function to sys, instead of two (4 new functions).

That's a good point. (I suppose you could have a pair of get/set functions that each set multiple lists instead of one, but that isn't really any simpler than multiple get/set functions...)

>> It seems like the only advantage to require attaching them to a class is to associate each one with a name
> I started with a function, but it's a little bit weird to set a name
> attribute to a function ( = "fat").

It looks a lot less weird with a decorator `@transform('fat')` that sets it for you.

> Moreover, it's convenient
> to store some data in the object. In fatoptimizer, I store the
> configuration. Even in the most simple AST transformer example of the
> PEP, the constructor creates an object:
> It may be possible to use functions, but classes are just more
> "natural" in Python.

In general, sure. But for data that isn't accessible from outside, and only needs to be used in a single call, a simple function (with the option of a wrapping data in a closure) can be simpler. That's why so many decorators are functions that return a closure, not classes that build an object with a __call__ method.

But more specifically to this case, after looking over your examples, maybe the class makes sense here.

>> * There are other reasons to write AST and bytecode transformations besides optimization. MacroPy, which you mentioned, is an obvious example. But also, playing with new ideas for Python is a lot easier if you can do most of it with a simple hook that only makes you deal with the level you care about, rather than hacking up everything from the grammar to the interpreter. So, that's an additional benefit you might want to mention in your proposal.
> I wrote "A preprocessor has various and different usages." Maybe I can
> elaborate :-)

Sure. It's just a matter of emphasis, and whether more of it would help sell your idea or not. From the other big reply you got, maybe it would even hurt selling it... So, your call.

> It looks like it is possible to "implement" f-string (PEP 498) using
> macros. I think that it's a good example of experimenting evolutions
> of the language (without having to modify the C code which is much
> more complex, Yury Selivanov may want to share his experience here for
> this async/await PEP ;-)).

I did an experiment last year where I tried to add the same feature two ways (Haskell-style operator partials, so you can write `(* 2)` instead of `lambda x: x * 2)` or `rpartial(mul, 2)` or whatever). First, I did all the steps to add it "for real", from the grammar through to the code generator. Second, I added a quick grammar hack to create a noop AST node, then did everything else in Python with an import hook--preprocessor the text to get the noop nodes, then preprocessing the AST to turn those into nodes that do the intended semantics. As you might expect, the second version took a lot less time, required debugging a lot fewer segfaults, etc. and if your proposal removed the need for the import hook, it would be even simpler (and cleaner, too).

>> * It might be useful to have an API that handled bytes and text (and tokens, but that requires refactoring the token stream API, which is a separate project) as well as AST and bytecode.
>> (...)
>> Is there a reason you can't add text_transformer as well?
> I don't know this part of the compiler.
> Does Python already has an API to manipulate tokens, etc.? What about
> other Python implementations?

Well, Python does have an API to manipulate tokens, but it involves manually tokenizing the text, modifying the token stream, untokenizing it back to text, and then parsing and compiling the result, which is far from ideal. (In fact, in some cases you even need to encode back to bytes.) There's an open enhancement issue to make it easier to write token processors.

But don't worry about that part for now. A text preprocessor step should be very easy to add, and useful on its own (and it opens the door for adding a token preprocessor between text and AST in the future when that becomes feasible).

I also mentioned a bytes preprocessor, which could munge the bytes before the decoding to text. But that seems a lot less useful. (Maybe if you needed an alternative to the coding-declaration syntax for some reason?) I only included it because it's another layer you can hook in an import hook today, so it seems like if it is left out, that should be an intentional decision, not just something nobody thought about.

> I proposed AST transformers because it's already commonly used in the wild.

Text preprocessors are also used in the wild. IIRC, Guido mentioned having written one that turns Python 3-style annotations into something that compiles as legal Python 2.7 (although he later abandoned it, because it turned out to be too hard to integrate with their other Python 2 tools).

(Token preprocessors are not used much I n the wild, because it's painful to write them, nor are bytes preprocessors, because they're not that useful.)

> The Hy language uses its own parser and emits Python AST. Why not
> using this design?

By the same token, why not use your own code generator and emit Python bytecode, instead of just preprocessing ASTs?

If you're making a radical change, that makes sense. But for most uses, where you only want to make a small change on top of the normal processing, it makes a lot more sense to just hook the normal processing than to completely reproduce everything it does.

>> Even if that's out of scope, a paragraph explaining how to use byteplay with a code_transformer, and why it isn't integrated into the proposal, might be helpful.
> byteplay doesn't seem to be maintained anymore. Last commit in 2010...

There's a byteplay3 fork, which is maintained. But it doesn't support 3.5 yet. (As I mentioned, it's usually a few months to a few years behind each new Python release. Which is one reason integrating parts of it into the core might be nice. The dis module changes in 3.4 were basically integrating part of byteplay, and that part has paid off--the code in dis is automatically up to date with the compiler. There may be more you could do here. But probably it's out of scope for your project.)

> IHMO you can do the same than byteplay on the AST with much simpler
> code.

If that's really true, then you shouldn't include code_transformers in the PEP at all. You're just making things more complicated, in multiple ways, to enable a feature you don't think anyone will ever need.

However, based on my own experience, I think code transformers _are_ sometimes useful, but they usually require something like byteplay. Even just something as simple as removing an unnecessary jump instruction requires reordering the arguments of every other jump; something like merging two finally blocks would be a nightmare to do manually.

>> * One thing I've always wanted is a way to write decorators that transform at the AST level. But code objects only have bytecode and source;
> You should take a look at MacroPy,

Yes, I love MacroPy. But it doesn't provide the functionality I'm asking about here. (It _might_ be possible to write a macro that stores the AST on each function object; I haven't tried.)

Anyway, the reason I bring it up is that it's trivial to write a decorator that byteplay-hacks a function after compilation, and not much harder to write one that text-hacks the source and recompiles it, but taking the AST and recompiling it is more painful. Since your proposal is about making similar things easier in other cases, it could be nice to do that here as well. But, as I said at the top, I realize some of these ideas are out of scope; some of them are more about getting a definite "yeah, that might be cool but it's out of scope" as opposed to not knowing whether it had even been considered.  

> Modifying and recompiling the code at runtime (using AST, something
> higher level than bytecode) sounds like a Lisp feature and like JIT
> compiler, two cool stuff ;)

Well, part of the point of Lisp is that there is only one step--effectively, your source bytes are your AST. Python has to decode, tokenize, and parse to get to the AST. But being able to start there instead of repeating that work would give us the best of both worlds (as easy to do stuff as Lisp, but as readable as Python). 

From stephen at  Sat Jan 16 04:56:48 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 16 Jan 2016 18:56:48 +0900
Subject: [Python-ideas]  Boolean value of an Enum member
In-Reply-To: <>
References: <>
Message-ID: <>

I don't understand why this was cross-posted.  Python-Dev removed from

Ethan Furman writes:

 > When Enum was being designed one of the questions considered was where 
 > to start autonumbering: zero or one.
 > As I remember the discussion we chose not to start with zero because we 
 > didn't want an enum member to be False by default, and having a member 
 > with value 0 be True was discordant.  So the functional API starts with 
 > 1 unless overridden.  In fact, according to the Enum docs:
 >     The reason for defaulting to ``1`` as the starting number and
 >     not ``0`` is that ``0`` is ``False`` in a boolean sense, but
 >     enum members all evaluate to ``True``.
 > However, if the Enum is combined with some other type (str, int, float, 
 > etc), then most behaviour is determined by that type -- including 
 > boolean evaluation.  So the empty string, 0 values, etc, will cause that 
 > Enum member to evaluate as False.

Seems like perfectly desirable behavior to me.  A pure enum is a set
of mutually exclusive abstract symbolic values, and if you want one of
them to have specific behavior other than that you should say so.  If
you need a falsey value for a variable that takes pure Enum values,
"None" or "False" (or both!)  seems fine to me depending on the
semantics of the variable and dataset in question, and if neither
seems to fit the bill, define __bool__.  OTOH, an Enum which is
conceptually a set of symbolic names for constants of some type should
take on the semantics of that type, including truthiness of the values

Do you have a use case where that distinction seems totally
inappropriate, or have you merely been bitten by Emerson's Hobgoblin?

From encukou at  Sat Jan 16 06:06:58 2016
From: encukou at (Petr Viktorin)
Date: Sat, 16 Jan 2016 12:06:58 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/15/2016 05:10 PM, Victor Stinner wrote:
> Hi,
> This PEP 511 is part of a serie of 3 PEP (509, 510, 511) adding an API
> to implement a static Python optimizer specializing functions with
> guards.
> If the PEP is accepted, it will solve a long list of issues, some
> issues are old, like #1346238 which is 11 years old ;-) I found 12
> issues:
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> *
> I worked to make the PEP more generic that "this hook is written for
> FAT Python". Please read the full PEP to see a long list of existing
> usages in Python of code transformers.
> You may read again the discussion which occurred 4 years ago about the
> same topic:
> (the thread starts with an idea of AST optimizer, but is moves quickly
> to a generic API to transform the code)
> Thanks to Red Hat for giving me time to experiment on this.
> Victor
> HTML version:

Thanks for your efforts on making Python faster!

This PEP addresses two things that would benefit from different
approaches: let's call them optimizers and extensions.

Optimizers, such as your FAT, don't change Python semantics. They're
designed to run on *all* code, including the standard library. It makes
sense to register them as early in interpreter startup as possible, but
if they're not registered, nothing breaks (things will just be slower).
Experiments with future syntax (like when async/await was being
developed) have the same needs.

Syntax extensions, such as MacroPy or Hy, tend to target specific
modules, with which they're closely coupled: The modules won't run
without the transformer. And with other modules, the transformer either
does nothing (as with MacroPy, hopefully), or would fail altogether (as
with Hy). So, they would benefit from specific packages opting in. The
effects of enabling them globally range from inefficiency (MacroPy) to
failures or needing workarounds (Hy).

The PEP is designed optimizers. It would be good to stick to that use
case, at least as far as the registration is concerned. I suggest noting
in the documentation that Python semantics *must* be preserved, and
renaming the API, e.g.::


The "transformer" API can be used for syntax extensions as well, but the
registration needs to be different so the effects are localized. For
example it could be something like::

        'mypackage.specialmodule', MyTransformer())

or a special flag in packages::

    __transformers_for_submodules__ = [MyTransformer()]

or extendeding exec (which you actually might want to add to the PEP, to
make giving examples easier)::

    exec("print('Hello World!')", transformers=[MyTransformer()])

or making it easier to write an import hook with them, etc...

but all that would probably be out of scope for your PEP.

Another thing: this snippet from the PEP sounds too verbose::

    transformers = sys.get_code_transformers()
    transformers.insert(0, new_cool_transformer)

Can this just be a list, as with sys.path? Using the "optimizers" term::

    sys.global_optimizers.insert(0, new_cool_transformer)


    def code_transformer(code, consts, names, lnotab, context):

It's a function, so it would be better to name it::

    def transform_code(code):

And this::

    def ast_transformer(tree, context):

might work better with keyword arguments::

    def transform_ast(tree, *, filename, **kwargs):

otherwise people might use context objects with other attributes than
"filename", breaking when a future PEP assigns a specific meaning to them.

It actually might be good to make the code transformer API extensible as
well, and synchronize with the AST transformer::

    def transform_code(code, *, filename, **kwargs):

From sjoerdjob at  Sat Jan 16 11:22:35 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Sat, 16 Jan 2016 17:22:35 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote:
> The "transformer" API can be used for syntax extensions as well, but the
> registration needs to be different so the effects are localized. For
> example it could be something like::
>     importlib.util.import_with_transformer(
>         'mypackage.specialmodule', MyTransformer())
> or a special flag in packages::
>     __transformers_for_submodules__ = [MyTransformer()]
> or extendeding exec (which you actually might want to add to the PEP, to
> make giving examples easier)::
>     exec("print('Hello World!')", transformers=[MyTransformer()])
> or making it easier to write an import hook with them, etc...

So, you'd have to supply the transformer used before importing? That
seems like a troublesome solution to me.

A better approach (to me) would require being able to document what
transformers need to be run inside the module itself. Something like

    #:Transformers modname.TransformerClassName, modname.OtherTransformerClassName

The reason why I would prefer this, is that it makes sense to document
the transformers needed in the module itself, instead of in the code
importing the module.

As you suggest (and rightly so) to localize the effects of the
registration, it makes sense to do the registration in the affected

Of course there might be some cases where you want to import a module
using a transformer it does not need to know about, but I think that
would be less likely than the case where a module knows what
transformers there should be applied.

As an added bonus, it would let you apply transformers to the

    #!/usr/bin/env python
    #:Transformers foo.BarTransformerMyCodeCanNotRunWithout

But as you said, this support is probably outside the scope of the PEP

Kind regards,
Sjoerd Job

From kevinjacobconway at  Sat Jan 16 11:56:05 2016
From: kevinjacobconway at (Kevin Conway)
Date: Sat, 16 Jan 2016 16:56:05 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

I'm a big fan of your motivation to build an optimizer for cPython code.
What I'm struggling with is understanding why this requires a PEP and
language modification. There are already several projects that manipulate
the AST for performance gains such as [1] or even my own ham fisted attempt

Would you please elaborate on why these external approaches fail and how
language modifications would make your approach successful.


On Sat, Jan 16, 2016, 10:30 Sjoerd Job Postmus <sjoerdjob at> wrote:

> On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote:
> > The "transformer" API can be used for syntax extensions as well, but the
> > registration needs to be different so the effects are localized. For
> > example it could be something like::
> >
> >     importlib.util.import_with_transformer(
> >         'mypackage.specialmodule', MyTransformer())
> >
> > or a special flag in packages::
> >
> >     __transformers_for_submodules__ = [MyTransformer()]
> >
> > or extendeding exec (which you actually might want to add to the PEP, to
> > make giving examples easier)::
> >
> >     exec("print('Hello World!')", transformers=[MyTransformer()])
> >
> > or making it easier to write an import hook with them, etc...
> So, you'd have to supply the transformer used before importing? That
> seems like a troublesome solution to me.
> A better approach (to me) would require being able to document what
> transformers need to be run inside the module itself. Something like
>     #:Transformers modname.TransformerClassName,
> modname.OtherTransformerClassName
> The reason why I would prefer this, is that it makes sense to document
> the transformers needed in the module itself, instead of in the code
> importing the module.
> As you suggest (and rightly so) to localize the effects of the
> registration, it makes sense to do the registration in the affected
> module.
> Of course there might be some cases where you want to import a module
> using a transformer it does not need to know about, but I think that
> would be less likely than the case where a module knows what
> transformers there should be applied.
> As an added bonus, it would let you apply transformers to the
> entry-point:
>     #!/usr/bin/env python
>     #:Transformers foo.BarTransformerMyCodeCanNotRunWithout
> But as you said, this support is probably outside the scope of the PEP
> anyway.
> Kind regards,
> Sjoerd Job
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From andre.roberge at  Sat Jan 16 12:00:39 2016
From: andre.roberge at (Andre Roberge)
Date: Sat, 16 Jan 2016 13:00:39 -0400
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Sat, Jan 16, 2016 at 12:22 PM, Sjoerd Job Postmus <sjoerdjob at>

> >
> > or making it easier to write an import hook with them, etc...
> So, you'd have to supply the transformer used before importing? That
> seems like a troublesome solution to me.
> A better approach (to me) would require being able to document what
> transformers need to be run inside the module itself. Something like
>     #:Transformers modname.TransformerClassName,
> modname.OtherTransformerClassName
> The reason why I would prefer this, is that it makes sense to document
> the transformers needed in the module itself, instead of in the code
> importing the module.

+1 for this (but see below).  This is the approach I used when playing with
import hooks as shown in

and a few other posts I wrote about similar transformations.

> As you suggest (and rightly so) to localize the effects of the
> registration, it makes sense to do the registration in the affected
> module.
> Of course there might be some cases where you want to import a module
> using a transformer it does not need to know about, but I think that
> would be less likely than the case where a module knows what
> transformers there should be applied.
> As an added bonus, it would let you apply transformers to the
> entry-point:
>     #!/usr/bin/env python
>     #:Transformers foo.BarTransformerMyCodeCanNotRunWithout
> But as you said, this support is probably outside the scope of the PEP
> anyway.

While I would like to see some standard way to apply code transformations,
I agree that this is likely (and unfortunately) outside the scope of this

 Andr? Roberge

> Kind regards,
> Sjoerd Job
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Sat Jan 16 13:17:56 2016
From: ethan at (Ethan Furman)
Date: Sat, 16 Jan 2016 10:17:56 -0800
Subject: [Python-ideas] Boolean value of an Enum member
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/16/2016 01:56 AM, Stephen J. Turnbull wrote:

 > I don't understand why this was cross-posted.  Python-Dev removed from
 > addressees.

Not every one reads both lists, and I wanted the widest audience.

 > Ethan Furman writes:

 >> However, if the Enum is combined with some other type (str, int,
 >> float, etc), then most behaviour is determined by that type --
 >> including boolean evaluation.  So the empty string, 0 values, etc,
 >> will cause that Enum member to evaluate as False.
 > Do you have a use case where that distinction seems totally
 > inappropriate, or have you merely been bitten by Emerson's Hobgoblin?
Sadly, it was of failing memory.


From ethan at  Sat Jan 16 13:18:15 2016
From: ethan at (Ethan Furman)
Date: Sat, 16 Jan 2016 10:18:15 -0800
Subject: [Python-ideas] Boolean value of an Enum member
In-Reply-To: <>
References: <> <>
Message-ID: <>

[resending to lists -- sorry, Greg]

On 01/15/2016 12:36 PM, Greg Ewing wrote:
> Ethan Furman wrote:

>> So the question now is:  for a standard Enum (meaning no other type
>> besides Enum is involved) should __bool__ look to the value of the
>> Enum member to determine True/False, or should we always be True by
>> default and make the Enum creator add their own __bool__ if they want
>> something different?
> Can't you just specify a starting value of 0 if you
> want the enum to have a false value? That doesn't
> seem too onerous to me.

You can start with zero, but unless the Enum is mixed with a numeric 
type it will evaluate to True.

Also, but there are other falsey values that a pure Enum member could 
have:  False, None, '', etc., to name a few.

However, as Barry said, writing your own is a whopping two lines of code:

   def __bool__(self):
     return bool(self._value_)

With Barry and Guido's feedback this issue is closed.

Thanks everyone!


From brett at  Sat Jan 16 13:28:42 2016
From: brett at (Brett Cannon)
Date: Sat, 16 Jan 2016 18:28:42 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 15 Jan 2016 at 09:40 Victor Stinner <victor.stinner at>

> 2016-01-15 18:22 GMT+01:00 Brett Cannon <brett at>:
> > I just wanted to point out to people that the key part of this PEP is the
> > change in semantics of `-O` accepting an argument.
> The be exact, it's a new "-o arg" option, it's different from -O and
> -OO (uppercase). Since I don't know what to do with -O and -OO, I
> simply kept them :-D
> > I should also point out that this does get tricky in terms of how to
> handle
> > the stdlib if you have not pre-compiled it, e.g., if the first module
> > imported by Python is the encodings module then how to make sure the AST
> > optimizers are ready to go by the time that import happens?
> Since importlib reads sys.implementation.optim_tag at each import, it
> works fine.
> For example, you start with "opt" optimizer tag. You import everything
> needed for fatoptimizer. Then calling sys.set_code_transformers() will
> set a new optimizer flag (ex: "fat-opt"). But it works since the
> required code transformers are now available.

I understand all of that; my point is what if you don't compile the stdlib
for your optimization? You have to import over 20 modules before user code
gets imported. My question is how do you expect the situation to be handled
where you didn't optimize the stdlib since the 'encodings' module is
imported before anything else? If you set your `-o` flag and you want to
fail imports if the .pyc isn't there, then wouldn't that mean you are going
to fail immediately when you try and import 'encodings' in Py_Initialize()?

> The tricky part is more when you want to deploy an application without
> the code transformer, you have to ensure that all .py files are
> compiled to .pyc. But there is no technical issues to compile them,
> it's more a practical issue.
> See my second email with a lot of commands, I showed how .pyc are
> created with different .pyc filenames. Or follow my commands to try my
> "fatpython" fork to play yourself with the code ;-)
> > And lastly, Victor proposes that all .pyc files get an optimization tag.
> > While there is nothing technically wrong with that, PEP 488 purposefully
> > didn't do that in the default case for backwards-compatibility, so that
> will
> > need to be at least mentioned in the PEP.
> The PEP already contains:
> "Remove also the special case for the optimizer level 0 with the
> default optimizer tag 'opt' to simplify the code."
> Code relying on the exact .pyc filename (like unit tests) already have
> to be modified to use the optimizer tag. It's just an opportunity to
> simplify the code. I don't really care of this specific change ;-)

Right, it's just mentioning the backwards-compatibility issue should be

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jan 16 22:38:47 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 17 Jan 2016 13:38:47 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 17 January 2016 at 04:28, Brett Cannon <brett at> wrote:
> On Fri, 15 Jan 2016 at 09:40 Victor Stinner <victor.stinner at>
> wrote:
>> 2016-01-15 18:22 GMT+01:00 Brett Cannon <brett at>:
>> > I just wanted to point out to people that the key part of this PEP is
>> > the
>> > change in semantics of `-O` accepting an argument.
>> The be exact, it's a new "-o arg" option, it's different from -O and
>> -OO (uppercase). Since I don't know what to do with -O and -OO, I
>> simply kept them :-D
>> > I should also point out that this does get tricky in terms of how to
>> > handle
>> > the stdlib if you have not pre-compiled it, e.g., if the first module
>> > imported by Python is the encodings module then how to make sure the AST
>> > optimizers are ready to go by the time that import happens?
>> Since importlib reads sys.implementation.optim_tag at each import, it
>> works fine.
>> For example, you start with "opt" optimizer tag. You import everything
>> needed for fatoptimizer. Then calling sys.set_code_transformers() will
>> set a new optimizer flag (ex: "fat-opt"). But it works since the
>> required code transformers are now available.
> I understand all of that; my point is what if you don't compile the stdlib
> for your optimization? You have to import over 20 modules before user code
> gets imported. My question is how do you expect the situation to be handled
> where you didn't optimize the stdlib since the 'encodings' module is
> imported before anything else? If you set your `-o` flag and you want to
> fail imports if the .pyc isn't there, then wouldn't that mean you are going
> to fail immediately when you try and import 'encodings' in Py_Initialize()?

I don't think that's a major problem - it seems to me that it's the
same as going for "pyc only" deployment with an embedded Python
interpreter, and then forgetting to a precompiled standard library in
addition to your own components. Yes, it's going to fail, but the bug
is in the build process for your deployment artifacts rather than in
the runtime behaviour of CPython.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan 16 22:49:52 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 17 Jan 2016 13:49:52 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 17 January 2016 at 02:22, Sjoerd Job Postmus <sjoerdjob at> wrote:
> On Sat, Jan 16, 2016 at 12:06:58PM +0100, Petr Viktorin wrote:
>> The "transformer" API can be used for syntax extensions as well, but the
>> registration needs to be different so the effects are localized. For
>> example it could be something like::
>>     importlib.util.import_with_transformer(
>>         'mypackage.specialmodule', MyTransformer())
>> or a special flag in packages::
>>     __transformers_for_submodules__ = [MyTransformer()]
>> or extendeding exec (which you actually might want to add to the PEP, to
>> make giving examples easier)::
>>     exec("print('Hello World!')", transformers=[MyTransformer()])
>> or making it easier to write an import hook with them, etc...
> So, you'd have to supply the transformer used before importing? That
> seems like a troublesome solution to me.

I think Sjoerd's confusion here is a strong argument in favour of
clearly and permanently distinguishing semantics preserving code
optimizers (which can be sensibly applied externally and/or globally),
and semantically significant code transformers (which also need to be
taken into account when *reading* the code, and hence should be
visible locally, at least at the module level, and often at the
function level).

Making that distinction means we can be clear that the transformation
case is already well served by import hooks that process alternate
filename extensions rather than standard Python source or bytecode
files, encoding cookie tricks (which are visible as a comment in the
module header), and function decorators that alter the semantics of
the functions they're applied to.

The case which *isn't* currently well served is transparently applying
a semantics preserving code optimiser like FAT Python - that's a
decision for the person *running* the code, rather than the person
writing it, so this PEP is about providing the hooks at the
interpreter level to let them do that. While we can't *prevent* people
from using these new hooks with semantically significant transformers,
we *can* make it clear that we think actually doing is a bad idea, as
it is likely to result in a tightly coupled hard to maintain code base
that can't even be read reliably without understand the transforms
that are being implicitly applied.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan 16 22:59:56 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 17 Jan 2016 13:59:56 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 17 January 2016 at 02:56, Kevin Conway <kevinjacobconway at> wrote:
> I'm a big fan of your motivation to build an optimizer for cPython code.
> What I'm struggling with is understanding why this requires a PEP and
> language modification. There are already several projects that manipulate
> the AST for performance gains such as [1] or even my own ham fisted attempt
> [2].
> Would you please elaborate on why these external approaches fail and how
> language modifications would make your approach successful.

Existing external optimizers (include Victor's own astoptimizer, the
venerable psyco, static compilers like Cython, and dynamic compilers
like Numba) make simplifying assumptions that technically break some
of Python's expected runtime semantics. They get away with that by
relying on the assumption that people will only apply them in
situations where the semantic differences don't matter.

That's not good enough for optimization passes that are enabled
globally: those need to be semantics *preserving*, so they can be
applied blindly to any piece of Python code, with the worst possible
outcome being "the optimization was automatically bypassed or disabled
at runtime due to its prerequisites no longer being met".

The PyPy JIT actually works in much the same way, it just does it
dynamically at runtime by tracing frequently run execution paths. This
is both a strength (it allows even more optimal code generation based
on the actual running application), and a weakness (it requires time
for the JIT to warm up by identifying critical execution paths,
tracing them, and substituting the optimised code)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From abarnert at  Sat Jan 16 23:28:54 2016
From: abarnert at (Andrew Barnert)
Date: Sat, 16 Jan 2016 20:28:54 -0800
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Jan 16, 2016, at 19:49, Nick Coghlan <ncoghlan at> wrote:
> Making that distinction means we can be clear that the transformation
> case is already well served by import hooks that process alternate
> filename extensions rather than standard Python source or bytecode
> files, encoding cookie tricks (which are visible as a comment in the
> module header), and function decorators that alter the semantics of
> the functions they're applied to.
> The case which *isn't* currently well served is transparently applying
> a semantics preserving code optimiser like FAT Python - that's a
> decision for the person *running* the code, rather than the person
> writing it...

I think something that isn't made clear in the rationale is why an import hook is good enough for most semantic extensions, but isn't good enough for global optimizers.

After all, it's not that hard to write a module that installs an import hook for normal .py files instead of .hy or .pyq or whatever files. Then, to optimize your own code, or a third-party library, you just import the optimizer module first; to optimize an application, you write a 2- or 3-line wrapper (which can be trivially automated a la setuptools entry point scripts) to import the optimizer and then start the app.

There are good reasons that isn't sufficient. For example, parts of the stdlib have already been imported before the top of the main module. While there are ways around that (I believe FAT comes with a script to recompile the stdlib into a venv or something?), they're clumsy and ad hoc, and it's unlikely two different optimizers would play nicely together. Also, making it work in a sensible way with .pyc files takes a decent amount of code, and will again be an ad-hoc solution that won't play well with other projects doing similar things. And there are people who write and execute long-running, optimization-ripe bits of code in the REPL (or at least in an IPython notebook), and that can't be handled with an import hook. Nor can code that extensively used exec. And probably other reasons I haven't thought of.

Maybe the PEP should explain those reasons, so it's clear why this feature will help projects like FAT.

Then again, some of those same reasons seem to apply equally well to semantic extensions. Two extensions are no more likely to play together as import hooks than two optimizers, and yet in many cases there's no syntactic or semantic reason they couldn't. Extensions are probably even more useful than optimizations at the REPL. And so on. And this is all even more true for extensions that people write to explore a new feature idea than for things people want to publish as deployable code.

So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes, even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive.

From ncoghlan at  Sun Jan 17 01:06:41 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 17 Jan 2016 16:06:41 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 17 January 2016 at 14:28, Andrew Barnert <abarnert at> wrote:
> So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes,

The main problem with globally enabled transformations of any kind is
that action at a distance in software design is generally a *bad
thing*. Python's tolerant of it because sometimes it's a *necessary*
thing that actually makes code more maintainable - using
monkeypatching for use cases like testing and monitoring means those
cases can be ignored when reading and writing the code, using
metaclasses lets you enlist the interpreter in defining "class-like"
objects that differ in some specific way from normal ones (e.g. ORMs,
ABCs, enums), using codecs lets you more easily provide configurable
encoding and decoding behaviour, etc.

While relying too heavily on those kinds of features can significantly
harm debuggability, the pay-off in readability is worth it often
enough for them to be officially supported language and runtime

The kind of code transformation hooks that Victor is talking about
here are the ultimate in action at a distance - if it wants to, an
"optimizer" can completely throw away your code and substitute its
own. Import hooks do indeed give you a comparable level of power (at
least if you go so far as to write your own meta_path hook), but also
still miss the code that Python runs without importing it (__main__,
exec, eval, runpy, etc).

> even if people only want one of them served (normally, Python doesn't go out of its way to prevent writing certain kinds of code, it just becomes accepted that such code is not idiomatic; only when there's a real danger of attractive nuisance is the language modified to ban it), and I think it's potentially a positive.

That's all I'm suggesting - I think the proposed hooks should be
designed for globally enabled optimizations (and named accordingly),
but I don't think we should erect any specific barriers against using
them for other things. Designing them that way will provide a healthy
nudge towards the primary intended use case (transparently enabling
semantically compatible code optimizations), while still providing a
new transformation technique to projects like MacroPy.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From abarnert at  Sun Jan 17 01:22:57 2016
From: abarnert at (Andrew Barnert)
Date: Sat, 16 Jan 2016 22:22:57 -0800
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Jan 16, 2016, at 22:06, Nick Coghlan <ncoghlan at> wrote:
>> On 17 January 2016 at 14:28, Andrew Barnert <abarnert at> wrote:
>> So, I'm still not convinced that the distinction really is critical here. I definitely don't see why it's a negative that the PEP can serve both purposes,
> The main problem with globally enabled transformations of any kind is
> that action at a distance in software design is generally a *bad
> thing*. Python's tolerant of it because sometimes it's a *necessary*
> thing that actually makes code more maintainable


> That's all I'm suggesting - I think the proposed hooks should be
> designed for globally enabled optimizations (and named accordingly),
> but I don't think we should erect any specific barriers against using
> them for other things. Designing them that way will provide a healthy
> nudge towards the primary intended use case (transparently enabling
> semantically compatible code optimizations), while still providing a
> new transformation technique to projects like MacroPy.

OK, then I agree 100% on this part.

But on the main point, I still think it's important for the PEP to explain why import hooks aren't good enough for semantically-neutral global optimizations. As I said, I can think of multiple answers (top-level code, interaction with .pyc files,
etc.), but as long as the PEP doesn't give those answers, people are going to keep asking (even years from now, when people want to know why TOOWTDI didn't apply here).

From victor.stinner at  Sun Jan 17 06:48:59 2016
From: victor.stinner at (Victor Stinner)
Date: Sun, 17 Jan 2016 12:48:59 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou at <javascript:;>>:
> This PEP addresses two things that would benefit from different
> approaches: let's call them optimizers and extensions.
> Optimizers, such as your FAT, don't change Python semantics. They're
> designed to run on *all* code, including the standard library. It makes
> sense to register them as early in interpreter startup as possible, but
> if they're not registered, nothing breaks (things will just be slower).
> Experiments with future syntax (like when async/await was being
> developed) have the same needs.
> Syntax extensions, such as MacroPy or Hy, tend to target specific
> modules, with which they're closely coupled: The modules won't run
> without the transformer. And with other modules, the transformer either
> does nothing (as with MacroPy, hopefully), or would fail altogether (as
> with Hy). So, they would benefit from specific packages opting in. The
> effects of enabling them globally range from inefficiency (MacroPy) to
> failures or needing workarounds (Hy).

To be clear, Hylang will not benefit from my PEP. That's why it is not
mentioned in the PEP.

"Syntax extensions" only look like a special case of optimizers. I'm not
sure that it's worth to make them really different.

> The PEP is designed optimizers. It would be good to stick to that use
> case, at least as far as the registration is concerned. I suggest noting
> in the documentation that Python semantics *must* be preserved, and
> renaming the API, e.g.::
>     sys.set_global_optimizers([])

I would prefer to not restrict the PEP to a specific usage.

> The "transformer" API can be used for syntax extensions as well, but the
> registration needs to be different so the effects are localized. For
> example it could be something like::
>     importlib.util.import_with_transformer(
>         'mypackage.specialmodule', MyTransformer())

Brett may help on this part. I don't think that it's the best way to use
importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy
uses an import hook. (Maybe it should continue to use an import hook?)

> or a special flag in packages::
>     __transformers_for_submodules__ = [MyTransformer()]

Does it mean that you have to parse a .py file to then decide how to
transform it? It will slow down compilation of code not using transformers.

I would prefer to do that differently: always register transformers very
early, but configure each transformer to only apply it on some files. The
transformer can use the filename (file extension? importlib is currently
restricted to .py files by default no?), it can use a special variable in
the file (ex: fatoptimizer searchs for a __fatoptimizer__ variable which is
used to configure the optimizer), a configuration loaded when the
transformer is created, etc.

> or extendeding exec (which you actually might want to add to the PEP, to
> make giving examples easier)::
>     exec("print('Hello World!')", transformers=[MyTransformer()])

There are a lot of ways to load, compile and execute code. Starting to add
optional parameters will end as my old PEP 410 ( ) which was rejected because it
added an optional parameter a lot of functions (at least 16 functions!).
(It was not the only reason to reject the PEP.)

Brett Canon proposed to add hooks to importlib, but it would restrict the
feature to imports. See use cases in the PEP, I would like to use the same
code transformers everywhere.

> Another thing: this snippet from the PEP sounds too verbose::
>     transformers = sys.get_code_transformers()
>     transformers.insert(0, new_cool_transformer)
>     sys.set_code_transformers(transformers)
> Can this just be a list, as with sys.path? Using the "optimizers" term::
>     sys.global_optimizers.insert(0, new_cool_transformer)

set_code_transformers() checks the transformer name and ensures that the
transformer has at least a AST transformer or a bytecode transformer.
That's why it's a function and not a simple list.

set_code_transformers() also gets the AST and bytecode transformers methods
only once, to provide a simple C structure for PyAST_CompileObject
(bytecode transformers) and PyParser_ASTFromStringObject (AST transformers).

Note: sys.implementation.cache_tag is modifiable without any check. If you
mess it, importlib will probably fail badly. And the newly added
sys.implementation.optim_tag can also be modified without any check.

> This::
>     def code_transformer(code, consts, names, lnotab, context):
> It's a function, so it would be better to name it::
>     def transform_code(code):

Fair enough :-) But I want the context parameter to pass additional

Note: if we pass a code object, the filename is already in the code object,
but there are other informations (see below).

> And this::
>     def ast_transformer(tree, context):
> might work better with keyword arguments::
>     def transform_ast(tree, *, filename, **kwargs):
> otherwise people might use context objects with other attributes than
> "filename", breaking when a future PEP assigns a specific meaning to them.

The idea of a context object is to be "future-proof". Future versions of
Python can add new attributes without having to modify all code
transformers (or even worse, having to use kind of "#ifdef" in the code
depending on the Python version).

> It actually might be good to make the code transformer API extensible as
> well, and synchronize with the AST transformer::
>     def transform_code(code, *, filename, **kwargs):

**kwargs and context is basically the same, but I prefer a single parameter
rather than an ugly **kwargs. IMHO "**kwargs" cannot be called an API.

By the way, I added lately the bytecode transformers to the PEP. In fact,
we already can more informations to its context:

* compiler_flags: flags like
* optimization_level (int): 0, 1 or 2 depending on the -O and -OO command
line options
* interactive (boolean): True if interactive mode
* etc.

=> see the compiler structure in Python/compile.c.

We will have to check that these attributes make sense to other Python
implementations, or make it clear in the PEP that as sys.implementation,
each Python implementation can add specific attributes, and only a few of
them are always available.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sun Jan 17 07:36:32 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 17 Jan 2016 22:36:32 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 17 January 2016 at 21:48, Victor Stinner <victor.stinner at> wrote:
> 2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou at>:
>> The PEP is designed optimizers. It would be good to stick to that use
>> case, at least as far as the registration is concerned. I suggest noting
>> in the documentation that Python semantics *must* be preserved, and
>> renaming the API, e.g.::
>>     sys.set_global_optimizers([])
> I would prefer to not restrict the PEP to a specific usage.

The problem I see with making the documentation and naming too generic
is that people won't know what the feature is useful for - a generic
term like "transformer" accurately describes these units of code, but
provides no hint as to why a developer might care about their

However, if the reason we're adding the capability is to make global
static optimizers feasible, then we cam describe it accordingly (so
the answer to "Why does this feature exist?" becomes relatively self
evident), and have the fact that the feature can actually be used for
arbitrary transforms be an added bonus rather than the core intent.

Alternatively, we could follow the example of the atexit module, and
provide these hook registration capabilities through a new "atcompile"
module rather than through the sys module. Doing that would also
provide a namespace for doing things like allowing runtime caching of
compiled code objects - if there's no caching mechanism, then
optimising code compiled at runtime (rather than loading pre-optimised
code from bytecode files) could easily turn into a pessimisation if
the optimiser takes more time to run than is gained back in a single
execution of the optimised code relative to the unoptimised code.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From abarnert at  Sun Jan 17 11:27:03 2016
From: abarnert at (Andrew Barnert)
Date: Sun, 17 Jan 2016 08:27:03 -0800
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 17, 2016, at 03:48, Victor Stinner <victor.stinner at> wrote:
> 2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou at>:
> > The PEP is designed optimizers. It would be good to stick to that use
> > case, at least as far as the registration is concerned. I suggest noting
> > in the documentation that Python semantics *must* be preserved, and
> > renaming the API, e.g.::
> >
> >     sys.set_global_optimizers([])
> I would prefer to not restrict the PEP to a specific usage.
> > The "transformer" API can be used for syntax extensions as well, but the
> > registration needs to be different so the effects are localized. For
> > example it could be something like::
> >
> >     importlib.util.import_with_transformer(
> >         'mypackage.specialmodule', MyTransformer())
> Brett may help on this part. I don't think that it's the best way to use importlib. importlib is already pluggable. As I wrote in the PEP, MacroPy uses an import hook. (Maybe it should continue to use an import hook?)
> > or a special flag in packages::
> >
> >     __transformers_for_submodules__ = [MyTransformer()]
> Does it mean that you have to parse a .py file to then decide how to transform it? It will slow down compilation of code not using transformers.
> I would prefer to do that differently: always register transformers very early, but configure each transformer to only apply it on some files.

At that point, you're exactly duplicating what can be done with import hooks.

I think this is part of the reason Nick suggested the PEP should largely ignore the issue of syntax extensions and experiments: because then you don't have to solve Petr's problem. Globally-applicable optimizers are either on or off globally, so the only API you need to control them is a simple global list. The fact that this same API works for some uses of extensions doesn't matter; the fact that it doesn't work for some other uses of extensions also doesn't matter; just design it for the intended use.

> The transformer can use the filename (file extension? importlib is currently restricted to .py files by default no?),

Everything goes through the same import machinery. The usual importer gets registered for .py files. Something like hylang can register for a different extension. Something like PyMacro can wrap the usual importer, then register to take over for .py files. (This isn't quite what PyMacro does, because it's designed to work with older versions of Python, with less powerful/simple customization opportunities, but it's what a new PyMacro-like project would do.) A global optimizer could also be written that way today. And doing this is a couple dozen lines of code (or about 5 lines to do it as a quick&dirty hack without worrying about portability or backward/forward compatibility).

The reason your PEP is necessary, I believe, is to overcome the limitations of such an import hook: to work at the REPL/notebook/etc. level, to allow multiple optimizers to play nicely without them having to agree on some wrapping protocol, to work with exec, etc. By keeping things simple and only serving the global case, you can (or, rather, you already have) come up with easier solutions to those issues--no need for enabling/disabling files by type or other information, no need for extra optional parameters to exec, etc.

(Or, if you aren't trying to overcome those limitations, then I'm not sure why your PEP is necessary. Import hooks already work, after all.)

> > Another thing: this snippet from the PEP sounds too verbose::
> >
> >     transformers = sys.get_code_transformers()
> >     transformers.insert(0, new_cool_transformer)
> >     sys.set_code_transformers(transformers)
> >
> > Can this just be a list, as with sys.path? Using the "optimizers" term::
> >
> >     sys.global_optimizers.insert(0, new_cool_transformer)
> set_code_transformers() checks the transformer name and ensures that the transformer has at least a AST transformer or a bytecode transformer. That's why it's a function and not a simple list.

That doesn't seem necessary. After all, sys.path doesn't check that you aren't assigning non-strings or strings that don't make valid paths, and nobody has ever complained that it's too hard to debug the case where you write `sys.paths.insert(0, {1, 2, 3})` because the error comes at import time instead of locally.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Sun Jan 17 16:54:10 2016
From: greg.ewing at (Greg Ewing)
Date: Mon, 18 Jan 2016 10:54:10 +1300
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

Concerning ways to allow a module to opt in to transformations
that change semantics, my first thought was to use an import
from a magic module:

from __extensions__ import modulename

This would have to appear before any other statements or
non-magic imports, like __future__ does. The named module
would be imported at compile time and some suitable
convention used to extract transformers from it.

The problem is that if your extension is in a package,
you want to be able to write

from __extensions__ import packagename.modulename

which is not valid syntax.

So instead of a magic module, maybe a magic namespace

import __extensions__.packagename.modulename


From encukou at  Mon Jan 18 04:50:49 2016
From: encukou at (Petr Viktorin)
Date: Mon, 18 Jan 2016 10:50:49 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/17/2016 12:48 PM, Victor Stinner wrote:
> 2016-01-16 12:06 GMT+01:00 Petr Viktorin <encukou at <javascript:;>>:
>> This PEP addresses two things that would benefit from different
>> approaches: let's call them optimizers and extensions.
>> Optimizers, such as your FAT, don't change Python semantics. They're
>> designed to run on *all* code, including the standard library. It makes
>> sense to register them as early in interpreter startup as possible, but
>> if they're not registered, nothing breaks (things will just be slower).
>> Experiments with future syntax (like when async/await was being
>> developed) have the same needs.
>> Syntax extensions, such as MacroPy or Hy, tend to target specific
>> modules, with which they're closely coupled: The modules won't run
>> without the transformer. And with other modules, the transformer either
>> does nothing (as with MacroPy, hopefully), or would fail altogether (as
>> with Hy). So, they would benefit from specific packages opting in. The
>> effects of enabling them globally range from inefficiency (MacroPy) to
>> failures or needing workarounds (Hy).
> To be clear, Hylang will not benefit from my PEP. That's why it is not
> mentioned in the PEP.
> "Syntax extensions" only look like a special case of optimizers. I'm not
> sure that it's worth to make them really different.

There is an important difference: optimizers should be installed
globally. But modules that don't opt in to a specific syntax extension
should not get compiled with it.

>> The PEP is designed optimizers. It would be good to stick to that use
>> case, at least as far as the registration is concerned. I suggest noting
>> in the documentation that Python semantics *must* be preserved, and
>> renaming the API, e.g.::

My API examples seem to have led the conversation astray.
The point I wanted to make is that "syntax extensions" need a
registration API that only enables them for specific modules.

I admit the particular examples weren't very well thought out. I'm not
proposing adding *any* of them to the PEP: I'd be happy if the PEP stuck
to the "optimizers" use case and do that well.
The "extensions" case is worth another PEP, which can reuse the
transformers API (probably integrating it with importlib), but not the
registration API.

> I would prefer to do that differently: always register transformers
> very early, but configure each transformer to only apply it on some
> files. The transformer can use the filename (file extension?
> importlib is currently restricted to .py files by default no?), it
> can use a special variable in the file (ex: fatoptimizer searchs
> for a __fatoptimizer__ variable which is used to configure the
> optimizer), a configuration loaded when the transformer is
> created, etc.

Why very early? If a syntax extension is used in some package, it should
only be activated right before that package is imported. And ideally it
shouldn't get a chance to be activated on other packages.

importlib is not restricted to .py (it can do .zip, .pyc, .so, etc. out
of the box). Actually, with import hooks, the *source* file that uses
the DSL can use a different extension (as opposed to the *.pyc getting a
different tag, as for optimizers).
For example, a library using a SQL DSL could look like::    (imports a package to set up the transformer)

That is probably what you want for syntax extensions. You can't really
look at special variables in the file, because the transformer needs to
be enabled before the code is compiled -- especially if text/tokenstream
transforms are added, so the file might not be valid "vanilla Python".

What's left is making it easy to register an import hook with a specific
PEP 511 transformer -- but again, that can be a different PEP.

From at  Mon Jan 18 11:45:37 2016
From: at (Yury Selivanov)
Date: Mon, 18 Jan 2016 11:45:37 -0500
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 2016-01-17 7:36 AM, Nick Coghlan wrote:
> On 17 January 2016 at 21:48, Victor Stinner<victor.stinner at>  wrote:
>> >2016-01-16 12:06 GMT+01:00 Petr Viktorin<encukou at>:
>>> >>The PEP is designed optimizers. It would be good to stick to that use
>>> >>case, at least as far as the registration is concerned. I suggest noting
>>> >>in the documentation that Python semantics*must*  be preserved, and
>>> >>renaming the API, e.g.::
>>> >>
>>> >>     sys.set_global_optimizers([])
>> >
>> >I would prefer to not restrict the PEP to a specific usage.
> The problem I see with making the documentation and naming too generic
> is that people won't know what the feature is useful for - a generic
> term like "transformer" accurately describes these units of code, but
> provides no hint as to why a developer might care about their
> existence.
> However, if the reason we're adding the capability is to make global
> static optimizers feasible, then we cam describe it accordingly (so
> the answer to "Why does this feature exist?" becomes relatively self
> evident), and have the fact that the feature can actually be used for
> arbitrary transforms be an added bonus rather than the core intent.



From brett at  Mon Jan 18 11:52:42 2016
From: brett at (Brett Cannon)
Date: Mon, 18 Jan 2016 16:52:42 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 16 Jan 2016 at 19:38 Nick Coghlan <ncoghlan at> wrote:

> On 17 January 2016 at 04:28, Brett Cannon <brett at> wrote:
> >
> > On Fri, 15 Jan 2016 at 09:40 Victor Stinner <victor.stinner at>
> > wrote:
> >>
> >> 2016-01-15 18:22 GMT+01:00 Brett Cannon <brett at>:
> >> > I just wanted to point out to people that the key part of this PEP is
> >> > the
> >> > change in semantics of `-O` accepting an argument.
> >>
> >> The be exact, it's a new "-o arg" option, it's different from -O and
> >> -OO (uppercase). Since I don't know what to do with -O and -OO, I
> >> simply kept them :-D
> >>
> >> > I should also point out that this does get tricky in terms of how to
> >> > handle
> >> > the stdlib if you have not pre-compiled it, e.g., if the first module
> >> > imported by Python is the encodings module then how to make sure the
> >> > optimizers are ready to go by the time that import happens?
> >>
> >> Since importlib reads sys.implementation.optim_tag at each import, it
> >> works fine.
> >>
> >> For example, you start with "opt" optimizer tag. You import everything
> >> needed for fatoptimizer. Then calling sys.set_code_transformers() will
> >> set a new optimizer flag (ex: "fat-opt"). But it works since the
> >> required code transformers are now available.
> >
> >
> > I understand all of that; my point is what if you don't compile the
> stdlib
> > for your optimization? You have to import over 20 modules before user
> code
> > gets imported. My question is how do you expect the situation to be
> handled
> > where you didn't optimize the stdlib since the 'encodings' module is
> > imported before anything else? If you set your `-o` flag and you want to
> > fail imports if the .pyc isn't there, then wouldn't that mean you are
> going
> > to fail immediately when you try and import 'encodings' in
> Py_Initialize()?
> I don't think that's a major problem - it seems to me that it's the
> same as going for "pyc only" deployment with an embedded Python
> interpreter, and then forgetting to a precompiled standard library in
> addition to your own components. Yes, it's going to fail, but the bug
> is in the build process for your deployment artifacts rather than in
> the runtime behaviour of CPython.

It is the same, and that's my point. If we are going to enforce this import
requirement of having a matching .pyc file in order to do a proper import,
then we are already requiring an offline compilation which makes the
dynamic registering of optimizers a lot less necessary.

Now if we tweak the proposed semantics of `-o` to say "import these of kind
of optimized .pyc file *if you can*, otherwise don't worry about it", then
having registered optimizers makes more sense as that gets around the
bootstrap problem with the stdlib. This would require optimizations to be
module-level and not application-level, though. This also makes the
difference between `-o` and `-O` even more prevalent as the latter is not
only required, but then restricted to only optimizations which affect what
syntax is executed instead of what AST transformations were applied. This
also means that the file name of the .pyc files should keep `opt-1`, etc.
and the AST transformation names get appended on as it would stack `-O` and

It really depends on what kinds of optimizations we expect people to do. If
we expect application-level optimizations then we need to enforce universal
importing of bytecode because it may make assumptions about other modules.
But if we limit it to module-level optimizations then it isn't quite so
critical that the .pyc files pre-exist, making it such that `-o` can be
more of a request than a requirement for importing modules of a certain
optimization. That also means if the same AST optimizers are not installed
it's no big deal since you just work with what you have (although you could
set it to raise an ImportWarning when an import didn't find a .pyc file of
the requested optimization *and* the needed AST optimizers weren't
available either).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Mon Jan 18 11:54:32 2016
From: random832 at (Random832)
Date: Mon, 18 Jan 2016 11:54:32 -0500
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

> >> >2016-01-16 12:06 GMT+01:00 Petr Viktorin<encukou at>:
> >>> >>The PEP is designed optimizers. It would be good to stick to that use
> >>> >>case, at least as far as the registration is concerned. I suggest noting
> >>> >>in the documentation that Python semantics*must*  be preserved, and
> >>> >>renaming the API, e.g.::
> >>> >>
> >>> >>     sys.set_global_optimizers([])

> > On 17 January 2016 at 21:48, Victor Stinner<victor.stinner at>  wrote:
> >> >I would prefer to not restrict the PEP to a specific usage.

> On 2016-01-17 7:36 AM, Nick Coghlan wrote:
> > The problem I see with making the documentation and naming too generic
> > is that people won't know what the feature is useful for - a generic
> > term like "transformer" accurately describes these units of code, but
> > provides no hint as to why a developer might care about their
> > existence.
> >
> > However, if the reason we're adding the capability is to make global
> > static optimizers feasible, then we cam describe it accordingly (so
> > the answer to "Why does this feature exist?" becomes relatively self
> > evident), and have the fact that the feature can actually be used for
> > arbitrary transforms be an added bonus rather than the core intent.

On Mon, Jan 18, 2016, at 11:45, Yury Selivanov wrote:
> +1.

I think that it depends on how it's implemented. Having a _requirement_
that semantics _must_ be preserved suggests that they may not always be
applied, or may not be applied in a deterministic order.

From at  Mon Jan 18 12:04:31 2016
From: at (Yury Selivanov)
Date: Mon, 18 Jan 2016 12:04:31 -0500
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

>> On 2016-01-17 7:36 AM, Nick Coghlan wrote:
>>> The problem I see with making the documentation and naming too generic
>>> is that people won't know what the feature is useful for - a generic
>>> term like "transformer" accurately describes these units of code, but
>>> provides no hint as to why a developer might care about their
>>> existence.
>>> However, if the reason we're adding the capability is to make global
>>> static optimizers feasible, then we cam describe it accordingly (so
>>> the answer to "Why does this feature exist?" becomes relatively self
>>> evident), and have the fact that the feature can actually be used for
>>> arbitrary transforms be an added bonus rather than the core intent.
> On Mon, Jan 18, 2016, at 11:45, Yury Selivanov wrote:
>> +1.
> I think that it depends on how it's implemented. Having a _requirement_
> that semantics _must_ be preserved suggests that they may not always be
> applied, or may not be applied in a deterministic order.

It just won't be possible to enforce that "requirement".

What Nick suggests (and I suggested in my email earlier in
this thread) is that we should name the APIs clearly to avoid
any confusion.

`sys.set_code_transformers` is less clear about what it should
be used for than `sys.set_code_optimizers`.


From random832 at  Mon Jan 18 12:26:30 2016
From: random832 at (Random832)
Date: Mon, 18 Jan 2016 12:26:30 -0500
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 18, 2016, at 12:04, Yury Selivanov wrote:
> > I think that it depends on how it's implemented. Having a _requirement_
> > that semantics _must_ be preserved suggests that they may not always be
> > applied, or may not be applied in a deterministic order.
> It just won't be possible to enforce that "requirement".

I'm not talking about mechanically enforcing it. I'm talking about it
being a documented requirement to write such code, and that people
*should not* use this feature for things that need to be applied 100% of
the time for their applications to work. Either we have to nail down
exactly when and how these things are invoked so that people can rely on
them, or they are only _useful_ for optimizations (and other
semantic-preserving things like instrumentation) rather than arbitrary

From haael at  Tue Jan 19 09:10:35 2016
From: haael at (haael at
Date: Tue, 19 Jan 2016 15:10:35 +0100
Subject: [Python-ideas] Explicit variable capture list
Message-ID: <ypnydezpdjtvtxsvsohu@vlmj>


C++ has a nice feature of explicit variable capture list for lambdas:

    int a = 1, b = 2, c = 3;
    auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};

This allows easy construction of closures. In Python to achieve that, you need to say:

    def make_closure(a, b, c):
        def fun(x, y):
            return a + b + c + x + y
        return def
    a = 1
    b = 2
    c = 3
    fun = make_closure(a, b, c)

My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables

    a = 1
    b = 2
    c = 3
    def fun(x, y):
        capture a, b, c
        return a + b + c + x + y

This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour.

To make it more useful, we also need some syntax for inline lambdas. I.e.:

    a = 1
    b = 2
    c = 3
    fun = lambda[a, b, c] x, y: a + b + c + x + y


From jeanpierreda at  Tue Jan 19 09:39:17 2016
From: jeanpierreda at (Devin Jeanpierre)
Date: Tue, 19 Jan 2016 06:39:17 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <ypnydezpdjtvtxsvsohu@vlmj>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 6:10 AM, <haael at> wrote:
> Hi
> C++ has a nice feature of explicit variable capture list for lambdas:
>     int a = 1, b = 2, c = 3;
>     auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
> This allows easy construction of closures. In Python to achieve that, you need to say:

This is worded very confusingly. Python has easy construction of
closures with implicit variable capture.

The difference has to do with "value semantics" in C++, which Python
doesn't have. If you were using int* variables in your C++ example,
you'd have the same semantics as Python does with its int references.

>     def make_closure(a, b, c):
>         def fun(x, y):
>             return a + b + c + x + y
>         return def
>     a = 1
>     b = 2
>     c = 3
>     fun = make_closure(a, b, c)

The usual workaround is actually:

  a = 1
  b = 1
  c = 1
  def fun(x, y, a=a, b=b, c=c):
      return a + b + c + x + y

-- Devin

From jeanpierreda at  Tue Jan 19 09:45:59 2016
From: jeanpierreda at (Devin Jeanpierre)
Date: Tue, 19 Jan 2016 06:45:59 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Sorry, forget the first part entirely, I was still confused when I
wrote it. Definitely the semantics of values are very different, but
they don't matter for this.

I think the rough equivalent of the capture-by-copy C++ lambda is the
function definition I provided with default values.

-- Devin

On Tue, Jan 19, 2016 at 6:39 AM, Devin Jeanpierre
<jeanpierreda at> wrote:
> On Tue, Jan 19, 2016 at 6:10 AM, <haael at> wrote:
>> Hi
>> C++ has a nice feature of explicit variable capture list for lambdas:
>>     int a = 1, b = 2, c = 3;
>>     auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
>> This allows easy construction of closures. In Python to achieve that, you need to say:
> This is worded very confusingly. Python has easy construction of
> closures with implicit variable capture.
> The difference has to do with "value semantics" in C++, which Python
> doesn't have. If you were using int* variables in your C++ example,
> you'd have the same semantics as Python does with its int references.
>>     def make_closure(a, b, c):
>>         def fun(x, y):
>>             return a + b + c + x + y
>>         return def
>>     a = 1
>>     b = 2
>>     c = 3
>>     fun = make_closure(a, b, c)
> The usual workaround is actually:
>   a = 1
>   b = 1
>   c = 1
>   def fun(x, y, a=a, b=b, c=c):
>       return a + b + c + x + y
> -- Devin

From tjreedy at  Tue Jan 19 10:23:36 2016
From: tjreedy at (Terry Reedy)
Date: Tue, 19 Jan 2016 10:23:36 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <ypnydezpdjtvtxsvsohu@vlmj>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <n7lke6$j6o$>

On 1/19/2016 9:10 AM, haael at wrote:
> Hi
> C++ has a nice feature of explicit variable capture list for lambdas:
>      int a = 1, b = 2, c = 3;
>      auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
> This allows easy construction of closures. In Python to achieve that, you need to say:
>      def make_closure(a, b, c):
>          def fun(x, y):
>              return a + b + c + x + y
>          return def
>      a = 1
>      b = 2
>      c = 3
>      fun = make_closure(a, b, c)

The purpose of writing a make_closure function is so it can be called 
more than once, to make more than one closure.

f123 = make_closure(1, 2, 3)
f456 = make_closure(4, 5, 6)

> My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables
>      a = 1
>      b = 2
>      c = 3
>      def fun(x, y):
>          capture a, b, c
>          return a + b + c + x + y
> This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour.

This only allows one version of fun, not multiple, so it is not 
equivalent at all.  As Devin stated, it is equivalent to to using 
parameters with default argument values.

Terry Jan Reedy

From abarnert at  Tue Jan 19 11:22:51 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 19 Jan 2016 08:22:51 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <ypnydezpdjtvtxsvsohu@vlmj>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 19, 2016, at 06:10, haael at wrote:
> Hi
> C++ has a nice feature of explicit variable capture list for lambdas:
>    int a = 1, b = 2, c = 3;
>    auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
> This allows easy construction of closures. In Python to achieve that, you need to say:
>    def make_closure(a, b, c):
>        def fun(x, y):
>            return a + b + c + x + y
>        return def
>    a = 1
>    b = 2
>    c = 3
>    fun = make_closure(a, b, c)
> My proposal: create a special variable qualifier (like global and nonlocal) to automatically capture variables
>    a = 1
>    b = 2
>    c = 3
>    def fun(x, y):
>        capture a, b, c
>        return a + b + c + x + y
> This will have an effect that symbols a, b and c in the body of the function have values as they had at the moment of function creation. The variables a, b, c must be defined at the time of function creation. If they are not, an error is thrown. The 'capture' qualifier may be combined with keywords global and nonlocal to change lookup behaviour.

What you're suggesting is the exact opposite of what you say you're suggesting. Capturing a, b, and c in a closure is what Python already does. What you're trying to do is _not_ capture them and _not_ create a closure. So calling the statement "capture" is very misleading, and saying it "allows easy construction of closures" even more so.

In C++ terms, this:

    def fun(x, y): return a + b + c + x + y


    auto fun = [&](int x, int y) { return a + b + c + x + y; };

It obviously doesn't mean this, as you imply:

    auto fun = [](int x, int y) { return a + b + c + x + y; };

... because that just gives you a compile-time error saying that local variables a, b, and c aren't defined, which is not what Python does.

If you're looking for a way to copy references to the values, instead of capturing the variables, you write this:

    def fun(x, y, a=a, b=b, c=c): return a + b + c + x + y

And if you want to actually copy the values themselves, you have to do that explicitly (which has no visible effect for ints, of course, but think about lists or dicts here):

    def fun(x, y, a=copy.copy(a), b=copy.copy(b), c=copy.copy(c)): return a + b + c + x + y

... because Python, unlike C++, never automatically copies values. (Again, think about lists or dicts. If passing them to a function or storing them in a variable made an automatic copy, as in C++, you'd be wasting lots of time and space copying them all over the place. That's why you have to explicitly create vector<int>& variables, or shared_ptr<vector<int>>, or pass around iterators instead of the container itself--because you almost never actually want to waste time and space making a copy if you're not mutating, and you almost always want the changes to be effective if you are mutating.)

From guido at  Tue Jan 19 11:47:28 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Jan 2016 08:47:28 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

I think it's reasonable to divert this discussion to "value capture". Not
sure if that's the usual terminology, but the idea should be that a
reference to the value is captured, rather than (as Python normally does
with closures) a reference to the variable (implemented as something called
a "cell").

(However let's please not consider whether the value should be copied or
deep-copied. Just capture the object reference at the point the capture is

The best syntax for such capture remains to be seen. ("Capture" seems to
universally make people think of "variable capture" which is the opposite
of what we want here.)

On Tue, Jan 19, 2016 at 8:22 AM, Andrew Barnert via Python-ideas <
python-ideas at> wrote:

> On Jan 19, 2016, at 06:10, haael at wrote:
> >
> >
> > Hi
> >
> > C++ has a nice feature of explicit variable capture list for lambdas:
> >
> >    int a = 1, b = 2, c = 3;
> >    auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
> >
> > This allows easy construction of closures. In Python to achieve that,
> you need to say:
> >
> >    def make_closure(a, b, c):
> >        def fun(x, y):
> >            return a + b + c + x + y
> >        return def
> >    a = 1
> >    b = 2
> >    c = 3
> >    fun = make_closure(a, b, c)
> >
> > My proposal: create a special variable qualifier (like global and
> nonlocal) to automatically capture variables
> >
> >    a = 1
> >    b = 2
> >    c = 3
> >    def fun(x, y):
> >        capture a, b, c
> >        return a + b + c + x + y
> >
> > This will have an effect that symbols a, b and c in the body of the
> function have values as they had at the moment of function creation. The
> variables a, b, c must be defined at the time of function creation. If they
> are not, an error is thrown. The 'capture' qualifier may be combined with
> keywords global and nonlocal to change lookup behaviour.
> What you're suggesting is the exact opposite of what you say you're
> suggesting. Capturing a, b, and c in a closure is what Python already does.
> What you're trying to do is _not_ capture them and _not_ create a closure.
> So calling the statement "capture" is very misleading, and saying it
> "allows easy construction of closures" even more so.
> In C++ terms, this:
>     def fun(x, y): return a + b + c + x + y
> means:
>     auto fun = [&](int x, int y) { return a + b + c + x + y; };
> It obviously doesn't mean this, as you imply:
>     auto fun = [](int x, int y) { return a + b + c + x + y; };
> ... because that just gives you a compile-time error saying that local
> variables a, b, and c aren't defined, which is not what Python does.
> If you're looking for a way to copy references to the values, instead of
> capturing the variables, you write this:
>     def fun(x, y, a=a, b=b, c=c): return a + b + c + x + y
> And if you want to actually copy the values themselves, you have to do
> that explicitly (which has no visible effect for ints, of course, but think
> about lists or dicts here):
>     def fun(x, y, a=copy.copy(a), b=copy.copy(b), c=copy.copy(c)): return
> a + b + c + x + y
> ... because Python, unlike C++, never automatically copies values. (Again,
> think about lists or dicts. If passing them to a function or storing them
> in a variable made an automatic copy, as in C++, you'd be wasting lots of
> time and space copying them all over the place. That's why you have to
> explicitly create vector<int>& variables, or shared_ptr<vector<int>>, or
> pass around iterators instead of the container itself--because you almost
> never actually want to waste time and space making a copy if you're not
> mutating, and you almost always want the changes to be effective if you are
> mutating.)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From storchaka at  Tue Jan 19 15:24:37 2016
From: storchaka at (Serhiy Storchaka)
Date: Tue, 19 Jan 2016 22:24:37 +0200
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <n7m627$5sq$>

On 19.01.16 18:47, Guido van Rossum wrote:
> I think it's reasonable to divert this discussion to "value capture".
> Not sure if that's the usual terminology, but the idea should be that a
> reference to the value is captured, rather than (as Python normally does
> with closures) a reference to the variable (implemented as something
> called a "cell").
> (However let's please not consider whether the value should be copied or
> deep-copied. Just capture the object reference at the point the capture
> is executed.)
> The best syntax for such capture remains to be seen. ("Capture" seems to
> universally make people think of "variable capture" which is the
> opposite of what we want here.)

A number of variants of more powerful syntax were proposed in [1]. In 
neighbour topic Scott Sanderson had pointed to the asconstants decorator 
in codetransformer [2] that patches the code object by substituting a 
references to the variable with a reference to the constant. Ryan 
Gonzalez provided other implementation of similar decorator [3].

May be this feature doesn't need new syntax, but just new decorator in 
the stdlib.


From rosuav at  Tue Jan 19 18:29:48 2016
From: rosuav at (Chris Angelico)
Date: Wed, 20 Jan 2016 10:29:48 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum <guido at> wrote:
> I think it's reasonable to divert this discussion to "value capture". Not
> sure if that's the usual terminology, but the idea should be that a
> reference to the value is captured, rather than (as Python normally does
> with closures) a reference to the variable (implemented as something called
> a "cell").

+1. This would permit deprecation of the "def blah(...., len=len):"
optimization - all you need to do is set a value capture on the name


From guido at  Tue Jan 19 18:43:12 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Jan 2016 15:43:12 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n7m627$5sq$>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 12:24 PM, Serhiy Storchaka <storchaka at>
> A number of variants of more powerful syntax were proposed in [1]. In
> neighbour topic Scott Sanderson had pointed to the asconstants decorator in
> codetransformer [2] that patches the code object by substituting a
> references to the variable with a reference to the constant. Ryan Gonzalez
> provided other implementation of similar decorator [3].
> May be this feature doesn't need new syntax, but just new decorator in the
> stdlib.

Hmm... Using a decorator would mean that you'd probably have to add quotes
around the names of the variables whose values you want to capture, and
it'd require hacking the bytecode. That would mean that it'd only work for
CPython, and it'd not be a real part of the language. This feels like it
wants to be a language-level feature, like nonlocal.

> [1]
> [2]
> [3]

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Tue Jan 19 18:51:15 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Jan 2016 10:51:15 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <ypnydezpdjtvtxsvsohu@vlmj>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 03:10:35PM +0100, haael at wrote:
> Hi
> C++ has a nice feature of explicit variable capture list for lambdas:
>     int a = 1, b = 2, c = 3;
>     auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};

For the benefit of those who don't speak C++, could you explain what 
that does? Are C++ name binding semantics the same as Python's?

Specifically, inside fun, does "a" refer to the global a? If you rebind
global a, what happens to fun?

fun(0, 0)  # returns 6
a = 0
fun(0, 0)  # returns 5 or 6?

> This allows easy construction of closures. In Python to achieve that, you need to say:
>     def make_closure(a, b, c):
>         def fun(x, y):
>             return a + b + c + x + y
>         return def
>     a = 1
>     b = 2
>     c = 3
>     fun = make_closure(a, b, c)

I cannot tell whether the C++ semantics above are the same as the Python 
semantics here. Andrew's response to you suggests that it is not.

> My proposal: create a special variable qualifier (like global and 
> nonlocal) to automatically capture variables

"Variables" is an ambiguous term. I don't want to get into a debate 
about "Python doesn't have variables", but it's not clear what you mean 
here. Python certainly has names, and values, and when you talk about 
"variables" do you mean the name or the value or both?

>     a = 1
>     b = 2
>     c = 3
>     def fun(x, y):
>         capture a, b, c
>         return a + b + c + x + y
> This will have an effect that symbols a, b and c in the body of the 
> function have values as they had at the moment of function creation. 
> The variables a, b, c must be defined at the time of function 
> creation. If they are not, an error is thrown. 

If I have understood you correctly, we can already do that in 
Python, and don't even need a closure:

a, b, c = 1, 2, 3
fun = lambda x, y, a=a, b=b, c=c: a + b + c + x + y

will capture the current *value* of GLOBAL a, b and c, store them as 
default values, and use them as the LOCAL a, b and c.

You may consider it a strength or a weakness that they are exposed as 
regular function parameters:

fun(x, y)  # intended call signature
fun(x, y, a, b, c)  # actual call signature

but if you really care about hiding the extra parameters, a second 
approach will work:

from functools import partial
a, b, c = 1, 2, 3
fun = partial(lambda a, b, c, x, y: a + b + c + x + y, a, b, c)

If a, b, c are mutable objects, you can make a copy of the value:

fun = partial(lambda a, b, c, x, y: a + b + c + x + y, 
              a, b, copy.copy(c)

for example.

Does your proposal behave any differently from these examples?


From steve at  Tue Jan 19 19:14:25 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Jan 2016 11:14:25 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 10:29:48AM +1100, Chris Angelico wrote:
> On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum <guido at> wrote:
> > I think it's reasonable to divert this discussion to "value capture". Not
> > sure if that's the usual terminology, but the idea should be that a
> > reference to the value is captured, rather than (as Python normally does
> > with closures) a reference to the variable (implemented as something called
> > a "cell").
> +1. This would permit deprecation of the "def blah(...., len=len):"
> optimization - all you need to do is set a value capture on the name
> "len".

Some might argue that the default argument trick is already the One 
Obvious Way to capture a value in a function.

I don't think deprecation is the right word here, you can't deprecate 
"len=len" style code because it's just a special case of the more 
general name=expr function default argument syntax. I suppose a linter 
might complain if the expression on the right hand side is precisely the 
same as the name on the left, but _len=len would trivially work around 


From rymg19 at  Tue Jan 19 19:33:55 2016
From: rymg19 at (Ryan Gonzalez)
Date: Tue, 19 Jan 2016 18:33:55 -0600
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On January 19, 2016 5:51:15 PM CST, Steven D'Aprano <steve at> wrote:
>On Tue, Jan 19, 2016 at 03:10:35PM +0100, haael at wrote:
>> Hi
>> C++ has a nice feature of explicit variable capture list for lambdas:
>>     int a = 1, b = 2, c = 3;
>>     auto fun = [a, b, c](int x, int y){ return a + b + c + x + y};
>For the benefit of those who don't speak C++, could you explain what 
>that does? Are C++ name binding semantics the same as Python's?


>Specifically, inside fun, does "a" refer to the global a? If you rebind
>global a, what happens to fun?
>fun(0, 0)  # returns 6
>a = 0
>fun(0, 0)  # returns 5 or 6?

The given C++ lambda syntax copies the input parameters, so it would return 5.

This would return 6:

auto fun = [&a, &b, &c](int x, int y){ return a + b + c + x + y};

>> This allows easy construction of closures. In Python to achieve that,
>you need to say:
>>     def make_closure(a, b, c):
>>         def fun(x, y):
>>             return a + b + c + x + y
>>         return def
>>     a = 1
>>     b = 2
>>     c = 3
>>     fun = make_closure(a, b, c)
>I cannot tell whether the C++ semantics above are the same as the
>semantics here. Andrew's response to you suggests that it is not.
>> My proposal: create a special variable qualifier (like global and 
>> nonlocal) to automatically capture variables
>"Variables" is an ambiguous term. I don't want to get into a debate 
>about "Python doesn't have variables", but it's not clear what you mean
>here. Python certainly has names, and values, and when you talk about 
>"variables" do you mean the name or the value or both?
>>     a = 1
>>     b = 2
>>     c = 3
>>     def fun(x, y):
>>         capture a, b, c
>>         return a + b + c + x + y
>> This will have an effect that symbols a, b and c in the body of the 
>> function have values as they had at the moment of function creation. 
>> The variables a, b, c must be defined at the time of function 
>> creation. If they are not, an error is thrown. 
>If I have understood you correctly, we can already do that in 
>Python, and don't even need a closure:
>a, b, c = 1, 2, 3
>fun = lambda x, y, a=a, b=b, c=c: a + b + c + x + y
>will capture the current *value* of GLOBAL a, b and c, store them as 
>default values, and use them as the LOCAL a, b and c.
>You may consider it a strength or a weakness that they are exposed as 
>regular function parameters:
>fun(x, y)  # intended call signature
>fun(x, y, a, b, c)  # actual call signature
>but if you really care about hiding the extra parameters, a second 
>approach will work:
>from functools import partial
>a, b, c = 1, 2, 3
>fun = partial(lambda a, b, c, x, y: a + b + c + x + y, a, b, c)
>If a, b, c are mutable objects, you can make a copy of the value:
>fun = partial(lambda a, b, c, x, y: a + b + c + x + y, 
>              a, b, copy.copy(c)
>              )
>for example.
>Does your proposal behave any differently from these examples?

Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From steve at  Tue Jan 19 19:37:12 2016
From: steve at (Steven D'Aprano)
Date: Wed, 20 Jan 2016 11:37:12 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote:

> I think it's reasonable to divert this discussion to "value capture". Not
> sure if that's the usual terminology, but the idea should be that a
> reference to the value is captured, rather than (as Python normally does
> with closures) a reference to the variable (implemented as something called
> a "cell").

If I understand you correctly, that's precisely what a function default 
argument does: capture the current value of the default value expression 
at the time the function is called.

This has the side-effect of exposing that as an argument, which may be 
underdesirable. partial() can be used to work around that.

> (However let's please not consider whether the value should be copied or
> deep-copied. Just capture the object reference at the point the capture is
> executed.)
> The best syntax for such capture remains to be seen. ("Capture" seems to
> universally make people think of "variable capture" which is the opposite
> of what we want here.)

If I recall correctly, there was a recent(?) proposal for a "static" 
keyword with similar semantics:

def func(a):
   static b = expression

would guarantee that expression was evaluated exactly once. If that 
evaluation occured when func was defined, rather than when it was first 
called, that might be the semantics you are looking for:

def func(a):
    static b = b  # captures the value of b from the enclosing scope

Scoping rules might be tricky to get right. Perhaps rather than a 
declaration, "static" might be better treated as a block:

def func(a):
        # Function initialisation section. Occurs once, when the 
        # def statement runs.
        b = b  # b on the left is local, b on the right is non-local
               # (just like in a parameter list)
    # Normal function body goes here.

But neither of these approaches would be good for lambdas. I'm okay with 
that -- lambda is a lightweight syntax, for lightweight needs. If your 
needs are great (doc strings, annotations, multiple statements) don't 
use lambda.


From rosuav at  Tue Jan 19 19:38:45 2016
From: rosuav at (Chris Angelico)
Date: Wed, 20 Jan 2016 11:38:45 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 11:14 AM, Steven D'Aprano <steve at> wrote:
> On Wed, Jan 20, 2016 at 10:29:48AM +1100, Chris Angelico wrote:
>> On Wed, Jan 20, 2016 at 3:47 AM, Guido van Rossum <guido at> wrote:
>> > I think it's reasonable to divert this discussion to "value capture". Not
>> > sure if that's the usual terminology, but the idea should be that a
>> > reference to the value is captured, rather than (as Python normally does
>> > with closures) a reference to the variable (implemented as something called
>> > a "cell").
>> +1. This would permit deprecation of the "def blah(...., len=len):"
>> optimization - all you need to do is set a value capture on the name
>> "len".
> Some might argue that the default argument trick is already the One
> Obvious Way to capture a value in a function.

I disagree. There is nothing obvious about this, outside of the fact
that it's already used in so many places. It's not even obvious after
looking at the code.

> I don't think deprecation is the right word here, you can't deprecate
> "len=len" style code because it's just a special case of the more
> general name=expr function default argument syntax. I suppose a linter
> might complain if the expression on the right hand side is precisely the
> same as the name on the left, but _len=len would trivially work around
> that.

The deprecation isn't of named arguments with defaults, but of the use
of that for no reason other than optimization. IMO function arguments
should always exist primarily so a caller can override them. In
contrast, random.randrange has a parameter _int which is not mentioned
in the docs, and which should never be provided. Why should it even be
exposed? It exists solely as an optimization.

Big one for the bike-shedding: Is this "capture as local" (the same
semantics as the default arg - if you rebind it, it changes for the
current invocation only), or "capture as static" (the same semantics
as a closure if you use the 'nonlocal' directive - if you rebind it,
it stays changed), or "capture as constant" (what people are usually
going to be doing anyway)?


From guido at  Tue Jan 19 20:01:42 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Jan 2016 17:01:42 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 4:37 PM, Steven D'Aprano <steve at>

> On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote:
> > I think it's reasonable to divert this discussion to "value capture". Not
> > sure if that's the usual terminology, but the idea should be that a
> > reference to the value is captured, rather than (as Python normally does
> > with closures) a reference to the variable (implemented as something
> called
> > a "cell").
> If I understand you correctly, that's precisely what a function default
> argument does: capture the current value of the default value expression
> at the time the function is called.

I think you misspoke here (I don't think you actually believe what you said

Function defaults capture the current value at the time the function is

> This has the side-effect of exposing that as an argument, which may be
> underdesirable.

Indeed. It's also non-obvious to people who haven't seen it before.

> partial() can be used to work around that.

Hardly. Adding a partial() call usually makes code *less* obvious.

> > The best syntax for such capture remains to be seen. ("Capture" seems to
> > universally make people think of "variable capture" which is the opposite
> > of what we want here.)
> If I recall correctly, there was a recent(?) proposal for a "static"
> keyword with similar semantics:
> def func(a):
>    static b = expression
>    ...
> would guarantee that expression was evaluated exactly once.

Once per what? In the lifetime of the universe? Per CPython process start?
Per call?

J/K, I think I know what you meant -- once per function definition (same as
default values).

> If that
> evaluation occurred when func was defined, rather than when it was first
> called,

(FWIW, "when it was first called" would be a recipe for disaster and
irreproducible results.)

> that might be the semantics you are looking for:
> def func(a):
>     static b = b  # captures the value of b from the enclosing scope

Yeah, I think the OP proposed 'capture b' with these semantics.

> Scoping rules might be tricky to get right. Perhaps rather than a
> declaration, "static" might be better treated as a block:

Why? This does smell like a directive similar to global and nonlocal.

> def func(a):
>     static:
>         # Function initialisation section. Occurs once, when the
>         # def statement runs.
>         b = b  # b on the left is local, b on the right is non-local
>                # (just like in a parameter list)

Hm, this repetition of the name in parameter lists is actually a strike
against it, and the flexibility it adds (of allowing arbitrary expressions
to be captured) doesn't seem to be needed much in reality -- the examples
for the argument default pattern invariably use 'foo=foo, bar=bar'.

>     # Normal function body goes here.
> But neither of these approaches would be good for lambdas. I'm okay with
> that -- lambda is a lightweight syntax, for lightweight needs. If your
> needs are great (doc strings, annotations, multiple statements) don't
> use lambda.

Yeah, the connection with lambdas in C++ is unfortunate. In C++, IIRC, the
term lambda is used to refer to any function nested inside another, and
that's the only place where closures exist.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Tue Jan 19 23:44:00 2016
From: random832 at (Random832)
Date: Tue, 19 Jan 2016 23:44:00 -0500
Subject: [Python-ideas] Explicit variable capture list
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Steven D'Aprano <steve at> writes:
> But neither of these approaches would be good for lambdas. I'm okay with 
> that -- lambda is a lightweight syntax, for lightweight needs. If your 
> needs are great (doc strings, annotations, multiple statements) don't 
> use lambda.

Yeah, but the fact that it's specifically part of C++'s lambda syntax
suggests that it is a very common thing to need with a lambda, doesn't
it? What about... lambda a, = b: [stuff with captured value b] ?

From guido at  Tue Jan 19 23:54:51 2016
From: guido at (Guido van Rossum)
Date: Tue, 19 Jan 2016 20:54:51 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
 <> <>
Message-ID: <>

On Tue, Jan 19, 2016 at 8:44 PM, Random832 <random832 at> wrote:

> Steven D'Aprano <steve at> writes:
> > But neither of these approaches would be good for lambdas. I'm okay with
> > that -- lambda is a lightweight syntax, for lightweight needs. If your
> > needs are great (doc strings, annotations, multiple statements) don't
> > use lambda.
> Yeah, but the fact that it's specifically part of C++'s lambda syntax
> suggests that it is a very common thing to need with a lambda, doesn't
> it?

No, that's because in C++ "lambdas" are the only things with closures.

> What about... lambda a, = b: [stuff with captured value b] ?


--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Wed Jan 20 01:37:48 2016
From: ncoghlan at (Nick Coghlan)
Date: Wed, 20 Jan 2016 16:37:48 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 20 January 2016 at 10:38, Chris Angelico <rosuav at> wrote:
> Big one for the bike-shedding: Is this "capture as local" (the same
> semantics as the default arg - if you rebind it, it changes for the
> current invocation only), or "capture as static" (the same semantics
> as a closure if you use the 'nonlocal' directive - if you rebind it,
> it stays changed), or "capture as constant" (what people are usually
> going to be doing anyway)?

The "shared value" approach can already be achieved by binding a
mutable object rather than an immutable one, and there's no runtime
speed difference between looking up a local and looking up a constant,
so I think it makes sense to just stick with "default argument
semantics, but without altering the function signature"

One possible name for such a directive would be "sharedlocal": it's in
most respects a local variable, but the given definition time
initialisation value is shared across all invocations to the function.

With that spelling:

    def f(*, len=len):

Would become:

    def f():
        sharedlocal len=len

And you'd also be able to do things like:

    def f():
        sharedlocal cache={}

Alternatively, if we just wanted to support early binding of
pre-existing names, then "bindlocal" could work:

    def f():
        bindlocal len

Either approach could be used to handle early binding of loop
iteration variables:

    for i in range(10):
        def f():
            sharedlocal i=i

    for i in range(10):
        def f():
            bindlocal i

I'd be -1 on bindlocal (I think dynamic optimisers like PyPy or Numba,
or static ones like Victor's FAT Python project are better answers
there), but "sharedlocal" is more interesting, since it means you can
avoid creating a closure if all you need is to persist a bit of state
between invocations of a function.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ipipomme+python at  Wed Jan 20 06:13:44 2016
From: ipipomme+python at (Alexandre Figura)
Date: Wed, 20 Jan 2016 12:13:44 +0100
Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict
 keys/values views behave not as expected?
In-Reply-To: <>
References: <>
Message-ID: <>

If we put technical considerations aside, maybe we should just ask to
ourselves what behavior do we expect when doing equality tests between
ordered dictionaries. As a reminder:

>>> xy = OrderedDict([('x', None), ('y', None)])
>>> yx = OrderedDict([('y', None), ('x', None)])
>>> xy == yx
>>> xy.items() == yx.items()
>>> xy.keys() == yx.keys()
>>> xy.values() == yx.values()

So, it appears that:
1. equality tests between odict_values use objects identity and not
2. equality tests between odict_keys do not respect order.

If it is not technically possible to change the current implementation,
maybe all we can do is just add a warning about current behavior in the

On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum <guido at> wrote:

> Seems like we dropped the ball... Is there any action item here?
> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From agustin.herranz at  Wed Jan 20 09:27:37 2016
From: agustin.herranz at (=?UTF-8?Q?Agust=c3=adn_Herranz_Cecilia?=)
Date: Wed, 20 Jan 2016 15:27:37 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
Message-ID: <>

Hi!, I'd come to this thread late and by coincidence, but after read the 
whole thread I want to share some thoughts:

The main concern it's add a way to add some kind of gradual typing to 
python 2 code. Because it's working python2 code it can't use 
annotations and it can't be added in code, so a type comment is the way 
to go (independent of the convenience, or not, of having type annotation 
available on runtime).

Someone points that using a comment with the python3 annotation 
signature it's good to educate people on how to use annotations, I feel 
that's not the point, the point is to people get used to type hints. For 
this the same syntax must be used across different python versions, so 
'function type comments' must be available to use also in python 3 code, 
this also allow people who can't/won't use annotations to use type hinting.

For this, I don't believe that the section "Suggested syntax for Python 
2.7 and straddling code" added to PEP 484 is the correct way to go, the 
proper it's add type comments for functions, as a extension of the "Type 
comments" section or perhaps in a new PEP.

Some concerns that must be take into account to add function type comments:

- Using 'type comments' syntax of PEP484 a function signature should 
look like this:

def func(arg1, arg2):  # type: Callable[[int, int], int]
     """ Do something """
     return arg1 + arg2

- This easily becomes a long line so breaks PEP8 and linters would 
complain. So it need to define a way to put the type comment in another 
line. The type comment will be put In the line after or in the line 
before? Put in another line will be only available for function type 
comments or for other type comments too?

- With some kind of complex types the type comment surely become a long 
lint too, how the type comments will be break into different lines?

- GVR proposal includes some kind of syntactic sugar for function type 
comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but 
this must be an alternative over typing module syntax (PEP484), not the 
preferred way (for people get used to typehints). Is this syntactic 
sugar compatible with generators? The type analyzers could be 
differentiate between a Callable and a Generator?

More concerns on type comments:

- As this is intended to gradual type python2 code to port it to python 
3 I think it's convenient to add some sort of import that only be used 
for type checking, and be only imported by the type analyzer, not the 
runtime. This could be achieve by prepending "#type: " to the normal 
import statement, something like:
     # type: import module
     # type: from package import module

- Also there must be addressed how it work on a python2 to python3 
environment as there are types with the same name, str for example, that 
works differently on each python version. If the code is for only one 
version uses the type names of that version. For 2/3 code types could be 
define with a "py2" prefix on a module that could be "py2types" having 
"py2str", for example, to mark things that be of python2 str type. 
Python 3 types will not have prefixes.

I hope this reasoning/ideas will be useful. Also I hope that I have been 
expressed good enough, English is not my mother tongue.

Agust?n Herranz.

From guido at  Wed Jan 20 11:48:56 2016
From: guido at (Guido van Rossum)
Date: Wed, 20 Jan 2016 08:48:56 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

But 'shared' and 'local' are both the wrong words to use here. Also
probably this should syntactically be tied to the function header so the
time of evaluation is clear(er).

On Tue, Jan 19, 2016 at 10:37 PM, Nick Coghlan <ncoghlan at> wrote:

> On 20 January 2016 at 10:38, Chris Angelico <rosuav at> wrote:
> > Big one for the bike-shedding: Is this "capture as local" (the same
> > semantics as the default arg - if you rebind it, it changes for the
> > current invocation only), or "capture as static" (the same semantics
> > as a closure if you use the 'nonlocal' directive - if you rebind it,
> > it stays changed), or "capture as constant" (what people are usually
> > going to be doing anyway)?
> The "shared value" approach can already be achieved by binding a
> mutable object rather than an immutable one, and there's no runtime
> speed difference between looking up a local and looking up a constant,
> so I think it makes sense to just stick with "default argument
> semantics, but without altering the function signature"
> One possible name for such a directive would be "sharedlocal": it's in
> most respects a local variable, but the given definition time
> initialisation value is shared across all invocations to the function.
> With that spelling:
>     def f(*, len=len):
>          ...
> Would become:
>     def f():
>         sharedlocal len=len
>         ...
> And you'd also be able to do things like:
>     def f():
>         sharedlocal cache={}
> Alternatively, if we just wanted to support early binding of
> pre-existing names, then "bindlocal" could work:
>     def f():
>         bindlocal len
>         ...
> Either approach could be used to handle early binding of loop
> iteration variables:
>     for i in range(10):
>         def f():
>             sharedlocal i=i
>             ...
>     for i in range(10):
>         def f():
>             bindlocal i
>             ...
> I'd be -1 on bindlocal (I think dynamic optimisers like PyPy or Numba,
> or static ones like Victor's FAT Python project are better answers
> there), but "sharedlocal" is more interesting, since it means you can
> avoid creating a closure if all you need is to persist a bit of state
> between invocations of a function.
> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 20 12:42:05 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 20 Jan 2016 09:42:05 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia <agustin.herranz at> wrote:
> - GVR proposal includes some kind of syntactic sugar for function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but this must be an alternative over typing module syntax (PEP484), not the preferred way (for people get used to typehints). Is this syntactic sugar compatible with generators? The type analyzers could be differentiate between a Callable and a Generator?

I'm pretty sure Generator is not the type of a generator function, bit of a generator object. So to type a generator function, you just write `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int], Generator[int]]`.

(Of course you can use Callable instead of the more specific Function, or Iterator (or even Iterable) instead of the more specific Generator, if you want to be free to change the implementation to use an iterator class or something later, but normally you'd want the most specific type, I think.)
> - As this is intended to gradual type python2 code to port it to python 3 I think it's convenient to add some sort of import that only be used for type checking, and be only imported by the type analyzer, not the runtime. This could be achieve by prepending "#type: " to the normal import statement, something like:
>    # type: import module
>    # type: from package import module

That sounds like a bad idea. If the typing module shadows some global, you won't get any errors, but your code will be misleading to a reader (and even worse if you from package.module import t). If the cost of the import is too high for Python 2, surely it's also too high for Python 3. And what other reason do you have for skipping it?

> - Also there must be addressed how it work on a python2 to python3 environment as there are types with the same name, str for example, that works differently on each python version. If the code is for only one version uses the type names of that version.

That's the same problem that exists at runtime, and people (and tools) already know how to deal with it: use bytes when you mean bytes, unicode when you mean unicode, and str when you mean whatever is "native" to the version you're running under and are willing to deal with it. So now you just have to do the same thing in type hints that you're already doing in constructors, isinstance checks, etc.

Of course many people use libraries like six to help them deal with this, which means that those libraries have to be type-hinted appropriately for both versions (maybe using different stubs for py2 and py3, with the right one selected at pip install time?), but if that's taken care of, user code should just work.

From srkunze at  Wed Jan 20 13:39:09 2016
From: srkunze at (Sven R. Kunze)
Date: Wed, 20 Jan 2016 19:39:09 +0100
Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict
 keys/values views behave not as expected?
In-Reply-To: <>
References: <>
Message-ID: <>

Documentation is a very good idea.

Maybe, even raise an error when comparing values.


On 20.01.2016 12:13, Alexandre Figura wrote:
> If we put technical considerations aside, maybe we should just ask to 
> ourselves what behavior do we expect when doing equality tests between 
> ordered dictionaries. As a reminder:
> >>> xy = OrderedDict([('x', None), ('y', None)])
> >>> yx = OrderedDict([('y', None), ('x', None)])
> >>> xy == yx
> False
> >>> xy.items() == yx.items()
> True
> >>> xy.keys() == yx.keys()
> True
> >>> xy.values() == yx.values()
> False
> So, it appears that:
> 1. equality tests between odict_values use objects identity and not 
> equality,
> 2. equality tests between odict_keys do not respect order.
> If it is not technically possible to change the current 
> implementation, maybe all we can do is just add a warning about 
> current behavior in the documentation?
> On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum <guido at 
> <mailto:guido at>> wrote:
>     Seems like we dropped the ball... Is there any action item here?
>     -- 
>     --Guido van Rossum ( <>)
>     _______________________________________________
>     Python-ideas mailing list
>     Python-ideas at <mailto:Python-ideas at>
>     Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Wed Jan 20 13:58:46 2016
From: tjreedy at (Terry Reedy)
Date: Wed, 20 Jan 2016 13:58:46 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <n7oldl$sd1$>

On 1/20/2016 11:48 AM, Guido van Rossum wrote:
> But 'shared' and 'local' are both the wrong words to use here. Also
> probably this should syntactically be tied to the function header so the
> time of evaluation is clear(er).

Use ';' in the parameter list, followed by name=expr pairs.  The 
question is whether names after are initialized local variables, subject 
to rebinding at runtime, or named constants, with the names replaced by 
the values at definition time.  In the former case, a type hint could by 
included.  In the latter case, which is much better for optimization, 
the fixed object would already be typed.

def f(int a, int b=1; int c=2) => int

Terry Jan Reedy

From at  Wed Jan 20 14:05:17 2016
From: at (Yury Selivanov)
Date: Wed, 20 Jan 2016 14:05:17 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>


On 2016-01-20 1:37 AM, Nick Coghlan wrote:
> On 20 January 2016 at 10:38, Chris Angelico<rosuav at>  wrote:
>> >Big one for the bike-shedding: Is this "capture as local" (the same
>> >semantics as the default arg - if you rebind it, it changes for the
>> >current invocation only), or "capture as static" (the same semantics
>> >as a closure if you use the 'nonlocal' directive - if you rebind it,
>> >it stays changed), or "capture as constant" (what people are usually
>> >going to be doing anyway)?
> The "shared value" approach can already be achieved by binding a
> mutable object rather than an immutable one, and there's no runtime
> speed difference between looking up a local and looking up a constant,
> so I think it makes sense to just stick with "default argument
> semantics, but without altering the function signature"
> One possible name for such a directive would be "sharedlocal": it's in
> most respects a local variable, but the given definition time
> initialisation value is shared across all invocations to the function.
> With that spelling:
>      def f(*, len=len):
>           ...
> Would become:
>      def f():
>          sharedlocal len=len

FWIW I strongly believe that this feature (at least the
"len=len"-like optimizations) should be implemented as an
optimization in the interpreter.

We already have "nonlocal" and "global".  Having a third
modifier (such as sharedlocal, static, etc) will only
introduce confusion and make Python less comprehensible.


From abarnert at  Wed Jan 20 15:42:47 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 20 Jan 2016 20:42:47 +0000 (UTC)
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On Wednesday, January 20, 2016 11:05 AM, Yury Selivanov < at> wrote:

> FWIW I strongly believe that this feature (at least the
> "len=len"-like optimizations) should be implemented as an
> optimization in the interpreter.

The problem is that there are two reasonable interpretations for free variables--variable capture or value capture--and Python can only do one or the other automatically.

Python does variable capture, because that's what you usually want.[*] But when you _do_ want value capture, you need some way to signal it. In some cases, the only reason you want value capture is as an optimization, and maybe the optimizer can handle that for you. But sometimes there's a semantic reason you want it--such as the well known case (covered in the official Python Programming FAQ [1]) where you're trying to capture the separate values of an iteration variable in a bunch of separate functions defined in the loop. And we need some way to spell that.

Of course we already have a way to spell that, the `a=a` default value trick. And I personally think that's good enough. But if the community disagrees, and we come up with a new syntax, I don't see why we should stop people from also using that new syntaxfor the optimization case when they know they want it.[**]

  [*] Note that in C++, which people keep referring to, the Core Guidelines suggest using variable capture by default. And their main exception--use value capture when you need to keep something around beyond the lifetime of its original scope, because otherwise you'd get a dangling reference to a destroyed object--doesn't apply to Python.

  [**] I don't think people are abusing the default-value trick for optimization--I generally only see `len=len` in low-level library code that may end up getting used inside an inner loop--so I doubt they'd abuse any new syntax for the same thing.


> We already have "nonlocal" and "global".  Having a third
> modifier (such as sharedlocal, static, etc) will only
> introduce confusion and make Python less comprehensible.

I agree with that. Also, none of the names people are proposing make much sense. "static" looks like a function-level static in C and its descendants, but does something completely different. "capture" means the exact opposite of what it says, and "sharedlocal" sounds like it's going to be "more shared" than the default for free variables when it's actually less shared.

From abarnert at  Wed Jan 20 15:55:30 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 20 Jan 2016 20:55:30 +0000 (UTC)
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n7oldl$sd1$>
References: <n7oldl$sd1$>
Message-ID: <>

On Wednesday, January 20, 2016 10:59 AM, Terry Reedy <tjreedy at> wrote:

> Use ';' in the parameter list, followed by name=expr pairs.

This is the best option anyone's suggested (short of just not doing anything and telling people to keep using the default-value trick on the rare occasions where this is necessary). However, I'd suggest one minor change: for the common case of `j=j, len=len`, allow people to just write the name once. The compiler can definitely handle this:

    def spam(eggs; _i=i, j, len): 

> The 

> question is whether names after are initialized local variables, subject 
> to rebinding at runtime, or named constants, with the names replaced by 
> the values at definition time.

They almost certainly should be variables, just like parameters, with the values stored in `__defaults__`. Otherwise, this code:

    powers = [lambda x; i: x**i for i in range(5)]

... produces functions with their own separate code objects, instead of functions that share a single code object. And this isn't some weird use case; the "defining functions in a loop that capture the loop iterator by value" is the paradigm case for this new feature. (It's even covered in the official FAQ.)

The performance cost of those separate code objects (and the cache misses caused when you try to use them in a loop) has almost no compensating performance gain (`LOAD_CONST` isn't faster than `LOAD_FAST`, and the initial copy from `__defaults__` at call time is about 1/10th the cost of either). And it's more complicated to implement (especially from where we are today), and less flexible for reflective code that munges functions.

> In the former case, a type hint could by 

> included.  In the latter case, which is much better for optimization, 
> the fixed object would already be typed.
> def f(int a, int b=1; int c=2) => int

You've got the syntax wrong. But, more importantly, besides the latter case (const vs. default) actually being worse for optimization, it isn't any better for type inference. In this function:

    def f(a: int, b: int=1; c=2) -> int:

or even this one:

    def f():
        for i in range(5):
            def local(x: int; i) -> int:
                return x**i
            yield local

... the type checker can infer the type of `i`: it's initialized with an int literal (first version) or the value of a variable that's been inferred as an int; therefore, it's an int. So it can emit a warning if you assign anything but another int to it.

The only problem with your solution is that we now have three different variations that are all spelled very differently: 

def spam(i; j):  # captured by value 

def spam(i): 
    nonlocal j  # captured by variable 

def spam(i): # captured by variable if no assignment, else shadowed by a local

Is that acceptable?

From mike at  Wed Jan 20 18:56:00 2016
From: mike at (Michael Selik)
Date: Wed, 20 Jan 2016 23:56:00 +0000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 2:05 PM Yury Selivanov < at>

> On 2016-01-20 1:37 AM, Nick Coghlan wrote:
> > On 20 January 2016 at 10:38, Chris Angelico<rosuav at>  wrote:
> > With that spelling:
> >
> >      def f(*, len=len):
> >           ...
> >
> > Would become:
> >
> >      def f():
> >          sharedlocal len=len
> FWIW I strongly believe that this feature (at least the
> "len=len"-like optimizations) should be implemented as an
> optimization in the interpreter.
> We already have "nonlocal" and "global".  Having a third
> modifier (such as sharedlocal, static, etc) will only
> introduce confusion and make Python less comprehensible.

If the purpose is to improve speed, it certainly feels like an interpreter
optimization. The other thread about adding ``ma_version`` to dicts might
be useful for quickening the global variable lookup.

If the purpose is to store the current global value, it might be reasonable
to add a language feature to make that more explicit. Beginners often
mistakenly think that default values are evaluated and assigned at
call-time instead of def-time. However, adding a new, more explicit
language feature wouldn't eliminate the current confusion. Instead we'd
have two ways to do it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Wed Jan 20 19:10:32 2016
From: steve at (Steven D'Aprano)
Date: Thu, 21 Jan 2016 11:10:32 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 19, 2016 at 05:01:42PM -0800, Guido van Rossum wrote:
> On Tue, Jan 19, 2016 at 4:37 PM, Steven D'Aprano <steve at>
> wrote:
> > On Tue, Jan 19, 2016 at 08:47:28AM -0800, Guido van Rossum wrote:
> >
> > > I think it's reasonable to divert this discussion to "value capture".
> > If I understand you correctly, that's precisely what a function default
> > argument does: capture the current value of the default value expression
> > at the time the function is called.
> I think you misspoke here (I don't think you actually believe what you said
> :-).
> Function defaults capture the current value at the time the function is
> *define*.

Oops! You got me. Yes, I meant defined, not called.

> > > The best syntax for such capture remains to be seen. ("Capture" seems to
> > > universally make people think of "variable capture" which is the opposite
> > > of what we want here.)
> >
> > If I recall correctly, there was a recent(?) proposal for a "static"
> > keyword with similar semantics:
> >
> > def func(a):
> >    static b = expression
> >    ...
> >
> > would guarantee that expression was evaluated exactly once.
> Once per what? In the lifetime of the universe? Per CPython process start?
> Per call?
> J/K, I think I know what you meant -- once per function definition (same as
> default values).

That's what I mean. Although, I am curious as to how we might implement 
the once per lifetime of the universe requirement :-)

> > If that
> > evaluation occurred when func was defined, rather than when it was first
> > called,
> (FWIW, "when it was first called" would be a recipe for disaster and
> irreproducible results.)

It probably would be a bug magnet. Good thing I'm not asking for that 
behaviour then :-)

> > Scoping rules might be tricky to get right. Perhaps rather than a
> > declaration, "static" might be better treated as a block:
> >
> Why? This does smell like a directive similar to global and nonlocal.

I'm just tossing the "static block" idea out for discussion, but if you 
want a justification here are two differences between capture/static 
and global/nonlocal which suggest they aren't that similar and so we 
shouldn't feel obliged to use the same syntax.

(1) global and nonlocal operate on *names*, not values. E.g. after 
"global x", x refers to a name in the global scope, not the local scope.

But "capture"/"static" doesn't affect the name, or the scope that x 
belongs to. x is still a local, it just gets pre-initialised to the 
value of x in the enclosing scope. That makes it more of a binding 
operation or assignment than a declaration.

(2) If we limit this to only capturing the same name, then we can only 
write (say) "static x", and that does look like a declaration. But maybe 
we want to allow the local name to differ from the global name:

    static x = y

or even arbitrary expressions on the right:

    static x = x + 1

Now that starts to look more like it should be in a block of code, 
especially if you have a lot of them:

    static x = x + 1
    static len = len
    static data = open("data.txt").read()


        x = x + 1
        len = len
        data = open("data.txt").read()

I acknowledge that this goes beyond what the OP asked for, and I think 
that YAGNI is a reasonable response to the static block idea. I'm not 
going to champion it any further unless there's a bunch of interest from 
others. (I'm saving my energy for Eiffel-like require/ensure blocks 


From guido at  Wed Jan 20 19:11:03 2016
From: guido at (Guido van Rossum)
Date: Wed, 20 Jan 2016 16:11:03 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas <
python-ideas at> wrote:

> On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia <
> agustin.herranz at> wrote:
> >
> > - GVR proposal includes some kind of syntactic sugar for function type
> comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but
> this must be an alternative over typing module syntax (PEP484), not the
> preferred way (for people get used to typehints). Is this syntactic sugar
> compatible with generators? The type analyzers could be differentiate
> between a Callable and a Generator?
> I'm pretty sure Generator is not the type of a generator function, bit of
> a generator object. So to type a generator function, you just write `(int,
> int) -> Generator[int]`. Or, the long way, `Function[[int, int],
> Generator[int]]`.

There is no 'Function' -- it existed in mypy before PEP 484 but was
replaced by 'Callable'. And you don't annotate a function def with '->
Callable' (unless it returns another function). The Callable type is only
needed in the signature of higher-order functions, i.e. functions that take
functions for arguments or return a function. For example, a simple map
function would be written like this:

def map(f: Callable[[T], S], a: List[T]) -> List[S]:

As to generators, we just improved how mypy treats generators (
The Generator type has *three* parameters: the "yield" type (what's
yielded), the "send" type (what you send() into the generator, and what's
returned by yield), and the "return" type (what a return statement in the
generator returns, i.e. the value for the StopIteration exception). You can
also use Iterator if your generator doesn't expect its send() or throw()
messages to be called and it isn't returning a value for the benefit of
`yield from'.

For example, here's a simple generator that iterates over a list of
strings, skipping alternating values:

def skipper(a: List[str]) -> Iterator[str]:
    for i, s in enumerate(a):
        if i%2 == 0:
            yield s

and here's a coroutine returning a string (I know, it's pathetic, but it's
an example :-):

def readchar() -> Generator[Any, None, str]:
    # Implementation not shown
def readline() -> Generator[Any, None, str]:
    buf = ''
    while True:
        c = yield from readchar()
        if not c: break
        buf += c
        if c == '\n': break
    return buf

Here, in Generator[Any, None, str], the first parameter ('Any') refers to
the type yielded -- it actually yields Futures, but we don't care about
that (it's an asyncio implementation detail). The second parameter ('None')
is the type returned by yield -- again, it's an implementation detail and
we might just as well say 'Any' here. The third parameter (here 'str') is
the type actually returned by the 'return' statement.

It's illustrative to observe that the signature of readchar() is exactly
the same (since it also returns a string). OTOH the return type of e.g.
asyncio.sleep() is Generator[Any, None, None], because it doesn't return a

This business is clearly still suboptimal -- we would like to introduce a
new type, perhaps named Coroutine, so that you can write Coroutine[T]
instead of Generator[Any, None, T]. But that would just be a shorthand. The
actual type of a generator object is always some parametrization of

In any case, whatever we write after the -> (i.e., the return type) is
still the type of the value you get when you call the function. If the
function is a generator function, the value you get is a generator object,
and that's what the return type designates.

> (Of course you can use Callable instead of the more specific Function, or
> Iterator (or even Iterable) instead of the more specific Generator, if you
> want to be free to change the implementation to use an iterator class or
> something later, but normally you'd want the most specific type, I think.)

I don't know where you read about Callable vs. Function.

Regarding using Iterator[T] instead of Generator[..., ..., T], you are

Note that you *cannot* define a generator function as returning a
*subclass* of Iterator/Generator; there is no way to have a generator
function instantiate some other class as its return value. Consider
(ignoring generic types):

class MyIterator:
    def __next__(self): ...
    def __iter__(self): ...
    def bar(self): ...

def foo() -> MyIterator:

x = foo()  # Boom!

The type checker would assume that x has a method bar() based on the
declared return type for foo(), but it doesn't. (There are a few other
special cases, in addition to Generator and Iterator; declaring the return
type to be Any or object is allowed.)

> > - As this is intended to gradual type python2 code to port it to python
> 3 I think it's convenient to add some sort of import that only be used for
> type checking, and be only imported by the type analyzer, not the runtime.
> This could be achieve by prepending "#type: " to the normal import
> statement, something like:
> >    # type: import module
> >    # type: from package import module
> That sounds like a bad idea. If the typing module shadows some global, you
> won't get any errors, but your code will be misleading to a reader (and
> even worse if you from package.module import t). If the cost of the import
> is too high for Python 2, surely it's also too high for Python 3. And what
> other reason do you have for skipping it?

Exactly. Even though (when using Python 2) all type annotations are in
comments, you still must write real imports. (This causes minor annoyances
with linters that warn about unused imports, but there are ways to teach

> > - Also there must be addressed how it work on a python2 to python3
> environment as there are types with the same name, str for example, that
> works differently on each python version. If the code is for only one
> version uses the type names of that version.
> That's the same problem that exists at runtime, and people (and tools)
> already know how to deal with it: use bytes when you mean bytes, unicode
> when you mean unicode, and str when you mean whatever is "native" to the
> version you're running under and are willing to deal with it. So now you
> just have to do the same thing in type hints that you're already doing in
> constructors, isinstance checks, etc.

This is actually still a real problem. But it has no bearing on the choice
of syntax for annotations in Python 2 or straddling code.

> Of course many people use libraries like six to help them deal with this,
> which means that those libraries have to be type-hinted appropriately for
> both versions (maybe using different stubs for py2 and py3, with the right
> one selected at pip install time?), but if that's taken care of, user code
> should just work.

Yeah, we could use help. There are some very rudimentary stubs for a few
things defined by six (, but we
need more. There's a PR but it's of bewildering size (

PS. I have a hard time following the rest of Agustin's comments. The
comment-based syntax I proposed for Python 2.7 does support exactly the
same functionality as the official PEP 484 syntax; the only thing it
doesn't allow is selectively leaving out types for some arguments -- you
must use 'Any' to fill those positions. It's not a problem in practice, and
it doesn't reduce functionality (omitted argument types are assumed to be
Any in PEP 484 too). I should also remark that mypy supports the
comment-based syntax in Python 2 mode as well as in Python 3 mode; but when
writing Python 3 only code, the non-comment version is strongly preferred.
(We plan to eventually produce a tool that converts the comments to
standard PEP 484 syntax).

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Wed Jan 20 19:52:25 2016
From: steve at (Steven D'Aprano)
Date: Thu, 21 Jan 2016 11:52:25 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n7oldl$sd1$>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 01:58:46PM -0500, Terry Reedy wrote:
> On 1/20/2016 11:48 AM, Guido van Rossum wrote:
> >But 'shared' and 'local' are both the wrong words to use here. Also
> >probably this should syntactically be tied to the function header so the
> >time of evaluation is clear(er).
> Use ';' in the parameter list, followed by name=expr pairs.  The 
> question is whether names after are initialized local variables, subject 
> to rebinding at runtime, or named constants, with the names replaced by 
> the values at definition time.  In the former case, a type hint could by 
> included.  In the latter case, which is much better for optimization, 
> the fixed object would already be typed.
> def f(int a, int b=1; int c=2) => int

I almost like that.

The problem is that the difference between ; and , is visually 
indistinct and easy to mess up. I've occasionally typed ; in a parameter 
list and got a nice SyntaxError telling me I've messed up, but with your 
suggestion the function will just silently do the wrong thing.

I suggest a second "parameter list":

def func(a:int, b:int=1)(c:int)->int:

is morally equivalent to:

def func(a:int, b:int=1, c:int=c)->int:

except that c is not a parameter of the function and cannot be passed 
as an argument:

func(a=0, b=2)  # okay
func(a=0, b=2, c=1)  # TypeError

We still lack a good term for what the (c) thingy should be called. I'm 
not really happy with either of "static" or "capture", but for lack of 
anything better I'll go with capture for the moment.

So a full function declaration looks like:


(Bike-shedders: do you prefer () [] or {} for the list of captures?)

CAPTURES is a comma-delimitered list of local variable names, with 
optional type hint and optional bindings. Here are some examples:

    # Capture the values of x and y from the enclosing scope.
    # Both x and y must exist at func definition time.
    def func(arg)(x, y):
        # inside the body of func, x and y are locals

    # Same as above, except with type-hinting.
    # If x or y in the enclosing scope are not floats,
    # a checker should report a type error.
    def func(arg)(x:float, y:float):
        # inside the body of func, x and y are locals

    # Capture the values of x and y from the enclosing scope,
    # binding to names x and z.
    # Both x and y must exist at func definition time.
    def func(arg)(x, z=y):
        # inside the body of func, x and z are locals
        # while y would follow the usual scoping rules

    # Capture a copy of the value of dict d from the enclosing scope.
    # d must exist at func definition time.
    def func(arg)(d:dict=d.copy()):
        # inside the body of func, d is a local

If a capture consists of a name alone (or a name plus annotation), it 
declares a local variable of that name, and binds to it the captured 
value of the same name in the enclosing scope. E.g.:

x = 999
def func()(x):  # like x=x
    x += 1
    return (x, globals()['x'])

assert func() == (1000, 999)
x = 0
assert func() == (1000, 0)

If a capture consists of a name = expression, the expression is 
evaluated at function definition time, and the result captured.

y = 999
def func()(x=y+1):
    return x

assert func() == 1000
del y
assert func() == 1000

Can we make this work with lambda? I think we can. The current lambda 
syntax is:

lambda params: expression

e.g. lambda x, y=y: x+y

Could we keep that (for backwards compatibility) but allow parens to 
optionally surround the parameter list? If so, then we can allow an 
optional second set of parens after the first, allowing captures:

lambda (x)(y): x+y

The difference between 

    lambda x,y=y: ... 


    lambda (x)(y): ... 

is that the first takes two arguments, mandatory x and optional y (which 
defaults to the value of y from the enclosing scope), while the second 
only takes one argument, x.


From abarnert at  Wed Jan 20 19:54:53 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 21 Jan 2016 00:54:53 +0000 (UTC)
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Wednesday, January 20, 2016 4:11 PM, Guido van Rossum <guido at> wrote:

>On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas <python-ideas at> wrote:
>On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia <agustin.herranz at> wrote:
>>> - GVR proposal includes some kind of syntactic sugar for function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but this must be an alternative over typing module syntax (PEP484), not the preferred way (for people get used to typehints). Is this syntactic sugar compatible with generators? The type analyzers could be differentiate between a Callable and a Generator?
>>I'm pretty sure Generator is not the type of a generator function, bit of a generator object. So to type a generator function, you just write `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int], Generator[int]]`.

>There is no 'Function' -- it existed in mypy before PEP 484 but was replaced by 'Callable'. And you don't annotate a function def with '-> Callable' (unless it returns another function). 

Sorry about getting the `Function` from the initial proposal instead of the current PEP.

Anyway, I don't think the OP was suggesting that. If I interpreted his question right:

He was expecting that the comment `(int, int) -> int` was a way to annotate a function so it comes out as type `Callable[[int, int], int]`, which is correct. And he wanted to know how to instead write a comment for a generator function of type `GeneratorFunction[[int, int], int]`, and the answer is that you don't. There is no type needed for generator functions; they're just functions that return generators.

You're right that he doesn't need to know the actual type; you're never going to write that, you're just going to annotate the arguments and return value, or use the 2.x comment style:

    def f(arg1: int, arg2: int) -> Iterator[int]

    def f(arg1, arg2):

        # type: (int, int) -> Iterator[int]

Either way, the type checker will determine that type of the function is `Callable[[int, int], Iterator[int]]`, and the only reason you'll ever care is if that type shows up in an error message.
>Regarding using Iterator[T] instead of Generator[..., ..., T], you are correct.

>Note that you *cannot* define a generator function as returning a *subclass* of Iterator/Generator; 

But you could define it as returning the superclass `Iterable`, right? As I understand it, it's normal type variance, so any superclass will work; the only reason `Iterator` is special is that it happens to be simpler to specify than Generator and it's plausible that it isn't going to matter whether you've written a generator function or, say, a function that returns a list iterator.

> there is no way to have a generator function instantiate some other class as its return value.

If you really want that, you could always write a wrapper that forwards __next__, and a decorator that applies the wrapper. Can MyPy infer the type of the decorated function from the wrapped function and the decorator?

    # Can I leave this annotation off? And get one specialized to the actual
    # argument types of the wrapped function? That would be cool.
    def my_iterating(func: Callable[Any, Iterator]) -> Callable[Any, MyIterator]

        def wrapper(*args, **kw):
            return MyIterator(func(*args, **kw))
        return wrapper

    def foo() -> Iterator[int]:

    x = foo()

From guido at  Wed Jan 20 20:04:21 2016
From: guido at (Guido van Rossum)
Date: Wed, 20 Jan 2016 17:04:21 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano <steve at>

> I'm just tossing the "static block" idea out for discussion, but if you
> want a justification here are two differences between capture/static
> and global/nonlocal which suggest they aren't that similar and so we
> shouldn't feel obliged to use the same syntax.
> (1) global and nonlocal operate on *names*, not values. E.g. after
> "global x", x refers to a name in the global scope, not the local scope.
> But "capture"/"static" doesn't affect the name, or the scope that x
> belongs to. x is still a local, it just gets pre-initialised to the
> value of x in the enclosing scope. That makes it more of a binding
> operation or assignment than a declaration.
> (2) If we limit this to only capturing the same name, then we can only
> write (say) "static x", and that does look like a declaration. But maybe
> we want to allow the local name to differ from the global name:
>     static x = y
> or even arbitrary expressions on the right:
>     static x = x + 1
> Now that starts to look more like it should be in a block of code,
> especially if you have a lot of them:
>     static x = x + 1
>     static len = len
>     static data = open("data.txt").read()
> versus:
>     static:
>         x = x + 1
>         len = len
>         data = open("data.txt").read()
> I acknowledge that this goes beyond what the OP asked for, and I think
> that YAGNI is a reasonable response to the static block idea. I'm not
> going to champion it any further unless there's a bunch of interest from
> others.

Yeah, your arguments why it's different from global/nonlocal are
reasonable, but the question remains whether we really need all that
functionality. IIRC C++ lambdas only allow capturing a variable's value,
not an expression's.

So we should ask ourselves first: if we *only* had some directive that
captures some variables' values, essentially like the len=len argument
trick but without affecting the signature (e.g. just "static x, y, z"), how
much of the current pain would be addressed, and how much would remain?

> (I'm saving my energy for Eiffel-like require/ensure blocks
> *wink*).

Now you're making me curious.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Wed Jan 20 20:36:58 2016
From: guido at (Guido van Rossum)
Date: Wed, 20 Jan 2016 17:36:58 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 20, 2016 at 4:54 PM, Andrew Barnert <abarnert at> wrote:

> On Wednesday, January 20, 2016 4:11 PM, Guido van Rossum <guido at>
> wrote:
> >On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas <
> python-ideas at> wrote:
> >
> >On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia <
> agustin.herranz at> wrote:
> >>>
> >>> - GVR proposal includes some kind of syntactic sugar for function type
> comments (" # type: (t_arg1, t_arg2) -> t_ret "). I think it's good but
> this must be an alternative over typing module syntax (PEP484), not the
> preferred way (for people get used to typehints). Is this syntactic sugar
> compatible with generators? The type analyzers could be differentiate
> between a Callable and a Generator?
> >>
> >>I'm pretty sure Generator is not the type of a generator function, bit
> of a generator object. So to type a generator function, you just write
> `(int, int) -> Generator[int]`. Or, the long way, `Function[[int, int],
> Generator[int]]`.
> >
> >There is no 'Function' -- it existed in mypy before PEP 484 but was
> replaced by 'Callable'. And you don't annotate a function def with '->
> Callable' (unless it returns another function).
> Sorry about getting the `Function` from the initial proposal instead of
> the current PEP.
> Anyway, I don't think the OP was suggesting that. If I interpreted his
> question right:
> He was expecting that the comment `(int, int) -> int` was a way to
> annotate a function so it comes out as type `Callable[[int, int], int]`,
> which is correct.

Not really. I understand that you're saying that after:

def foo(a, b):
    # type: (int, int) -> str
    return str(a+b)

the type of 'foo' is 'Callable[[int, int], int]'.

But it really isn't. The type checker (e.g. mypy) knows more at this point:
it knows that foo has arguments named 'a' and 'b' and that e.g. calls like
'foo(1, b=2)' are valid. There's no way to express that using Callable.
Also Callable doesn't support argument defaults.

> And he wanted to know how to instead write a comment for a generator
> function of type `GeneratorFunction[[int, int], int]`, and the answer is
> that you don't. There is no type needed for generator functions; they're
> just functions that return generators.

Aha. No wonder I didn't get the question. :-(

> You're right that he doesn't need to know the actual type; you're never
> going to write that, you're just going to annotate the arguments and return
> value, or use the 2.x comment style:
>     def f(arg1: int, arg2: int) -> Iterator[int]
>     def f(arg1, arg2):
>         # type: (int, int) -> Iterator[int]
> Either way, the type checker will determine that type of the function is
> `Callable[[int, int], Iterator[int]]`, and the only reason you'll ever care
> is if that type shows up in an error message.

I don't think you can the word 'Callable' to show up in an error message
unless it's part of the type as written somewhere. A name defined with
'def' is special and it shows up differently. (And so is a lambda.)

> >Regarding using Iterator[T] instead of Generator[..., ..., T], you are
> correct.
> >
> >Note that you *cannot* define a generator function as returning a
> *subclass* of Iterator/Generator;
> But you could define it as returning the superclass `Iterable`, right?


> As I understand it, it's normal type variance, so any superclass will
> work; the only reason `Iterator` is special is that it happens to be
> simpler to specify than Generator and it's plausible that it isn't going to
> matter whether you've written a generator function or, say, a function that
> returns a list iterator.


> > there is no way to have a generator function instantiate some other
> class as its return value.
> If you really want that, you could always write a wrapper that forwards
> __next__, and a decorator that applies the wrapper. Can MyPy infer the type
> of the decorated function from the wrapped function and the decorator?

I think that's an open question. Your example below is complicated because
of the **args, *kw pattern.

>     # Can I leave this annotation off? And get one specialized to the
> actual
>     # argument types of the wrapped function? That would be cool.

You can't -- mypy never infers a function's type from its inner workings.

However, some Googlers are working on a tool that does infer types:

It's early days though (relatively speaking), and I don't think it handles
this case yet.

>     def my_iterating(func: Callable[Any, Iterator]) -> Callable[Any,
> MyIterator]

Alas, PEP 484 is not powerful enough to describe the relationship between
the input and output functions. You'd want to do something that uses a type
variable to capture all arguments together, so you could write something

T = TypeVar('T')
S = TypeVar('S')
def my_iterating(func: Callable[T, Iterator[S]]) -> Callable[T,

>         @wraps(func)
>         def wrapper(*args, **kw):
>             return MyIterator(func(*args, **kw))
>         return wrapper
>     @my_iterating
>     def foo() -> Iterator[int]:
>         yield
>     x = foo()

The only reasonable way to do something like this without adding more
sophistication to PEP 484 would be to give up on the decorator and just
hardcode it using a pair of functions:

def foo() -> MyIterator[int]:
    return MyIterator(_foo())

# Implementation
def _foo() -> Iterator[int]:
    yield 0

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Wed Jan 20 23:59:52 2016
From: greg.ewing at (Greg Ewing)
Date: Thu, 21 Jan 2016 17:59:52 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

My idea for handling this kind of thing is:

   for new x in things:
     funcs.append(lambda: dosomethingwith(x))

The 'new' modifier can be applied to any assignment target,
and conceptually has the effect of creating a new binding
instead of changing an existing binding.

There is a very simple way to implement this in CPython:
create a new cell each time instead of replacing the
contents of an existing cell.


From ncoghlan at  Thu Jan 21 02:54:15 2016
From: ncoghlan at (Nick Coghlan)
Date: Thu, 21 Jan 2016 17:54:15 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On 21 January 2016 at 06:42, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> On Wednesday, January 20, 2016 11:05 AM, Yury Selivanov < at> wrote:
>> FWIW I strongly believe that this feature (at least the
>> "len=len"-like optimizations) should be implemented as an
>> optimization in the interpreter.
> The problem is that there are two reasonable interpretations for free variables--variable capture or value capture--and Python can only do one or the other automatically.

Can we please use the longstanding early binding and late binding
terminology for these two variants, rather than introducing new
phrases that just confuse the matter...


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From victor.stinner at  Thu Jan 21 03:48:21 2016
From: victor.stinner at (Victor Stinner)
Date: Thu, 21 Jan 2016 09:48:21 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>


Sorry but I'm lost in this long thread. Do you want to extend the
Python language to declare constant in a function? Maybe I'm completly
off-topic, sorry.

2016-01-21 1:10 GMT+01:00 Steven D'Aprano <steve at>:
> (2) If we limit this to only capturing the same name, then we can only
> write (say) "static x", and that does look like a declaration. But maybe
> we want to allow the local name to differ from the global name:
>     static x = y

3 months ago, Serhiy Storchaka proposed a "const var = expr" syntax:

With a shortcut "const len" which is like "const len = len".

In the meanwhile, I implemented an optimization in my FAT Python
project: "Copy builtins to constant". It's quite simple: replace the
"LOAD_GLOBAL builtin" instruction with a "LOAD_CONST builtin"
transation and "patch" co_consts constants of a code object at

   def hello(): print("hello world")

is replaced with:

   def hello(): "LOAD_GLOBAL print"("hello world")
   hello.__code__ = fat.replace_consts(hello.__code__, {'LOAD_GLOBAL
print': print})

Where fat.replace_consts() is an helper to create a new code object
replacing constants with the specified mapping:

Replacing print(...) with "LOAD_GLOBAL"(...) is done in the
fatoptimizer (an AST optimpizer):

We have to inject the builtin function at runtime. It cannot be done
when the code object is created by "def ..." because a code object can
only contain objects serializable by marshal (to be able to compile a
.py file to a .pyc file).

> I acknowledge that this goes beyond what the OP asked for, and I think
> that YAGNI is a reasonable response to the static block idea. I'm not
> going to champion it any further unless there's a bunch of interest from
> others. (I'm saving my energy for Eiffel-like require/ensure blocks
> *wink*).

The difference between "def hello(print=print): ..." and Serhiy's
const idea (or my optimization) is that "def hello(print=print): ..."
changes the signature of the function which can be a serious issue in
an API.

Note: The other optimization "local_print = print" in the function is
only useful for loops (when the builtin is loaded multiple times) and
it still loads the builtin once per function call, whereas my
optimization uses a constant and so no lookup is required anymore.

Then guards are used to disable the optimization if builtins are
modified. See the PEP 510 for an explanation on that part.


From mal at  Thu Jan 21 04:39:43 2016
From: mal at (M.-A. Lemburg)
Date: Thu, 21 Jan 2016 10:39:43 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 21.01.2016 09:48, Victor Stinner wrote:
> The difference between "def hello(print=print): ..." and Serhiy's
> const idea (or my optimization) is that "def hello(print=print): ..."
> changes the signature of the function which can be a serious issue in
> an API.
> Note: The other optimization "local_print = print" in the function is
> only useful for loops (when the builtin is loaded multiple times) and
> it still loads the builtin once per function call, whereas my
> optimization uses a constant and so no lookup is required anymore.
> Then guards are used to disable the optimization if builtins are
> modified. See the PEP 510 for an explanation on that part.

I ran performance tests on these optimization tricks (and
others) in 2014. See this talk:

(slides 33ff.)

The keyword trick doesn't really pay off in terms of added
performance vs. danger of introducing weird bugs.

Still, it would be great to have a way to say "please look
this symbol up at compile time and stick the result in a local
variable" (which is basically what the keyword trick does),
only in a form that's easier to detect when reading the code
and doesn't change the function signature.

A decorator could help with this (by transforming the byte
code and localizing the symbols), e.g.

def f(seq):
    z = 0
    for x in seq:
       if x:
           z += len(x)
    return z

but the more we move language features to decorators, the
less readable the code will get by having long tails of
decorators on many functions (we don't really want our
functions to resemble snakes, do we ? :-)).

So perhaps it is indeed time for a new keyword to localize
symbols in a function or module, say:

# module scope localization, applies to all code objects in
# this module:
localize len

def f(seq):


def f(seq):
    # Localize len in this function, since we need it in
    # tight loops
    localize len

All that said, I don't really believe that this is a high
priority feature request. The gained performance win is
not all that great and only becomes relevant when used
in tight loops.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 21 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From wes.turner at  Thu Jan 21 05:14:58 2016
From: wes.turner at (Wes Turner)
Date: Thu, 21 Jan 2016 04:14:58 -0600
Subject: [Python-ideas] Fwd: Why do equality tests between OrderedDict
 keys/values views behave not as expected?
In-Reply-To: <>
References: <>
Message-ID: <>

* DOC: OrderDict.values() comparisons in Python 3
  * Src:

What should it say?


.. `<


* "is suprising"
* Python 3 has `dict views`_

* :class:`OrderedDict` matches the dict interface in Python 2.7 and
  Python 3.

* Python 2 dict interface:
  * dict.viewkeys(), dict.viewvalues(), dict.viewitems()
  * dict.keys(), dict.values(), dict.items()
* Python 3 dict interface (:ref:`dictionary view objects`:
  * dict.keys(), dict.values(), dict.items()
  * list(dict.keys()), list(dict.values()), list(dict.items())

* In order to compare OrderedDict.values() **by value**
  you must either:

  * Cast values() to a sequence (e.g. a list) before comparison
  * Subclass :class:`OrderedDict` and wrap `values()`

.. code:: python

    from collections import OrderedDict
    a = 'a'
    x = 1
    y = x
    ab  =  [( a,  x), ('b', 2)]
    ba  =  [('b', 2),  (a,  y)]
    ab_odict      = OrderedDict(ab)
    ab_odict_     = OrderedDict(ab)
    ab_odict_copy = OrderedDict(ab.copy())
    ba_odict      = OrderedDict(ba)
    ab_dict = dict(ab)
    ba_dict = dict(ba)

    # In Python 3,
    # OrderedDict.values.__eq__ does not compare by value:
    assert (     ab_odict.values()  ==      ab_odict_.values()) is False
    assert (list(ab_odict.values()) == list(ab_odict_.values()) is True

    # In Python 2.7 and 3,
    # OrderedDict.__eq__ compares ordered sequences
    assert (ab_odict == ab_odict_)     is True
    assert (ab_odict == ab_odict_copy) is True
    assert (ab_odict == ba_odict) is False
    assert (ab_dict  == ba_dict)  is True

    # - [ ] How to explain the x, y part?
    #   - in terms of references, __eq__, id(obj), __hash__


On Wed, Jan 20, 2016 at 12:39 PM, Sven R. Kunze <srkunze at> wrote:

> Documentation is a very good idea.
> Maybe, even raise an error when comparing values.
> Best,
> Sven
> On 20.01.2016 12:13, Alexandre Figura wrote:
> If we put technical considerations aside, maybe we should just ask to
> ourselves what behavior do we expect when doing equality tests between
> ordered dictionaries. As a reminder:
> >>> xy = OrderedDict([('x', None), ('y', None)])
> >>> yx = OrderedDict([('y', None), ('x', None)])
> >>> xy == yx
> False
> >>> xy.items() == yx.items()
> True
> >>> xy.keys() == yx.keys()
> True
> >>> xy.values() == yx.values()
> False
> So, it appears that:
> 1. equality tests between odict_values use objects identity and not
> equality,
> 2. equality tests between odict_keys do not respect order.
> If it is not technically possible to change the current implementation,
> maybe all we can do is just add a warning about current behavior in the
> documentation?
> On Mon, Jan 11, 2016 at 4:17 AM, Guido van Rossum < <guido at>
> guido at> wrote:
>> Seems like we dropped the ball... Is there any action item here?
>> --
>> --Guido van Rossum ( <>)
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing listPython-ideas at python.org
> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From victor.stinner at  Thu Jan 21 08:19:59 2016
From: victor.stinner at (Victor Stinner)
Date: Thu, 21 Jan 2016 14:19:59 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

2016-01-21 10:39 GMT+01:00 M.-A. Lemburg <mal at>:
> I ran performance tests on these optimization tricks (and
> others) in 2014. See this talk:
> (slides 33ff.)

Ah nice, thanks for the slides.

> The keyword trick doesn't really pay off in terms of added
> performance vs. danger of introducing weird bugs.

I ran a quick microbenchmark to measure the cost of LOAD_GLOBAL to
load a global: call func("abc") with

   mylen = len
   def func(obj): return mylen(obj)


117 ns: original bytecode (LOAD_GLOBAL)
109 ns: LOAD_CONST
116 ns: LOAD_CONST with guard

LOAD_CONST avoids 1 dict lookup (globals) and reduces the runtime by 8
ns: 7% faster. But the guard has a cost of 7 ns: we only win 1
nanosecond. Not really interesting here.

LOAD_CONST means that the LOAD_GLOBAL instruction has been replaced
with a LOAD_CONST instruction. The guard checks if the frame globals
and globals()['mylen'] didn't change.

I ran a second microbenchmark on func("abc") to measure the cost
LOAD_GLOBAL to load a builtin: call func("abc") with

   def func(obj): return len(obj)


124 ns: original bytecode (LOAD_GLOBAL)
107 ns: LOAD_CONST
116 ns: LOAD_CONST with guard on builtins + globals

LOAD_CONST avoids 2 dict lookup (globals, builtins) and reduces the
runtime by 17 ns: 14% faster. But the guard has a cost of 9 ns: we win
8 nanosecond, 6% faster.

Here is the guard is more complex: checks if the frame builtins, the
frame globals, builtins.__dict__['len'] and globals()['len'] didn't

If you avoid guards, it's always faster, but it changes the Python semantics.

The speedup on such very small example is low. It's more interesting
when the global or builtin variable is used in a loop: the speedup is
multipled by the number of loop iterations.

> A decorator could help with this (by transforming the byte
> code and localizing the symbols), e.g.
> @localize(len)
> def f(seq):
>     z = 0
>     for x in seq:
>        if x:
>            z += len(x)
>     return z

FYI has such decorator:

> All that said, I don't really believe that this is a high
> priority feature request. The gained performance win is
> not all that great and only becomes relevant when used
> in tight loops.

Yeah, in the Python stdlib, the hack is only used for loops.


From mal at  Thu Jan 21 08:39:29 2016
From: mal at (M.-A. Lemburg)
Date: Thu, 21 Jan 2016 14:39:29 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 21.01.2016 14:19, Victor Stinner wrote:
> 2016-01-21 10:39 GMT+01:00 M.-A. Lemburg <mal at>:
>> I ran performance tests on these optimization tricks (and
>> others) in 2014. See this talk:
>> (slides 33ff.)
> Ah nice, thanks for the slides.

Forgot to mention the benchmarks I used:

>> The keyword trick doesn't really pay off in terms of added
>> performance vs. danger of introducing weird bugs.
> I ran a quick microbenchmark to measure the cost of LOAD_GLOBAL to
> load a global: call func("abc") with
>    mylen = len
>    def func(obj): return mylen(obj)
> Result:
> 117 ns: original bytecode (LOAD_GLOBAL)
> 109 ns: LOAD_CONST
> 116 ns: LOAD_CONST with guard
> LOAD_CONST avoids 1 dict lookup (globals) and reduces the runtime by 8
> ns: 7% faster. But the guard has a cost of 7 ns: we only win 1
> nanosecond. Not really interesting here.
> LOAD_CONST means that the LOAD_GLOBAL instruction has been replaced
> with a LOAD_CONST instruction. The guard checks if the frame globals
> and globals()['mylen'] didn't change.
> I ran a second microbenchmark on func("abc") to measure the cost
> LOAD_GLOBAL to load a builtin: call func("abc") with
>    def func(obj): return len(obj)
> Result:
> 124 ns: original bytecode (LOAD_GLOBAL)
> 107 ns: LOAD_CONST
> 116 ns: LOAD_CONST with guard on builtins + globals
> LOAD_CONST avoids 2 dict lookup (globals, builtins) and reduces the
> runtime by 17 ns: 14% faster. But the guard has a cost of 9 ns: we win
> 8 nanosecond, 6% faster.
> Here is the guard is more complex: checks if the frame builtins, the
> frame globals, builtins.__dict__['len'] and globals()['len'] didn't
> change.
> If you avoid guards, it's always faster, but it changes the Python semantics.
> The speedup on such very small example is low. It's more interesting
> when the global or builtin variable is used in a loop: the speedup is
> multipled by the number of loop iterations.

Sure, but for those, you'd probably simply use the in-function

def f(seq):
    z = 0
    local_len = len
    for x in seq:
      if x:
          z += local_len(x)
   return z

This results in a LOAD_FAST inside the loop and is probably
the better way to speed things up.

>> A decorator could help with this (by transforming the byte
>> code and localizing the symbols), e.g.
>> @localize(len)
>> def f(seq):
>>     z = 0
>>     for x in seq:
>>        if x:
>>            z += len(x)
>>     return z
> FYI has such decorator:
> @asconstants(len=len).

Interesting :-)

>> All that said, I don't really believe that this is a high
>> priority feature request. The gained performance win is
>> not all that great and only becomes relevant when used
>> in tight loops.
> Yeah, in the Python stdlib, the hack is only used for loops.

Right. The only advantage I'd see in having a keyword
to "configure" the behavior is that you could easily
apply the change to a whole module/function without having
to add explicit localizations everywhere.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 21 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From agustin.herranz at  Thu Jan 21 13:14:18 2016
From: agustin.herranz at (=?UTF-8?Q?Agust=c3=adn_Herranz_Cecilia?=)
Date: Thu, 21 Jan 2016 19:14:18 +0100
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

El 2016/01/21 a las 1:11, Guido van Rossum escribi?:
> On Wed, Jan 20, 2016 at 9:42 AM, Andrew Barnert via Python-ideas 
> <python-ideas at <mailto:python-ideas at>> wrote:
>     On Jan 20, 2016, at 06:27, Agust?n Herranz Cecilia
>     <agustin.herranz at <mailto:agustin.herranz at>> wrote:
>     >
>     > - GVR proposal includes some kind of syntactic sugar for
>     function type comments (" # type: (t_arg1, t_arg2) -> t_ret "). I
>     think it's good but this must be an alternative over typing module
>     syntax (PEP484), not the preferred way (for people get used to
>     typehints). Is this syntactic sugar compatible with generators?
>     The type analyzers could be differentiate between a Callable and a
>     Generator?
>     I'm pretty sure Generator is not the type of a generator function,
>     bit of a generator object. So to type a generator function, you
>     just write `(int, int) -> Generator[int]`. Or, the long way,
>     `Function[[int, int], Generator[int]]`.
> There is no 'Function' -- it existed in mypy before PEP 484 but was 
> replaced by 'Callable'. And you don't annotate a function def with '-> 
> Callable' (unless it returns another function). The Callable type is 
> only needed in the signature of higher-order functions, i.e. functions 
> that take functions for arguments or return a function. For example, a 
> simple map function would be written like this:
> def map(f: Callable[[T], S], a: List[T]) -> List[S]:
>     ...
> As to generators, we just improved how mypy treats generators 
> ( 
> The Generator type has *three* parameters: the "yield" type (what's 
> yielded), the "send" type (what you send() into the generator, and 
> what's returned by yield), and the "return" type (what a return 
> statement in the generator returns, i.e. the value for the 
> StopIteration exception). You can also use Iterator if your generator 
> doesn't expect its send() or throw() messages to be called and it 
> isn't returning a value for the benefit of `yield from'.
> For example, here's a simple generator that iterates over a list of 
> strings, skipping alternating values:
> def skipper(a: List[str]) -> Iterator[str]:
>     for i, s in enumerate(a):
> if i%2 == 0:
> yield s
> and here's a coroutine returning a string (I know, it's pathetic, but 
> it's an example :-):
> @asyncio.coroutine
> def readchar() -> Generator[Any, None, str]:
>     # Implementation not shown
> @asyncio.coroutine
> def readline() -> Generator[Any, None, str]:
>     buf = ''
>     while True:
>         c = yield from readchar()
> if not c: break
> buf += c
> if c == '\n': break
> return buf
> Here, in Generator[Any, None, str], the first parameter ('Any') refers 
> to the type yielded -- it actually yields Futures, but we don't care 
> about that (it's an asyncio implementation detail). The second 
> parameter ('None') is the type returned by yield -- again, it's an 
> implementation detail and we might just as well say 'Any' here. The 
> third parameter (here 'str') is the type actually returned by the 
> 'return' statement.
> It's illustrative to observe that the signature of readchar() is 
> exactly the same (since it also returns a string). OTOH the return 
> type of e.g. asyncio.sleep() is Generator[Any, None, None], because it 
> doesn't return a value.
> This business is clearly still suboptimal -- we would like to 
> introduce a new type, perhaps named Coroutine, so that you can write 
> Coroutine[T] instead of Generator[Any, None, T]. But that would just 
> be a shorthand. The actual type of a generator object is always some 
> parametrization of Generator.
> In any case, whatever we write after the -> (i.e., the return type) is 
> still the type of the value you get when you call the function. If the 
> function is a generator function, the value you get is a generator 
> object, and that's what the return type designates.
>     (Of course you can use Callable instead of the more specific
>     Function, or Iterator (or even Iterable) instead of the more
>     specific Generator, if you want to be free to change the
>     implementation to use an iterator class or something later, but
>     normally you'd want the most specific type, I think.)
> I don't know where you read about Callable vs. Function.
> Regarding using Iterator[T] instead of Generator[..., ..., T], you are 
> correct.
> Note that you *cannot* define a generator function as returning a 
> *subclass* of Iterator/Generator; there is no way to have a generator 
> function instantiate some other class as its return value. Consider 
> (ignoring generic types):
> class MyIterator:
>     def __next__(self): ...
>     def __iter__(self): ...
>     def bar(self): ...
> def foo() -> MyIterator:
>     yield
> x = foo()
> # Boom!
> The type checker would assume that x has a method bar() based on the 
> declared return type for foo(), but it doesn't. (There are a few other 
> special cases, in addition to Generator and Iterator; declaring the 
> return type to be Any or object is allowed.)
This is a mistake by my side, I got confused, the generator is just the 
return type of the callable, but the returned generator it's also a 

>     > - As this is intended to gradual type python2 code to port it to
>     python 3 I think it's convenient to add some sort of import that
>     only be used for type checking, and be only imported by the type
>     analyzer, not the runtime. This could be achieve by prepending
>     "#type: " to the normal import statement, something like:
>     >    # type: import module
>     >    # type: from package import module
>     That sounds like a bad idea. If the typing module shadows some
>     global, you won't get any errors, but your code will be misleading
>     to a reader (and even worse if you from package.module import t).
>     If the cost of the import is too high for Python 2, surely it's
>     also too high for Python 3. And what other reason do you have for
>     skipping it?
> Exactly. Even though (when using Python 2) all type annotations are in 
> comments, you still must write real imports. (This causes minor 
> annoyances with linters that warn about unused imports, but there are 
> ways to teach them.)
This type comment 'imports' are not intended to shadow the current 
namespace, are intended to tell the analyzer where it can find those 
types present in the type comments that are not in the current namespace 
without import in it. This surely complicates the analyzer task but 
helps avoid namespace pollution and also saves memory on runtime.

The typical case I've found is when using a third party library (that 
don't have type information) and you creates objects with a factory. The 
class of the objects is no needed anywhere so it's not imported in the 
current namespace, but it's needed only for type analysis and autocomplete.

>     > - Also there must be addressed how it work on a python2 to
>     python3 environment as there are types with the same name, str for
>     example, that works differently on each python version. If the
>     code is for only one version uses the type names of that version.
>     That's the same problem that exists at runtime, and people (and
>     tools) already know how to deal with it: use bytes when you mean
>     bytes, unicode when you mean unicode, and str when you mean
>     whatever is "native" to the version you're running under and are
>     willing to deal with it. So now you just have to do the same thing
>     in type hints that you're already doing in constructors,
>     isinstance checks, etc.
> This is actually still a real problem. But it has no bearing on the 
> choice of syntax for annotations in Python 2 or straddling code.

Yes, this is no related with the choice of syntax for annotations 
directly. This is intended to help in the process of porting python2 
code to python3, and it's outside of the PEP scope but related to the 
original problem. What I have in mind is some type aliases so you could 
annotate a version specific type to avoid ambiguousness on code that 
it's used on different versions. At the end what I originally try to 
said is that it's good to have a convention way to name this type aliases.

This are intended to use during the process of porting, to help some 
automated tools, in a period of transition between versions. It's a way 
to tell the analyzer that a type have a behavior, perhaps different, 
than the same type on the running python version.

For example. You start with some working python2 code that you want to 
still be working. A code analysis tool can infer the types and annotate 
the code. Also can check which parts are py2/py3 compatible and which 
not, and mark those types with the mentioned type aliases. With this, 
and test suites, it could be calculated how much code is needed to be 
ported. Refactor to adapt the code to python3 maintaining code to still 
run on python2 (it could be marked for automate deletion), and when it's 
done, drop all the python2 code..
>     Of course many people use libraries like six to help them deal
>     with this, which means that those libraries have to be type-hinted
>     appropriately for both versions (maybe using different stubs for
>     py2 and py3, with the right one selected at pip install time?),
>     but if that's taken care of, user code should just work.
> Yeah, we could use help. There are some very rudimentary stubs for a 
> few things defined by six 
> (, 
> but we need more. There's a PR but it's of bewildering size 
> (
I think the process of porting it's different from the process of 
adapting code to work on python 2/3. Code with bytes, unicode, & 
str(don't mind) are not python2 code nor python3. Lot's of libraries 
that are 2/3 compatibles are just python2 code minimally adapted to run 
on python3 with six, and still be developed with a python2 style. When 
the time of drop python2 arrives the refactor needed will be huge. There 
is also an article that recently claims "Stop writing code that break on 
Python 4" and show code that treats python3 as the special case..

> PS. I have a hard time following the rest of Agustin's comments. The 
> comment-based syntax I proposed for Python 2.7 does support exactly 
> the same functionality as the official PEP 484 syntax; the only thing 
> it doesn't allow is selectively leaving out types for some arguments 
> -- you must use 'Any' to fill those positions. It's not a problem in 
> practice, and it doesn't reduce functionality (omitted argument types 
> are assumed to be Any in PEP 484 too). I should also remark that mypy 
> supports the comment-based syntax in Python 2 mode as well as in 
> Python 3 mode; but when writing Python 3 only code, the non-comment 
> version is strongly preferred. (We plan to eventually produce a tool 
> that converts the comments to standard PEP 484 syntax).
> -- 
> --Guido van Rossum ( <>)

My original point is that if comment-based function annotations are 
going to be added, add it to python 3 too, no only for the special case 
of "Python 2.7 and straddling code", even though, on python 3, type 
annotations are preferred.

I think that have the alternative to define types of a function as a 
type comment is a good thing because annotations could become a mesh, 
specially with complex types and default parameters, and I don't fell 
that the optional part of gradual typing must include readability.
Some examples of my own code:

class Field:
     def __init__(self, name: str,
                  extract: Callable[[str], str],
                  validate: Callable[[str], bool]=bool_test,
                  transform: Callable[[str], Any]=identity) -> 'Field':

class RepeatableField:
     def __init__(self,
                  extract: Callable[[str], str],
                  size: int,
                  fields: List[Field],
                  index_label: str,
                  index_transform: Callable[[int], str]=lambda x: 
str(x)) -> 'RepeatableField':

def filter_by(field_gen: Iterable[Dict[str, Any]], **kwargs) -> 
Generator[Dict[str, Any], Any, Any]:

So, for define a comment-based function annotation it should be accepted 
two kind of syntax:
- one 'explicit' marking the type of the function according to the 
PEP484 syntax:

     def embezzle(self, account, funds=1000000, *fake_receipts):
         # type: Callable[[str, int, *str], None]
         """Embezzle funds from account using fake receipts."""
         <code goes here>

   like if was a normal type comment:

     embezzle = get_embezzle_function()  # type: Callable[[str, int, *str], None]

- and another one that 'implicitly' define the type of the function as 

     def embezzle(self, account, funds=1000000, *fake_receipts):
         # type: (str, int, *str) -> None
         """Embezzle funds from account using fake receipts."""
         <code goes here>

Both ways are easily translated back and forth into python3 annotations.

Also, comment-based function annotations easily goes over one line's 
characters, so it should be define which syntax is used to break the 
line. As it said on

Those things should be on a PEP as a standard way to implement this, not 
only for mypy, also for other tools.
Accept comment-based function annotations in python3 is good for 
migration python 2/3 code as it helps on refactor and use (better 
autocomplete), but makes it a python2 feature and not python3 increase 
the gap between versions.

Hope I expressed better, if not, sorry about that.

Agust?n Herranz

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Thu Jan 21 13:44:07 2016
From: guido at (Guido van Rossum)
Date: Thu, 21 Jan 2016 10:44:07 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia <
agustin.herranz at> wrote:

> El 2016/01/21 a las 1:11, Guido van Rossum escribi?:
> [...]
>> > - As this is intended to gradual type python2 code to port it to python
>> 3 I think it's convenient to add some sort of import that only be used for
>> type checking, and be only imported by the type analyzer, not the runtime.
>> This could be achieve by prepending "#type: " to the normal import
>> statement, something like:
>> >    # type: import module
>> >    # type: from package import module
>> That sounds like a bad idea. If the typing module shadows some global,
>> you won't get any errors, but your code will be misleading to a reader (and
>> even worse if you from package.module import t). If the cost of the import
>> is too high for Python 2, surely it's also too high for Python 3. And what
>> other reason do you have for skipping it?
> Exactly. Even though (when using Python 2) all type annotations are in
> comments, you still must write real imports. (This causes minor annoyances
> with linters that warn about unused imports, but there are ways to teach
> them.)
> This type comment 'imports' are not intended to shadow the current
> namespace, are intended to tell the analyzer where it can find those types
> present in the type comments that are not in the current namespace without
> import in it. This surely complicates the analyzer task but helps avoid
> namespace pollution and also saves memory on runtime.
> The typical case I've found is when using a third party library (that
> don't have type information) and you creates objects with a factory. The
> class of the objects is no needed anywhere so it's not imported in the
> current namespace, but it's needed only for type analysis and autocomplete.

You're describing a case I have also encountered: we have a module with a
function foo

def foo(a):

and the intention is that a is an instance of a class A defined in another
module, which is not imported.

If we add annotations we have to add an import

from a_mod import A

def foo(a: A) -> str:

But the code that calls foo() is already importing A from a_mod somewhere,
so there's not really any time wasted -- the import is just done at a
different time.

At least, that's the theory.

In practice, indeed there are some unpleasant cases. For example, adding
the explicit import might create an import cycle, and A may not yet be
defined when foo_mod is loaded. We can't use the usual subterfuge, since we
really need the definition of A:

import a_mod

def foo(a: a_mod.A) -> str:

This will still fail if a_mod hasn't defined A yet because we reference
a_mod.A at load time (annotations are evaluated when the function
definition is executed).

So we end up with this:

import a_mod

def foo(a: 'a_mod.A') -> str:

This is both hard to read and probably wastes a lot of developer time
figuring out they have to do this.

And there are other issues, e.g. some folks have tricks to work around
their start-up time by importing modules late (e.g. do the import inside
the function that needs that module).

In mypy there's another hack possible: it doesn't care if an import is
inside "if False". So you can write:

if False:
    from a_mod import A

def foo(a: 'A') -> str:

You still have to quote 'A' because A isn't actually defined at run time,
but it's the best we can do. When using type comments you can skip the

if False:
    from a_mod import A

def foo(a):
    # type: (A) -> str

All of this is unpleasant but not unbearable -- the big constraint here is
that we don't want to add extra syntax (beyond PEP 3107, i.e. function
annotations), so that we can use mypy for Python 3.2 and up. And with the
type comments we even support 2.7.

>> > - Also there must be addressed how it work on a python2 to python3
>> environment as there are types with the same name, str for example, that
>> works differently on each python version. If the code is for only one
>> version uses the type names of that version.
>> That's the same problem that exists at runtime, and people (and tools)
>> already know how to deal with it: use bytes when you mean bytes, unicode
>> when you mean unicode, and str when you mean whatever is "native" to the
>> version you're running under and are willing to deal with it. So now you
>> just have to do the same thing in type hints that you're already doing in
>> constructors, isinstance checks, etc.
> This is actually still a real problem. But it has no bearing on the choice
> of syntax for annotations in Python 2 or straddling code.
> Yes, this is no related with the choice of syntax for annotations
> directly. This is intended to help in the process of porting python2 code
> to python3, and it's outside of the PEP scope but related to the original
> problem. What I have in mind is some type aliases so you could annotate a
> version specific type to avoid ambiguousness on code that it's used on
> different versions. At the end what I originally try to said is that it's
> good to have a convention way to name this type aliases.

Yes, this is a useful thing to discuss.

Maybe we can standardize on the types defined by the 'six' package, which
is commonly used for 2-3 straddling code:

six.text_type (unicode in PY2, str in PY3)
six.binary_type (str in PY2, bytes in PY3)

Actually for the latter we might as well use bytes.

> This are intended to use during the process of porting, to help some
> automated tools, in a period of transition between versions. It's a way to
> tell the analyzer that a type have a behavior, perhaps different, than the
> same type on the running python version.
> For example. You start with some working python2 code that you want to
> still be working. A code analysis tool can infer the types and annotate the
> code. Also can check which parts are py2/py3 compatible and which not, and
> mark those types with the mentioned type aliases. With this, and test
> suites, it could be calculated how much code is needed to be ported.
> Refactor to adapt the code to python3 maintaining code to still run on
> python2 (it could be marked for automate deletion), and when it's done,
> drop all the python2 code..

Yes, that's the kind of process we're trying to develop. It's still early
days though -- people have gotten different workflows already using six and
tests and the notion of straddling code, __future__ imports, and PyPI
backports of some PY3 stdlib packages (e.g. contextlib2).

There's also a healthy set of tools that converts PY2 code to straddling
code, approximately (e.g. futurize and modernize). What's missing (as you
point out) is tools that help automating a larger part of the conversion
once PY2 code has been annotated.

But first we need to agree on how to annotate PY2 code.

> Of course many people use libraries like six to help them deal with this,
>> which means that those libraries have to be type-hinted appropriately for
>> both versions (maybe using different stubs for py2 and py3, with the right
>> one selected at pip install time?), but if that's taken care of, user code
>> should just work.
> Yeah, we could use help. There are some very rudimentary stubs for a few
> things defined by six (
> <>
> but
> we need more. There's a PR but it's of bewildering size (
> I think the process of porting it's different from the process of adapting
> code to work on python 2/3. Code with bytes, unicode, & str(don't mind) are
> not python2 code nor python3. Lot's of libraries that are 2/3 compatibles
> are just python2 code minimally adapted to run on python3 with six, and
> still be developed with a python2 style. When the time of drop python2
> arrives the refactor needed will be huge. There is also an article that
> recently claims "Stop writing code that break on Python 4" and show code
> that treats python3 as the special case..
> PS. I have a hard time following the rest of Agustin's comments. The
> comment-based syntax I proposed for Python 2.7 does support exactly the
> same functionality as the official PEP 484 syntax; the only thing it
> doesn't allow is selectively leaving out types for some arguments -- you
> must use 'Any' to fill those positions. It's not a problem in practice, and
> it doesn't reduce functionality (omitted argument types are assumed to be
> Any in PEP 484 too). I should also remark that mypy supports the
> comment-based syntax in Python 2 mode as well as in Python 3 mode; but when
> writing Python 3 only code, the non-comment version is strongly preferred.
> (We plan to eventually produce a tool that converts the comments to
> standard PEP 484 syntax).
> --
> --Guido van Rossum ( <>)
> My original point is that if comment-based function annotations are going
> to be added, add it to python 3 too, no only for the special case of
> "Python 2.7 and straddling code", even though, on python 3, type
> annotations are preferred.

The text I added to the end of PEP 484 already says so:

- Tools that support this syntax should support it regardless of the
  Python version being checked.  This is necessary in order to support
  code that straddles Python 2 and Python 3.

> I think that have the alternative to define types of a function as a type
> comment is a good thing because annotations could become a mesh, specially
> with complex types and default parameters, and I don't fell that the
> optional part of gradual typing must include readability.
> Some examples of my own code:
> class Field:
>     def __init__(self, name: str,
>                  extract: Callable[[str], str],
>                  validate: Callable[[str], bool]=bool_test,
>                  transform: Callable[[str], Any]=identity) -> 'Field':
> class RepeatableField:
>     def __init__(self,
>                  extract: Callable[[str], str],
>                  size: int,
>                  fields: List[Field],
>                  index_label: str,
>                  index_transform: Callable[[int], str]=lambda x: str(x))
> -> 'RepeatableField':
> def filter_by(field_gen: Iterable[Dict[str, Any]], **kwargs) ->
> Generator[Dict[str, Any], Any, Any]:
> So, for define a comment-based function annotation it should be accepted
> two kind of syntax:
> - one 'explicit' marking the type of the function according to the PEP484
> syntax:
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         # type: Callable[[str, int, *str], None]
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
>   like if was a normal type comment:
>     embezzle = get_embezzle_function()  # type: Callable[[str, int, *str], None]
> - and another one that 'implicitly' define the type of the function as
> Callable:
>     def embezzle(self, account, funds=1000000, *fake_receipts):
>         # type: (str, int, *str) -> None
>         """Embezzle funds from account using fake receipts."""
>         <code goes here>
> Both ways are easily translated back and forth into python3 annotations.

I don't see what adding support for

# type: Callable[[str, int, *str], None]

adds. It's more verbose, and when the 'implicit' notation is used, the type
checker already knows that embezzle is a function with that signature. You
can already do this (except for the *str part):

from typing import Callable

def embezzle(account, funds=1000000):
    # type: (str, int) -> None
    """Embezzle funds from account using fake receipts."""

f = None  # type: Callable[[str, int], None]

f = embezzle

f('a', 42)

However, note that no matter which notation you use, there's no way in PEP
484 to write the type of the original embezzle() function object using
Callable -- Callable does not have support for varargs like *fake_receipts.
If you want that the best place to bring it up is the typehinting tracker ( But it's going to be a tough
nut to crack, and the majority of places where Callable is needed (mostly
higher-order functions like filter/map) don't need it -- their function
arguments have purely positional arguments.

> Also, comment-based function annotations easily goes over one line's
> characters, so it should be define which syntax is used to break the line.
> As it said on
> Those things should be on a PEP as a standard way to implement this, not
> only for mypy, also for other tools.
> Accept comment-based function annotations in python3 is good for migration
> python 2/3 code as it helps on refactor and use (better autocomplete), but
> makes it a python2 feature and not python3 increase the gap between
> versions.

Consider it done. The time machine strikes again. :-)

> Hope I expressed better, if not, sorry about that.

It's perfectly fine this time!

> Agust?n Herranz

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Thu Jan 21 14:35:43 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 21 Jan 2016 11:35:43 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 20, 2016, at 20:59, Greg Ewing <greg.ewing at> wrote:
> My idea for handling this kind of thing is:
>  for new x in things:
>    funcs.append(lambda: dosomethingwith(x))
> The 'new' modifier can be applied to any assignment target,
> and conceptually has the effect of creating a new binding
> instead of changing an existing binding.

C# almost did this (but only in foreach statements, not all bindings), but in the end they decided that it was simpler to just make foreach _always_ create a new binding each time through the loop, instead of requiring new syntax.

I think most of the rationale is in one of Eric Lippert's blog posts with a name like "loop closures considered harmful" (I can't remember the exact title, and searching while typing sucks on a phone), but I can summarize here.

C# had the exact same problem, for the exact same reasons. And, since they don't have the default-value trick, the solution required defining a new local copy in the same scope as the function definition (which means, if you're defining the function in expression context, you have to wrap it in another lambda and call it).

After years of closing bugs with "no, C# closures are not broken, what you're complaining about is the exact definition of a closure", they decided they had to do something about it. Every option they considered had some unacceptable feature, but in the end they decided that leaving it as-is was also unacceptable. So, borrowing a bit of practicality-beats-purity from some other language, they decided that a breaking semantic change, and making foreach and C-style for less consistent, and violating one of their fundamental design principles (left is always at least as outside as right) was the best choice.

Python doesn't have the left-outside principle to break (see comprehensions), doesn't have a C-style for to be consistent with, and has probably less rather than more performance impact (we know whether a loop variable is captured, and can skip it for non-cellvars). But it probably has more backward compatibility issues (nobody writes new code expecting it to work for C# 3 as well as C# 5, but people are still writing code that has to work with Python 2.7). So, unless we can be sure that nobody intentionally writes code with a free variable that captures a loop variable, the C# solution isn't available.

Which means your solution is probably the next best thing. And, while I don't see any compelling need for it anywhere other than loop variables, there's also no compelling reason to ban it elsewhere, so why not keep assignment targets consistent.

From tjreedy at  Thu Jan 21 23:08:02 2016
From: tjreedy at (Terry Reedy)
Date: Thu, 21 Jan 2016 23:08:02 -0500
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <n7s9v3$v13$>

On 1/21/2016 1:44 PM, Guido van Rossum wrote:

[Snip discussion of nitty-gritty issue of annotating code, especially 
2.7 code.]

I suspect that at this point making migration from 2.7 to 3.x *easier*, 
with annotations, will do more to encourage migration, overall, than yet 
another new module.  So I support this work even if I will not directly 
use it.

If you are looking for a PyCon talk topic, I think this, with your 
experiences up that time, would be a good one.

Only slightly off topic, I also think it worthwhile to reiterate that 
pydev support for 2.7 really really will end in 2020, possibly on the 
first day, as now documented in the nice new front-page devguide chart.
I have read people saying (SO comments, I think) that there might or 
will be a security-patch only phase of some number of years *after* that.

> There's also a healthy set of tools that converts PY2 code to straddling
> code, approximately (e.g. futurize and modernize). What's missing (as
> you point out) is tools that help automating a larger part of the
> conversion once PY2 code has been annotated.

PEP 484 gives the motivation for 2.7 compatible type comments as "Some 
tools may want to support type annotations in code that must be 
compatible with Python 2.7. "  To me, this just implies running a static 
analyzer over *existing* code.  Using type hint comments to help 
automate conversion, if indeed possible, would be worth adding to the 

> But first we need to agree on how to annotate PY2 code.

Given the current addition to an accepted PEP, I though we more or less 
had, at least provisionally.

Terry Jan Reedy

From brett at  Fri Jan 22 13:37:48 2016
From: brett at (Brett Cannon)
Date: Fri, 22 Jan 2016 18:37:48 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:

> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia <
> agustin.herranz at> wrote:
>> El 2016/01/21 a las 1:11, Guido van Rossum escribi?:
> [...]
> [SNIP]
>> > - Also there must be addressed how it work on a python2 to python3
>>> environment as there are types with the same name, str for example, that
>>> works differently on each python version. If the code is for only one
>>> version uses the type names of that version.
>>> That's the same problem that exists at runtime, and people (and tools)
>>> already know how to deal with it: use bytes when you mean bytes, unicode
>>> when you mean unicode, and str when you mean whatever is "native" to the
>>> version you're running under and are willing to deal with it. So now you
>>> just have to do the same thing in type hints that you're already doing in
>>> constructors, isinstance checks, etc.
>> This is actually still a real problem. But it has no bearing on the
>> choice of syntax for annotations in Python 2 or straddling code.
>> Yes, this is no related with the choice of syntax for annotations
>> directly. This is intended to help in the process of porting python2 code
>> to python3, and it's outside of the PEP scope but related to the original
>> problem. What I have in mind is some type aliases so you could annotate a
>> version specific type to avoid ambiguousness on code that it's used on
>> different versions. At the end what I originally try to said is that it's
>> good to have a convention way to name this type aliases.
> Yes, this is a useful thing to discuss.
> Maybe we can standardize on the types defined by the 'six' package, which
> is commonly used for 2-3 straddling code:
> six.text_type (unicode in PY2, str in PY3)
> six.binary_type (str in PY2, bytes in PY3)
> Actually for the latter we might as well use bytes.

I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
Python 3.

As for the textual type, I say either `text` or `unicode` since they are
both unambiguous between Python 2 and 3 and get the point across.

And does `str` represent the type for the specific version of Python mypy
is running under, or is it pegged to a specific representation across
Python 2 and 3? If it's the former then fine, else those people who use the
"native string" concept might want a way to say "I want the `str` type as
defined on the version of Python I'm running under" (personally I don't
promote the "native string" concept, but I know it has been brought up in
the past).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Jan 22 14:08:14 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 11:08:14 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon <brett at> wrote:

> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia <
>> agustin.herranz at> wrote:
>> [...]
>> Yes, this is no related with the choice of syntax for annotations
>> directly. This is intended to help in the process of porting python2 code
>> to python3, and it's outside of the PEP scope but related to the original
>> problem. What I have in mind is some type aliases so you could annotate a
>> version specific type to avoid ambiguousness on code that it's used on
>> different versions. At the end what I originally try to said is that it's
>> good to have a convention way to name this type aliases.
>> Yes, this is a useful thing to discuss.
>> Maybe we can standardize on the types defined by the 'six' package, which
>> is commonly used for 2-3 straddling code:
>> six.text_type (unicode in PY2, str in PY3)
>> six.binary_type (str in PY2, bytes in PY3)
>> Actually for the latter we might as well use bytes.
> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
> Python 3.

OK, that's settled.

> As for the textual type, I say either `text` or `unicode` since they are
> both unambiguous between Python 2 and 3 and get the point across.

Then let's call it unicode. I suppose we can add this to In PY2,
typing.unicode is just the built-in unicode. In PY3, it's the built-in str.

> And does `str` represent the type for the specific version of Python mypy
> is running under, or is it pegged to a specific representation across
> Python 2 and 3? If it's the former then fine, else those people who use the
> "native string" concept might want a way to say "I want the `str` type as
> defined on the version of Python I'm running under" (personally I don't
> promote the "native string" concept, but I know it has been brought up in
> the past).

In mypy (and in typeshed and in, 'str' refers to the type named
str in the Python version for which you are checking -- i.e. by default
mypy checks in PY3 mode and str will be the unicode type; but "mypy --py2"
checks in PY2 mode and str will be the Python 2 8-bit string type. (This is
actually the only thing that makes sense IMO.)

There's one more thing that I wonder might be useful. In PY2 we have
basestring as the supertype of str and unicode. As far as mypy is concerned
it's almost the same as Union[str, unicode]. Maybe we could add this to as well so it's also available in PY3, in that case as a
shorthand for Union[str, unicode].

FWIW We are having a long discussion about this topic in the mypy tracker: -- interested parties are
invited to participate there!

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Fri Jan 22 14:19:24 2016
From: random832 at (Random832)
Date: Fri, 22 Jan 2016 14:19:24 -0500
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 22, 2016, at 14:08, Guido van Rossum wrote:
> In mypy (and in typeshed and in, 'str' refers to the type
> named
> str in the Python version for which you are checking -- i.e. by default
> mypy checks in PY3 mode and str will be the unicode type; but "mypy
> --py2"
> checks in PY2 mode and str will be the Python 2 8-bit string type. (This
> is
> actually the only thing that makes sense IMO.)

Why should it need to check both modes separately? Does it not work at a
level where it can see if the expression that a value originates from is
"native" (e.g. a literal with no u/b) or bytes/unicode?

From guido at  Fri Jan 22 14:24:13 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 11:24:13 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 22, 2016 at 11:19 AM, Random832 <random832 at> wrote:

> On Fri, Jan 22, 2016, at 14:08, Guido van Rossum wrote:
> > In mypy (and in typeshed and in, 'str' refers to the typenamed
> > str in the Python version for which you are checking -- i.e. by default
> > mypy checks in PY3 mode and str will be the unicode type; but "mypy
> --py2"
> > checks in PY2 mode and str will be the Python 2 8-bit string type. (This
> > is actually the only thing that makes sense IMO.)
> Why should it need to check both modes separately? Does it not work at a
> level where it can see if the expression that a value originates from is
> "native" (e.g. a literal with no u/b) or bytes/unicode?

There are many differences between PY2 and PY3, not the least in the stubs
for the stdlib. If you get an expression by calling a built-in function (or
anything else that's not a literal) the type depends on what's in the stub.
The architecture of mypy just isn't designed to take two different sets of
stubs (and other differences in rules, e.g. whether something's an iterator
because it defines '__next__' or 'next') into account at once.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Fri Jan 22 14:35:55 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 22 Jan 2016 11:35:55 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 22, 2016, at 10:37, Brett Cannon <brett at> wrote:
>> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>> Yes, this is a useful thing to discuss.
>> Maybe we can standardize on the types defined by the 'six' package, which is commonly used for 2-3 straddling code:
>> six.text_type (unicode in PY2, str in PY3)
>> six.binary_type (str in PY2, bytes in PY3)
>> Actually for the latter we might as well use bytes.
> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in Python 3.
> As for the textual type, I say either `text` or `unicode` since they are both unambiguous between Python 2 and 3 and get the point across.

The only problem is that, while bytes is a builtin type in both 2.7 and 3.x, with similar behavior (especially in 3.5, where simple %-formatting code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that would require people writing something like "try: unicode except: unicode=str" at the top of every file (or monkeypatching builtins somewhere) for the annotations to actually be valid 3.x code. And, if you're going to do that, using something that's already wide-spread and as close to a de facto standard as possible, like the six type suggested by Guido, seems less disruptive than inventing a new standard (even if "text" or "unicode" is a little nicer than "six.text_type").

(Or, of course, Guido could just get in his time machine and, along with restoring the u string literal prefix in 3.3, also restore the builtin name unicode as a synonym for str, and then this whole mail thread would fade out like Marty McFly.)

Also, don't forget "basestring", which some 2.x code uses. A lot of such code just drops bytes support when modernizing, but if not, it has to change to something that means basestring or str|unicode in 2.x and bytes|str in 3.x. Again, six has a solution for that, string_types, and mypy could standardize on that solution too.

> And does `str` represent the type for the specific version of Python mypy is running under, or is it pegged to a specific representation across Python 2 and 3? If it's the former then fine,

In six-based code, it means native string, and there are tools designed to help you go over all your str uses and decide which ones should be changed to something else (usually text_type or binary_type), but no special name to use when you decide "I really do want native str here". So, I think it makes sense for mypy to assume the same, rather than to encourage people to shadow or rebind str to make mypy happy in 2.x.

Speaking of native strings: six code often doesn't use native strings for __str__, instead using explicit text, and the @python_2_unicode_compatible class decorator. Will mypy need special support for that decorator to handle those types? If so, it's probably worth adding; otherwise, it would be encouraging people to stick with native strings instead of switching to text.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Jan 22 14:44:27 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 11:44:27 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

Looks like our messages crossed.

On Fri, Jan 22, 2016 at 11:35 AM, Andrew Barnert <abarnert at> wrote:

> On Jan 22, 2016, at 10:37, Brett Cannon <brett at> wrote:
> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>> Yes, this is a useful thing to discuss.
>> Maybe we can standardize on the types defined by the 'six' package, which
>> is commonly used for 2-3 straddling code:
>> six.text_type (unicode in PY2, str in PY3)
>> six.binary_type (str in PY2, bytes in PY3)
>> Actually for the latter we might as well use bytes.
> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
> Python 3.
> As for the textual type, I say either `text` or `unicode` since they are
> both unambiguous between Python 2 and 3 and get the point across.
> The only problem is that, while bytes is a builtin type in both 2.7 and
> 3.x, with similar behavior (especially in 3.5, where simple %-formatting
> code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that
> would require people writing something like "try: unicode except:
> unicode=str" at the top of every file (or monkeypatching builtins
> somewhere) for the annotations to actually be valid 3.x code. And, if
> you're going to do that, using something that's already wide-spread and as
> close to a de facto standard as possible, like the six type suggested by
> Guido, seems less disruptive than inventing a new standard (even if "text"
> or "unicode" is a little nicer than "six.text_type").
> (Or, of course, Guido could just get in his time machine and, along with
> restoring the u string literal prefix in 3.3, also restore the builtin name
> unicode as a synonym for str, and then this whole mail thread would fade
> out like Marty McFly.)
> Also, don't forget "basestring", which some 2.x code uses. A lot of such
> code just drops bytes support when modernizing, but if not, it has to
> change to something that means basestring or str|unicode in 2.x and
> bytes|str in 3.x. Again, six has a solution for that, string_types, and
> mypy could standardize on that solution too.
> And does `str` represent the type for the specific version of Python mypy
> is running under, or is it pegged to a specific representation across
> Python 2 and 3? If it's the former then fine,
> In six-based code, it means native string, and there are tools designed to
> help you go over all your str uses and decide which ones should be changed
> to something else (usually text_type or binary_type), but no special name
> to use when you decide "I really do want native str here". So, I think it
> makes sense for mypy to assume the same, rather than to encourage people to
> shadow or rebind str to make mypy happy in 2.x.
> Speaking of native strings: six code often doesn't use native strings for
> __str__, instead using explicit text, and the @python_2_unicode_compatible
> class decorator. Will mypy need special support for that decorator to
> handle those types? If so, it's probably worth adding; otherwise, it would
> be encouraging people to stick with native strings instead of switching to
> text.

That decorator is in the typeshed stubs and appears to work -- although it
looks like it's just a noop even in PY2. If that requires tweaks please
submit a bug to the typeshed project tracker (

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Jan 22 15:00:57 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 12:00:57 -0800
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload outside
 stub files
Message-ID: <>

Ben Darnell (Tornado lead) brought up a good use case for allowing
@overload in regular Python files.

There's some discussion (some old, some new) here:

I now propose to allow @overload in non-stub (i.e. .py) files, but with the
following rule: a series of @overload-decorated functions must be followed
by an implementation function that's not @overload-decorated. Calling an
@overload-decorated function is still an error (I propose NotImplemented).
Due to the way repeated function definitions with the same name replace
each other, leaving only the last one active, this should work. E.g. for
Tornado's utf8() the full definition would look like this:

@overloaddef utf8(value: None) -> None: ... at overloaddef utf8(value:
bytes) -> bytes: ... at overloaddef utf8(value: str) -> bytes: ...  # or
(unicode)->bytes, in PY2def utf8(value):
    # Real implementation goes here.

NOTE: If you are trying to understand why we can't use a stub file here or
why we can't solve this with type variables or unions, please read the
issue and comment there if things are not clear. Here on python-ideas I'd
like to focus on seeing whether this amendment is non-controversial (apart
from tea party members who just want to repeal PEP 484 entirely :-).

I know that previously we wanted to come up with a complete solution for
multi-dispatch based on type annotations first, and there are philosophical
problems with using @overload (though it can be made to work using
sys._getframe()). The proposal here is *not* that solution. If you call the
@overload-decorated function, it will raise NotImplemented. (But if you
follow the rule, the @overload-decorated function objects are inaccessible
so this would only happen if you forgot or misspelled the final,
undecorated implementation function).

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From p.f.moore at  Fri Jan 22 15:58:43 2016
From: p.f.moore at (Paul Moore)
Date: Fri, 22 Jan 2016 20:58:43 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On 22 January 2016 at 19:08, Guido van Rossum <guido at> wrote:
> On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon <brett at> wrote:
>> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia
>>> <agustin.herranz at> wrote:
>>> [...]
>>> Yes, this is no related with the choice of syntax for annotations
>>> directly. This is intended to help in the process of porting python2 code to
>>> python3, and it's outside of the PEP scope but related to the original
>>> problem. What I have in mind is some type aliases so you could annotate a
>>> version specific type to avoid ambiguousness on code that it's used on
>>> different versions. At the end what I originally try to said is that it's
>>> good to have a convention way to name this type aliases.
>>> Yes, this is a useful thing to discuss.
>>> Maybe we can standardize on the types defined by the 'six' package, which
>>> is commonly used for 2-3 straddling code:
>>> six.text_type (unicode in PY2, str in PY3)
>>> six.binary_type (str in PY2, bytes in PY3)
>>> Actually for the latter we might as well use bytes.
>> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
>> Python 3.
> OK, that's settled.
>> As for the textual type, I say either `text` or `unicode` since they are
>> both unambiguous between Python 2 and 3 and get the point across.
> Then let's call it unicode. I suppose we can add this to In PY2,
> typing.unicode is just the built-in unicode. In PY3, it's the built-in str.

This thread came to my attention just as I'd been thinking about a
related point.

For me, by far the worst Unicode-related porting issue I see is people
with a confused view of what type of data reading a file will give.
This is because open() returns a different type (byte stream or
character stream) depending on its arguments (specifically 'b' in the
mode) and it's frustratingly difficult to track this type across
function calls - especially in code originally written in a Python 2
environment where people *expect* to confuse bytes and strings in this
context. So, for example, I see a function read_one_byte which does, and works fine in real use when a data file (opened with
'b') is processed, but fails when sys.stdin us used (on Python 3once
someone types a Unicode character).

As far as I know, there's no way for type annotations to capture this
distinction - either as they are at present in Python3, nor as being
discussed here. But what I'm not sure of is whether it's something
that *could* be tracked by a type checker. Of course I'm also not sure
I'm right when I say you can't do it right now :-)

Is this something worth including in the discussion, or is it a
completely separate topic?

From guido at  Fri Jan 22 16:11:24 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 13:11:24 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

Interesting. PEP 484 defines an IO generic class, so you can write IO[str]
or IO[bytes]. Maybe introducing separate helper functions that open files
in text or binary mode can complement this to get a solution?

On Fri, Jan 22, 2016 at 12:58 PM, Paul Moore <p.f.moore at> wrote:

> On 22 January 2016 at 19:08, Guido van Rossum <guido at> wrote:
> > On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon <brett at> wrote:
> >>
> >>
> >>
> >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
> >>>
> >>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia
> >>> <agustin.herranz at> wrote:
> >>> [...]
> >>> Yes, this is no related with the choice of syntax for annotations
> >>> directly. This is intended to help in the process of porting python2
> code to
> >>> python3, and it's outside of the PEP scope but related to the original
> >>> problem. What I have in mind is some type aliases so you could
> annotate a
> >>> version specific type to avoid ambiguousness on code that it's used on
> >>> different versions. At the end what I originally try to said is that
> it's
> >>> good to have a convention way to name this type aliases.
> >>>
> >>> Yes, this is a useful thing to discuss.
> >>>
> >>> Maybe we can standardize on the types defined by the 'six' package,
> which
> >>> is commonly used for 2-3 straddling code:
> >>>
> >>> six.text_type (unicode in PY2, str in PY3)
> >>> six.binary_type (str in PY2, bytes in PY3)
> >>>
> >>> Actually for the latter we might as well use bytes.
> >>
> >>
> >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
> >> Python 3.
> >
> >
> > OK, that's settled.
> >
> >>
> >> As for the textual type, I say either `text` or `unicode` since they are
> >> both unambiguous between Python 2 and 3 and get the point across.
> >
> >
> > Then let's call it unicode. I suppose we can add this to In
> PY2,
> > typing.unicode is just the built-in unicode. In PY3, it's the built-in
> str.
> This thread came to my attention just as I'd been thinking about a
> related point.
> For me, by far the worst Unicode-related porting issue I see is people
> with a confused view of what type of data reading a file will give.
> This is because open() returns a different type (byte stream or
> character stream) depending on its arguments (specifically 'b' in the
> mode) and it's frustratingly difficult to track this type across
> function calls - especially in code originally written in a Python 2
> environment where people *expect* to confuse bytes and strings in this
> context. So, for example, I see a function read_one_byte which does
>, and works fine in real use when a data file (opened with
> 'b') is processed, but fails when sys.stdin us used (on Python 3once
> someone types a Unicode character).
> As far as I know, there's no way for type annotations to capture this
> distinction - either as they are at present in Python3, nor as being
> discussed here. But what I'm not sure of is whether it's something
> that *could* be tracked by a type checker. Of course I'm also not sure
> I'm right when I say you can't do it right now :-)
> Is this something worth including in the discussion, or is it a
> completely separate topic?
> Paul

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Fri Jan 22 16:40:21 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 22 Jan 2016 13:40:21 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 22, 2016, at 13:11, Guido van Rossum <guido at> wrote:
> Interesting. PEP 484 defines an IO generic class, so you can write IO[str] or IO[bytes]. Maybe introducing separate helper functions that open files in text or binary mode can complement this to get a solution?

The runtime types are a little weird here as well.

In 3.x, open returns different types depending on the value, rather than the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even though it isn't a subtype in the LSP sense, so you have to test isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is going to return bytes. That's all a little wonky, but not impossible to deal with.

In 2.x, most file-like objects--including file itself, which open returns--don't satisfy either ABC, and most of them can return either type from read.

Having a different function for open-binary instead of a mode flag would solve this, but it seems a little late to be adding that now. You'd have to go through all your 2.x code and change every open to one of the two new functions just to statically type your code, and then change it again for 3.x. Plus, you'd need to do the same thing not just for the builtin open, but for every library that provides an open-like method.

Maybe this special case is special enough that static type checkers just have to deal with it specially? When the mode flag is a literal, process it; when it's forwarded from another function, it may be possible to get the type from there; otherwise, everything is just unicode|bytes and the type checker can't know any more unless you explicitly tell it (by annotating the variable the result of open is stored in).

>> On Fri, Jan 22, 2016 at 12:58 PM, Paul Moore <p.f.moore at> wrote:
>> On 22 January 2016 at 19:08, Guido van Rossum <guido at> wrote:
>> > On Fri, Jan 22, 2016 at 10:37 AM, Brett Cannon <brett at> wrote:
>> >>
>> >>
>> >>
>> >> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>> >>>
>> >>> On Thu, Jan 21, 2016 at 10:14 AM, Agust?n Herranz Cecilia
>> >>> <agustin.herranz at> wrote:
>> >>> [...]
>> >>> Yes, this is no related with the choice of syntax for annotations
>> >>> directly. This is intended to help in the process of porting python2 code to
>> >>> python3, and it's outside of the PEP scope but related to the original
>> >>> problem. What I have in mind is some type aliases so you could annotate a
>> >>> version specific type to avoid ambiguousness on code that it's used on
>> >>> different versions. At the end what I originally try to said is that it's
>> >>> good to have a convention way to name this type aliases.
>> >>>
>> >>> Yes, this is a useful thing to discuss.
>> >>>
>> >>> Maybe we can standardize on the types defined by the 'six' package, which
>> >>> is commonly used for 2-3 straddling code:
>> >>>
>> >>> six.text_type (unicode in PY2, str in PY3)
>> >>> six.binary_type (str in PY2, bytes in PY3)
>> >>>
>> >>> Actually for the latter we might as well use bytes.
>> >>
>> >>
>> >> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
>> >> Python 3.
>> >
>> >
>> > OK, that's settled.
>> >
>> >>
>> >> As for the textual type, I say either `text` or `unicode` since they are
>> >> both unambiguous between Python 2 and 3 and get the point across.
>> >
>> >
>> > Then let's call it unicode. I suppose we can add this to In PY2,
>> > typing.unicode is just the built-in unicode. In PY3, it's the built-in str.
>> This thread came to my attention just as I'd been thinking about a
>> related point.
>> For me, by far the worst Unicode-related porting issue I see is people
>> with a confused view of what type of data reading a file will give.
>> This is because open() returns a different type (byte stream or
>> character stream) depending on its arguments (specifically 'b' in the
>> mode) and it's frustratingly difficult to track this type across
>> function calls - especially in code originally written in a Python 2
>> environment where people *expect* to confuse bytes and strings in this
>> context. So, for example, I see a function read_one_byte which does
>>, and works fine in real use when a data file (opened with
>> 'b') is processed, but fails when sys.stdin us used (on Python 3once
>> someone types a Unicode character).
>> As far as I know, there's no way for type annotations to capture this
>> distinction - either as they are at present in Python3, nor as being
>> discussed here. But what I'm not sure of is whether it's something
>> that *could* be tracked by a type checker. Of course I'm also not sure
>> I'm right when I say you can't do it right now :-)
>> Is this something worth including in the discussion, or is it a
>> completely separate topic?
>> Paul
> -- 
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Fri Jan 22 18:17:27 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 15:17:27 -0800
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 22, 2016 at 1:40 PM, Andrew Barnert <abarnert at> wrote:

> On Jan 22, 2016, at 13:11, Guido van Rossum <guido at> wrote:
> Interesting. PEP 484 defines an IO generic class, so you can write IO[str]
> or IO[bytes]. Maybe introducing separate helper functions that open files
> in text or binary mode can complement this to get a solution?
> The runtime types are a little weird here as well.
> In 3.x, open returns different types depending on the value, rather than
> the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even
> though it isn't a subtype in the LSP sense, so you have to test
> isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is
> going to return bytes. That's all a little wonky, but not impossible to
> deal with.

Agreed. At this level it's really hard to fix. :-(

> In 2.x, most file-like objects--including file itself, which open
> returns--don't satisfy either ABC, and most of them can return either type
> from read.

Well, the type returned by the builtin open() never returns Unicode. For
duck types (and even StringIO) it's indeed a crapshoot. :-(

> Having a different function for open-binary instead of a mode flag would
> solve this, but it seems a little late to be adding that now. You'd have to
> go through all your 2.x code and change every open to one of the two new
> functions just to statically type your code, and then change it again for
> 3.x. Plus, you'd need to do the same thing not just for the builtin open,
> but for every library that provides an open-like method.

Yeah, painful. Though in most cases you can also patch up individual calls
using cast(IO[str], open(...)) etc.

> Maybe this special case is special enough that static type checkers just
> have to deal with it specially? When the mode flag is a literal, process
> it; when it's forwarded from another function, it may be possible to get
> the type from there; otherwise, everything is just unicode|bytes and the
> type checker can't know any more unless you explicitly tell it (by
> annotating the variable the result of open is stored in).

That would be a lot of work too. We have so many other
important-but-not-urgent things already that I would really like to push
back on this until someone has actually tried the alternative and tells us
how bad it is (like Ben Darnell did for @overload).

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tjreedy at  Fri Jan 22 21:32:44 2016
From: tjreedy at (Terry Reedy)
Date: Fri, 22 Jan 2016 21:32:44 -0500
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <n7uooe$g30$>

On 1/22/2016 3:00 PM, Guido van Rossum wrote:
> Ben Darnell (Tornado lead) brought up a good use case for allowing
> @overload in regular Python files.
> There's some discussion (some old, some new) here:
> I now propose to allow @overload in non-stub (i.e. .py) files,

 From a naive point of view, it is the prohibition that is exceptional 
and in need of justification.  So its removal would seem non-problematical.

 >  but with
> the following rule: a series of @overload-decorated functions must be
> followed by an implementation function that's not @overload-decorated.
> Calling an @overload-decorated function is still an error (I propose
> NotImplemented). Due to the way repeated function definitions with the
> same name replace each other, leaving only the last one active, this
> should work. E.g. for Tornado's utf8() the full definition would look
> like this:
> @overload
> def  utf8(value:None) ->None:...
> @overload
> def  utf8(value:bytes) ->bytes:...
> @overload
> def  utf8(value:str) ->bytes:...   # or (unicode)->bytes, in PY2
> def  utf8(value):
>      # Real implementation goes here.

Again, from a naive point of view, treating 'overload' the same as 'x', 
this seems wasteful, so the usage must be a consenting-adult tradeoff 
between the time taken to create function objects that get thrown away 
and the side-effect of 'overload'.

I do understand that non-beginners with expectation based on other 
languages, who don't know Pythons specific usage of 'overload', may get 

Your proposed implementation is missing a return statement.

def overload(func):
     def overload_dummy(*args, **kwds):
         raise NotImplemented(
             "You should not call an overloaded function. "
             "A series of @overload-decorated functions "
             "outside a stub module should always be followed "
             "by an implementation that is not @overloaded.")

To avoid throwing away two functions with each def, I suggested moving 
the constant replacement outside of overload.

def _overload_dummy(*args, **kwds):
     raise NotImplemented(
         "You should not call an overloaded function. "
         "A series of @overload-decorated functions "
         "outside a stub module should always be followed "
         "by an implementation that is not @overloaded.")

def overload(func):
     return _overload_dummy

> NOTE: If you are trying to understand why we can't use a stub file here
> or why we can't solve this with type variables or unions, please read
> the issue and comment there if things are not clear. Here on
> python-ideas I'd like to focus on seeing whether this amendment is
> non-controversial (apart from tea party members who just want to repeal
> PEP 484 entirely :-).

Sorry, I don't see any connection between tea party philosophy and type 
hinting, except maybe in the opposite direction.  Maybe we should 
continue leaving external politics, US or otherwise, out of pydev 

> I know that previously we wanted to come up with a complete solution for
> multi-dispatch based on type annotations first, and there are
> philosophical problems with using @overload (though it can be made to
> work using sys._getframe()). The proposal here is *not* that solution.

It would be possible for _overload_dummy to add a reference to each func 
arg to a list or set thereof.  Perhaps you meant more or less the same 
with 'using sys._getframe()'.  The challenge would be writing a new 
'overload_complete' decorator, for the last function, that would combine 
the pieces.

Terry Jan Reedy

From abarnert at  Fri Jan 22 22:28:48 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 22 Jan 2016 19:28:48 -0800
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 22, 2016, at 12:00, Guido van Rossum <guido at> wrote:
> Ben Darnell (Tornado lead) brought up a good use case for allowing @overload in regular Python files.
> There's some discussion (some old, some new) here:
> I now propose to allow @overload in non-stub (i.e. .py) files, but with the following rule: a series of @overload-decorated functions must be followed by an implementation function that's not @overload-decorated. Calling an @overload-decorated function is still an error (I propose NotImplemented). Due to the way repeated function definitions with the same name replace each other, leaving only the last one active, this should work. E.g. for Tornado's utf8() the full definition would look like this:
> @overload
> def utf8(value: None) -> None: ...
> @overload
> def utf8(value: bytes) -> bytes: ...
> @overload
> def utf8(value: str) -> bytes: ...  # or (unicode)->bytes, in PY2
> def utf8(value):
>     # Real implementation goes here.
It feels like this is too similar to single_dispatch to be so different in details. I get the distinction (the former is for a function that has a single implementation but acts like a bunch of overloads that switch on type, the latter is for a function that's actually implemented as a bunch of overloads that switch on type), and I also get that it'll be much easier to extend overload to compile-time multiple dispatch than to extend single_dispatch to run-time multiple dispatch (and you don't want the former to have to wait on the latter), and so on. But it still feels like someone has stapled together two languages here.

(Of course I feel the same way about typing.protocols and implicit ABCs, and I know you disagreed there, so I wouldn't be too surprised if you disagree here as well. But this is even _more_ of a distinct and parallel system than that was--at least typing.Sized is in some way related to, while overload is not related to single_dispatch at all, so someone who finds the wrong one in a search seems much more liable to assume that Python just doesn't have the one he wants than to find it.)

Other than that detail, I like everything else: some feature like this should definitely be part of the type checker (the only alternative is horribly complex type annotations); if it's allowed in stub files, it should be allowed in source files; and the rule of "follow the overloads with the real implementation" seems like by far the simplest rule that could make this work.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Fri Jan 22 23:04:43 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 23 Jan 2016 14:04:43 +1000
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On 23 January 2016 at 06:00, Guido van Rossum <guido at> wrote:
> Ben Darnell (Tornado lead) brought up a good use case for allowing @overload
> in regular Python files.
> There's some discussion (some old, some new) here:
> I now propose to allow @overload in non-stub (i.e. .py) files, but with the
> following rule: a series of @overload-decorated functions must be followed
> by an implementation function that's not @overload-decorated. Calling an
> @overload-decorated function is still an error (I propose NotImplemented).
> Due to the way repeated function definitions with the same name replace each
> other, leaving only the last one active, this should work. E.g. for
> Tornado's utf8() the full definition would look like this:
> @overload
> def utf8(value: None) -> None: ...
> @overload
> def utf8(value: bytes) -> bytes: ...
> @overload
> def utf8(value: str) -> bytes: ...  # or (unicode)->bytes, in PY2
> def utf8(value):
>     # Real implementation goes here.

I share Andrew's concerns about the lack of integration between this
and functools.singledispatch, so would it be feasible to apply the
"magic comments" approach here, similar to the workarounds for
variable annotations and Py2 compatible function annotations?

That is, would it be possible to use a notation like the following?:

    def utf8(value):
        # type: (None) -> None
        # type: (bytes) -> bytes
        # type: (unicode) -> bytes

You're already going to have to allow this for single lines to handle
Py2 compatible annotations, so it seems reasonable to also extend it
to handle overloading while you're still figuring out a native syntax
for that.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From abarnert at  Fri Jan 22 23:50:52 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 22 Jan 2016 20:50:52 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 21, 2016, at 00:48, Victor Stinner <victor.stinner at> wrote:
> Hi,
> Sorry but I'm lost in this long thread. 

I think the whole issue of const optimization is taking this discussion way off track, so let me try to summarize the actual issue.

What the thread is ultimately looking for is a solution to the "closures capturing loop variables" problem. This problem has been in the official programming FAQ[1] for decades, as "Why do lambdas defined in a loop with different values all return the same result"?

    powers = [lambda x: x**i for i in range(10)]

This gives you ten functions that all return x**9, which is probably not what you wanted.

The reason this is a problem is that Python uses "late binding", which in this context means that each of those functions is a closure that captures the variable i in a way that looks up the value of i at call time. All ten functions capture the same variable, and when you later call them, that variable's value is 9.

Almost every language with real closures and for-each loops has the same problem, but someone who's coming to Python as a first language, or coming from a language like C that doesn't have those features, is almost guaranteed to be confused by this when he first sees it. (Presumably, that's why it's in the FAQ.)

The OP proposed that we should add some syntax, borrowed from C++, to function definitions that specifies that some things get captured by value. You could instead describe this as early binding the specified names, or as not capturing at all, but however you describe it, the idea is pretty simple. The obvious way to implement it is to copy the values into the function object at function-creation time, then copy them into locals at call time--exactly like default parameter values. (Not too surprising, because default parameter values are the idiomatic workaround today.)

A few alternatives to the parameter-like syntax borrowed from C++ were proposed, including "def power(x; i):" (notice the semicolon) and "def power(x)(i):". A few people also proposed a new declaration statement similar to "global" and "nonlocal"--which opens the question of what to call it; suggested names included "shared", "sharedlocal", and "capture".

People also suggested an optimization: store them like constants, instead of like default values, so they don't need to be copied into locals. (This is similar to the global->const optimizations being discussed in the FAT threads, but here it's optimizing the equivalent of default parameter values, not globals. Which means it's much less important of an optimization, since defaults are only fetched once per call, after which they're looked up the same as locals, which are just as fast as consts. It _could_ potentially feed into further FAT-type optimizations, but that's getting pretty speculative.) The obvious downside here is that constants are stored in the code object, so instead of 10 (small) function objects all sharing the same (big) code object, you'd have 10 function objects with 10 separate (big) code objects

Another alternative, which I don't think anyone seriously considered, is to flag the specified freevars so that, at function creation time, we copy the cell and bind that copy, instead of binding the original cell. (This alternative can't really be called "early binding" or "capture by value", but it has the same net effect.)

Finally, Terry suggested a completely different solution to the problem: don't change closures; change for loops. Make them create a new variable each time through the loop, instead of reusing the same variable. When the variable isn't captured, this would make no difference, but when it is, closures from different iterations would capture different variables (and therefore different cells). For backward-compatibility reasons, this might have to be optional, which means new syntax; he proposed "for new i in range(10):".

I don't know of any languages that use the C++-style solution that don't have lvalues to worry about. It's actually necessary for other reasons in C++ (capturing a variable doesn't extend its lifetime, so you need to be able to explicitly copy things or you end up with dangling references), but those reasons don't apply to Python (or C#, Swift, JavaScript, etc.). Still, it is a well-known solution to the problem.

Terry's solution, on the other hand, is used by Swift (from the start, even though it _does_ have lvalues), C# (since 5.0), and Ruby (since 1.9), among other languages. C#, in particular, decided to add it as a breaking change to a mature language, rather than adding new syntax, because Eric Lippert believed that almost any code that's relying on the old behavior is probably a bug rather than intentional.


> Do you want to extend the
> Python language to declare constant in a function? Maybe I'm completly
> off-topic, sorry.
> 2016-01-21 1:10 GMT+01:00 Steven D'Aprano <steve at>:
>> (2) If we limit this to only capturing the same name, then we can only
>> write (say) "static x", and that does look like a declaration. But maybe
>> we want to allow the local name to differ from the global name:
>>    static x = y
> 3 months ago, Serhiy Storchaka proposed a "const var = expr" syntax:
> With a shortcut "const len" which is like "const len = len".
> In the meanwhile, I implemented an optimization in my FAT Python
> project: "Copy builtins to constant". It's quite simple: replace the
> "LOAD_GLOBAL builtin" instruction with a "LOAD_CONST builtin"
> transation and "patch" co_consts constants of a code object at
> runtime:
>   def hello(): print("hello world")
> is replaced with:
>   def hello(): "LOAD_GLOBAL print"("hello world")
>   hello.__code__ = fat.replace_consts(hello.__code__, {'LOAD_GLOBAL
> print': print})
> Where fat.replace_consts() is an helper to create a new code object
> replacing constants with the specified mapping:
> Replacing print(...) with "LOAD_GLOBAL"(...) is done in the
> fatoptimizer (an AST optimpizer):
> We have to inject the builtin function at runtime. It cannot be done
> when the code object is created by "def ..." because a code object can
> only contain objects serializable by marshal (to be able to compile a
> .py file to a .pyc file).
>> I acknowledge that this goes beyond what the OP asked for, and I think
>> that YAGNI is a reasonable response to the static block idea. I'm not
>> going to champion it any further unless there's a bunch of interest from
>> others. (I'm saving my energy for Eiffel-like require/ensure blocks
>> *wink*).
> The difference between "def hello(print=print): ..." and Serhiy's
> const idea (or my optimization) is that "def hello(print=print): ..."
> changes the signature of the function which can be a serious issue in
> an API.
> Note: The other optimization "local_print = print" in the function is
> only useful for loops (when the builtin is loaded multiple times) and
> it still loads the builtin once per function call, whereas my
> optimization uses a constant and so no lookup is required anymore.
> Then guards are used to disable the optimization if builtins are
> modified. See the PEP 510 for an explanation on that part.
> Victor
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mike at  Fri Jan 22 23:58:02 2016
From: mike at (Michael Selik)
Date: Fri, 22 Jan 2016 23:58:02 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

> On Jan 22, 2016, at 11:50 PM, Andrew Barnert via Python-ideas <python-ideas at> wrote:
> On Jan 21, 2016, at 00:48, Victor Stinner <victor.stinner at> wrote:
>> Hi,
>> Sorry but I'm lost in this long thread. 
> I think the whole issue of const optimization is taking this discussion way off track, so let me try to summarize the actual issue.
> What the thread is ultimately looking for is a solution to the "closures capturing loop variables" problem. This problem has been in the official programming FAQ[1] for decades, as "Why do lambdas defined in a loop with different values all return the same result"?
>     powers = [lambda x: x**i for i in range(10)]
> This gives you ten functions that all return x**9, which is probably not what you wanted.

The original request could have also been solved with ``functools.partial``. Sure, this is a toy solution, but the problem as originally shared was a toy problem.

>>> from functors import partial
>>> a = 1
>>> f = partial(lambda a,x: a+x, a)
>>> f(10)
>>> a = 2
>>> f(10)

Seems to me quite similar to the original suggestion from haael:

   a = 1
   b = 2
   c = 3
   fun = lambda[a, b, c] x, y: a + b + c + x + y

From rosuav at  Sat Jan 23 00:06:36 2016
From: rosuav at (Chris Angelico)
Date: Sat, 23 Jan 2016 16:06:36 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> Finally, Terry suggested a completely different solution to the problem:
> don't change closures; change for loops. Make them create a new variable
> each time through the loop, instead of reusing the same variable. When the
> variable isn't captured, this would make no difference, but when it is,
> closures from different iterations would capture different variables (and
> therefore different cells). For backward-compatibility reasons, this might
> have to be optional, which means new syntax; he proposed "for new i in
> range(10):".

Not just for backward compatibility. Python's scoping and assignment
rules are currently very straight-forward: assignment creates a local
name unless told otherwise by a global/nonlocal declaration, and *all*
name binding follows the same rules as assignment. Off the top of my
head, I can think of two special cases, neither of which is truly a
change to the binding semantics: "except X as Y:" triggers an
unbinding at the end of the block, and comprehensions have a hidden
function boundary that means their iteration variables are more local
than you might think. Making for loops behave differently by default
would be a stark break from that tidiness.

It seems odd to change this on the loop, though. Is there any reason
to use "for new i in range(10):" if you're not making a series of
nested functions? Seems most logical to make this a special way of
creating functions, not of looping.


From tjreedy at  Sat Jan 23 00:09:37 2016
From: tjreedy at (Terry Reedy)
Date: Sat, 23 Jan 2016 00:09:37 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <n7v1ui$9f3$>

On 1/22/2016 11:50 PM, Andrew Barnert via Python-ideas wrote:

> Finally, Terry suggested a completely different solution to the problem:
> don't change closures; change for loops.

I remember that proposal, but it was someone other than me.

Terry Jan Reedy

From abarnert at  Sat Jan 23 00:36:30 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 22 Jan 2016 21:36:30 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 22, 2016, at 21:06, Chris Angelico <rosuav at> wrote:
> On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas
> <python-ideas at> wrote:
>> Finally, Terry suggested a completely different solution to the problem:
>> don't change closures; change for loops. Make them create a new variable
>> each time through the loop, instead of reusing the same variable. When the
>> variable isn't captured, this would make no difference, but when it is,
>> closures from different iterations would capture different variables (and
>> therefore different cells). For backward-compatibility reasons, this might
>> have to be optional, which means new syntax; he proposed "for new i in
>> range(10):".
> Not just for backward compatibility. Python's scoping and assignment
> rules are currently very straight-forward: assignment creates a local
> name unless told otherwise by a global/nonlocal declaration, and *all*
> name binding follows the same rules as assignment. Off the top of my
> head, I can think of two special cases, neither of which is truly a
> change to the binding semantics: "except X as Y:" triggers an
> unbinding at the end of the block, and comprehensions have a hidden
> function boundary that means their iteration variables are more local
> than you might think. Making for loops behave differently by default
> would be a stark break from that tidiness.

As a side note, notice that if you don't capture the variable, there is no observable difference (which means CPython would be well within its rights to optimize it by reusing the same variable unless it's a cellvar).

Anyway, yes, it's still something that you have to learn--but the unexpected-on-first-encounter interaction between loop variables and closures is also something that everybody has to learn. And, even after you understand it, it still doesn't become obvious until you've been bitten by it enough times (and if you're going back and forth between Python and a language that's solved the problem, one way or the other, you may keep relearning it). So, theoretically, the status quo is certainly simpler, but in practice, I'm not sure it is.

> It seems odd to change this on the loop, though. Is there any reason
> to use "for new i in range(10):" if you're not making a series of
> nested functions?

Rarely if ever. But is there any reason to "def spam(x; i):" or "def [i](x):" or whatever syntax people like if you're not overwriting i with a different and unwanted value? And is there any reason to reuse a variable you've bound in that way if a loop isn't forcing you to do so?

This problem comes up all the time, in all kinds of languages, when loops and closures intersect. It almost never comes up with loops alone or closures alone.

> Seems most logical to make this a special way of
> creating functions, not of looping.

There are also some good theoretical motivations for changing loops, but I'm really hoping someone else (maybe the Swift or C# dev team blogs) has already written it up, so I can just post a link and a short "... and here's why it also applies to Python" (complicated by the fact that one of the motivations _doesn't_ apply to Python...).

Also, the idea of a closure "capturing by value" is pretty strange on the surface; you have to think through why that doesn't just mean "not capturing" in a language like Python. Nick Coghlan suggests calling it "capture at definition" vs. "capture at call", which helps, but it's still weird. Weirder than loops creating a new binding that has the same name as the old one in a let-less language? I don't know. They're both weird. And so is the existing behavior, despite the fact that it makes perfect sense once you work it through.

Anyway, for now, I'll just repeat that Ruby, Swift, C#, etc. all solved this by changing for loops, while only C++, which already needed to change closures because of its lifetime rules, solved it by changing closures. On the other hand, JavaScript and Java both explicitly rejected any change to fix the problem, and Python has lived with it for a long time, so...

From guido at  Sat Jan 23 00:43:12 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 21:43:12 -0800
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 22, 2016 at 8:04 PM, Nick Coghlan <ncoghlan at> wrote:

> On 23 January 2016 at 06:00, Guido van Rossum <guido at> wrote:
> > Ben Darnell (Tornado lead) brought up a good use case for allowing
> @overload
> > in regular Python files.
> >
> > There's some discussion (some old, some new) here:
> >
> >
> > I now propose to allow @overload in non-stub (i.e. .py) files, but with
> the
> > following rule: a series of @overload-decorated functions must be
> followed
> > by an implementation function that's not @overload-decorated. Calling an
> > @overload-decorated function is still an error (I propose
> NotImplemented).
> > Due to the way repeated function definitions with the same name replace
> each
> > other, leaving only the last one active, this should work. E.g. for
> > Tornado's utf8() the full definition would look like this:
> >
> > @overload
> > def utf8(value: None) -> None: ...
> > @overload
> > def utf8(value: bytes) -> bytes: ...
> > @overload
> > def utf8(value: str) -> bytes: ...  # or (unicode)->bytes, in PY2
> > def utf8(value):
> >     # Real implementation goes here.
> I share Andrew's concerns about the lack of integration between this
> and functools.singledispatch, so would it be feasible to apply the
> "magic comments" approach here, similar to the workarounds for
> variable annotations and Py2 compatible function annotations?
> That is, would it be possible to use a notation like the following?:
>     def utf8(value):
>         # type: (None) -> None
>         # type: (bytes) -> bytes
>         # type: (unicode) -> bytes
>         ...
> You're already going to have to allow this for single lines to handle
> Py2 compatible annotations, so it seems reasonable to also extend it
> to handle overloading while you're still figuring out a native syntax
> for that.

That's clever. I'm sure we could make it work if we wanted to. But it
doesn't map to anything else -- in stub files do already have @overload,
and there's no way to translate this directly to Python 3 in-line

Regarding the confusion with @functools.singledispatch, hopefully all
documentation for @overload (including StackOverflow :-) would quickly
point out how you are supposed to use it.

There's also a deep distinction between @overload in PEP 484 and
singledispatch, multidispatch or even the (ultimately deferred) approach
from PEP 3124, also called @overload.

PEP 484's @overload (whether in stubs or in this proposed form in .py
files) talks to the *type checker* and it can be used with generic types.
For example, suppose you have

def foo(a: Sequence[int]) -> int: ...
def foo(a: Sequence[str]) -> float: ...
def foo(a):
    return sum(float(x) for x in a)

(NOTE: Don't be fooled to think that the implementation is the last word on
the supported types and hence the list of overloads is "obviously"
incomplete. The type checker needs to take the overloads at their word and
reject calls to e.g. foo([3.14]). A future implementation that matches the
overloaded signatures given here might not work for float arguments.)

Here the implementation will have to somehow figure out whether its
argument is a list of integers or strings, e.g. by checking the type of the
first item -- that should be okay since passing the type check implies a
promise that the argument is homogeneous. But a purely runtime dispatcher
would not be able to make that distinction so easily, since PEP 484 assumes
type erasure -- at runtime the argument is just a Sequence. Of course,
functools.singledispatch sidesteps this by not supporting generic types at
all. But the example illustrates that the two are more different than you'd
think from the utf8() example (which just distinguishes between unicode,
bytes and None -- no generic types there).

>From a type-checking perspective, functools.singledispatch is not easy to
handle -- it is defined in terms of its runtime behavior, and it explicitly
supports dynamic registration. (Who uses it? There's only one use in the
stdlib, which is in, under the guise @simplegeneric.)

Clearly both @overload and functools.singledispatch are stepping stones on
the way to an elusive better solution. Hopefully until that solution is
found they can live together?

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan 23 00:50:03 2016
From: guido at (Guido van Rossum)
Date: Fri, 22 Jan 2016 21:50:03 -0800
Subject: [Python-ideas] Typehinting repo moved to python/typing
Message-ID: <>

This is just a note that with Benjamin's help we've moved the
ambv/typehinting repo on GitHub into the python org, so its URL is now .

This repo was used most intensely for discussions during PEP 484's drafting
period. It also contains the code for, repackaged for earlier
releases on PyPI. The issue tracker is still open for proposals to change
PEP 484, which is not unheard of given its provisional status. If you find
a pointer to the original location of this repo in a file you can update,
please go ahead (though GitHub is pretty good at forwarding URLs from
renamed repos).

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jan 23 00:54:09 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 23 Jan 2016 15:54:09 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 23 January 2016 at 14:50, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> What the thread is ultimately looking for is a solution to the "closures
> capturing loop variables" problem. This problem has been in the official
> programming FAQ[1] for decades, as "Why do lambdas defined in a loop with
> different values all return the same result"?
>     powers = [lambda x: x**i for i in range(10)]
> This gives you ten functions that all return x**9, which is probably not
> what you wanted.
> The reason this is a problem is that Python uses "late binding", which in
> this context means that each of those functions is a closure that captures
> the variable i in a way that looks up the value of i at call time. All ten
> functions capture the same variable, and when you later call them, that
> variable's value is 9.

Thanks for that summary, Andrew.

While I do make some further thoughts below, I'll also note explicitly
that I think the status quo in this area is entirely acceptable, and
we don't actually *need* to change anything. However, there have
already been some new ways of looking at the question that haven't
come up previously, so I think it's a worthwhile discussion, even
though the most likely outcome is still "No change".

> The OP proposed that we should add some syntax, borrowed from C++, to
> function definitions that specifies that some things get captured by value.
> You could instead describe this as early binding the specified names, or as
> not capturing at all, but however you describe it, the idea is pretty
> simple. The obvious way to implement it is to copy the values into the
> function object at function-creation time, then copy them into locals at
> call time--exactly like default parameter values. (Not too surprising,
> because default parameter values are the idiomatic workaround today.)

In an off-list discussion with Andrew, I noted that one reason the
"capture by value" terminology was confusing me was because it made me
think in terms of "pass by reference" and "pass by value" in C/C++,
neither of which is actually relevant to the discussion at hand.
However, he also pointed out that "early binding" vs "late binding"
was also confusing, since the compile-time/definition-time/call-time
distinction in Python is relatively unique, and in many other contexts
"early binding" refers to things that happen at compile time.

As a result (and as Andrew already noted in another email), I'm
currently thinking of the behaviour of nonlocal and global variables
as "capture at call", while the values of default parameters are
"capture at definition". (If "capture" sounds weird, "resolve at call"
and "resolve at definition" also work).

The subtlety of this distinction actually shows up in *two* entries in
the programming FAQ.

Andrew already mentioned the interaction of loops and closures, where
capture-at-call surprises people:
However, there are also mutable default arguments, where it is
capture-at-definition that is often surprising:

While nobody's proposing to change the latter, providing an explicit
syntax for "capture at definition" may still have a beneficial side
effect in making it easier to explain the way default arguments are
evaluated and stored on the function object at function definition
time rather than created anew each time the function runs.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan 23 01:17:55 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 23 Jan 2016 16:17:55 +1000
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On 23 January 2016 at 15:43, Guido van Rossum <guido at> wrote:
> On Fri, Jan 22, 2016 at 8:04 PM, Nick Coghlan <ncoghlan at> wrote:
>> That is, would it be possible to use a notation like the following?:
>>     def utf8(value):
>>         # type: (None) -> None
>>         # type: (bytes) -> bytes
>>         # type: (unicode) -> bytes
>>         ...
>> You're already going to have to allow this for single lines to handle
>> Py2 compatible annotations, so it seems reasonable to also extend it
>> to handle overloading while you're still figuring out a native syntax
>> for that.
> That's clever. I'm sure we could make it work if we wanted to. But it
> doesn't map to anything else -- in stub files do already have @overload, and
> there's no way to translate this directly to Python 3 in-line annotations.

Right, my assumption is that it would eventually be translated to a
full multi-dispatch solution for Python 3, whatever that spelling
turns out to be - I'm just assuming that spelling *won't* involve
annotating empty functions the way that stub files currently do, but
rather annotating separate implementations for a multidispatch
algorithm, or perhaps gaining a way to more neatly compose multiple
sets of annotations.

> @overload
> def foo(a: Sequence[int]) -> int: ...
> @overload
> def foo(a: Sequence[str]) -> float: ...
> def foo(a):
>     return sum(float(x) for x in a)

While the disconnect with functools.singledispatch is one concern,
another is the sheer visual weight of this approach. The real function
has to go last to avoid getting clobbered, but the annotations for
multiple dispatch end up using a lot of space beforehand.

It gets worse if you need to combine it with Python 2 compatible type
hinting comments since you can't squeeze the function definitions onto
one line anymore:

    def foo(a):
        # type: (Sequence[int]) -> int
    def foo(a):
        # type: (Sequence[str]) -> float
   def foo(a):
        return sum(float(x) for x in a)

If you were to instead go with a Python 2 compatible comment based
inline solution for now, you'd then get to design the future official
spelling for multi-dispatch annotations based on your experience with
both that and with the decorator+annotations approach used in stub


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From spencerb21 at  Sat Jan 23 03:14:58 2016
From: spencerb21 at (Spencer Brown)
Date: Sat, 23 Jan 2016 08:14:58 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

> On 23 Jan 2016, at 7:41 AM, Andrew Barnert via Python-ideas <python-ideas at> wrote:
> The runtime types are a little weird here as well.
> In 3.x, open returns different types depending on the value, rather than the type, of its inputs. Also, TextIOBase is a subclass of IOBase, even though it isn't a subtype in the LSP sense, so you have to test isinstance(IOBase) and not isinstance(TextIOBase) to know that read() is going to return bytes. That's all a little wonky, but not impossible to deal with.
> In 2.x, most file-like objects--including file itself, which open returns--don't satisfy either ABC, and most of them can return either type from read.
> Having a different function for open-binary instead of a mode flag would solve this, but it seems a little late to be adding that now. You'd have to go through all your 2.x code and change every open to one of the two new functions just to statically type your code, and then change it again for 3.x. Plus, you'd need to do the same thing not just for the builtin open, but for every library that provides an open-like method.
> Maybe this special case is special enough that static type checkers just have to deal with it specially? When the mode flag is a literal, process it; when it's forwarded from another function, it may be possible to get the type from there; otherwise, everything is just unicode|bytes and the type checker can't know any more unless you explicitly tell it (by annotating the variable the result of open is stored in).

Instead of special-casing open() specifically, adding a 'Literal' class would solve this issue (although only in a stub file): 

def open(mode: Literal['rb', 'wb', 'ab']) -> BufferedIOBase: ...
def open(mode: Literal['rt', 'wt', 'at']) -> TextIOBase: ...

Literal[a,b,c] == Union[Literal[a], Literal[b], Literal[c]] for convenience purposes. To avoid repetition,  func(arg: Literal='value') could be made equivalent to func(arg: Literal['value']='value').

Typecheckers should just treat this the same as the type of the value, but for cases where it knows the value (literals or aliases) check the value too. (Either by comparison for core types, or just by identity. That allows use of object() sentinel values or Enum members.)

From greg.ewing at  Sat Jan 23 06:01:13 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 24 Jan 2016 00:01:13 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n7v1ui$9f3$>
References: <ypnydezpdjtvtxsvsohu@vlmj>
 <> <n7v1ui$9f3$>
Message-ID: <>

Terry Reedy wrote:

>> Finally, Terry suggested a completely different solution to the problem:
>> don't change closures; change for loops.
> I remember that proposal, but it was someone other than me.

If you're looking for the perpetrator of "for new i in ...",
I confess it was me.


From greg.ewing at  Sat Jan 23 06:11:29 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 24 Jan 2016 00:11:29 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Nick Coghlan wrote:
> As a result (and as Andrew already noted in another email), I'm
> currently thinking of the behaviour of nonlocal and global variables
> as "capture at call",

That's not right either, because if a free variable gets
reassigned between the time of the call and the time the
variable is used within the function, the new value is


From stephen at  Sat Jan 23 06:53:38 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 23 Jan 2016 20:53:38 +0900
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Andrew Barnert via Python-ideas writes:

 >     powers = [lambda x: x**i for i in range(10)]

 > This gives you ten functions that all return x**9, which is
 > probably not what you wanted.

 > The reason this is a problem is that Python uses "late binding",
 > which in this context means that each of those functions is a
 > closure that captures the variable i in a way that looks up the
 > value of i at call time. All ten functions capture the same
 > variable, and when you later call them, that variable's value is
 > 9.

But this explanation going to confuse people who understand the
concept of variable in Python to mean names that are bound and
re-bound to objects.  The comprehension's binding of i disappears
before any element of powers can be called.  So from their point of
view, either that expression is an error, or powers[i] closes over a
new binding of the name "i", specific to "the lambda's scope" (see
below), to the current value of i in the comprehension.

Of course the same phenomenon is observable with other scopes.  In
particular global scope behaves this way, as importing this file

    i = 0
    def f(x):
        return x + i
    i = 1

and calling f(0) will demonstrate.  But changing the value of a
global, used the way i is here, within a library module is a rather
unusual thing to do; I doubt people will observe it.

Also, once again the semantics of lambda (specifically, that unlike
def it doesn't create a scope) seem to be a source of confusion more
than anything else.  Maybe it's possible to exhibit the same issue
with def, but the def equivalent to the above lambda

    >>> def make_increment(i):
    ...  def _(x):
    ...   return x + i
    ...  return _
    >>> funcs = [make_increment(j) for j in range(3)]
    >>> [f(0) for f in funcs]
    [0, 1, 2]

closes over i in the expected way.  (Of course in practicality, it's
way more verbose, and in purity, it's not truly equivalent since
there's at least one extra nesting of scope involved.)  While

    >>> def make_increment():
    ...  def _(x):
    ...   return x + i
    ...  return _
    >>> funcs = [make_increment() for i in range(3)]
    >>> [f(0) for f in funcs]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in <listcomp>
      File "<stdin>", line 3, in _
    NameError: name 'i' is not defined
    >>> i = 6
    >>> [f(0) for f in funcs]
    [6, 6, 6]

doesn't make closures at all, but rather retains the global binding.

From skrah.temporarily at  Sat Jan 23 07:57:36 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sat, 23 Jan 2016 12:57:36 +0000 (UTC)
Subject: [Python-ideas] Explicit variable capture list
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Nick Coghlan <ncoghlan at ...> writes:
> On 23 January 2016 at 14:50, Andrew Barnert via Python-ideas
> <python-ideas at ...> wrote:
> > What the thread is ultimately looking for is a solution to the "closures
> > capturing loop variables" problem. This problem has been in the official
> > programming FAQ[1] for decades, as "Why do lambdas defined in a loop with
> > different values all return the same result"?
> >
> >     powers = [lambda x: x**i for i in range(10)]
> >
> > This gives you ten functions that all return x**9, which is probably not
> > what you wanted.
> >
> > The reason this is a problem is that Python uses "late binding", which in
> > this context means that each of those functions is a closure that captures
> > the variable i in a way that looks up the value of i at call time. All ten
> > functions capture the same variable, and when you later call them, that
> > variable's value is 9.

I've never liked the use of "late binding" in this context. The
behavior is totally standard for closures that use mutable values.

Here's OCaml, using refs (mutable reference cells) instead of
the regular immutable values.  BTW, no one would write OCaml
like in the following example, it's just for clarity):

let i = ref 0.0;;
# val i : float ref = {contents = 0.}

let rpow = ref [];;
# val rpow : '_a list ref = {contents = []}

while (!i < 10.0) do
  rpow := (fun x -> x**(!i)) :: !rpow;
  i := !i +. 1.0
- : unit = ()

let powers = List.rev !rpow;;

val powers : (float -> float) list =
  [<fun>; <fun>; <fun>; <fun>; <fun>; <fun>; <fun>; <fun>; <fun>; <fun>] (fun f -> f 10.0) powers;;
- : float list =
[10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.;
 10000000000.; 10000000000.; 10000000000.; 10000000000.; 10000000000.]

You see that "i" is a reference cell, i.e. it's compiled to a C struct
and lookups are just a pointer dereference.

Conceptually Python's dictionaries are really just the same as reference
cells, except they hold more than one value.

So, to me the entire question is more one of immutable vs. mutable
rather than late vs. early binding.

Stefan Krah

From guido at  Sat Jan 23 11:54:37 2016
From: guido at (Guido van Rossum)
Date: Sat, 23 Jan 2016 08:54:37 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 3:53 AM, Stephen J. Turnbull <stephen at>

> Andrew Barnert via Python-ideas writes:
>  >     powers = [lambda x: x**i for i in range(10)]
>  > This gives you ten functions that all return x**9, which is
>  > probably not what you wanted.
>  > The reason this is a problem is that Python uses "late binding",
>  > which in this context means that each of those functions is a
>  > closure that captures the variable i in a way that looks up the
>  > value of i at call time. All ten functions capture the same
>  > variable, and when you later call them, that variable's value is
>  > 9.

Actually it doesn't look up the value at call time, but each time it's
used. This technicality matters if in between uses you call something that
has write access to the same variable (typically using nonlocal) and
modifies it.

> But this explanation going to confuse people who understand the
> concept of variable in Python to mean names that are bound and
> re-bound to objects.  The comprehension's binding of i disappears
> before any element of powers can be called.  So from their point of
> view, either that expression is an error, or powers[i] closes over a
> new binding of the name "i", specific to "the lambda's scope" (see
> below), to the current value of i in the comprehension.

But this seems to refer to a very specific definition of "binding" that
doesn't have root in Python's semantic model. I suppose it may come from
Lisp (which didn't influence Python quite as much as people think :-).

So I think what you're saying here comes down that it will confuse people
who misunderstand Python's variables. Given that the misunderstanding
you're supposing here is pretty specific (it's not just due to people
who've never thought much about variables) I'm not sure I care much.

> Of course the same phenomenon is observable with other scopes.  In
> particular global scope behaves this way, as importing this file
>     i = 0
>     def f(x):
>         return x + i
>     i = 1
> and calling f(0) will demonstrate.  But changing the value of a
> global, used the way i is here, within a library module is a rather
> unusual thing to do; I doubt people will observe it.

I disagree again: in interactive mode most of what you do is global and you
will see this quite often.

And all scopes in Python behave the same way.

> Also, once again the semantics of lambda (specifically, that unlike
> def it doesn't create a scope)

Uh, what? I can sort of guess what you are referring to here (namely, that
no syntactic construct permissible in a lambda can assign to a local
variable -- or any variable, for that matter) but it certainly has a scope
(to hold the arguments, which are just variables, as one quickly learns
from experimenting with the arguments to a function defined using def).

> seem to be a source of confusion more
> than anything else.  Maybe it's possible to exhibit the same issue
> with def, but the def equivalent to the above lambda
>     >>> def make_increment(i):
>     ...  def _(x):
>     ...   return x + i
>     ...  return _
>     ...
>     >>> funcs = [make_increment(j) for j in range(3)]
>     >>> [f(0) for f in funcs]
>     [0, 1, 2]
> closes over i in the expected way.  (Of course in practicality, it's
> way more verbose, and in purity, it's not truly equivalent since
> there's at least one extra nesting of scope involved.)

It's such a strawman that I'm surprised you bring it up. Who would even
*think* of using that idiom as equivalent to the simple lambda?

If I were to deconstruct the original statement, I would start by replacing
the list comprehension with a plain old for loop. That would also not be
truly equivalent because the comprehension introduces a scope while the for
loop doesn't, but the difference only matters if it stomps on another
variable -- the semantics relative to the lambda are exactly the same. In
particular, this example exhibits the same phenomenon without using a

powers = []
for i in range(10):
    powers.append(lambda x: x**i)

This in turn can be rewritten without changing the semantics related to
scopes using a def that's equivalent (truly equivalent except for its
__name__ attribute!):

powers = []
for i in range(10):
    def f(x):
        return x**i

(Note that the leakage of f here is irrelevant to the problem.)

This has the same problem, without being distracted by lambda or
comprehensions, and we can now explore its semantics through
experimentation. We could even unroll the for loop and get the same issue:

powers = []

i = 0
def f(x):
    return x**i

i = 1
def f(x):
    return x**i

# Etc.

> While
>     >>> def make_increment():
>     ...  def _(x):
>     ...   return x + i
>     ...  return _
>     ...
>     >>> funcs = [make_increment() for i in range(3)]
>     >>> [f(0) for f in funcs]
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>       File "<stdin>", line 1, in <listcomp>
>       File "<stdin>", line 3, in _
>     NameError: name 'i' is not defined
>     >>> i = 6
>     >>> [f(0) for f in funcs]
>     [6, 6, 6]
> doesn't make closures at all, but rather retains the global binding.

Totally different idiom again -- another strawman.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan 23 12:08:15 2016
From: guido at (Guido van Rossum)
Date: Sat, 23 Jan 2016 09:08:15 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 4:57 AM, Stefan Krah <skrah.temporarily at>

> I've never liked the use of "late binding" in this context. The
> behavior is totally standard for closures that use mutable values.

I wonder if the problem isn't that "binding" is a term imported from a
different language philosophy, and the idea there is just fundamentally
different from Python's philosophy about variables.

In Python, a variable is *conceptually* just a key in a dict (and often,
like for globals, builtins and instance or class variables, that really is
how it's implemented). The variable name is the key, and there are implicit
(and often dynamic) rules for deciding which dict to use. For local
variables this is a bit of a lie, but the language goes out of its way to
make it appear true (e.g. the existence of locals()).

This concept is also valid for nonlocals (either the implicit PY2 kind, of
the explicit PY3 kind  introduced by a nonlocal statement). The
implementation through "cells" is nearly unobservable (try getting a hold
of a cell object through introspection without using ctypes!) and is just
an optimization. Semantically (if we don't mind keeping other objects alive
loger), nonlocals can be implemented by just holding on to the stack frame
of the function call where they live, or, if locals hadn't been optimized,
holding on to the dict containing that frame's locals would also work.

So, I don't really want to introduce "for new x in ..." because it suddenly
introduces a completely different concept into the language, and it would
be really hard to explain what it does to someone who has correctly grasped
Python's concept of variables as keys in a dict. What dict hold x in "for
new x ..."? It would have to be considered a new dict created just to hold
x, but other variables assigned in the body of the for loop would still be
in the dict holding all the other locals of the function. Bah.

--Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jlehtosalo at  Sat Jan 23 13:13:08 2016
From: jlehtosalo at (Jukka Lehtosalo)
Date: Sat, 23 Jan 2016 18:13:08 +0000
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 23, 2016 at 6:17 AM, Nick Coghlan <ncoghlan at> wrote:

> On 23 January 2016 at 15:43, Guido van Rossum <guido at> wrote:
> > That's clever. I'm sure we could make it work if we wanted to. But it
> > doesn't map to anything else -- in stub files do already have @overload,
> and
> > there's no way to translate this directly to Python 3 in-line
> annotations.
> Right, my assumption is that it would eventually be translated to a
> full multi-dispatch solution for Python 3, whatever that spelling
> turns out to be - I'm just assuming that spelling *won't* involve
> annotating empty functions the way that stub files currently do, but
> rather annotating separate implementations for a multidispatch
> algorithm, or perhaps gaining a way to more neatly compose multiple
> sets of annotations.

We don't have a proposal for multidispatch even though people have been
hoping for it to happen for a long time. It's a much harder problem than
providing multiple signatures for a function, and it's also arguably a
different problem, even if there is some overlap. @overload might still be
preferable to a multidispatch solution in a lot of cases even if both
existed since @overload is conceptually pretty simple. However, this is all
conjecture since we don't know what multidispatch would look like. I'd love
to see a multidispatch proposal.

> > @overload
> > def foo(a: Sequence[int]) -> int: ...
> > @overload
> > def foo(a: Sequence[str]) -> float: ...
> > def foo(a):
> >     return sum(float(x) for x in a)
> While the disconnect with functools.singledispatch is one concern,
> another is the sheer visual weight of this approach. The real function
> has to go last to avoid getting clobbered, but the annotations for
> multiple dispatch end up using a lot of space beforehand.

There is little evidence that @overload answers a *common* need, even if
the need is important -- originally we left out @overload in .py files
because we hadn't found a convincing use case. I don't consider the visual
weight to be a major problem, as this would only be used rarely, at least
based on our current understanding. But clearly the proposed syntax won't
win any prettiness awards.

Singledispatch solves a different problem and makes different tradeoffs --
for example, it adds more runtime overhead, and it doesn't lend itself to
specifying multiple return types for a single function body that depend on
argument types. It also lives in a different module. I don't worry too much
about the overlap.

> It gets worse if you need to combine it with Python 2 compatible type
> hinting comments since you can't squeeze the function definitions onto
> one line anymore:
>     @overload
>     def foo(a):
>         # type: (Sequence[int]) -> int
>         ...
>     @overload
>     def foo(a):
>         # type: (Sequence[str]) -> float
>         ...
>    def foo(a):
>         return sum(float(x) for x in a)
> If you were to instead go with a Python 2 compatible comment based
> inline solution for now, you'd then get to design the future official
> spelling for multi-dispatch annotations based on your experience with
> both that and with the decorator+annotations approach used in stub
> files.

Your proposed comment based solution looks nicer in Python 2 code than
@overload. I'd prefer optimizing any syntax we choose for Python 3 as
that's where the future is. I'd rather not be forced to use comment-based
signatures in Python 3 only code.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sat Jan 23 14:18:42 2016
From: brett at (Brett Cannon)
Date: Sat, 23 Jan 2016 19:18:42 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 22 Jan 2016 at 11:35 Andrew Barnert <abarnert at> wrote:

> On Jan 22, 2016, at 10:37, Brett Cannon <brett at> wrote:
> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>> Yes, this is a useful thing to discuss.
>> Maybe we can standardize on the types defined by the 'six' package, which
>> is commonly used for 2-3 straddling code:
>> six.text_type (unicode in PY2, str in PY3)
>> six.binary_type (str in PY2, bytes in PY3)
>> Actually for the latter we might as well use bytes.
> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
> Python 3.
> As for the textual type, I say either `text` or `unicode` since they are
> both unambiguous between Python 2 and 3 and get the point across.
> The only problem is that, while bytes is a builtin type in both 2.7 and
> 3.x, with similar behaviour (especially in 3.5, where simple %-formatting
> code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that
> would require people writing something like "try: unicode except:
> unicode=str" at the top of every file (or monkeypatching builtins
> somewhere) for the annotations to actually be valid 3.x code.

But why do they have to be valid code? This is for Python 2/3 code which
means any typing information is going to be in a comment and so it isn't
important that it be valid code as-is as long as the tools involved realize
what `unicode` represents. IOW if mypy knows what the `unicode` type
represents in PY3 mode then what does it matter if `unicode` is not a
built-in type of Python 3?

> And, if you're going to do that, using something that's already
> wide-spread and as close to a de facto standard as possible, like the six
> type suggested by Guido, seems less disruptive than inventing a new
> standard (even if "text" or "unicode" is a little nicer than
> "six.text_type").
> (Or, of course, Guido could just get in his time machine and, along with
> restoring the u string literal prefix in 3.3, also restore the builtin name
> unicode as a synonym for str, and then this whole mail thread would fade
> out like Marty McFly.)

I long thought about that option, but I don't think it buys us enough to
bother to add the alias for `str` in Python 3. Considering all of the other
built-in tweaks you typically end up making, I don't think this one change
is worth it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sat Jan 23 14:22:19 2016
From: brett at (Brett Cannon)
Date: Sat, 23 Jan 2016 19:22:19 +0000
Subject: [Python-ideas] Proposal to extend PEP 484 (gradual typing) to
 support Python 2.7
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, 23 Jan 2016 at 11:18 Brett Cannon <brett at> wrote:

> On Fri, 22 Jan 2016 at 11:35 Andrew Barnert <abarnert at> wrote:
>> On Jan 22, 2016, at 10:37, Brett Cannon <brett at> wrote:
>> On Thu, 21 Jan 2016 at 10:45 Guido van Rossum <guido at> wrote:
>>> Yes, this is a useful thing to discuss.
>>> Maybe we can standardize on the types defined by the 'six' package,
>>> which is commonly used for 2-3 straddling code:
>>> six.text_type (unicode in PY2, str in PY3)
>>> six.binary_type (str in PY2, bytes in PY3)
>>> Actually for the latter we might as well use bytes.
>> I agree that `bytes` should cover str/bytes in Python 2 and `bytes` in
>> Python 3.
>> As for the textual type, I say either `text` or `unicode` since they are
>> both unambiguous between Python 2 and 3 and get the point across.
>> The only problem is that, while bytes is a builtin type in both 2.7 and
>> 3.x, with similar behaviour (especially in 3.5, where simple %-formatting
>> code works the same as in 2.7), unicode exists in 2.x but not 3.x, so that
>> would require people writing something like "try: unicode except:
>> unicode=str" at the top of every file (or monkeypatching builtins
>> somewhere) for the annotations to actually be valid 3.x code.
> But why do they have to be valid code? This is for Python 2/3 code which
> means any typing information is going to be in a comment and so it isn't
> important that it be valid code as-is as long as the tools involved realize
> what `unicode` represents. IOW if mypy knows what the `unicode` type
> represents in PY3 mode then what does it matter if `unicode` is not a
> built-in type of Python 3?

I should also mention that Guido is suggesting typing.unicode come into
existence, so there is no special import guard necessary. And since you
will be importing `typing` anyway for type details then having
typing.unicode in both Python 2 and Python 3 is a very minor overhead.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From skrah.temporarily at  Sat Jan 23 15:03:35 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sat, 23 Jan 2016 20:03:35 +0000 (UTC)
Subject: [Python-ideas] Explicit variable capture list
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Guido van Rossum <guido at ...> writes: 
>> I've never liked the use of "late binding" in this context. The
>> behavior is totally standard for closures that use mutable values.
> I wonder if the problem isn't that "binding" is a term imported from a
different language philosophy, and the idea there is just fundamentally
different from Python's philosophy about variables.

I think my point is that even if "late binding" is the best term
for Python's symbol resolution scheme, it may not be optimal to
use it as an explanation for this particular closure behavior, since
all languages with mutable closures behave in the same manner (and
most of them would be classified as "early binding" languages).

Stefan Krah

From skrah.temporarily at  Sat Jan 23 16:12:26 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sat, 23 Jan 2016 21:12:26 +0000 (UTC)
Subject: [Python-ideas] PEP 484 change proposal: Allowing <at> overload
 outside stub files
References: <>
Message-ID: <>

Nick Coghlan <ncoghlan at ...> writes:
> You're already going to have to allow this for single lines to handle
> Py2 compatible annotations, so it seems reasonable to also extend it
> to handle overloading while you're still figuring out a native syntax
> for that.

I find that looks quite

>>> from multipledispatch import dispatch
>>> @dispatch(int, int) 
... def add(x, y):
...      return x + y
>>> @dispatch(float, float)
... def add(x, y):
...     return x + y
>>> add(1, 2)
>>> add(1.0, 2.0)
>>> add(1.0, 2)
Traceback (most recent call last):
[cut because is inflexible]
line 155, in __call__
    func = self._cache[types]
KeyError: (<class 'float'>, <class 'int'>)

Stefan Krah

From greg.ewing at  Sat Jan 23 16:16:11 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 24 Jan 2016 10:16:11 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Guido van Rossum wrote:
> So, I don't really want to introduce "for new x in ..." because it 
> suddenly introduces a completely different concept into the language,
> What 
> dict hold x in "for new x ..."? It would have to be considered a new 
> dict created just to hold x, but other variables assigned in the body of 
> the for loop would still be in the dict holding all the other locals of 
> the function.

We could say that the body of a "for new" loop is a nested
scope in which all other referenced variables are implicitly
declared "nonlocal".


From ncoghlan at  Sat Jan 23 21:18:18 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 24 Jan 2016 12:18:18 +1000
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On 24 January 2016 at 04:13, Jukka Lehtosalo <jlehtosalo at> wrote:
> On Sat, Jan 23, 2016 at 6:17 AM, Nick Coghlan <ncoghlan at> wrote:
>> If you were to instead go with a Python 2 compatible comment based
>> inline solution for now, you'd then get to design the future official
>> spelling for multi-dispatch annotations based on your experience with
>> both that and with the decorator+annotations approach used in stub
>> files.
> Your proposed comment based solution looks nicer in Python 2 code than
> @overload. I'd prefer optimizing any syntax we choose for Python 3 as that's
> where the future is. I'd rather not be forced to use comment-based
> signatures in Python 3 only code.

For the benefit of folks reading this thread, but not the linked
issue: Guido pointed out some cases with variable signatures (e.g.
annotating a range-style API)  & keyword args where the stacked
comments idea doesn't work, so I switched to being +0 on the
"@overload in .py files" interim solution.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan 23 21:45:05 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 24 Jan 2016 12:45:05 +1000
Subject: [Python-ideas] Multiple dispatch (was Re: PEP 484 change proposal:
 Allowing <at> overload outside stub files
Message-ID: <>

On 24 January 2016 at 07:12, Stefan Krah <skrah.temporarily at> wrote:
> Nick Coghlan <ncoghlan at ...> writes:
>> You're already going to have to allow this for single lines to handle
>> Py2 compatible annotations, so it seems reasonable to also extend it
>> to handle overloading while you're still figuring out a native syntax
>> for that.
> I find that looks quite
> nice:
>>>> from multipledispatch import dispatch
>>>> @dispatch(int, int)
> ... def add(x, y):
> ...      return x + y
> ...
>>>> @dispatch(float, float)
> ... def add(x, y):
> ...     return x + y
> ...
>>>> add(1, 2)
> 3
>>>> add(1.0, 2.0)
> 3.0
>>>> add(1.0, 2)
> Traceback (most recent call last):
>   File
> [cut because is inflexible]
> line 155, in __call__
>     func = self._cache[types]
> KeyError: (<class 'float'>, <class 'int'>)

Right, the Blaze folks have been doing some very nice work in that
area. One of the projects building on multipledispatch is the Odo
network of data conversion operations:

They do make the somewhat controversial design decision to make
dispatch operations process global by default [1], rather than scoping
by module. On the other hand, the design also makes it easy to define
your own dispatch namespace, so the default orthogonality with the
module system likely isn't a problem in practice, and the lack of
essential boilerplate does make it very easy to use in contexts like
an IPython notebook.

There is one aspect that still requires runtime stack introspection
[2], and that's getting access to the class scope in order to
implicitly make method dispatch specific to the class defining the
methods. It's the kind of thing that makes me wonder whether we should
be exposing a thread-local variable somewhere with a "class namespace
stack" that made it possible to:

- tell that you're currently running in the context of a class definition
- readily get access to the namespace of the innermost class currently
being defined



Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From ncoghlan at  Sat Jan 23 22:22:30 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 24 Jan 2016 13:22:30 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 24 January 2016 at 07:16, Greg Ewing <greg.ewing at> wrote:
> Guido van Rossum wrote:
>> So, I don't really want to introduce "for new x in ..." because it
>> suddenly introduces a completely different concept into the language,
>> What dict hold x in "for new x ..."? It would have to be considered a new
>> dict created just to hold x, but other variables assigned in the body of the
>> for loop would still be in the dict holding all the other locals of the
>> function.
> We could say that the body of a "for new" loop is a nested
> scope in which all other referenced variables are implicitly
> declared "nonlocal".

This actually ties into an idea your suggestion prompted: it would
likely suffice if we had a way to request "create a new scope per
iteration" behaviour in for loops and comprehensions, with no implicit
nonlocal behaviour at all.

Consider Guido's spelled out list comprehension equivalent:

    powers = []
    for i in range(10):
        def f(x):
            return x**i

There's no rebinding of values in the current scope there - only
mutation of a list. Container comprehensions and generator expressions
have the same characteristic - no name rebinding occurs in the loop
body, so the default handling of rebinding of names other than the
iteration variables doesn't matter.

Accordingly, a statement like:

    powers = []
    for new i in range(10):
        def f(x):
            return x**i

Could be semantically equivalent to:

    powers = []
    for i in range(10):
        def _for_loop_suite(i=i):
            def f(x):
                return x**i
        del _for_loop_suite

Capturing additional values on each iteration would be possible with a
generator expression:

    for new i, a, b, c in (i, a, b, c for i range(10)):
        def f(x):
            return x**i, a, b, c

While nonlocal and global declarations would work the same way they do
in any other nested function.

For a practical example of this, consider the ThreadPoolExecutor
example from the concurrent.futures docs:

A scope-per-iteration construct makes it much easier to use a closure
to define the operation submitted to the executor for each URL:

    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        # Start the load operations and mark each future with its URL
        future_to_site = {}
        for new site_url in sites_to_load:
            def load_site():
                with urllib.request.urlopen(site_url, timeout=60) as conn:
            future_to_site[executor.submit(load_site)] = site_url
        # Report results as they become available
        for future in concurrent.futures.as_completed(future_to_site):
            site_url = future_to_site[future]
                data = future.result()
            except Exception as exc:
                print('%r generated an exception: %s' % (site_url, exc))
                print('%r page is %d bytes' % (site_url, len(data)))

If you try to write that code that way today (i.e. without the "new"
on the first for loop), you'll end up with a race condition between
the main thread changing the value of "site_url" and the executor
issuing the URL open request.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From guido at  Sun Jan 24 00:16:57 2016
From: guido at (Guido van Rossum)
Date: Sat, 23 Jan 2016 21:16:57 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 7:22 PM, Nick Coghlan <ncoghlan at> wrote:
> [...]
> For a practical example of this, consider the ThreadPoolExecutor
> example from the concurrent.futures docs:
> A scope-per-iteration construct makes it much easier to use a closure
> to define the operation submitted to the executor for each URL:
>     with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
>         # Start the load operations and mark each future with its URL
>         future_to_site = {}
>         for new site_url in sites_to_load:
>             def load_site():
>                 with urllib.request.urlopen(site_url, timeout=60) as conn:
>                     return
>             future_to_site[executor.submit(load_site)] = site_url
>         # Report results as they become available
>         for future in concurrent.futures.as_completed(future_to_site):
>             site_url = future_to_site[future]
>             try:
>                 data = future.result()
>             except Exception as exc:
>                 print('%r generated an exception: %s' % (site_url, exc))
>             else:
>                 print('%r page is %d bytes' % (site_url, len(data)))
> If you try to write that code that way today (i.e. without the "new"
> on the first for loop), you'll end up with a race condition between
> the main thread changing the value of "site_url" and the executor
> issuing the URL open request.

I wonder if kids today aren't too much in love with local function
definitions. :-) There's a reason why executor.submit() takes a
function *and arguments*. If you move the function out of the for loop
and pass the url as a parameter to submit(), problem solved, and you
waste fewer resources on function objects and cells to hold nonlocals.
A generation ago most people would have naturally used such a solution
(since most languages didn't support the alternative :-).

--Guido van Rossum (

From guido at  Sun Jan 24 00:37:17 2016
From: guido at (Guido van Rossum)
Date: Sat, 23 Jan 2016 21:37:17 -0800
Subject: [Python-ideas] Multiple dispatch (was Re: PEP 484 change
 proposal: Allowing <at> overload outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 23, 2016 at 6:45 PM, Nick Coghlan <ncoghlan at> wrote:
> [...]
> There is one aspect that still requires runtime stack introspection
> [2], and that's getting access to the class scope in order to
> implicitly make method dispatch specific to the class defining the
> methods. It's the kind of thing that makes me wonder whether we should
> be exposing a thread-local variable somewhere with a "class namespace
> stack" that made it possible to:
> - tell that you're currently running in the context of a class definition
> - readily get access to the namespace of the innermost class currently
> being defined

I wonder if it wouldn't be acceptable to have a metaclass that takes
care of the dispatch registry. You'd have a metaclass whose
__prepare__ method produces a special kind of namespace that
collaborates with a @dispatch() decorator. In this design, @dispatch()
would not do the registration, it would just store its parameters on a
function attribute and mark the function (or return some other object
representing the dispatch parameters and the function). When the
namespace receives a __setattr__()  call with such an object, it
registers it and if needed merges it with the object already there.

Admittedly, calling inspect.currentframe() and assuming it never
returns None is probably less code. (Hm, maybe sys._getframe() could
be guaranteed to work inside a class scope?)

> [1]
> [2]

--Guido van Rossum (

From stephen at  Sun Jan 24 01:27:52 2016
From: stephen at (Stephen J. Turnbull)
Date: Sun, 24 Jan 2016 15:27:52 +0900
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>


Thank you for taking the trouble to address my rather confused post.

Guido van Rossum writes:

 > If I were to deconstruct the original statement, I would start by
 > replacing the list comprehension with a plain old for loop.

I did that.  But that actually doesn't bother me because the loop
index's identifier doesn't go out of scope.  I now see why that's a
red herring, but maybe documentation can be improved.

Anyway, I wrote that post before seeing your explanation that things
just aren't that difficult, they all follow from "variable reference
as dictionary lookup".  The clue I needed was the way to view a scope
as an object, and then realize that all free variable references are
the same, except for visibility of the relevant scope to the other
code at the call site.

For me it's now a documentation issue (I know why the comprehension of
lambdas work as they do, and I also know how to get the "expected",
more useful result).  I'll go take a look at the language reference,
and tutorial, and see if I think they can be improved.

From mertz at  Sun Jan 24 01:45:24 2016
From: mertz at (David Mertz)
Date: Sat, 23 Jan 2016 22:45:24 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 8:54 AM, Guido van Rossum <guido at> wrote:

> Also, once again the semantics of lambda (specifically, that unlike
> def it doesn't create a scope)
> Uh, what? I can sort of guess what you are referring to here (namely, that
> no syntactic construct permissible in a lambda can assign to a local
> variable -- or any variable, for that matter).

That's not even quite true, you can assign to global variables in a lambda:

>>> myglobal = 1
>>> f = lambda: globals().__setitem__('myglobal', 2) or 42
>>> f()
>>> myglobal

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sun Jan 24 01:53:56 2016
From: guido at (Guido van Rossum)
Date: Sat, 23 Jan 2016 22:53:56 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016 at 10:27 PM, Stephen J. Turnbull
<stephen at> wrote:
> Guido,
> Thank you for taking the trouble to address my rather confused post.

You're welcome. And thanks for taking it as constructive criticism.

> Guido van Rossum writes:
>  > If I were to deconstruct the original statement, I would start by
>  > replacing the list comprehension with a plain old for loop.
> I did that.  But that actually doesn't bother me because the loop
> index's identifier doesn't go out of scope.  I now see why that's a
> red herring, but maybe documentation can be improved.
> Anyway, I wrote that post before seeing your explanation that things
> just aren't that difficult, they all follow from "variable reference
> as dictionary lookup".  The clue I needed was the way to view a scope
> as an object, and then realize that all free variable references are
> the same, except for visibility of the relevant scope to the other
> code at the call site.
> For me it's now a documentation issue (I know why the comprehension of
> lambdas work as they do, and I also know how to get the "expected",
> more useful result).  I'll go take a look at the language reference,
> and tutorial, and see if I think they can be improved.

I expect that the tutorial just needs some touch-up or an extra
section on these issues. But the language reference... Well, it's a
mess, it is often confusing and not all that exact. I should take a
year off to rewrite it from scratch (what a book that would be!), but
I don't have the kind of discipline to finish long writing projects.

--Guido van Rossum (

From julien at  Sun Jan 24 05:35:59 2016
From: julien at (Julien Palard)
Date: Sun, 24 Jan 2016 11:35:59 +0100
Subject: [Python-ideas] Cross link documentation translations
Message-ID: <>


While translating the Python Documentation in French [1][2], I 
discovered that we're not the only country doing it, there is also Japan 
[3][4], and Spain [5]. It's possible there's other but I didn't find 
them (and it's the problem).

But there's only a few way for users to find the translations (hearing 
about them, or explicitly searching for them on a search engine, which 
they won't do, obviously expecting a link from the english version if 
they exists).

So here is my idea: Why not linking translations from the main 

I know that's not directly supported by Sphinx doc [6], but separate 
sphinx build, blindly (with hardcoded links) linking themselves, may 
work (like readthedoc is probably doing). The downside of those links is 
that we'll sometime link to untranslated parts, but those parts may be 
marked as untranslated [7] to encourage new translators to help.



Julien Palard

From ncoghlan at  Sun Jan 24 07:54:53 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 24 Jan 2016 22:54:53 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 24 January 2016 at 15:16, Guido van Rossum <guido at> wrote:
> I wonder if kids today aren't too much in love with local function
> definitions. :-) There's a reason why executor.submit() takes a
> function *and arguments*. If you move the function out of the for loop
> and pass the url as a parameter to submit(), problem solved, and you
> waste fewer resources on function objects and cells to hold nonlocals.

Aye, that's how the current example code in the docs handles it -
there's an up front definition of the page loading function, and then
the submission to the executor is with a dict comprehension.

The only thing "wrong" with it is that when reading the code, the
potentially single-use function is introduced first without any
context, and it's only later that you get to see what it's for.

> A generation ago most people would have naturally used such a solution
> (since most languages didn't support the alternative :-).

In programming we would have, but I don't think the same is true when
writing work instructions for other people to follow - for those,
we're more likely to use nested bullets to describe subtasks, and only
pull them out to a separate document or section if we need to
reference the same subtask from multiple places.

While my view is admittedly only based on intuition rather than hard
data, it seems to me that when folks are reaching for nested
functions, it's that "subtask as a nested bulleted list" idiom they're
aiming to express, and Python is otherwise so accommodating of English
structural idioms that it's jarring when it doesn't work properly. (I
also suspect that's why it's a question we keep returning to - as a
*programming language*, making closures play more nicely with
iteration variables doesn't add any real power to Python, but as
*executable pseudo-code*, it makes it a little bit easier to express
certain ideas in the same way we'd describe them to another person).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From tritium-list at  Sun Jan 24 09:51:57 2016
From: tritium-list at (Alexander Walters)
Date: Sun, 24 Jan 2016 09:51:57 -0500
Subject: [Python-ideas] Cross link documentation translations
In-Reply-To: <>
References: <>
Message-ID: <>

I am -1 for linking the official documentation to anything not on (or to things like the source listings link on the 
library docs, which is controlled by the same party as controls the 
documentation).  It irks me every time I see a link to activestate 
recipes in the official docs.  Activestate has already taken down 
documentation they have previously hosted, and the recipes will only 
exist as long as it is advantageous to them to continue hosting them*.  
The same can be said of translation projects that don't exist as special 
interest groups under the PSF.

But I do like the idea of linking to translations.  Would it not be a 
better solution to try and unify the translation efforts into one system 
as SIGs of the Doc-SIG?

Besides, linking only to documentation you generate (as hosting the docs 
as a sig under the doc-sig would allow) would make the technical 
implementation much easier.

Baring that, its a better-than-nothing idea.

* not to mention the questionable quality of the recipes.

On 1/24/2016 05:35, Julien Palard wrote:
> o/
> While translating the Python Documentation in French [1][2], I 
> discovered that we're not the only country doing it, there is also 
> Japan [3][4], and Spain [5]. It's possible there's other but I didn't 
> find them (and it's the problem).
> But there's only a few way for users to find the translations (hearing 
> about them, or explicitly searching for them on a search engine, which 
> they won't do, obviously expecting a link from the english version if 
> they exists).
> So here is my idea: Why not linking translations from the main 
> documentation?
> I know that's not directly supported by Sphinx doc [6], but separate 
> sphinx build, blindly (with hardcoded links) linking themselves, may 
> work (like readthedoc is probably doing). The downside of those links 
> is that we'll sometime link to untranslated parts, but those parts may 
> be marked as untranslated [7] to encourage new translators to help.
> Thoughts?
> [1]
> [2]
> [3]
> [4]
> [5]
> [6]
> [7]

From wes.turner at  Sun Jan 24 14:04:12 2016
From: wes.turner at (Wes Turner)
Date: Sun, 24 Jan 2016 13:04:12 -0600
Subject: [Python-ideas] Cross link documentation translations
In-Reply-To: <>
References: <>
Message-ID: <>

ReadTheDocs supports hosting projects with multiple translations:

| Docs:

- [ ] There could be a dedicated Python Infrastructure ReadtheDocs Docker

ReadTheDocs CPython Docs

* | Docs:
* | Project:

   * [ ] All past revisions
   * [ ] All translations

On Jan 24, 2016 4:41 AM, "Julien Palard" <julien at> wrote:
> o/
> While translating the Python Documentation in French [1][2], I discovered
that we're not the only country doing it, there is also Japan [3][4], and
Spain [5]. It's possible there's other but I didn't find them (and it's the
> But there's only a few way for users to find the translations (hearing
about them, or explicitly searching for them on a search engine, which they
won't do, obviously expecting a link from the english version if they
> So here is my idea: Why not linking translations from the main
> I know that's not directly supported by Sphinx doc [6], but separate
sphinx build, blindly (with hardcoded links) linking themselves, may work
(like readthedoc is probably doing). The downside of those links is that
we'll sometime link to untranslated parts, but those parts may be marked as
untranslated [7] to encourage new translators to help.
> Thoughts?
> [1]
> [2]
> [3]
> [4]
> [5]
> [6]
> [7]
> --
> Julien Palard
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sun Jan 24 14:57:30 2016
From: brett at (Brett Cannon)
Date: Sun, 24 Jan 2016 19:57:30 +0000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sun, Jan 24, 2016, 04:55 Nick Coghlan <ncoghlan at> wrote:

> On 24 January 2016 at 15:16, Guido van Rossum <guido at> wrote:
> > I wonder if kids today aren't too much in love with local function
> > definitions. :-) There's a reason why executor.submit() takes a
> > function *and arguments*. If you move the function out of the for loop
> > and pass the url as a parameter to submit(), problem solved, and you
> > waste fewer resources on function objects and cells to hold nonlocals.
> Aye, that's how the current example code in the docs handles it -
> there's an up front definition of the page loading function, and then
> the submission to the executor is with a dict comprehension.
> The only thing "wrong" with it is that when reading the code, the
> potentially single-use function is introduced first without any
> context, and it's only later that you get to see what it's for.

So the doics just need an added comment to help explain it. Want to file an
issue for that?

> > A generation ago most people would have naturally used such a solution
> > (since most languages didn't support the alternative :-).
> In programming we would have, but I don't think the same is true when
> writing work instructions for other people to follow - for those,
> we're more likely to use nested bullets to describe subtasks, and only
> pull them out to a separate document or section if we need to
> reference the same subtask from multiple places.
> While my view is admittedly only based on intuition rather than hard
> data, it seems to me that when folks are reaching for nested
> functions, it's that "subtask as a nested bulleted list" idiom they're
> aiming to express, and Python is otherwise so accommodating of English
> structural idioms that it's jarring when it doesn't work properly. (I
> also suspect that's why it's a question we keep returning to - as a
> *programming language*, making closures play more nicely with
> iteration variables doesn't add any real power to Python, but as
> *executable pseudo-code*, it makes it a little bit easier to express
> certain ideas in the same way we'd describe them to another person).

I personally like the semantics we currently have. I get why people bring
this up, but I'm voting for the programming language side over the
pseudo-code angle.


> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Sun Jan 24 15:17:27 2016
From: brett at (Brett Cannon)
Date: Sun, 24 Jan 2016 20:17:27 +0000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sat, Jan 23, 2016, 22:55 Guido van Rossum <guido at> wrote:

> On Sat, Jan 23, 2016 at 10:27 PM, Stephen J. Turnbull
> <stephen at> wrote:
> > Guido,
> >
> > Thank you for taking the trouble to address my rather confused post.
> You're welcome. And thanks for taking it as constructive criticism.
> > Guido van Rossum writes:
> >
> >  > If I were to deconstruct the original statement, I would start by
> >  > replacing the list comprehension with a plain old for loop.
> >
> > I did that.  But that actually doesn't bother me because the loop
> > index's identifier doesn't go out of scope.  I now see why that's a
> > red herring, but maybe documentation can be improved.
> >
> > Anyway, I wrote that post before seeing your explanation that things
> > just aren't that difficult, they all follow from "variable reference
> > as dictionary lookup".  The clue I needed was the way to view a scope
> > as an object, and then realize that all free variable references are
> > the same, except for visibility of the relevant scope to the other
> > code at the call site.
> >
> > For me it's now a documentation issue (I know why the comprehension of
> > lambdas work as they do, and I also know how to get the "expected",
> > more useful result).  I'll go take a look at the language reference,
> > and tutorial, and see if I think they can be improved.
> I expect that the tutorial just needs some touch-up or an extra
> section on these issues. But the language reference... Well, it's a
> mess, it is often confusing and not all that exact. I should take a
> year off to rewrite it from scratch (what a book that would be!), but
> I don't have the kind of discipline to finish long writing projects.
> :-(

Would doing something like the Ruby community where we write a spec using a
BDD-style so it's more a set of tests than verbiage be easier?


> --
> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Sun Jan 24 15:42:52 2016
From: greg.ewing at (Greg Ewing)
Date: Mon, 25 Jan 2016 09:42:52 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Nick Coghlan wrote:
> Capturing additional values on each iteration would be possible with a
> generator expression:
>     for new i, a, b, c in (i, a, b, c for i range(10)):
>         def f(x):
>             return x**i, a, b, c

I'm not sure I see the point of this. If you're needing
to capture a, b and c from an outer scope, presumably
it's because there's some outer loop that's changing
them -- in which case you can just make *that* loop
a "new" loop as well.

BTW, should there be a "while new" loop too?


From cs at  Sun Jan 24 16:21:04 2016
From: cs at (Cameron Simpson)
Date: Mon, 25 Jan 2016 08:21:04 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On 21Jan2016 11:52, Steven D'Aprano <steve at> wrote:
>So a full function declaration looks like:
>(Bike-shedders: do you prefer () [] or {} for the list of captures?)

Just to this: I prefer () - this is very much like a special parameter list. [] 
and {} should list and dict to me.

Cameron Simpson <cs at>

From python at  Sun Jan 24 16:22:22 2016
From: python at (Erik)
Date: Sun, 24 Jan 2016 21:22:22 +0000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 24/01/16 20:42, Greg Ewing wrote:
> BTW, should there be a "while new" loop too?

And a "with foo() as new i:" ... and what about "func(new bar)"?

Removing tongue from cheek now ;)


From julien at  Sun Jan 24 17:16:49 2016
From: julien at (Julien Palard)
Date: Sun, 24 Jan 2016 23:16:49 +0100
Subject: [Python-ideas] Cross link documentation translations
In-Reply-To: <>
References: <> <>
Message-ID: <>

On 01/24/2016 03:51 PM, Alexander Walters wrote:
> I am -1 for linking the official documentation to anything not on 
My principal goal is not to cross-link outside of, but to 
cross-link efforts, and provide users a way to find the translations.

Being hosted on is probably the neatest way to do it: So 
I can only agree.

> But I do like the idea of linking to translations.  Would it not be a 
> better solution to try and unify the translation efforts into one 
> system as SIGs of the Doc-SIG?
I'm not well-aware of SIGs and their inner workings, (SIG-Doc mail 
archive has not been updated since 2013), and cross-language unification 
is probably not a good idea, at first (each team have its unique 
organization / leaders / etc), but why not, I'm open to any ideas.

> Besides, linking only to documentation you generate (as hosting the 
> docs as a sig under the doc-sig would allow) would make the technical 
> implementation much easier.

Actually we generate the whole documentation, even untranslated parts, 
(actually Sphinx does it). Also, ignoring the internal working of 
special interest groups, I completely miss the "hosting the docs as a 
sig under the doc-sig" part: does SIG has hostings ?

Julien Palard

From stephen at  Sun Jan 24 21:08:32 2016
From: stephen at (Stephen J. Turnbull)
Date: Mon, 25 Jan 2016 11:08:32 +0900
Subject: [Python-ideas] Cross link documentation translations
In-Reply-To: <>
References: <> <>
Message-ID: <>

Julien Palard writes:

 > I'm not well-aware of SIGs and their inner workings, (SIG-Doc mail
 > archive has not been updated since 2013), and cross-language
 > unification is probably not a good idea, at first (each team have
 > its unique organization / leaders / etc), but why not, I'm open to
 > any ideas.

As you're probably aware, Debian[1] has #(languages) + 2 teams
translating the Debian-specific parts of packages.  One of the special
teams works on internationalizing the packaging software (mostly but
not entirely done, even after more than a decade), and another
provides infrastructure for accepting and distributing translations.
As with Debian, I don't think there will be unification of
organizations, and it certainly isn't needed.  On the other hand,
*somebody* will need to construct the web page and repository
structure and linkage, and there will be an on-going need for
integrating new versions.  The debian-i18n mailing list is also useful
for propagating best practices.

Quality of translation is an issue that Debian doesn't much have to
deal with (because the Debian teams are working on the same task --
installation and configuration -- they quickly develop idioms for
repetitive queries), but for manuals the issue is important.  IMHO, it
would be ideal if the integrators included a team of editor/reviewers
independent of the various language teams.  At least for Japanese,
translations (both of generic English and specifically English
software manuals) are often mechanical and not very educational, and
occasionally actively misleading.  Of course this is pie-in-the-sky;
surely people with such skills and the interest are likely to be
participating in the teams.  But I think it's a good idea to keep
quality of translation in mind, and possibly impose some sort of
formal review process or two-approvals requirement.

[1]  The example I'm familiar with, I suppose many other projects have
similar setups.

From ncoghlan at  Sun Jan 24 21:31:38 2016
From: ncoghlan at (Nick Coghlan)
Date: Mon, 25 Jan 2016 12:31:38 +1000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 25 January 2016 at 05:57, Brett Cannon <brett at> wrote:
> On Sun, Jan 24, 2016, 04:55 Nick Coghlan <ncoghlan at> wrote:
>> On 24 January 2016 at 15:16, Guido van Rossum <guido at> wrote:
>> > I wonder if kids today aren't too much in love with local function
>> > definitions. :-) There's a reason why executor.submit() takes a
>> > function *and arguments*. If you move the function out of the for loop
>> > and pass the url as a parameter to submit(), problem solved, and you
>> > waste fewer resources on function objects and cells to hold nonlocals.
>> Aye, that's how the current example code in the docs handles it -
>> there's an up front definition of the page loading function, and then
>> the submission to the executor is with a dict comprehension.
>> The only thing "wrong" with it is that when reading the code, the
>> potentially single-use function is introduced first without any
>> context, and it's only later that you get to see what it's for.
> So the doics just need an added comment to help explain it. Want to file an
> issue for that?

There's nothing to comment on given the Python semantics we have today
- what's there is a sensible way to write that code, and the design
FAQ covers why the inline closure approach wouldn't work.

As noted, I suspect the only reason the topic keeps coming up is the
niggling sense that the closure based approach "should" work, and the
fact that it doesn't is a case where underlying technical details that
we generally aim to let people gloss over make themselves apparent.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From guido at  Sun Jan 24 23:40:26 2016
From: guido at (Guido van Rossum)
Date: Sun, 24 Jan 2016 20:40:26 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sun, Jan 24, 2016 at 12:17 PM, Brett Cannon <brett at> wrote:
> Would doing something like the Ruby community where we write a spec using a
> BDD-style so it's more a set of tests than verbiage be easier?

I haven't seen that, bu tif it's anything like the typical way of
writing unit tests in Ruby, please no.

--Guido van Rossum (

From tjreedy at  Mon Jan 25 01:32:17 2016
From: tjreedy at (Terry Reedy)
Date: Mon, 25 Jan 2016 01:32:17 -0500
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <n84fhk$kt9$>

On 1/24/2016 7:54 AM, Nick Coghlan wrote:
> On 24 January 2016 at 15:16, Guido van Rossum <guido at> wrote:
>> I wonder if kids today aren't too much in love with local function
>> definitions. :-) There's a reason why executor.submit() takes a
>> function *and arguments*. If you move the function out of the for loop

What I've concluded from this thread is that function definitions (with 
direct use 'def' or 'lambda') do not fit well within loops, though I 
used them there myself.

When delayed function calls are are needed, what belongs within loops is 
packaging of a pre-defined function with one or more arguments within a 
callable.  Instance.method is an elegant syntax for doing so. 
functools.partial(func, args, ...) is a much clumsier generalized 
expression, which requires an import.  Note that 'partial' returns a 
function for delayed execution even when a complete, not partial, set of 
arguments is passed.

A major attempted (and tempting) use for definitions within a loop is 
multiple callbacks for multiple gui widgets, where delayed execution is 
needed.  The three answers to multiple 'why doesn't this work' on both 
python-list and Stackoverflow are multiple definitions with variant 
'default args', a custom make_function function outside the loop called 
multiple times within the loop, and a direct function outside the loop 
called with partial within the loop.  I am going to start using partial 

Making partial a builtin would make it easier to use and more 
attractive.  Even more attractive would be syntax that abbreviates 
delayed calls with pre-bound arguments in the way that inst.meth 
abbreviates a much more complicated expression roughly equivalent to 
"bind(inst.__getattr__('meth'), inst)".

A possibility would be to make {} a delayed and possibly partial call 
operator, in parallel to the current use of () as a immediate and total 
call operator.
would evaluate to a function, whether of type <function> or a special 
class similar to bound methods. The 'arguments' would be anything 
allowed within partial, which I believe is anything allowed in any 
function call.  I chose {} because expr{...} is currently illegal, just 
as expr(arguments) is for anything other than a function call.  On the 
other hand, expr[...] is currently legal, at least up to '[', as is 
expr<...> at least up to '<'.

>> and pass the url as a parameter to submit(), problem solved, and you
>> waste fewer resources on function objects and cells to hold nonlocals.

executor.submit appears to me to be a specialized version of partial, 
with all arguments required.  With the proposal above, I think 
submit(func{all args}) would work.

> Aye, that's how the current example code in the docs handles it -
> there's an up front definition of the page loading function, and then
> the submission to the executor is with a dict comprehension.

I presume you both are referring to ThreadPoolExecutor Example.  The 
load_url function, which I think should be 'get_page' has a comment that 
is wrong (it does not 'report the url') and no docstring.  My suggestion:

# Define an example function for the executor.submit call below.
def get_page(url, timeout):
     "Return the page, as a string, retrieved from the url."
     with ...

> The only thing "wrong" with it is that when reading the code, the
> potentially single-use function is introduced first without any
> context, and it's only later that you get to see what it's for.

A proper comment would fix this I think.  That aside, if the main code 
were packaged within def main, as in the following ProcessPoolExecutor 
Example, so as to delay the lookup of 'load_url' or 'get_page', then the 
two functions definitions could be in *either* order.  The general 
convention in Pythonland seems to be to put main last (bottom up, define 
everything before use), but in a recent python-list thread, at least one 
person, and I think two, said they like to start with def main (top down 
style, which you seem to like).

I just checked and PEP8 seems to be silent on the placement of 'def 
main'.  So unless Guido says otherwise, I would not mind if you revised 
one of the examples to start with def main, just to show that that is a 
legitimate alternative.  It is a feature of Python that one can do this 
without having to add, before the first appearance of a function name 
within a function, a dummy 'forward declaration' giving the function 

>> A generation ago most people would have naturally used such a solution
>> (since most languages didn't support the alternative :-).
> In programming we would have, but I don't think the same is true when
> writing work instructions for other people to follow - for those,
> we're more likely to use nested bullets to describe subtasks, and only
> pull them out to a separate document or section if we need to
> reference the same subtask from multiple places.

People can and do jump around while reading code for understanding. 
They can do this without markers as explicit as needed for machines. 
Current compilers and interpreters initially read code linearly, with 
only one character or token lookahead.  For Python, a def header is 
needed for forward reference, to delay name resolution to call time, 
after the whole file has been read.

> While my view is admittedly only based on intuition rather than hard
> data, it seems to me that when folks are reaching for nested
> functions, it's that "subtask as a nested bulleted list" idiom they're
> aiming to express, and Python is otherwise so accommodating of English
> structural idioms that it's jarring when it doesn't work properly. (I
> also suspect that's why it's a question we keep returning to - as a
> *programming language*, making closures play more nicely with
> iteration variables doesn't add any real power to Python, but as
> *executable pseudo-code*, it makes it a little bit easier to express
> certain ideas in the same way we'd describe them to another person).

I thought about some explicit examples and it is not necessarily clear 
how to translate bullet points to code.  But in general, I do not 
believe that instructions to another person are meant to induce in the 
mind of a listener multiple functions that only differ in a default 
argumnet object.  In other words, I do not see

for i in it:
   def f(i=i): pass

as corresponding to natural language.  Hence my initial statement above.

Terry Jan Reedy

From marcel at  Mon Jan 25 10:11:56 2016
From: marcel at (Marcel O'Neil)
Date: Mon, 25 Jan 2016 10:11:56 -0500
Subject: [Python-ideas] intput()
Message-ID: <>

def intput():
    return int(input())

Life would be just marginally easier, with a punny function name as a bonus.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rymg19 at  Mon Jan 25 10:38:39 2016
From: rymg19 at (Ryan Gonzalez)
Date: Mon, 25 Jan 2016 09:38:39 -0600
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

Me: *sees intput*

Huh, there's a typo here. Let me just change it back to input!

*program explodes*

Seriously, it's too easy to mistype to me.

On January 25, 2016 9:11:56 AM CST, Marcel O'Neil <marcel at> wrote:
>def intput():
>    return int(input())
>Life would be just marginally easier, with a punny function name as a
>Python-ideas mailing list
>Python-ideas at
>Code of Conduct:

Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From geoffspear at  Mon Jan 25 10:40:44 2016
From: geoffspear at (Geoffrey Spear)
Date: Mon, 25 Jan 2016 10:40:44 -0500
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 25, 2016 at 10:11 AM, Marcel O'Neil <marcel at>

> def intput():
>     return int(input())
> Life would be just marginally easier, with a punny function name as a
> bonus.

Cute, and easy enough to do in your own code. Way too much of a trivial
special case to add to the core language, though, in my opinion.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rob.cliffe at  Mon Jan 25 10:47:34 2016
From: rob.cliffe at (Rob Cliffe)
Date: Mon, 25 Jan 2016 15:47:34 +0000
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

On 25/01/2016 15:40, Geoffrey Spear wrote:
> On Mon, Jan 25, 2016 at 10:11 AM, Marcel O'Neil 
> <marcel at <mailto:marcel at>> wrote:
>     def intput():
>         return int(input())
>     Life would be just marginally easier, with a punny function name
>     as a bonus.
> Cute, and easy enough to do in your own code. Way too much of a 
> trivial special case to add to the core language, though, in my opinion.
+1.  In real life you would probably want validation and allow the user 
retries, and a prompt.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ian.g.kelly at  Mon Jan 25 10:56:55 2016
From: ian.g.kelly at (Ian Kelly)
Date: Mon, 25 Jan 2016 08:56:55 -0700
Subject: [Python-ideas] Documenting asyncio methods as returning awaitables
Message-ID: <>

The official asyncio documentation includes this note:

Note: In this documentation, some methods are documented as
coroutines, even if they are plain Python functions returning a
Future. This is intentional to have a freedom of tweaking the
implementation of these functions in the future. If such a function is
needed to be used in a callback-style code, wrap its result with

Despite the note, this still causes confusion. See for example

As of Python 3.5, "awaitable" is a thing, and as of Python 3.5.1,
ensure_future is supposed to accept any awaitable. Would it be better
then to document these methods as returning awaitables rather than as

From guido at  Mon Jan 25 11:52:49 2016
From: guido at (Guido van Rossum)
Date: Mon, 25 Jan 2016 08:52:49 -0800
Subject: [Python-ideas] Documenting asyncio methods as returning
In-Reply-To: <>
References: <>
Message-ID: <>

I agree there's been considerable confusion. For example, quoting from
the conversation you linked, "While coroutines are the focus of the
library, they're based on futures". That's actually incorrect.

Until PEP 492 (async/await), there were two separate concepts: Future
and coroutine. A Future is an object with certain methods (e.g.
get_result(), cancel(), add_done_callback()). A coroutine is a
generator object -- it sports no such methods (though it has some of
its own, e.g. send() and throw()). But not every generator object is a
coroutine -- coroutine is expected to have certain "behavior" that
makes it interact correctly with a scheduler. (Details of this
behavior don't matter for this explanation, but it involves yielding
zero or more Futures. Also, the @asyncio.coroutine decorator must be
used to mark the generator as supporting that "behavior".)

Coroutines are more efficient, because when a coroutine calls and
waits for another coroutine (using yield from, or in 3.5 also await)
no trip to the scheduler is required -- it's all taken care of by the
Python interpreter.

Now, the confusion typically occurs because when you use yield from,
it accepts either a coroutine or a Future. And in most cases you're
not really aware (and you don't care) whether a particular thing
you're waiting on is a coroutine or a Future -- you just want to wait
for it, letting the event loop do other things, until it has a result
for you, and either type supports that.

However sometimes you *do* care about the type -- and that's typically
because you want a Future, so you can call some of its methods like
cancel() or add_done_callback(). The correct way to do this, when
you're not sure whether something is a Future or a coroutine, is to
call ensure_future(). If what you've got is already a Future it will
just return that unchanged; if you've got a coroutine it wraps it in a

Many asyncio operations take either a Future or a coroutine -- they
all just call ensure_future() on that argument.

So how do things change in Python 3.5 with PEP 492? Not much -- the
same story applies, except there's a third type of object, confusingly
called a coroutine object (as opposed to the coroutine I was talking
about above, which is called a generator object). A coroutine object
is almost the same as a generator object, and supports mostly the same
interface (e.g. send(), throw()).

We can treat generator objects with coroutine "behavior" and proper
(PEP 492) coroutine objects as essentially interchangeable, because
that's how PEP 492 was designed. (Differences come out only when
you're making a mistake, such as trying to iterate over one. Iterating
over a pre-PEP-492 coroutine is invalid, but (because it's implemented
as a generator object) you can still call its iter() method. Calling
iter() on a PEP 492 coroutine object fails with a TypeError.

So what should the docs do?

IMO they should be very clear about the distinction between functions
that return Futures and functions that return coroutines (of either
kind). I think it's fine if they are fuzzy about whether the latter
return a PEP 492 style coroutine (i.e. defined with async def) or a
pre-PEP-492 coroutine (marked with @asyncio.coroutine), since those
are almost entirely interchangeable, and the plan is to eventually
make everything a PEP 492 coroutine.

Finally, what should you do if you have a Future but you need a
coroutine? This has come up a few times but it's probably an
indication that there's something you haven't understood yet. The only
API that requires a coroutine (and rejects a Future) is the Task()
constructor, but you should only call that with a coroutine you
defined yourself -- if it's something you received, you should be
using ensure_future(), which will do the right thing (wrapping a
coroutine in a Task).

Good luck!


On Mon, Jan 25, 2016 at 7:56 AM, Ian Kelly <ian.g.kelly at> wrote:
> The official asyncio documentation includes this note:
> """
> Note: In this documentation, some methods are documented as
> coroutines, even if they are plain Python functions returning a
> Future. This is intentional to have a freedom of tweaking the
> implementation of these functions in the future. If such a function is
> needed to be used in a callback-style code, wrap its result with
> ensure_future().
> """
> Despite the note, this still causes confusion. See for example
> As of Python 3.5, "awaitable" is a thing, and as of Python 3.5.1,
> ensure_future is supposed to accept any awaitable. Would it be better
> then to document these methods as returning awaitables rather than as
> coroutines?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From guido at  Mon Jan 25 13:52:26 2016
From: guido at (Guido van Rossum)
Date: Mon, 25 Jan 2016 10:52:26 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n84fhk$kt9$>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Sun, Jan 24, 2016 at 10:32 PM, Terry Reedy <tjreedy at> wrote:
> What I've concluded from this thread is that function definitions (with
> direct use 'def' or 'lambda') do not fit well within loops, though I used
> them there myself.

Right. When you can avoid them, you avoid extra work in an inner loop,
which is often a good idea.

> When delayed function calls are are needed, what belongs within loops is
> packaging of a pre-defined function with one or more arguments within a
> callable.  Instance.method is an elegant syntax for doing so.
> functools.partial(func, args, ...) is a much clumsier generalized
> expression, which requires an import.  Note that 'partial' returns a
> function for delayed execution even when a complete, not partial, set of
> arguments is passed.

Right. I've always hated partial() (which is why it's not a builtin)
because usually a lambda is clearer (it's difficult to calculate in
your head the signature of the thing it returns from the  arguments
passed), but this is one thing where partial() wins, since it captures

> A major attempted (and tempting) use for definitions within a loop is
> multiple callbacks for multiple gui widgets, where delayed execution is
> needed.  The three answers to multiple 'why doesn't this work' on both
> python-list and Stackoverflow are multiple definitions with variant 'default
> args', a custom make_function function outside the loop called multiple
> times within the loop, and a direct function outside the loop called with
> partial within the loop.  I am going to start using partial more.

Yes, the make_function() approach is just a custom partial().

> Making partial a builtin would make it easier to use and more attractive.
> Even more attractive would be syntax that abbreviates delayed calls with
> pre-bound arguments in the way that inst.meth abbreviates a much more
> complicated expression roughly equivalent to "bind(inst.__getattr__('meth'),
> inst)".

A recommended best practice / idiom is more useful, because it can be
applied to all Python versions.

> A possibility would be to make {} a delayed and possibly partial call
> operator, in parallel to the current use of () as a immediate and total call
> operator.
>   expr{arguments}
> would evaluate to a function, whether of type <function> or a special class
> similar to bound methods. The 'arguments' would be anything allowed within
> partial, which I believe is anything allowed in any function call.  I chose
> {} because expr{...} is currently illegal, just as expr(arguments) is for
> anything other than a function call.  On the other hand, expr[...] is
> currently legal, at least up to '[', as is expr<...> at least up to '<'.

-1 on expr{...}.

>>> and pass the url as a parameter to submit(), problem solved, and you
>>> waste fewer resources on function objects and cells to hold nonlocals.
> executor.submit appears to me to be a specialized version of partial, with
> all arguments required.  With the proposal above, I think submit(func{all
> args}) would work.

But not before 3.6.

>> Aye, that's how the current example code in the docs handles it -
>> there's an up front definition of the page loading function, and then
>> the submission to the executor is with a dict comprehension.
> I presume you both are referring to ThreadPoolExecutor Example.  The
> load_url function, which I think should be 'get_page' has a comment that is
> wrong (it does not 'report the url') and no docstring.  My suggestion:
> # Define an example function for the executor.submit call below.
> def get_page(url, timeout):
>     "Return the page, as a string, retrieved from the url."
>     with ...
>> The only thing "wrong" with it is that when reading the code, the
>> potentially single-use function is introduced first without any
>> context, and it's only later that you get to see what it's for.
> A proper comment would fix this I think.  That aside, if the main code were
> packaged within def main, as in the following ProcessPoolExecutor Example,
> so as to delay the lookup of 'load_url' or 'get_page', then the two
> functions definitions could be in *either* order.  The general convention in
> Pythonland seems to be to put main last (bottom up, define everything before
> use), but in a recent python-list thread, at least one person, and I think
> two, said they like to start with def main (top down style, which you seem
> to like).

I like both. :-)

--Guido van Rossum (

From greg.ewing at  Mon Jan 25 15:04:08 2016
From: greg.ewing at (Greg Ewing)
Date: Tue, 26 Jan 2016 09:04:08 +1300
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

Marcel O'Neil wrote:
> def intput():
>     return int(input())

And also

   def flintput():
     return float(input())


From abarnert at  Mon Jan 25 16:03:49 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 13:03:49 -0800
Subject: [Python-ideas] PEP 484 change proposal: Allowing <at> overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 23, 2016, at 13:12, Stefan Krah <skrah.temporarily at> wrote:
> Nick Coghlan <ncoghlan at ...> writes:
>> You're already going to have to allow this for single lines to handle
>> Py2 compatible annotations, so it seems reasonable to also extend it
>> to handle overloading while you're still figuring out a native syntax
>> for that.
> I find that looks quite
> nice:
>>>> from multipledispatch import dispatch
>>>> @dispatch(int, int)
> ... def add(x, y):
> ...      return x + y
> ... 
>>>> @dispatch(float, float)
> ... def add(x, y):
> ...     return x + y
> ... 
>>>> add(1, 2)
> 3
>>>> add(1.0, 2.0)
> 3.0
>>>> add(1.0, 2)
> Traceback (most recent call last):
>  File
> [cut because is inflexible]
> line 155, in __call__
>    func = self._cache[types]
> KeyError: (<class 'float'>, <class 'int'>)

Of course you still have to work out how that would fit with type annotations. Presumably you could just move the dispatched types from the decorator to the annotations, and add a return type on each overload. And you could make the dispatch algorithm ignore element types in generic types (so add(a: Sequence[T], b: Sequence[T]) gets called on any pair of Sequences).

But even then, it's hard to imagine how a type checker could understand your code unless it had special-case code for this special decorator.

Not to mention that you're not supposed to runtime-dispatch on typing.Sequence (isinstance and issubclass only work by accident), but you can't genericize

Plus, as either Guido or Jukka pointed out earlier, you may want to specify that Sequence[T] normally returns T but Sequence[Number] always returns float or something; at runtime, those are the same type, so they have to share a single implementation, which takes you right back to needing a way to specify overloads for the type checker.

Still, I would love to see someone take that library and mypy and experiment with making them work together and solving all of these problems.

(As a side note, every time I look at this stuff, I start thinking I want type computation so I can specify that add(a: Sequence[T], b: Sequence[U]) -> Sequence[decltype(T + U)], until I spend a few minutes trying to find a way to write that that isn't as horrible as C++ without introducing all of Haskell into Python, and then appreciating again why maybe building the simple thing first was a good idea...)

From srkunze at  Mon Jan 25 16:03:50 2016
From: srkunze at (Sven R. Kunze)
Date: Mon, 25 Jan 2016 22:03:50 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 24.01.2016 06:16, Guido van Rossum wrote:
> I wonder if kids today aren't too much in love with local function
> definitions. :-) There's a reason why executor.submit() takes a
> function *and arguments*. If you move the function out of the for loop
> and pass the url as a parameter to submit(), problem solved, and you
> waste fewer resources on function objects and cells to hold nonlocals.
> A generation ago most people would have naturally used such a solution
> (since most languages didn't support the alternative :-).

Well said.

I remember js be a hatchery of this kind of programming. My main concern 
always was "how can I test these inner functions?" Almost impossible but 
a good excuse not to. So, it's unprofessional from my point of view but 
things may change.

On-topic: I like the way Python allows me to bind early. It's simple and 
that's the main argument for it and against introducing an yet-another 
syntax (like colons, brakes, etc.); especially for solving such a side 


From bzvi7919 at  Mon Jan 25 15:58:33 2016
From: bzvi7919 at (Bar Harel)
Date: Mon, 25 Jan 2016 20:58:33 +0000
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

def dictput():
  raise SyntaxError("You entered a dict in the wrong way")

Will probably raise a few lols.
btw flintput is float(int(input()) which rounds down. flinput is

-- Bar

On Mon, Jan 25, 2016 at 10:04 PM Greg Ewing <greg.ewing at>

> Marcel O'Neil wrote:
> > def intput():
> >     return int(input())
> And also
>    def flintput():
>      return float(input())
> Yabba-dabba-doo-ly,
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Mon Jan 25 16:28:18 2016
From: ethan at (Ethan Furman)
Date: Mon, 25 Jan 2016 13:28:18 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

Let's not forget

def dolphinput(message):
     "get fish order from are ocean-going mammalian friends"

flipper'nly yrs,

From abarnert at  Mon Jan 25 16:42:08 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 13:42:08 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 23, 2016, at 19:22, Nick Coghlan <ncoghlan at> wrote:
> Accordingly, a statement like:
>    powers = []
>    for new i in range(10):
>        def f(x):
>            return x**i
>       powers.append(f)
> Could be semantically equivalent to:
>    powers = []
>    for i in range(10):
>        def _for_loop_suite(i=i):
>            def f(x):
>                return x**i
>           powers.append(f)
>        _for_loop_suite()
>        del _for_loop_suite

A simpler translation of the Swift/C#/etc. behavior might be:

>    powers = []
>    for i in range(10):
>        def _for_loop_suite(i):
>            def f(x):
>                return x**i
>           powers.append(f)
>        _for_loop_suite(i)
>        del _for_loop_suite

This is, after all, how comprehensions work, and how you mechanically translate let bindings from other languages to Python (I believe MacroPy even has a let macro that does exactly this); it's slightly simpler to understand under the hood; it's even slightly more efficient (not that it will ever matter).

Of course that raises an important point: when you're _not_ mechanically translating, you rarely translate a let this way; instead, you translate it by rewriting the code at a higher level. (And the fact that this translation _is_ idiomatic in JavaScript is exactly why JS code is ugly in the way that Guido and others decry in this thread.) Do we want the compiler doing something under the hood that we wouldn't want to write ourselves? (Again, people in JS, and other languages like C#, don't consider that a problem--both languages define async as effectively a macro that transforms your code into something you wouldn't want to look at, and those kinds of macros are almost the whole point of Lisp, but I think part of why people like Python is that the semantics of most sugar can be described in terms that are just as readable as the sugared version, except for being longer.)

That's why I think I prefer not-Terry's (sorry for the misattribution) version: if something is going to act differently from the usual semantics, maybe it's better to describe it honestly as a new rule you have to learn, than to describe it as a translation to code that has familiar semantics but is nowhere near idiomatic.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From skrah.temporarily at  Mon Jan 25 16:45:11 2016
From: skrah.temporarily at (Stefan Krah)
Date: Mon, 25 Jan 2016 21:45:11 +0000 (UTC)
Subject: [Python-ideas] PEP 484 change proposal: Allowing <at> overload
 outside stub files
References: <>
Message-ID: <>

Andrew Barnert via Python-ideas <python-ideas at ...> writes:
> Still, I would love to see someone take that library and mypy and
experiment with making them work together

Exactly: I posted that link mainly in the hope of not having
a simple @overload now and perhaps a fully typed-checked @dispatch
version later.

But apparently people really want the simple version right now.

Stefan Krah

From srkunze at  Mon Jan 25 16:57:32 2016
From: srkunze at (Sven R. Kunze)
Date: Mon, 25 Jan 2016 22:57:32 +0100
Subject: [Python-ideas] Making Python great again
Message-ID: <>


for all those who felt that something is wrong with Python. Here's the 



From rymg19 at  Mon Jan 25 16:58:00 2016
From: rymg19 at (Ryan Gonzalez)
Date: Mon, 25 Jan 2016 15:58:00 -0600
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>


def linput():
    'Reads a list. Completely, 100% secure and bulletproof.'
    return map(eval, input[1:-1].split(',')))

def ninput():
    'Reads None.'
    assert input() == 'None'

def strinput():
    'Reads a string. Also 100% secure.'
    return eval("'" + input() + "'")

On January 25, 2016 2:04:08 PM CST, Greg Ewing <greg.ewing at> wrote:
>Marcel O'Neil wrote:
>> def intput():
>>     return int(input())
>And also
>   def flintput():
>     return float(input())
>Python-ideas mailing list
>Python-ideas at
>Code of Conduct:

Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 25 17:14:51 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 14:14:51 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <n84fhk$kt9$>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 24, 2016, at 22:32, Terry Reedy <tjreedy at> wrote:
> A possibility would be to make {} a delayed and possibly partial call operator, in parallel to the current use of () as a immediate and total call operator.
>  expr{arguments}
> would evaluate to a function, whether of type <function> or a special class similar to bound methods. The 'arguments' would be anything allowed within partial, which I believe is anything allowed in any function call.  I chose {} because expr{...} is currently illegal, just as expr(arguments) is for anything other than a function call.  On the other hand, expr[...] is currently legal, at least up to '[', as is expr<...> at least up to '<'.

I like the idea of "easy" partials, but I don't like this syntax. 

Many languages (Scala, C++ with boost::lambda, etc.) use a syntax something like this:

    hex = int(_, 16)
    binopen = open(_, "rb", *_, **_)
    setspam = setattr(spam, attr, _)

The equivalent functions are:

    lambda x: int(x, 16)
    lambda arg, *args, **kw: open(arg, "rb", *args, **kw)
    lambda arg, *, _spam=spam, _attr=attr: setattr(_spam, _attr, arg)

You can extend this to allow reordering arguments, similarly to the way %-formatting handles reordering:

    modexp = pow(_3, _1, _2)

Obviously '_' only works if that's not a valid identifier (or if you're implementing things with horrible template metaprogramming tricks and argument-dependent lookup rather than in the language), but some other symbol like ':', '%', or '$' might work.

I won't get into the ways you can extend this to expressions other than calls, like 2*_ or just (2*).

The first problem with this syntax is that it doesn't give you a way to specify _all_ of the arguments and return a nullary partial. But you can always work around that with dummy params with default values. And it really doesn't come up that often in practice anyway in languages with this syntax, except in the special case that Python already handles with bound methods.

The other big problem is that it just doesn't look like Python, no matter how much you squint. But going only half-way there, via an extended functools.partial that's more like boost bind than boost lambda isn't nearly as bad:

    hex = partial(int, _, 16)
    binopen = partial(open, _, "rb", *_, **_)
    setspam = partial(setattr, spam, attr, _)

Only the last one can be built with partial today, and even that one seems a lot more comprehensible with the explicit ', _' showing that the resulting function takes one argument, and you can see exactly where that argument will go, than with the current implicit version.

At any rate, I'm not sure I like either of these, but I definitely like them both better than:

    setspam = setattr{spam, attr}

From abarnert at  Mon Jan 25 17:24:44 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 14:24:44 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

    def binput():
        return bytes(map(ord, input()))

This should make Python 3-haters happy: it works perfectly, without any need for thought, as long as all of your friends are American. If not, just throw in random calls to .encode and .decode all over the place until the errors go away.

Sent from my iPhone

> On Jan 25, 2016, at 13:58, Ryan Gonzalez <rymg19 at> wrote:
> Also:
> def linput():
> 'Reads a list. Completely, 100% secure and bulletproof.'
> return map(eval, input[1:-1].split(',')))
> def ninput():
> 'Reads None.'
> assert input() == 'None'
> def strinput():
> 'Reads a string. Also 100% secure.'
> return eval("'" + input() + "'")
>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <greg.ewing at> wrote:
>> Marcel O'Neil wrote:
>>>  def intput():
>>>      return int(input())
>> And also
>>    def flintput():
>>      return float(input())
>> Yabba-dabba-doo-ly,
>> Greg
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> -- 
> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bzvi7919 at  Mon Jan 25 17:25:05 2016
From: bzvi7919 at (Bar Harel)
Date: Mon, 25 Jan 2016 22:25:05 +0000
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S.

def duckput():
  """Reads anything. 'Cause there's never too much ducktyping"""
  return eval(input()+";")  # ; makes sure there is only one line.

On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at> wrote:

> Also:
> def linput():
> 'Reads a list. Completely, 100% secure and bulletproof.'
> return map(eval, input[1:-1].split(',')))
> def ninput():
> 'Reads None.'
> assert input() == 'None'
> def strinput():
> 'Reads a string. Also 100% secure.'
> return eval("'" + input() + "'")
> On January 25, 2016 2:04:08 PM CST, Greg Ewing <
> greg.ewing at> wrote:
>> Marcel O'Neil wrote:
>>>  def intput():
>>>      return int(input())
>> And also
>>    def flintput():
>>      return float(input())
>> Yabba-dabba-doo-ly,
>> Greg
>> ------------------------------
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
>> --
> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 25 17:30:40 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 14:30:40 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 25, 2016, at 14:25, Bar Harel <bzvi7919 at> wrote:
> For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S.
> def duckput():
>   """Reads anything. 'Cause there's never too much ducktyping"""
>   return eval(input()+";")  # ; makes sure there is only one line.

Isn't that a guaranteed syntax error? Expressions can't include semicolons. Although I suppose that makes it even more secure, I think it would be more efficient to just `raise SyntaxError`.

>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at> wrote:
>> Also:
>> def linput():
>> 'Reads a list. Completely, 100% secure and bulletproof.'
>> return map(eval, input[1:-1].split(',')))
>> def ninput():
>> 'Reads None.'
>> assert input() == 'None'
>> def strinput():
>> 'Reads a string. Also 100% secure.'
>> return eval("'" + input() + "'")
>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <greg.ewing at> wrote:
>>> Marcel O'Neil wrote:
>>>>  def intput():
>>>>      return int(input())
>>> And also
>>>    def flintput():
>>>      return float(input())
>>> Yabba-dabba-doo-ly,
>>> Greg
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> -- 
>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From bzvi7919 at  Mon Jan 25 17:36:09 2016
From: bzvi7919 at (Bar Harel)
Date: Mon, 25 Jan 2016 22:36:09 +0000
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

Just decorate it with fuckit <> and
everything will be alright. Make sure to follow the module's guideline
though: "This module is like violence: if it doesn't work, you just need
more of it."

On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert <abarnert at> wrote:

> On Jan 25, 2016, at 14:25, Bar Harel <bzvi7919 at> wrote:
> For the ducks among us. Simple, Clean, Efficient and Secure. The 4 S/C/E/S.
> def duckput():
>   """Reads anything. 'Cause there's never too much ducktyping"""
>   return eval(input()+";")  # ; makes sure there is only one line.
> Isn't that a guaranteed syntax error? Expressions can't include
> semicolons. Although I suppose that makes it even more secure, I think it
> would be more efficient to just `raise SyntaxError`.
> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at> wrote:
>> Also:
>> def linput():
>> 'Reads a list. Completely, 100% secure and bulletproof.'
>> return map(eval, input[1:-1].split(',')))
>> def ninput():
>> 'Reads None.'
>> assert input() == 'None'
>> def strinput():
>> 'Reads a string. Also 100% secure.'
>> return eval("'" + input() + "'")
>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <
>> greg.ewing at> wrote:
>>> Marcel O'Neil wrote:
>>>>  def intput():
>>>>      return int(input())
>>> And also
>>>    def flintput():
>>>      return float(input())
>>> Yabba-dabba-doo-ly,
>>> Greg
>>> ------------------------------
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>>> --
>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 25 17:44:08 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 14:44:08 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

I said I'd write something up over the weekend if I couldn't find a good writeup from the Swift, C#, or Scala communities. I couldn't, so I did:

Apologies for the formatting (which I blame on blogspot--my markdown-to-html-with-workarounds-for-blogspot-sucking toolchain is still not perfect), and for being not entirely focused on Python (which is a consequence of Ruby and C# people being vaguely interested in it), and for being overly verbose (which is entirely my fault, as usual).

Sent from my iPhone

> On Jan 22, 2016, at 21:36, Andrew Barnert via Python-ideas <python-ideas at> wrote:
>> On Jan 22, 2016, at 21:06, Chris Angelico <rosuav at> wrote:
>> On Sat, Jan 23, 2016 at 3:50 PM, Andrew Barnert via Python-ideas
>> <python-ideas at> wrote:
>>> Finally, Terry suggested a completely different solution to the problem:
>>> don't change closures; change for loops. Make them create a new variable
>>> each time through the loop, instead of reusing the same variable. When the
>>> variable isn't captured, this would make no difference, but when it is,
>>> closures from different iterations would capture different variables (and
>>> therefore different cells). For backward-compatibility reasons, this might
>>> have to be optional, which means new syntax; he proposed "for new i in
>>> range(10):".
>> Not just for backward compatibility. Python's scoping and assignment
>> rules are currently very straight-forward: assignment creates a local
>> name unless told otherwise by a global/nonlocal declaration, and *all*
>> name binding follows the same rules as assignment. Off the top of my
>> head, I can think of two special cases, neither of which is truly a
>> change to the binding semantics: "except X as Y:" triggers an
>> unbinding at the end of the block, and comprehensions have a hidden
>> function boundary that means their iteration variables are more local
>> than you might think. Making for loops behave differently by default
>> would be a stark break from that tidiness.
> As a side note, notice that if you don't capture the variable, there is no observable difference (which means CPython would be well within its rights to optimize it by reusing the same variable unless it's a cellvar).
> Anyway, yes, it's still something that you have to learn--but the unexpected-on-first-encounter interaction between loop variables and closures is also something that everybody has to learn. And, even after you understand it, it still doesn't become obvious until you've been bitten by it enough times (and if you're going back and forth between Python and a language that's solved the problem, one way or the other, you may keep relearning it). So, theoretically, the status quo is certainly simpler, but in practice, I'm not sure it is.
>> It seems odd to change this on the loop, though. Is there any reason
>> to use "for new i in range(10):" if you're not making a series of
>> nested functions?
> Rarely if ever. But is there any reason to "def spam(x; i):" or "def [i](x):" or whatever syntax people like if you're not overwriting i with a different and unwanted value? And is there any reason to reuse a variable you've bound in that way if a loop isn't forcing you to do so?
> This problem comes up all the time, in all kinds of languages, when loops and closures intersect. It almost never comes up with loops alone or closures alone.
>> Seems most logical to make this a special way of
>> creating functions, not of looping.
> There are also some good theoretical motivations for changing loops, but I'm really hoping someone else (maybe the Swift or C# dev team blogs) has already written it up, so I can just post a link and a short "... and here's why it also applies to Python" (complicated by the fact that one of the motivations _doesn't_ apply to Python...).
> Also, the idea of a closure "capturing by value" is pretty strange on the surface; you have to think through why that doesn't just mean "not capturing" in a language like Python. Nick Coghlan suggests calling it "capture at definition" vs. "capture at call", which helps, but it's still weird. Weirder than loops creating a new binding that has the same name as the old one in a let-less language? I don't know. They're both weird. And so is the existing behavior, despite the fact that it makes perfect sense once you work it through.
> Anyway, for now, I'll just repeat that Ruby, Swift, C#, etc. all solved this by changing for loops, while only C++, which already needed to change closures because of its lifetime rules, solved it by changing closures. On the other hand, JavaScript and Java both explicitly rejected any change to fix the problem, and Python has lived with it for a long time, so...
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From mahmoud at  Mon Jan 25 18:01:54 2016
From: mahmoud at (Mahmoud Hashemi)
Date: Mon, 25 Jan 2016 15:01:54 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

I tried to have fun, but my joke ended up long and maybe useful.

Anyways, here's *ynput()*:

Get yourself a True/False from a y/n.



On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel <bzvi7919 at> wrote:

> Just decorate it with fuckit <> and
> everything will be alright. Make sure to follow the module's guideline
> though: "This module is like violence: if it doesn't work, you just need
> more of it."
> On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert <abarnert at>
> wrote:
>> On Jan 25, 2016, at 14:25, Bar Harel <bzvi7919 at> wrote:
>> For the ducks among us. Simple, Clean, Efficient and Secure. The 4
>> S/C/E/S.
>> def duckput():
>>   """Reads anything. 'Cause there's never too much ducktyping"""
>>   return eval(input()+";")  # ; makes sure there is only one line.
>> Isn't that a guaranteed syntax error? Expressions can't include
>> semicolons. Although I suppose that makes it even more secure, I think it
>> would be more efficient to just `raise SyntaxError`.
>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at> wrote:
>>> Also:
>>> def linput():
>>> 'Reads a list. Completely, 100% secure and bulletproof.'
>>> return map(eval, input[1:-1].split(',')))
>>> def ninput():
>>> 'Reads None.'
>>> assert input() == 'None'
>>> def strinput():
>>> 'Reads a string. Also 100% secure.'
>>> return eval("'" + input() + "'")
>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <
>>> greg.ewing at> wrote:
>>>> Marcel O'Neil wrote:
>>>>>  def intput():
>>>>>      return int(input())
>>>> And also
>>>>    def flintput():
>>>>      return float(input())
>>>> Yabba-dabba-doo-ly,
>>>> Greg
>>>> ------------------------------
>>>> Python-ideas mailing list
>>>> Python-ideas at
>>>> Code of Conduct:
>>>> --
>>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From steve at  Mon Jan 25 18:21:36 2016
From: steve at (Steven D'Aprano)
Date: Tue, 26 Jan 2016 10:21:36 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Excellent summary, thank you, but I want to take exception to something 
you wrote. I fear that you have inadvertently derailed the thread into a 
considerably narrower focus than it should have.

On Fri, Jan 22, 2016 at 08:50:52PM -0800, Andrew Barnert wrote:

> What the thread is ultimately looking for is a solution to the 
> "closures capturing loop variables" problem. This problem has been in 
> the official programming FAQ[1] for decades, as "Why do lambdas 
> defined in a loop with different values all return the same result"?

The issue is not loop variables, or rather, it's not *only* loop 
variables, and so any solution which focuses on fixing loop variables is 
only half a solution. If we look back at Haael's original post, his 
example captures *three* variables, not one, and there is no suggestion 
that they are necessarily loop variables.

It's nice that since we have lambda and list comps we can 
occasionally write closures in a one-liner loop like so:

>     powers = [lambda x: x**i for i in range(10)]
> This gives you ten functions that all return x**9, which is probably 
> not what you wanted.

but in my option, that's really a toy example suitable only for 
demonstrating the nature of the issue and the difference between early 
and late binding. Outside of such toys, we often find ourselves closing 
over at least one variable which is derived from the loop variable, but 
not the loop variable itself:

# Still a toy, but perhaps a bit more of a realistic toy.
searchers = []
for provider in search_provider:
    key = API_KEYS[provider]
    url = SEARCH_URLS[provider]
    def lookup(*terms):
        terms = "/q=" + "+".join(escape(t) for t in terms)
        u = url + ("key=%s" % key) + terms
        return fetch(u) or []

> The OP proposed that we should add some syntax, borrowed from C++, to 
> function definitions that specifies that some things get captured by 
> value.

Regardless of the syntax chosen, this has a few things to recommend it:

- It's completely explicit. If you want a value captured, you 
have to say so explicitly, otherwise you will get the normal variable 
lookup behaviour that Python uses now.

- It's general. We can capture locals, nonlocals, globals or builtins, 
not just loop variables.

- It allows us to avoid the "default argument" idiom, in cases where we 
really don't want the argument, we just want to capture the value. There 
are a lot of functions which have their parameter list polluted by 
extraneous arguments that should never be used by the caller simply 
because that's the only way to get early binding/value capturing.

> Finally, Terry suggested a completely different solution to the 
> problem: don't change closures; change for loops. Make them create a 
> new variable each time through the loop, instead of reusing the same 
> variable. When the variable isn't captured, this would make no 
> difference, but when it is, closures from different iterations would 
> capture different variables (and therefore different cells).

It was actually Greg, not Terry.

I strongly dislike this suggestion (sorry Greg), and I am concerned that 
the thread seems to have been derailed into treating loop variables as 
special enough to break the rules. It does nothing to solve the general 
problem of capturing values. It doesn't work for my "searchers" example 
above, or even the toy example here:

funcs = []
for i in range(10):
    n = i**2
    funcs.append(lambda x: x + n)

This example can be easily re-written to close over the loop variable 
directly, that's not the point. The point is that we frequently need to 
capture more than just the loop variable. Coming up with a solution that 
only solves the issue for loop variables isn't enough, and it is a 
mistake to think that this is about "closures capturing loop variables".

I won't speak for other languages, but in Python, where loops don't 
introduce a new scope, "closures capturing loop variables" shouldn't 
even be seen as a seperate problem from the more general issue of 
capturing values early rather than late. It's just a common, easily 
stumbled across, manifestation of the same.

> For 
> backward-compatibility reasons, this might have to be optional, which 
> means new syntax; he proposed "for new i in range(10):".

I would not like to see "new" become a keyword. I have a lot of code 
using new (and old) as a variable.


From steve at  Mon Jan 25 18:34:55 2016
From: steve at (Steven D'Aprano)
Date: Tue, 26 Jan 2016 10:34:55 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote:
> On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano <steve at>
> wrote:
> > (I'm saving my energy for Eiffel-like require/ensure blocks
> > *wink*).
> >
> Now you're making me curious.

Okay, just to satisfy your curiosity, and not as a concrete proposal at 
this time, here is a sketch of the sort of thing Eiffel uses for Design 
By Contract.

Each function or method has an (optional, but recommended) pre-condition 
and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy 

class Lunch:
    def __init__(self, arg):
        self.meat = self.spam(arg)

    def spam(self, n:int=5):
        """Set the lunch meat to n servings of spam."""
            # Assert the pre-conditions of the method.
            assert n >= 1
            # Assert the post-conditions of the method.
            assert self.meat.startswith('Spam')
            if ' ' in self.meat:
                assert ' spam' in self.meat
        # main body of the method, as usual
        serves = ['spam']*n
        serves[0] = serves.title()
        self.meat = ' '.join(serves)

The require block runs before the body of the method, and the ensure 
block runs after the body, but before the method returns to the caller. 
If either fail their assertions, the method fails and raises an 


- The pre- and post-conditions make up (part of) the method's
  contract, which is part of the executable documentation of 
  the method. Documentation tools can extract the ensure 
  and require sections as present them as part of the API docs.

- The compiler can turn the contract checking on or off as
  needed, with the ensure/require sections handled independently.

- Testing pre- and post-conditions is logically separate from 
  the method's implementation. This allows the implementation 
  to vary while keeping the contract the same.

- But at the same time, the contract is right there with the 
  method, not seperated in some potentially distant part of the
  code base.

I'm not going to go into detail about Design By Contract, if anyone 
wants to learn more you can start here:

I've just discovered there's an older PEP for something similar:

but that uses docstrings for the contracts. I don't like that.


From rymg19 at  Mon Jan 25 18:40:37 2016
From: rymg19 at (Ryan Gonzalez)
Date: Mon, 25 Jan 2016 17:40:37 -0600
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>


...that's actually pretty awesome. (Other than the "is True" and "is False"
stuff, which is making my OCD go haywire.)

On Mon, Jan 25, 2016 at 5:01 PM, Mahmoud Hashemi <mahmoud at>

> I tried to have fun, but my joke ended up long and maybe useful.
> Anyways, here's *ynput()*:
> Get yourself a True/False from a y/n.
> D[Yn]amically,
> Mahmoud
> On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel <bzvi7919 at> wrote:
>> Just decorate it with fuckit <> and
>> everything will be alright. Make sure to follow the module's guideline
>> though: "This module is like violence: if it doesn't work, you just need
>> more of it."
>> On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert <abarnert at>
>> wrote:
>>> On Jan 25, 2016, at 14:25, Bar Harel <bzvi7919 at> wrote:
>>> For the ducks among us. Simple, Clean, Efficient and Secure. The 4
>>> S/C/E/S.
>>> def duckput():
>>>   """Reads anything. 'Cause there's never too much ducktyping"""
>>>   return eval(input()+";")  # ; makes sure there is only one line.
>>> Isn't that a guaranteed syntax error? Expressions can't include
>>> semicolons. Although I suppose that makes it even more secure, I think it
>>> would be more efficient to just `raise SyntaxError`.
>>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at> wrote:
>>>> Also:
>>>> def linput():
>>>> 'Reads a list. Completely, 100% secure and bulletproof.'
>>>> return map(eval, input[1:-1].split(',')))
>>>> def ninput():
>>>> 'Reads None.'
>>>> assert input() == 'None'
>>>> def strinput():
>>>> 'Reads a string. Also 100% secure.'
>>>> return eval("'" + input() + "'")
>>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <
>>>> greg.ewing at> wrote:
>>>>> Marcel O'Neil wrote:
>>>>>>  def intput():
>>>>>>      return int(input())
>>>>> And also
>>>>>    def flintput():
>>>>>      return float(input())
>>>>> Yabba-dabba-doo-ly,
>>>>> Greg
>>>>> ------------------------------
>>>>> Python-ideas mailing list
>>>>> Python-ideas at
>>>>> Code of Conduct:
>>>>> --
>>>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at
>>>> Code of Conduct:
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something?s wrong.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Mon Jan 25 18:52:59 2016
From: rosuav at (Chris Angelico)
Date: Tue, 26 Jan 2016 10:52:59 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 26, 2016 at 10:21 AM, Steven D'Aprano <steve at> wrote:
> - It allows us to avoid the "default argument" idiom, in cases where we
> really don't want the argument, we just want to capture the value. There
> are a lot of functions which have their parameter list polluted by
> extraneous arguments that should never be used by the caller simply
> because that's the only way to get early binding/value capturing.

Can you actually name a few, please? I went digging earlier, and
couldn't find any really good examples in the stdlib - they're mostly
internal functions (underscore-prefixed) that shouldn't be being
called from outside their own module anyway. Maybe this isn't as
common an issue as I'd thought.


From abarnert at  Mon Jan 25 18:59:58 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 15:59:58 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 25, 2016, at 15:21, Steven D'Aprano <steve at> wrote:
> Excellent summary, thank you, but I want to take exception to something 
> you wrote. I fear that you have inadvertently derailed the thread into a 
> considerably narrower focus than it should have.
>> On Fri, Jan 22, 2016 at 08:50:52PM -0800, Andrew Barnert wrote:
>> What the thread is ultimately looking for is a solution to the 
>> "closures capturing loop variables" problem. This problem has been in 
>> the official programming FAQ[1] for decades, as "Why do lambdas 
>> defined in a loop with different values all return the same result"?
> The issue is not loop variables, or rather, it's not *only* loop 
> variables, and so any solution which focuses on fixing loop variables is 
> only half a solution.

I think it really _is_ only loop variables--or at least 95% loop variables.

> ... Outside of such toys, we often find ourselves closing 
> over at least one variable which is derived from the loop variable, but 
> not the loop variable itself:

But, depending on how you write that, either (a) it already works the way you'd naively expect, or (b) the only reason you'd expect it to work is if you don't understand Python scoping (that is, you think every block is a scope).

That's different from the case with loop variables: even people who know Python scoping still regularly make the mistake with loop variables, swear at themselves, and write the default-value trick on the first debug pass. (Novices, of course, swear at themselves, try 28 random changes, then post their code on StackOverflow titled "Why Python closures does suck this way?")

It's the loop variable problem that's in the FAQ. And it does in fact come up all the time in some kinds of programs, like Tkinter code that wants to create callbacks for each of 10 buttons. And again, looking at other languages, it's the loop variable problem that's in their FAQs, and the new-variable-per-instance solution would work across most of them, and is actually used in some.

Again, I definitely acknowledge that Python's non-granular scopes make the issue much less clear-cut than in those languages where "key = API_KEYS[provider]" would actually work. That's why I said that if there's one mainstream language that _shouldn't_ use my solution, it's Python.

And, ultimately, I'm still -0 about any change--the default-value solution has worked for decades, everyone who uses Python understands it, and there's no serious problem with it.

But I think "capture by value" or "capture early" would, outside the loop-variable case, be more often an attractive nuisance for code you shouldn't be writing than a help for code you should.

If you think we _should_ solve the problem with "loop-body-local" variables, that would definitely be an argument for Nick's "define and call a function" implementation over the new-cell implementation, because his version does actually define a new scope, and can easily be written to make those variables actually loop-body-local. 

However, I think that, if we wanted that, it would be better to have a more general solution--maybe a "scope" statement that defines a new scope for its suite, or even a "let" statement that defines a new variable only until the end of the current suite.

Or, of course, we could toss this on the large pile of "problems that would be solved by light-weight multi-line lambda" (and I think it doesn't add nearly enough weight to that pile to make the problem worth solving, either).

>> The OP proposed that we should add some syntax, borrowed from C++, to 
>> function definitions that specifies that some things get captured by 
>> value.
> [...]
> Regardless of the syntax chosen, this has a few things to recommend it:
> - It's completely explicit. If you want a value captured, you 
> have to say so explicitly, otherwise you will get the normal variable 
> lookup behaviour that Python uses now.

Surely "for new i" is just as explicit about the fact that the variable is "special" as "def f(x; i):" or "sharedlocal i"? The difference is only _where_ it's marked, not _whether_ it's marked.

> - It's general. We can capture locals, nonlocals, globals or builtins, 
> not just loop variables.

Sure, but it may be an overly-general solution to a very specific problem. If not, then great, but... Do you really have code that would be clearer if you could capture a global variable by value?

(Of course there's code that does that as an optimization--but that's not to make the code clearer; it's to make the code slightly faster despite being less clear.)

> - It allows us to avoid the "default argument" idiom, in cases where we 
> really don't want the argument, we just want to capture the value. There 
> are a lot of functions which have their parameter list polluted by 
> extraneous arguments that should never be used by the caller simply 
> because that's the only way to get early binding/value capturing.

It's not the _only_ way. When you really want a new scope, you can always define and call a local function. Or, usually better, refactor things so you're calling a global function, or using an object, or some other solution. The default-value idiom is just the most _concise_ way.

Meanwhile, have you ever actually had a bug where someone passed an override for the i=i or len=len parameter? I suspect if people really were worried about that, they would use "*, _spam=spam", but they never do. (The only place I've seen anything like that is in generated code--e.g., a currying macro.)

>> For 
>> backward-compatibility reasons, this might have to be optional, which 
>> means new syntax; he proposed "for new i in range(10):".
> I would not like to see "new" become a keyword. I have a lot of code 
> using new (and old) as a variable.

I've even got some 2.5 code that runs in 3.3+ thanks to modernize, but still uses the "new" module. :)

Of course it could become a context-sensitive keyword, like async. But yeah, that seems more like a last-resort idea than something to emulate wherever possible...

From mike at  Mon Jan 25 19:18:53 2016
From: mike at (Michael Selik)
Date: Tue, 26 Jan 2016 00:18:53 +0000
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Mon, Jan 25, 2016 at 5:22 PM Steven D'Aprano <steve at> wrote:

> # Still a toy, but perhaps a bit more of a realistic toy.
> searchers = []
> for provider in search_provider:
>     key = API_KEYS[provider]
>     url = SEARCH_URLS[provider]
>     def lookup(*terms):
>         terms = "/q=" + "+".join(escape(t) for t in terms)
>         u = url + ("key=%s" % key) + terms
>         return fetch(u) or []
>     searchers.append(lookup)
I'd define the basic function outside the loop.

    def lookup(root_url, api_key, *terms):
        args = root_url, api_key, "+".join(escape(t) for t in terms)
        url = '%s?key=%s&q=%s' % args
        return fetch(url) or []

Then use ``functools.partial`` inside the loop to create the closure.

    searchers = []
    for provider in search_provider:
        key = API_KEYS[provider]
        url = SEARCH_URLS[provider]
        searchers.append(partial(lookup, url, key))

Or even more concisely, you could use a comprehension at that point.

    searchers = [partial(lookup, SEARCH_URLS[p], API_KEYS[p]) for p in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mahmoud at  Mon Jan 25 19:37:50 2016
From: mahmoud at (Mahmoud Hashemi)
Date: Mon, 25 Jan 2016 16:37:50 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

If you look very closely, identity checks are actually intended. I want
_the_ True (or False). Otherwise, ValueError. :)

On Mon, Jan 25, 2016 at 3:40 PM, Ryan Gonzalez <rymg19 at> wrote:

> ...
> ...that's actually pretty awesome. (Other than the "is True" and "is
> False" stuff, which is making my OCD go haywire.)
> On Mon, Jan 25, 2016 at 5:01 PM, Mahmoud Hashemi <mahmoud at>
> wrote:
>> I tried to have fun, but my joke ended up long and maybe useful.
>> Anyways, here's *ynput()*:
>> Get yourself a True/False from a y/n.
>> D[Yn]amically,
>> Mahmoud
>> On Mon, Jan 25, 2016 at 2:36 PM, Bar Harel <bzvi7919 at> wrote:
>>> Just decorate it with fuckit <> and
>>> everything will be alright. Make sure to follow the module's guideline
>>> though: "This module is like violence: if it doesn't work, you just
>>> need more of it."
>>> On Tue, Jan 26, 2016 at 12:30 AM Andrew Barnert <abarnert at>
>>> wrote:
>>>> On Jan 25, 2016, at 14:25, Bar Harel <bzvi7919 at> wrote:
>>>> For the ducks among us. Simple, Clean, Efficient and Secure. The 4
>>>> S/C/E/S.
>>>> def duckput():
>>>>   """Reads anything. 'Cause there's never too much ducktyping"""
>>>>   return eval(input()+";")  # ; makes sure there is only one line.
>>>> Isn't that a guaranteed syntax error? Expressions can't include
>>>> semicolons. Although I suppose that makes it even more secure, I think it
>>>> would be more efficient to just `raise SyntaxError`.
>>>> On Mon, Jan 25, 2016 at 11:58 PM Ryan Gonzalez <rymg19 at>
>>>> wrote:
>>>>> Also:
>>>>> def linput():
>>>>> 'Reads a list. Completely, 100% secure and bulletproof.'
>>>>> return map(eval, input[1:-1].split(',')))
>>>>> def ninput():
>>>>> 'Reads None.'
>>>>> assert input() == 'None'
>>>>> def strinput():
>>>>> 'Reads a string. Also 100% secure.'
>>>>> return eval("'" + input() + "'")
>>>>> On January 25, 2016 2:04:08 PM CST, Greg Ewing <
>>>>> greg.ewing at> wrote:
>>>>>> Marcel O'Neil wrote:
>>>>>>>  def intput():
>>>>>>>      return int(input())
>>>>>> And also
>>>>>>    def flintput():
>>>>>>      return float(input())
>>>>>> Yabba-dabba-doo-ly,
>>>>>> Greg
>>>>>> ------------------------------
>>>>>> Python-ideas mailing list
>>>>>> Python-ideas at
>>>>>> Code of Conduct:
>>>>>> --
>>>>> Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.
>>>>> _______________________________________________
>>>>> Python-ideas mailing list
>>>>> Python-ideas at
>>>>> Code of Conduct:
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at
>>>> Code of Conduct:
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at
>>> Code of Conduct:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> --
> Ryan
> [ERROR]: Your autotools build scripts are 200 lines longer than your
> program. Something?s wrong.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 25 19:42:59 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 16:42:59 -0800
Subject: [Python-ideas] DBC (Re:  Explicit variable capture list)
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 25, 2016, at 15:34, Steven D'Aprano <steve at> wrote:
> Okay, just to satisfy your curiosity, and not as a concrete proposal at 
> this time, here is a sketch of the sort of thing Eiffel uses for Design 
> By Contract.

I think it's worth explaining why this has to be an actual language feature, not something you just do by writing functions named "requires" and "ensures". Many of the benefits you cited would work just fine with a PyPI-library solution, but there are some problems that are much harder to solve:

* You usually want requires to be able to access the return value and exception state, and maybe even any locals, and ensure to be able to access the parameters.

* Faking ensure usually means finally or with (which means indenting your entire function) or a wrapper function (while precludes many simple designs).

* Many contract assertions are slow (or even dangerous, when not upheld) to calculate, so just no-opping out the checker functions doesn't help.

* Class invariants should be automatically verified as ensures on all public methods except __del__ and (if it raises) __init__.

* Subclasses that override a method need to automatically inherit the base class's pre- and post-conditions (as well as possibly adding some of their own), even if they don't call the super method.

* Some contract assertions can be tested at compile time. (Eiffel doesn't have much experimentation here; C# does, and there are rumors about Swift with clang-static-analyzer.)

Some of these things can be shoehorned in with frame hacks and metaclasses and so on, but it's not fun. There's a lot of history of people trying to fake it in other languages and then giving up and saying "just use comments until we can build it into language version n+1". (See D 1.0, Core C++ Standard for C++14/17, C# 4, Swift 2...) There have been a few attempts for Python, but most of them seem to have run into similar problems, after a lot of messing around with metaclasses and so on.

From mike at  Mon Jan 25 20:01:27 2016
From: mike at (Michael Selik)
Date: Tue, 26 Jan 2016 01:01:27 +0000
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Mon, Jan 25, 2016 at 6:43 PM Andrew Barnert via Python-ideas <
python-ideas at> wrote:

> On Jan 25, 2016, at 15:34, Steven D'Aprano <steve at> wrote:
> >
> > Okay, just to satisfy your curiosity, and not as a concrete proposal at
> > this time, here is a sketch of the sort of thing Eiffel uses for Design
> > By Contract.
> I think it's worth explaining why this has to be an actual language
> feature, not something you just do by writing functions named "requires"
> and "ensures". Many of the benefits you cited would work just fine with a
> PyPI-library solution, but there are some problems that are much harder to
> solve:
> Some of these things can be shoehorned in with frame hacks and metaclasses
> and so on, but it's not fun. ... There have been a few attempts for Python,
> but most of them seem to have run into similar problems, after a lot of
> messing around with metaclasses and so on.

As you were writing this, I was sketching out an implementation using a
callable FunctionWithContract context manager as a decorator. As you say,
the trouble seems to be elegantly capturing the function output and passing
that to an ensure or __exit__ method. The requires side isn't so bad.

Still, I'm somewhat hopeful that someone more skilled than I might be able
to write an elegant ``Contract`` type using current Python syntax.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Mon Jan 25 20:02:26 2016
From: abarnert at (Andrew Barnert)
Date: Mon, 25 Jan 2016 17:02:26 -0800
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

One more, ported from VB code I found via Google:

    def nput(n, prompt):
        for i in range(n):
        if prompt:
            tx.immediate = True

I think this has something to do with stock derivatives? If you want to use this to get rich off high-frequency trading, you may want to cythonize or numpyize it.

From mertz at  Mon Jan 25 20:24:29 2016
From: mertz at (David Mertz)
Date: Mon, 25 Jan 2016 17:24:29 -0800
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Just curious, Michael, what would you like the Python syntax version to
look like if you *can* do whatever metaclass or stack hackery that's
needed?  I'm a little confused when you mention a decorator and a context
manager in the same sentence since those would seem like different
approaches.  E.g.:

@Contract(pre=my_pre, post=my_post)
def my_fun(...): ...


with contract(pre=my_pre, post=my_post):

    def my_fun(...): ...


def my_fun(...):

    with contract(pre=my_pre, post=my_post):

I'm sure lots of other variations are possible too (if any can be made
fully to work).

On Mon, Jan 25, 2016 at 5:01 PM, Michael Selik <mike at> wrote:

> On Mon, Jan 25, 2016 at 6:43 PM Andrew Barnert via Python-ideas <
> python-ideas at> wrote:
>> On Jan 25, 2016, at 15:34, Steven D'Aprano <steve at> wrote:
>> >
>> > Okay, just to satisfy your curiosity, and not as a concrete proposal at
>> > this time, here is a sketch of the sort of thing Eiffel uses for Design
>> > By Contract.
>> I think it's worth explaining why this has to be an actual language
>> feature, not something you just do by writing functions named "requires"
>> and "ensures". Many of the benefits you cited would work just fine with a
>> PyPI-library solution, but there are some problems that are much harder to
>> solve:
>> Some of these things can be shoehorned in with frame hacks and
>> metaclasses and so on, but it's not fun. ... There have been a few attempts
>> for Python, but most of them seem to have run into similar problems, after
>> a lot of messing around with metaclasses and so on.
> As you were writing this, I was sketching out an implementation using a
> callable FunctionWithContract context manager as a decorator. As you say,
> the trouble seems to be elegantly capturing the function output and passing
> that to an ensure or __exit__ method. The requires side isn't so bad.
> Still, I'm somewhat hopeful that someone more skilled than I might be able
> to write an elegant ``Contract`` type using current Python syntax.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From tritium-list at  Tue Jan 26 01:51:46 2016
From: tritium-list at (Alexander Walters)
Date: Tue, 26 Jan 2016 01:51:46 -0500
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

intput = ast.literal_eval
flput = ast.literal_eval

On 1/25/2016 10:11, Marcel O'Neil wrote:
> def intput():
>     return int(input())
> Life would be just marginally easier, with a punny function name as a 
> bonus.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From sjoerdjob at  Tue Jan 26 05:47:06 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Tue, 26 Jan 2016 11:47:06 +0100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 26, 2016 at 10:34:55AM +1100, Steven D'Aprano wrote:
> On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote:
> > On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano <steve at>
> > wrote:
> [...]
> > > (I'm saving my energy for Eiffel-like require/ensure blocks
> > > *wink*).
> > >
> > 
> > Now you're making me curious.
> Okay, just to satisfy your curiosity, and not as a concrete proposal at 
> this time, here is a sketch of the sort of thing Eiffel uses for Design 
> By Contract.
> Each function or method has an (optional, but recommended) pre-condition 
> and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy 
> example:
> class Lunch:
>     def __init__(self, arg):
>         self.meat = self.spam(arg)
>     def spam(self, n:int=5):
>         """Set the lunch meat to n servings of spam."""
>         require:
>             # Assert the pre-conditions of the method.
>             assert n >= 1
>         ensure:
>             # Assert the post-conditions of the method.
>             assert self.meat.startswith('Spam')
>             if ' ' in self.meat:
>                 assert ' spam' in self.meat
>         # main body of the method, as usual
>         serves = ['spam']*n
>         serves[0] = serves.title()
>         self.meat = ' '.join(serves)
> The require block runs before the body of the method, and the ensure 
> block runs after the body, but before the method returns to the caller. 
> If either fail their assertions, the method fails and raises an 
> exception.
> Benefits:
> - The pre- and post-conditions make up (part of) the method's
>   contract, which is part of the executable documentation of 
>   the method. Documentation tools can extract the ensure 
>   and require sections as present them as part of the API docs.
> - The compiler can turn the contract checking on or off as
>   needed, with the ensure/require sections handled independently.
> - Testing pre- and post-conditions is logically separate from 
>   the method's implementation. This allows the implementation 
>   to vary while keeping the contract the same.
> - But at the same time, the contract is right there with the 
>   method, not seperated in some potentially distant part of the
>   code base.

One thing I immediately thought of was using decorators.

    def requires(*conditions):
        def decorator(func):
            # TODO: Do some hackery such that the signature of wrapper
            # matches the signature of `func`.
            def wrapper(*args, **kwargs):
                for condition in conditions
                    assert eval(condition, {}, locals())
                return func(*args, **kwargs)
            return wrapper
        return decorator

    def ensure(*conditions):
        def decorator(func):
            def wrapper(*args, **kwargs):
                    return func(*args, **kwargs)
                    for condition in conditions:
                        assert eval(condition, {}, locals())
        return decorator

Maybe do some checking for the optimization-level flag, and replace the
decorator function with `return func` instead of another wrapper?

The `ensure` part isn't quite to my liking yet, but I think that the
`ensure` should have no need to access internal variables of the
function, but only the externally visible state.

(This somewhat mimics what I'm trying to fiddle around with in my own
time: writing a decorator that does run-time checking of argument and
return types of functions.)

From steve at  Tue Jan 26 09:26:47 2016
From: steve at (Steven D'Aprano)
Date: Wed, 27 Jan 2016 01:26:47 +1100
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 25, 2016 at 05:24:29PM -0800, David Mertz wrote:
> Just curious, Michael, what would you like the Python syntax version to
> look like if you *can* do whatever metaclass or stack hackery that's
> needed?  I'm a little confused when you mention a decorator and a context
> manager in the same sentence since those would seem like different
> approaches.  E.g.:

I'm not Michael, but since I started this discussion, I'll give an 

I haven't got any working code, but I think something like this would be 
acceptable as a proof-of-concept. I'd use a class as a fake namespace, 
with either a decorator or metaclass:

class myfunction(metaclass=DBC):
    def myfunction(args):
        # function implementation
    def requires():
    def ensures():

The duplication of the name is a bit ugly, and it looks a bit funny for 
the decorator/metaclass to take a class as input and return a function, 
but we don't really have anything else that makes a good namespace. 
There's functions themselves, of course, but it's hard to get at the 

The point is to avoid having to pre-define the pre- and post-condition 
functions. We don't write this:

def __init__(self): 
def method(self, arg): 

class MyClass(init=__init__, method=method)

and nor should we have to do the same for require/ensure.


From steve at  Tue Jan 26 09:40:23 2016
From: steve at (Steven D'Aprano)
Date: Wed, 27 Jan 2016 01:40:23 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Tue, Jan 26, 2016 at 10:52:59AM +1100, Chris Angelico wrote:
> On Tue, Jan 26, 2016 at 10:21 AM, Steven D'Aprano <steve at> wrote:
> > - It allows us to avoid the "default argument" idiom, in cases where we
> > really don't want the argument, we just want to capture the value. There
> > are a lot of functions which have their parameter list polluted by
> > extraneous arguments that should never be used by the caller simply
> > because that's the only way to get early binding/value capturing.
> >
> Can you actually name a few, please? 

The random module is the first example that comes to mind.

Up until 3.3, the last argument was spelled "int" with no underscore:

py> inspect.signature(random.randrange)
<Signature (start, stop=None, step=1, _int=<class 'int'>)>

random.shuffle also used to have an int=int argument, but it seems to be 
gone in 3.5.

> I went digging earlier, and
> couldn't find any really good examples in the stdlib - they're mostly
> internal functions (underscore-prefixed) that shouldn't be being
> called from outside their own module anyway. Maybe this isn't as
> common an issue as I'd thought.

Obviously you can get away with more junk in a private function than a 
public function, but it's still unpleasant. Even if it only effects the 
maintainer of the library, not the users of it, a polluted signature is 
still polluted.


From p.f.moore at  Tue Jan 26 10:06:48 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 26 Jan 2016 15:06:48 +0000
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On 26 January 2016 at 14:26, Steven D'Aprano <steve at> wrote:
> class myfunction(metaclass=DBC):
>     def myfunction(args):
>         # function implementation
>         ...
>     def requires():
>         ...
>     def ensures():
>         ...
> The duplication of the name is a bit ugly, and it looks a bit funny for
> the decorator/metaclass to take a class as input and return a function,
> but we don't really have anything else that makes a good namespace

Well, classes can be callable already, so how about

class myfunction:
    def __call__(self, args):
    def requires(self):
    def ensures(self, result):

The DBC class decorator does something like

def DBC(cls):
    def wrapper(*args, **kw):
        fn = cls()
        fn.args = args = kw
        for pre in fn.__preconditions__:
        result = fn(*args, **kw)
        for post in fn.__postconditions__:
    return wrapper

Pre and post conditions can access the args via self.args and
The method decorators would let you have multiple pre- and
post-conditions. Or you could use "magic" names and omit the


From random832 at  Tue Jan 26 10:30:37 2016
From: random832 at (Random832)
Date: Tue, 26 Jan 2016 10:30:37 -0500
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote:
> I tried to have fun, but my joke ended up long and maybe useful.
> Anyways, here's *ynput()*:
> Get yourself a True/False from a y/n.

If I were writing such a function I'd use
locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my
system indicates these all accept y/n, but additionally accept their own
language's term (or a related language - en_DK supports danish J/N, and
en_CA supports french O/N). Mostly these use syntax compatible with
python regex, though a few use (grouping|alternation) with no backslash.

From rosuav at  Tue Jan 26 10:24:52 2016
From: rosuav at (Chris Angelico)
Date: Wed, 27 Jan 2016 02:24:52 +1100
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 2:06 AM, Paul Moore <p.f.moore at> wrote:
> Well, classes can be callable already, so how about
> @DBC
> class myfunction:
>     def __call__(self, args):
>         ...
>     @precondition
>     def requires(self):
>         ...
>     @postcondition
>     def ensures(self, result):
>         ...
> The DBC class decorator does something like
> def DBC(cls):
>     def wrapper(*args, **kw):
>         fn = cls()
>         fn.args = args
> = kw
>         for pre in fn.__preconditions__:
>             pre()
>         result = fn(*args, **kw)
>         for post in fn.__postconditions__:
>             post(result)
>     return wrapper
> Pre and post conditions can access the args via self.args and
> The method decorators would let you have multiple pre- and
> post-conditions. Or you could use "magic" names and omit the
> decorators.

I'd rather use magic names - something like:

class myfunction:
    def body(self, args):
    def requires(self):
    def ensures(self, result):

and then the DBC decorator can create a __call__ method. This still
has one nasty problem though: the requires and ensures functions can't
see function arguments. You could get around this by duplicating the
argument list onto the other two, but who wants to do that?


From p.f.moore at  Tue Jan 26 10:42:54 2016
From: p.f.moore at (Paul Moore)
Date: Tue, 26 Jan 2016 15:42:54 +0000
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On 26 January 2016 at 15:24, Chris Angelico <rosuav at> wrote:
> This still
> has one nasty problem though: the requires and ensures functions can't
> see function arguments.

See my code - you can put the args onto the instance as attributes for
requires/ensures to inspect.

From rosuav at  Tue Jan 26 10:51:08 2016
From: rosuav at (Chris Angelico)
Date: Wed, 27 Jan 2016 02:51:08 +1100
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 2:42 AM, Paul Moore <p.f.moore at> wrote:
> On 26 January 2016 at 15:24, Chris Angelico <rosuav at> wrote:
>> This still
>> has one nasty problem though: the requires and ensures functions can't
>> see function arguments.
> See my code - you can put the args onto the instance as attributes for
> requires/ensures to inspect.

Except that there can be only one of those at any given time, so you
run into issues with recursion or threads/async/etc; plus, it's still
not properly clean - you have to check either args or kwargs,
depending on whether the argument was passed positionally or by
keyword. I don't see that as a solution.

(Maybe what we need is a "keyword-to-positional" functools feature -
anything in **kwargs that can be interpreted positionally gets removed
and added to *args. Or the other way - keywordify everything.)


From encukou at  Tue Jan 26 11:17:55 2016
From: encukou at (Petr Viktorin)
Date: Tue, 26 Jan 2016 17:17:55 +0100
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/26/2016 04:51 PM, Chris Angelico wrote:
> On Wed, Jan 27, 2016 at 2:42 AM, Paul Moore <p.f.moore at> wrote:
>> On 26 January 2016 at 15:24, Chris Angelico <rosuav at> wrote:
>>> This still
>>> has one nasty problem though: the requires and ensures functions can't
>>> see function arguments.
>> See my code - you can put the args onto the instance as attributes for
>> requires/ensures to inspect.
> Except that there can be only one of those at any given time, so you
> run into issues with recursion or threads/async/etc; plus, it's still
> not properly clean - you have to check either args or kwargs,
> depending on whether the argument was passed positionally or by
> keyword. I don't see that as a solution.
> (Maybe what we need is a "keyword-to-positional" functools feature -
> anything in **kwargs that can be interpreted positionally gets removed
> and added to *args. Or the other way - keywordify everything.)

Well, it's not in functools.

import inspect

def keyword_to_positional(func, args, kwargs):
    sig = inspect.signature(func).bind(*args, **kwargs)
    return sig.args, sig.kwargs

def keywordify_everything(func, args, kwargs):
    sig = inspect.signature(func).bind(*args, **kwargs)
    return sig.arguments

From ethan at  Tue Jan 26 11:55:04 2016
From: ethan at (Ethan Furman)
Date: Tue, 26 Jan 2016 08:55:04 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On 01/25/2016 03:34 PM, Steven D'Aprano wrote:
> On Wed, Jan 20, 2016 at 05:04:21PM -0800, Guido van Rossum wrote:
>> On Wed, Jan 20, 2016 at 4:10 PM, Steven D'Aprano wrote:
> [...]
>>> (I'm saving my energy for Eiffel-like require/ensure blocks
>>> *wink*).
>> Now you're making me curious.
> Okay, just to satisfy your curiosity, and not as a concrete proposal at
> this time, here is a sketch of the sort of thing Eiffel uses for Design
> By Contract.
> Each function or method has an (optional, but recommended) pre-condition
> and post-condition. Using a hybrid Eiffel/Python syntax, here is a toy
> example:
> class Lunch:
>      def __init__(self, arg):
>          self.meat = self.spam(arg)
>      def spam(self, n:int=5):
>          """Set the lunch meat to n servings of spam."""
>          require:
>              # Assert the pre-conditions of the method.
>              assert n >= 1
>          ensure:
>              # Assert the post-conditions of the method.
>              assert self.meat.startswith('Spam')
>              if ' ' in self.meat:
>                  assert ' spam' in self.meat
>          # main body of the method, as usual
>          serves = ['spam']*n
>          serves[0] = serves.title()
>          self.meat = ' '.join(serves)

I like that syntax.

Currently, something not too ugly would be to use descriptors -- 
something like:

from dbc import require, ensure

class Frobnigate(object):
     def spammer(self, desc):
         desc.assertInRange(arg1, 0, 99)

     def _spammer(self, arg1, arg2):
         return arg1 // arg2 + arg1

     def spammer(self, desc, res):
         if desc.arg2 % 2 == 1:
             desc.assertEqual(res % 2, 1)
             desc.assertEqual(res % 2, 0)

     def egger(self, desc, res):
         desc.assertIsType(res, str)

     def _egger(self, egg_type):
         'scrambled, poached, boiled, etc'
          return egg_type

Where 'desc' in the above code is 'self' for the descriptor so saved 
arguments could be accessed, etc.

I put a leading underscore on the body so it could be kept separate and 
more easily subclassed without losing the DBC portions.

If 'require' is not needed, one can use 'ensure'; both create the DBC 
object which would take care of calling any/all requires, then the 
function, then any/all ensures, and also grabbing and saving the 
function signature and actual parameters.


From srkunze at  Tue Jan 26 12:44:09 2016
From: srkunze at (Sven R. Kunze)
Date: Tue, 26 Jan 2016 18:44:09 +0100
Subject: [Python-ideas] PEP 484 change proposal: Allowing @overload
 outside stub files
In-Reply-To: <>
References: <>
Message-ID: <>

Overall, that's an interesting idea although I share the concern about 
the visual heaviness of the proposal. Not sure if that can be resolved 
properly. I somehow like the comment idea but that doesn't fit into the 
remaining concept well.

On 22.01.2016 21:00, Guido van Rossum wrote:
> Calling an @overload-decorated function is still an error (I propose 
> NotImplemented).

Not sure if that applies here, but would that be rather NotImplementedError?


From ethan at  Tue Jan 26 13:55:31 2016
From: ethan at (Ethan Furman)
Date: Tue, 26 Jan 2016 10:55:31 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
 <> <>
Message-ID: <>

On 01/26/2016 08:55 AM, Ethan Furman wrote:

> Currently, something not too ugly would be to use descriptors --
> something like:
> from dbc import require, ensure
> class Frobnigate(object):
>      @require
>      def spammer(self, desc):
>          desc.assertInRange(arg1, 0, 99)
>      @spammer
>      def _spammer(self, arg1, arg2):
>          return arg1 // arg2 + arg1
>      @spammer.ensure
>      def spammer(self, desc, res):
>          if desc.arg2 % 2 == 1:
>              desc.assertEqual(res % 2, 1)
>          else:
>              desc.assertEqual(res % 2, 0)
>      @ensure
>      def egger(self, desc, res):
>          desc.assertIsType(res, str)
>      @egger
>      def _egger(self, egg_type):
>          'scrambled, poached, boiled, etc'
>           return egg_type
> Where 'desc' in the above code is 'self' for the descriptor so saved
> arguments could be accessed, etc.
> I put a leading underscore on the body so it could be kept separate and
> more easily subclassed without losing the DBC portions.
> If 'require' is not needed, one can use 'ensure'; both create the DBC
> object which would take care of calling any/all requires, then the
> function, then any/all ensures, and also grabbing and saving the
> function signature and actual parameters.

The descriptor itself might look like:

# untested
class require:

     def __init__(desc, func=None):
         desc.require = []
         desc.ensure = [] = None
         desc.func = None

     def __call__(desc, func):
         # desc.func is not None, func is the actual function,
         # otherwise it's a requires function
         if desc.func is None:
             return desc
             desc.func_name = name = func.__name__
             if name.startswith('_'):
                 name = name[1:]
    = name
             return func

     def __get__(desc, self, cls):
         function = self.getattr(desc.func_name)
         def caller(self, *args, **kwds):
             for require in desc.require:
                 require(self, desc, *args, **kwds)
             res = function(self, *args, **kwds)
             for ensure in desc.ensure:
                 ensure(self, desc, res, *args, **kwds)
             return res
         return caller

     def ensure(desc, func):
         return desc

     def require(desc, func):
         return desc

I decided to pass args and kwds rather than save them to the descriptor 
instance, hoping threading would be easier that way.

The 'ensure' class would be very similar.

This style does require the programmer to have both names: 'spammer' and 
'_spammer' -- it would be a bit cleaner to have a metaclass with a 
custom __getattribute__, but a lot more work and possible metaclass 
conflicts when combining with other interesting metaclasses.


From bzvi7919 at  Tue Jan 26 14:13:55 2016
From: bzvi7919 at (Bar Harel)
Date: Tue, 26 Jan 2016 19:13:55 +0000
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

ynput can use distutils.util.strtobool instead of defining for itself (just
an added bonus)

On Tue, Jan 26, 2016 at 5:30 PM Random832 <random832 at> wrote:

> On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote:
> > I tried to have fun, but my joke ended up long and maybe useful.
> >
> > Anyways, here's *ynput()*:
> >
> >
> >
> > Get yourself a True/False from a y/n.
> If I were writing such a function I'd use
> locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my
> system indicates these all accept y/n, but additionally accept their own
> language's term (or a related language - en_DK supports danish J/N, and
> en_CA supports french O/N). Mostly these use syntax compatible with
> python regex, though a few use (grouping|alternation) with no backslash.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jimjjewett at  Tue Jan 26 14:40:01 2016
From: jimjjewett at (Jim J. Jewett)
Date: Tue, 26 Jan 2016 14:40:01 -0500
Subject: [Python-ideas] several different needs [Explicit variable capture
Message-ID: <>

I think a small part of the confusion is that there are at least four
separate (albeit related) use cases.  They all use default arguments
for the current workarounds, but they don't have the same ideal

(1)  Auxiliary variables

    def f(x, _len=len): ...

This is often a micro-optimization; the _len keyword really shouldn't
be overridden.  Partly because it shouldn't be overridden, having it
in the signature is just ugly.

This could be solved with another separator in the signature, such as
; or a second () or a new keyword ...

    def f(x, aux _len=len): ...
    def f(x, once _len=len): ...

    def f(x; _len=len):...
    def f(x)(_len=len): ...
    def f(x){_len=len}: ...

But realistically, that _len isn't ugly *just* because it shouldn't be
overridden; it is also inherently ugly.  I would prefer that something
like Victor's FAT optimizer just make this idiom obsolete.

(2)  immutable bindings

once X
final Y
const Z

This is pretty similar to the auxiliary variables case, except that it
tends to be desired more outside of functions.  The immutability can
be worth documenting on its own, but it feels too much like a typing
declaration, which raises questions of "why *this* distinction for
*this* variable?"

So again, I think something like Victor's FAT optimizer (plus comments
when immutability really is important) is a better long-term solution,
but I'm not as sure as I was for case 1.

(3)  Persistent storage

    def f(x, _cached_results={}): ...

In the past, I've managed to convince myself that it is good to be
able to pass in a different cache ... or to turn the function into a
class, so that I could get to self, or even to attach attributes to
the function itself (so that rebinding the name to another function in
a naive manner would fail, rather than produces bugs).  Those
convincings don't stick very well, though.

This was clearly at least one of the motivations of some people who
asked about static variables.

I still think it might be nice to just have a way of easily opening a
new scope ... but then I have to explain why I can't just turn the
function into a class...

So in the end, I suspect this use case is better off ignored, but I am
pretty sure it will lead to some extra confusion if any of the others
are "solved" in a way that doesn't consider it.

(4)  Current Value Capture

This is the loop variable case that some have thought was the only
case under consideration.

I don't have anything to add to Andrew Barnert's

but do see Steven D'Aprano's
 for gotchas even within this use case.


From abarnert at  Tue Jan 26 15:59:07 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 26 Jan 2016 12:59:07 -0800
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:
> (1)  Auxiliary variables
>    def f(x, _len=len): ...
> This is often a micro-optimization;

When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.:

    def len(iterable, _len=len):
        if something(iterable): special_case()
        else: return _len(iterable)

Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4, except that you're capturing a builtin instead of a nonlocal.

> But realistically, that _len isn't ugly *just* because it shouldn't be
> overridden; it is also inherently ugly.  I would prefer that something
> like Victor's FAT optimizer just make this idiom obsolete.

But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations.

Also, marking that you're using an intentional micro-optimization is useful, even (or maybe especially) if it's ugly: it signals to any future maintainer that performance is particularly important here, and they should be careful with any changes.

Of course some people will abuse that (IIRC, a couple years ago, someone removed all the "register" declarations in the perl 5 source, which not only sped it up by a small amount, but also got people to look at some other micro-optimized code from 15 years ago that was actually pessimizing things on modern platforms...), but those people are the last ones who will stop micro-optimizing because you tell them the compiler can often do it better.

> (2)  immutable bindings
> once X
> final Y
> const Z

But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables.
> So again, I think something like Victor's FAT optimizer (plus comments
> when immutability really is important) is a better long-term solution,
> but I'm not as sure as I was for case 1.

How could an optimizer enforce immutability, much less signal it? It only makes changes that are semantically transparent, and changing a mutable binding to immutable is definitely not transparent.

> (3)  Persistent storage
>    def f(x, _cached_results={}): ...

> I still think it might be nice to just have a way of easily opening a
> new scope ...

You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

From abarnert at  Tue Jan 26 17:10:41 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 26 Jan 2016 14:10:41 -0800
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <>
Message-ID: <>

There are probably a dozen DBC packages on PyPI, and dozens more that never even got that far. If this is doable without language changes, surely it'll be done on PyPI first, get traction there, and only then be considered for inclusion in the stdlib (so that it can be used to contractify parts of the stdlib), right?

But, since this is fun:

On Jan 26, 2016, at 07:24, Chris Angelico <rosuav at> wrote:
>> On Wed, Jan 27, 2016 at 2:06 AM, Paul Moore <p.f.moore at> wrote:
>> Well, classes can be callable already, so how about
>> @DBC
>> class myfunction:
>>    def __call__(self, args):
>>        ...
>>    @precondition
>>    def requires(self):
>>        ...
>>    @postcondition
>>    def ensures(self, result):
>>        ...
>> The DBC class decorator does something like
>> def DBC(cls):
>>    def wrapper(*args, **kw):
>>        fn = cls()
>>        fn.args = args
>> = kw
>>        for pre in fn.__preconditions__:
>>            pre()
>>        result = fn(*args, **kw)
>>        for post in fn.__postconditions__:
>>            post(result)
>>    return wrapper
>> Pre and post conditions can access the args via self.args and
>> The method decorators would let you have multiple pre- and
>> post-conditions. Or you could use "magic" names and omit the
>> decorators.
> I'd rather use magic names - something like:
> @DBC
> class myfunction:
>    def body(self, args):
>        ...
>    def requires(self):
>        ...
>    def ensures(self, result):
>        ...
> and then the DBC decorator can create a __call__ method. This still
> has one nasty problem though: the requires and ensures functions can't
> see function arguments. You could get around this by duplicating the
> argument list onto the other two, but who wants to do that?

You could do this pretty easily with a macro that returns (the AST for) something like this:

    def myfunction([func.body.params]):
        except Return as r:
            result, exc = r.args(0), None
            return result
        except Exception as exc:

(I deliberately didn't write this in MacroPy style, but obviously if you really wanted to implement this, that's how you'd do it.)

There are still a few things missing here. For example, many postconditions are specified in terms of the pre- and post- values of mutable parameters, with self as a very important special case. And fitting class invariant testing into this scheme should be extra fun. But I think it's all doable.

From greg.ewing at  Tue Jan 26 18:59:13 2016
From: greg.ewing at (Greg Ewing)
Date: Wed, 27 Jan 2016 12:59:13 +1300
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

Steven D'Aprano wrote:
> Outside of such toys, we often find ourselves closing 
> over at least one variable which is derived from the loop variable, but 
> not the loop variable itself:

Terry's idea of a variant of the for-loop whose body
is a nested scope (with everything that implies) would
address that, because any name assigned within the body
(and not declared nonlocal) would be part of the captured

> I would not like to see "new" become a keyword.

I'm open to alternatives. Would "foreach" be better
keyword material? We could say

   foreach i in things:

although the difference between "for" and "foreach"
would be far from obvious.

I'd like to do something with "let", which is famliar
from other languages as a binding-creation construct,
and it doesn't seem a likely choice for a variable

Maybe if we had a general statement for introducing
a new scope, independent of looping:


Then loops other than for-loops could be treated like

   i = 0
   while i < n:
       x = things[i]
       funcs.append(lambda: process(x))

The for-loop is a special case, because it assigns a
variable in a place where we can't capture it in a
let-block. So we introduce a variant:

   for let x in things:
     funcs.append(lambda: process(x))


1) Other special cases could be provided, but I don't
think any other special cases are strictly needed. For
example, you might want:

   with open(filename) as let f:

but that could be written as

   with open(filename) as f:
       g = f

2) It may be desirable to allow assignments on the
same line as "let", e.g.

   with open(filename) as f:
     let g = f:

which seems marginally more readable. Also, the
RHS of the assignment would be evaluated outside
the scope being created, allowing

   with open(filename) as f:
     let f = f:

although I'm not sure that's a style that should be
encouraged. Code that apparently assigns something
to itself always looks a bit wanky to me. :-(


From mike at  Tue Jan 26 19:47:07 2016
From: mike at (Michael Selik)
Date: Wed, 27 Jan 2016 00:47:07 +0000
Subject: [Python-ideas] DBC (Re: Explicit variable capture list)
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Mon, Jan 25, 2016 at 7:24 PM David Mertz <mertz at> wrote:

> Just curious, Michael, what would you like the Python syntax version to
> look like if you *can* do whatever metaclass or stack hackery that's
> needed?  I'm a little confused when you mention a decorator and a context
> manager in the same sentence since those would seem like different
> approaches.

Now that you mention it, that does seem weird. Initially the pattern of
trying to factor out a setup/cleanup feels like a context manager. But we
also need to capture the function arguments and return value. So that feels
like a decorator.

I started by implementing an abstract base class Contract that sets up the
require/ensure behavior. One inherits and overrides to implement a
particular contract. The overridden require/ensure functions would receive
the arguments/result of a decorated function.

    class StayPositive(Contract):
        def require(self, *args, **kwargs):
            assert sum(args + list(kwargs.values())
        def ensure(self, result, *args, **kwargs):
            # ensure receives not only the result, but also same argument
            assert sum(result)

    def foo(i + am + happy):
        return i + am + happy

One thing I like here is that the require/ensure doesn't clutter the
function definition with awkward decorator parameters. The contract terms
are completely separate. This does put the burden on wisely naming the
contract subclass name.

The combination of decorator and context manager was unnecessary. The
internals of my Contract base class included an awkward ``with self:``. If
I were to refactor, I'd separate out a context manager helper from the
decorator object.

Seeing some of the stubs folks have written makes me think this ends with
exec-ing a template a la namedtuple.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jimjjewett at  Tue Jan 26 20:23:11 2016
From: jimjjewett at (Jim J. Jewett)
Date: Tue, 26 Jan 2016 20:23:11 -0500
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert at> wrote:
> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:

>> (1)  Auxiliary variables

>>    def f(x, _len=len): ...

>> This is often a micro-optimization;

> When _isn't_ it a micro-optimization?

It can improve readability, usually by providing a useful rename.

I have a vague sense that there might be other cases I'm forgetting,
simply because I haven't had much use for them myself.

> I think if it isn't, it's a very different case, e.g.:
>     def len(iterable, _len=len):
>         if something(iterable): special_case()
>         else: return _len(iterable)

I would (perhaps wrongly) still have assumed that was at least
intended for optimization.

> Obviously non-optimization use cases can't be solved
> by an optimizer. I think this is really more a special case
> of your #4 ...

[#4 was current-value capture]

I almost never set out to capture a snapshot of the current
environment's values.  I get around to that solution after being
annoyed that something else didn't work, but it isn't the original
intent.  (That might be one reason I sometimes have to stop and think
about closures created in a loop.)

The "this shouldn't be in the signature" and "why is something being
assigned to itself" problems won't go away even if current-value
capture is resolved.  I suspect current-value capture would even
become an attractive nuisance that led to obscure bugs when the value
was captured too soon.

>> But realistically, that _len isn't ugly *just* because it shouldn't be
>> overridden; it is also inherently ugly.  I would prefer that something
>> like Victor's FAT optimizer just make this idiom obsolete.

> But, like most micro-optimizations, you should use this
> only when you really need it. Which means you probably
> can't count on a general-purpose optimizer that may do it
> for you, on some people's installations.

That still argues for not making any changes to the language; I think
the equivalent of (faster access to unchanged globals or builtins) is
a better portability bet than new language features.

> Also, marking that you're using an intentional
> micro-optimization is useful, even (or maybe especially)
> if it's ugly: it signals to any future maintainer that
> performance is particularly important here, and they
> should be careful with any changes.

Changing the language to formalize that signal takes away
some of the emphasis you get from ugliness.  I also wouldn't
assume that such speed assessments are likely to be valid
across the timescales needed for adoption of new syntax.

>> (2)  immutable bindings

>> once X
>> final Y
>> const Z

> But a default value neither guarantees immutability,
> nor signals such an intent. Parameters can be rebound
> or mutated just like any other variables.

It is difficult to signal "once set, this should not change"
in Python, largely because it is so difficult to enforce.

This case might actually be worth new syntax, or a keyword.

Alternatively, it might be like const contagion, that ends
up being applied too often and just adding visual noise.

>> So again, I think something like Victor's FAT optimizer (plus comments
>> when immutability really is important) is a better long-term solution,
>> but I'm not as sure as I was for case 1.

> How could an optimizer enforce immutability, much less signal it?

Victor's guards can "enforce" immutability by recognizing when it
fails in practice.  It can't signal, but comments can ... and
immutability being semantically important (as opposed to merely useful
for optimization) is rare enough that I think a comment is more likely
to be accurate than a type declaration.

>> (3)  Persistent storage

>>    def f(x, _cached_results={}): ...

>> I still think it might be nice to just have a way of easily opening a
>> new scope ...

> You mean to open a new scope _outside_ the function
> definition, so it can capture the cache in a closure, without
> leaving it accessible from outside the scope? But then f won't
> be accessible either, unless you have some way to "return"
> the value to the parent scope. And a scope that returns
> something--that's just a function, isn't it?

It is a function plus a function call, rather than just a function.
Getting that name (possible several names) bound properly in the outer
scope is also beyond the abilities of a call.  But "opening a new
scope" can start to look a lot like creating a new class instance,

> Meanwhile, a C-style function-static variable isn't really
> the same thing. Statics are just globals with names nobody
> else can see. So, for a nested function (or a method) that
> had a "static cache", any copies of the function would all
> share the same cache, while one with a closure over a
> cache defined in a new scope  (or a default parameter value,
> or a class instance) would get a new cache for each copy.
> So, if you give people an easier way to write statics, they'd
> still have to use something else when they want the other.

And explaining when they want one instead of the other will still be
so difficult that whichever is easier to write will become an
attractive nuisance, that would only cause problems under load.


From abarnert at  Tue Jan 26 23:39:03 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 26 Jan 2016 20:39:03 -0800
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett at> wrote:
>> On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert at> wrote:
>> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:
>>> (1)  Auxiliary variables
>>>   def f(x, _len=len): ...
>>> This is often a micro-optimization;
>> When _isn't_ it a micro-optimization?
> It can improve readability, usually by providing a useful rename.

OK, but then how could FAT, or any optimizer, help with that?

>> I think if it isn't, it's a very different case, e.g.:
>>    def len(iterable, _len=len):
>>        if something(iterable): special_case()
>>        else: return _len(iterable)
> I would (perhaps wrongly) still have assumed that was at least
> intended for optimization.

This is how you hook a global or builtin function with special behavior for a special case, when you can't use the normal protocol (e.g., because the special case is a C extension type so you can't monkeypatch it), or want to hook it at a smaller scope than builtin. That's usually nothing to do with optimization, but with adding functionality. But, either way, it's not something an optimizer can help with anyway.

>> Obviously non-optimization use cases can't be solved
>> by an optimizer. I think this is really more a special case
>> of your #4 ...
> [#4 was current-value capture]
> I almost never set out to capture a snapshot of the current
> environment's values.  I get around to that solution after being
> annoyed that something else didn't work, but it isn't the original
> intent.  (That might be one reason I sometimes have to stop and think
> about closures created in a loop.)
> The "this shouldn't be in the signature" and "why is something being
> assigned to itself" problems won't go away even if current-value
> capture is resolved.  I suspect current-value capture would even
> become an attractive nuisance that led to obscure bugs when the value
> was captured too soon.

You may be right here. The fact that current-value capture is currently ugly means you only use it when you need to explicitly signal something unusual, or when you have no other choice. Making it nicer could make it an attractive nuisance.

>> But, like most micro-optimizations, you should use this
>> only when you really need it. Which means you probably
>> can't count on a general-purpose optimizer that may do it
>> for you, on some people's installations.
> That still argues for not making any changes to the language; I think
> the equivalent of (faster access to unchanged globals or builtins) is
> a better portability bet than new language features.

Sure. I already said I don't think anything but maybe (and probably not) the loop-capture problem actually needs to be solved, so you don't have to convince me. :) When you really need the micro-optimization, which is very rare, you will continue to spell it with the default-value trick. The rest of the time, you don't need any way to spell it at all (and maybe FAT will sometimes optimize things for you, but that's just gravy).

> Alternatively, it might be like const contagion, that ends
> up being applied too often and just adding visual noise.

Const contagion is a C++-specific problem. (Actually, two problems--mixing up lvalue-const and rvalue-const incorrectly, and having half the stdlib and half the popular third-party libraries out there not being const-correct because they're actually C libs--but they're both unique to C++.) Play with D or Swift for a while to see how it can work.

>>> So again, I think something like Victor's FAT optimizer (plus comments
>>> when immutability really is important) is a better long-term solution,
>>> but I'm not as sure as I was for case 1.
>> How could an optimizer enforce immutability, much less signal it?
> Victor's guards can "enforce" immutability by recognizing when it
> fails in practice.

But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here.

>  It can't signal, but comments can ... and
> immutability being semantically important (as opposed to merely useful
> for optimization) is rare enough that I think a comment is more likely
> to be accurate than a type declaration.

Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal.

>>> (3)  Persistent storage
>>>   def f(x, _cached_results={}): ...
>>> I still think it might be nice to just have a way of easily opening a
>>> new scope ...
>> You mean to open a new scope _outside_ the function
>> definition, so it can capture the cache in a closure, without
>> leaving it accessible from outside the scope? But then f won't
>> be accessible either, unless you have some way to "return"
>> the value to the parent scope. And a scope that returns
>> something--that's just a function, isn't it?
> It is a function plus a function call, rather than just a function.
> Getting that name (possible several names) bound properly in the outer
> scope is also beyond the abilities of a call.  

It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript:

    var spam = function(n) {
        var cache = {}:
        return function(n) {
            if (cache[n] === undefined) {
                cache[n] = slow_computation(n);
            return cache[n];

And the exact same thing works in Python:

    def _():
        cache = {}
        def spam(n):
            if n not in cache:
                cache[n] = slow_computation(n)
            return cache[n]
        return spam
    spam = _()

You just rarely do it in Python because we have better ways of doing everything this can do.

>> Meanwhile, a C-style function-static variable isn't really
>> the same thing. Statics are just globals with names nobody
>> else can see. So, for a nested function (or a method) that
>> had a "static cache", any copies of the function would all
>> share the same cache, while one with a closure over a
>> cache defined in a new scope  (or a default parameter value,
>> or a class instance) would get a new cache for each copy.
>> So, if you give people an easier way to write statics, they'd
>> still have to use something else when they want the other.
> And explaining when they want one instead of the other will still be
> so difficult that whichever is easier to write will become an
> attractive nuisance, that would only cause problems under load.

Yes, yet another strike against C-style static variables. But, again, I don't think this was a problem that needed solving in the first place.

From mojtaba.gharibi at  Wed Jan 27 00:25:05 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Wed, 27 Jan 2016 00:25:05 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
Message-ID: <>


I'm thinking of this idea that we have a pseudo-operator called
"Respectively" and shown maybe with ;

Some examples first:

a;b;c = x1;y1;z1 + x2;y2;z2
is equivalent to

So it means for each position in the statement, do something like
respectively. It's like what I call a vertical expansion, i.e. running
statements one by one.
Then there is another unpacking operator which maybe we can show with $
sign and it operates on lists and tuples and creates the "Respectively"
version of them.
So for instance,
$vec = $u + $v
will add two 10-dimensional vectors to each other and put the result in vec.

I think this is a syntax that can make many things more concise plus it
makes component wise operation on a list done one by one easy.

For example, we can calculate the inner product between two vectors like
follows (inner product is the sum of component wise multiplication of two

innerProduct =0
innerProduct += $a * $b

which is equivalent to
for i in range(len(a)):
...innerProduct += a[i]+b[i]

For example, let's say we want to apply a function to all element in a
list, we can do:

The $ and ; take precedence over anything except ().

Also, an important thing is that whenever, we don't have the respectively
operator, such as for example in the statement above on the left hand side,
we basically use the same variable or value or operator for each statement
or you can equivalently think we have repeated that whole thing with ;;;;.
Such as:
s;s;s += a;b;c; * d;e;f
which result in s being a*d+b,c*e+d*f

Also, I didn't spot (at least for now any ambiguity).
For example one might think what if we do this recursively, such as in:
x;y;z + (a;b;c);(d;e;f);(g;h;i)
using the formula above this is equivalent to
if we apply print on the statement above, the result will be:

Beware that in all of these ; or $ does not create a new list. Rather, they
are like creating new lines in the program and executing those lines one by
one( in the case of $, to be more accurate, we create for loops).

I'll appreciate your time and looking forward to hearing your thoughts.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From sjoerdjob at  Wed Jan 27 00:57:43 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Wed, 27 Jan 2016 06:57:43 +0100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
> Hello,
> I'm thinking of this idea that we have a pseudo-operator called
> "Respectively" and shown maybe with ;

Hopefully, you're already aware of sequence unpacking? Search for
'unpacking' at .
Unfortunately, it does not have its own section I can directly link to.

    x, y = 3, 5

would give the same result as

    x = 3
    y = 5

But it's more robust, as it can also deal with things like

    x, y = y + 1, x + 4
> Some examples first:
> a;b;c = x1;y1;z1 + x2;y2;z2
> is equivalent to
> a=x1+x2
> b=y1+y2
> c=z1+z2

So what would happen with the following?

    a; b = x1;a + x2;5

> So it means for each position in the statement, do something like
> respectively. It's like what I call a vertical expansion, i.e. running
> statements one by one.
> Then there is another unpacking operator which maybe we can show with $
> sign and it operates on lists and tuples and creates the "Respectively"
> version of them.
> So for instance,
> vec=[]*10
> $vec = $u + $v
> will add two 10-dimensional vectors to each other and put the result in vec.
> I think this is a syntax that can make many things more concise plus it
> makes component wise operation on a list done one by one easy.
> For example, we can calculate the inner product between two vectors like
> follows (inner product is the sum of component wise multiplication of two
> vectors):
> innerProduct =0
> innerProduct += $a * $b
> which is equivalent to
> innerProduct=0
> for i in range(len(a)):
> ...innerProduct += a[i]+b[i]

>From what I can see, it would be very beneficial for you to look into
numpy: . It already provides inner product, sums
of arrays and such. I myself am not very familiar with it, but I think
it provides what you need.

> For example, let's say we want to apply a function to all element in a
> list, we can do:
> f($a)
> The $ and ; take precedence over anything except ().
> Also, an important thing is that whenever, we don't have the respectively
> operator, such as for example in the statement above on the left hand side,
> we basically use the same variable or value or operator for each statement
> or you can equivalently think we have repeated that whole thing with ;;;;.
> Such as:
> s=0
> s;s;s += a;b;c; * d;e;f
> which result in s being a*d+b,c*e+d*f
> Also, I didn't spot (at least for now any ambiguity).
> For example one might think what if we do this recursively, such as in:
> x;y;z + (a;b;c);(d;e;f);(g;h;i)
> using the formula above this is equivalent to
> (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i)
> if we apply print on the statement above, the result will be:
> x+a
> x+b
> x+c
> y+d
> y+e
> y+f
> z+g
> z+h
> z+i
> Beware that in all of these ; or $ does not create a new list. Rather, they
> are like creating new lines in the program and executing those lines one by
> one( in the case of $, to be more accurate, we create for loops).
> I'll appreciate your time and looking forward to hearing your thoughts.

Again, probably you should use numpy. I'm not really sure it warrants a
change to the language, because it seems like it would really only be
beneficial to those working with matrices. Numpy already supports it,
and I'm suspecting that the use case for `a;b = c;d + e;f` can already
be satisfied by `a, b = c + e, d + f`, and it already has clearly
documented semantics and still works fine when one of the names on the
left also appears on the right: First all the calculations on the right
are performed, then they are assigned to the names on the left.

> Cheers,
> Moj

Kind regards,
Sjoerd Job

From mojtaba.gharibi at  Wed Jan 27 01:19:56 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Wed, 27 Jan 2016 01:19:56 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

Yes, I'm aware sequence unpacking.
There is an overlap like you mentioned, but there are things that can't be
done with sequence unpacking, but can be done here.

For example, let's say you're given two lists that are not necessarily
numbers, so you can't use numpy, but you want to apply some component-wise
operator between each component. This is something you can't do with
sequence unpacking or with numpy. For example:

$StudentFullName = $FirstName + " " + $LastName

So, in effect, I think one big part of is component wise operations.

Another thing that can't be achieved with sequence unpacking is:
i.e. applying f for each component of x.

About your question above, it's not ambiguous here either:
 a; b = x1;a + x2;5
is exactly "Equivalent" to
a = x1+x2
b = a + 5

Also, there is a difference in style in sequence unpacking, and here.
In sequence unpacking, you have to pair up the right variables and repeat
the operator, for example:
x,y,z = x1+x2 , y1+y2, z1+z2
Here you don't have to repeat it and pair up the right variables, i.e.
x;y;z = x1;y1;z1 + x2;y2;z2
It's I think good that you (kind of) don't break the encapsulation-ish
thing we have for the three values here. Also, you don't risk, making a
mistake in the operator for one of the values by centralizing the operator
use. For example you could make the mistake:
x,y,z = x1+x2, y1-y2, z1+z2

Also there are all sort of other things that are less of a motivation for
me but that cannot be done with sequence unpacking.
For instance:
add ; prod = a +;* y  (This one I'm not sure how can be achieved without
x;y = f;g (a;b)

On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus <sjoerdjob at>

> On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
> > Hello,
> >
> > I'm thinking of this idea that we have a pseudo-operator called
> > "Respectively" and shown maybe with ;
> Hopefully, you're already aware of sequence unpacking? Search for
> 'unpacking' at .
> Unfortunately, it does not have its own section I can directly link to.
>     x, y = 3, 5
> would give the same result as
>     x = 3
>     y = 5
> But it's more robust, as it can also deal with things like
>     x, y = y + 1, x + 4
> >
> > Some examples first:
> >
> > a;b;c = x1;y1;z1 + x2;y2;z2
> > is equivalent to
> > a=x1+x2
> > b=y1+y2
> > c=z1+z2
> So what would happen with the following?
>     a; b = x1;a + x2;5
> >
> > So it means for each position in the statement, do something like
> > respectively. It's like what I call a vertical expansion, i.e. running
> > statements one by one.
> > Then there is another unpacking operator which maybe we can show with $
> > sign and it operates on lists and tuples and creates the "Respectively"
> > version of them.
> > So for instance,
> > vec=[]*10
> > $vec = $u + $v
> > will add two 10-dimensional vectors to each other and put the result in
> vec.
> >
> > I think this is a syntax that can make many things more concise plus it
> > makes component wise operation on a list done one by one easy.
> >
> > For example, we can calculate the inner product between two vectors like
> > follows (inner product is the sum of component wise multiplication of two
> > vectors):
> >
> > innerProduct =0
> > innerProduct += $a * $b
> >
> > which is equivalent to
> > innerProduct=0
> > for i in range(len(a)):
> > ...innerProduct += a[i]+b[i]
> >
> From what I can see, it would be very beneficial for you to look into
> numpy: . It already provides inner product, sums
> of arrays and such. I myself am not very familiar with it, but I think
> it provides what you need.
> >
> > For example, let's say we want to apply a function to all element in a
> > list, we can do:
> > f($a)
> >
> > The $ and ; take precedence over anything except ().
> >
> > Also, an important thing is that whenever, we don't have the respectively
> > operator, such as for example in the statement above on the left hand
> side,
> > we basically use the same variable or value or operator for each
> statement
> > or you can equivalently think we have repeated that whole thing with
> ;;;;.
> > Such as:
> > s=0
> > s;s;s += a;b;c; * d;e;f
> > which result in s being a*d+b,c*e+d*f
> >
> > Also, I didn't spot (at least for now any ambiguity).
> > For example one might think what if we do this recursively, such as in:
> > x;y;z + (a;b;c);(d;e;f);(g;h;i)
> > using the formula above this is equivalent to
> > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i)
> > if we apply print on the statement above, the result will be:
> > x+a
> > x+b
> > x+c
> > y+d
> > y+e
> > y+f
> > z+g
> > z+h
> > z+i
> >
> > Beware that in all of these ; or $ does not create a new list. Rather,
> they
> > are like creating new lines in the program and executing those lines one
> by
> > one( in the case of $, to be more accurate, we create for loops).
> >
> > I'll appreciate your time and looking forward to hearing your thoughts.
> Again, probably you should use numpy. I'm not really sure it warrants a
> change to the language, because it seems like it would really only be
> beneficial to those working with matrices. Numpy already supports it,
> and I'm suspecting that the use case for `a;b = c;d + e;f` can already
> be satisfied by `a, b = c + e, d + f`, and it already has clearly
> documented semantics and still works fine when one of the names on the
> left also appears on the right: First all the calculations on the right
> are performed, then they are assigned to the names on the left.
> >
> > Cheers,
> > Moj
> Kind regards,
> Sjoerd Job
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 27 02:29:31 2016
From: abarnert at (Andrew Barnert)
Date: Tue, 26 Jan 2016 23:29:31 -0800
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi <mojtaba.gharibi at> wrote:
> Yes, I'm aware sequence unpacking.
> There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
> For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy.

Yes, you can do it with numpy.

Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections:

    >>> firsts = ['John', 'Jane']
    >>> lasts = ['Smith', 'Doe']
    >>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
    array(['Smith, John', 'Doe, Jane'], dtype='<U11)

That's everything you're asking for, with even more flexibility, with no need for any new ugly perlesque syntax: just use at least one np.array type in an operator expression, call a method on an array type, or wrap a function in vectorize, and everything is elementwise.

And of course when you actually _are_ using numbers, as in every single one of your examples, using numpy also gives you around a 6:1 space and 20:1 time savings, which is a nice bonus.

> For example: 
> $StudentFullName = $FirstName + " " + $LastName
> So, in effect, I think one big part of is component wise operations.
> Another thing that can't be achieved with sequence unpacking is:
> f($x)
> i.e. applying f for each component of x.

That's a very different operation, which I think is more readably spelled map(f, x).

> About your question above, it's not ambiguous here either:
>  a; b = x1;a + x2;5
> is exactly "Equivalent" to
> a = x1+x2
> b = a + 5
> Also, there is a difference in style in sequence unpacking, and here.
> In sequence unpacking, you have to pair up the right variables and repeat the operator, for example:
> x,y,z = x1+x2 , y1+y2, z1+z2
> Here you don't have to repeat it and pair up the right variables, i.e.
> x;y;z = x1;y1;z1 + x2;y2;z2

If you only have two or three of these, that isn't a problem. Although in this case, it sure looks like you're trying to add two 3D vectors, so maybe you should just be storing 3D vectors as instances of a class (with an __add__ method, of course), or as arrays, or as columns in a larger array, rather than as 3 separate variables. What could be more readable than this:

    v = v1 + v2
And if you have more than about three separate variables, you _definitely_ want some kind of array or iterable, not a bunch of separate variables. You're worried about accidentally typing "y1-y2" when you meant "+", but you're far more likely to screw up one of the letters or numbers than the operator. You also can't loop over separate variables, which means you can't factor out some logic and apply it to all three axes, or to both vectors. Also consider how you'd do something like transposing or pivoting or anything even fancier. If you've got a 2D array or iterable of iterables, that's trivial: transpose or zip, etc. If you've got N*M separate variables, you have to write them all individually. Your syntax at best cuts the source length and opportunity for errors in half; using collections cuts it down to 1.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From sjoerdjob at  Wed Jan 27 02:30:04 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Wed, 27 Jan 2016 08:30:04 +0100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
> Yes, I'm aware sequence unpacking.
> There is an overlap like you mentioned, but there are things that can't be
> done with sequence unpacking, but can be done here.
> For example, let's say you're given two lists that are not necessarily
> numbers, so you can't use numpy, but you want to apply some component-wise
> operator between each component. This is something you can't do with
> sequence unpacking or with numpy. For example:
> $StudentFullName = $FirstName + " " + $LastName
> So, in effect, I think one big part of is component wise operations.
> Another thing that can't be achieved with sequence unpacking is:
> f($x)
> i.e. applying f for each component of x.

map(f, x)

> About your question above, it's not ambiguous here either:
>  a; b = x1;a + x2;5
> is exactly "Equivalent" to
> a = x1+x2
> b = a + 5

Now that's confusing, that it differs from sequence unpacking.

> Also, there is a difference in style in sequence unpacking, and here.
> In sequence unpacking, you have to pair up the right variables and repeat
> the operator, for example:
> x,y,z = x1+x2 , y1+y2, z1+z2
> Here you don't have to repeat it and pair up the right variables, i.e.
> x;y;z = x1;y1;z1 + x2;y2;z2
> It's I think good that you (kind of) don't break the encapsulation-ish
> thing we have for the three values here. Also, you don't risk, making a
> mistake in the operator for one of the values by centralizing the operator
> use. For example you could make the mistake:
> x,y,z = x1+x2, y1-y2, z1+z2
> Also there are all sort of other things that are less of a motivation for
> me but that cannot be done with sequence unpacking.
> For instance:
> add ; prod = a +;* y  (This one I'm not sure how can be achieved without
> ambiguity)
> x;y = f;g (a;b)
> On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus <sjoerdjob at>
> wrote:
> > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
> > > Hello,
> > >
> > > I'm thinking of this idea that we have a pseudo-operator called
> > > "Respectively" and shown maybe with ;
> >
> > Hopefully, you're already aware of sequence unpacking? Search for
> > 'unpacking' at .
> > Unfortunately, it does not have its own section I can directly link to.
> >
> >     x, y = 3, 5
> >
> > would give the same result as
> >
> >     x = 3
> >     y = 5
> >
> > But it's more robust, as it can also deal with things like
> >
> >     x, y = y + 1, x + 4
> > >
> > > Some examples first:
> > >
> > > a;b;c = x1;y1;z1 + x2;y2;z2
> > > is equivalent to
> > > a=x1+x2
> > > b=y1+y2
> > > c=z1+z2
> >
> > So what would happen with the following?
> >
> >     a; b = x1;a + x2;5
> >
> > >
> > > So it means for each position in the statement, do something like
> > > respectively. It's like what I call a vertical expansion, i.e. running
> > > statements one by one.
> > > Then there is another unpacking operator which maybe we can show with $
> > > sign and it operates on lists and tuples and creates the "Respectively"
> > > version of them.
> > > So for instance,
> > > vec=[]*10
> > > $vec = $u + $v
> > > will add two 10-dimensional vectors to each other and put the result in
> > vec.
> > >
> > > I think this is a syntax that can make many things more concise plus it
> > > makes component wise operation on a list done one by one easy.
> > >
> > > For example, we can calculate the inner product between two vectors like
> > > follows (inner product is the sum of component wise multiplication of two
> > > vectors):
> > >
> > > innerProduct =0
> > > innerProduct += $a * $b
> > >
> > > which is equivalent to
> > > innerProduct=0
> > > for i in range(len(a)):
> > > ...innerProduct += a[i]+b[i]
> > >

Thinking about this some more:

How do you know if this is going to return a list of products, or the
sum of those products?

That is, why is `innerProduct += $a * $b` not equivalent to
`innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure.

A clearer solution would be

    innerProduct = sum(map(operator.mul, a, b))

But that's current-Python syntax.

To be honest, I still haven't seen an added benefit that the new syntax
would gain. Maybe you could expand on that?

> >
> > From what I can see, it would be very beneficial for you to look into
> > numpy: . It already provides inner product, sums
> > of arrays and such. I myself am not very familiar with it, but I think
> > it provides what you need.
> >
> > >
> > > For example, let's say we want to apply a function to all element in a
> > > list, we can do:
> > > f($a)
> > >
> > > The $ and ; take precedence over anything except ().
> > >
> > > Also, an important thing is that whenever, we don't have the respectively
> > > operator, such as for example in the statement above on the left hand
> > side,
> > > we basically use the same variable or value or operator for each
> > statement
> > > or you can equivalently think we have repeated that whole thing with
> > ;;;;.
> > > Such as:
> > > s=0
> > > s;s;s += a;b;c; * d;e;f
> > > which result in s being a*d+b,c*e+d*f
> > >
> > > Also, I didn't spot (at least for now any ambiguity).
> > > For example one might think what if we do this recursively, such as in:
> > > x;y;z + (a;b;c);(d;e;f);(g;h;i)
> > > using the formula above this is equivalent to
> > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i)
> > > if we apply print on the statement above, the result will be:
> > > x+a
> > > x+b
> > > x+c
> > > y+d
> > > y+e
> > > y+f
> > > z+g
> > > z+h
> > > z+i
> > >
> > > Beware that in all of these ; or $ does not create a new list. Rather,
> > they
> > > are like creating new lines in the program and executing those lines one
> > by
> > > one( in the case of $, to be more accurate, we create for loops).
> > >
> > > I'll appreciate your time and looking forward to hearing your thoughts.
> >
> > Again, probably you should use numpy. I'm not really sure it warrants a
> > change to the language, because it seems like it would really only be
> > beneficial to those working with matrices. Numpy already supports it,
> > and I'm suspecting that the use case for `a;b = c;d + e;f` can already
> > be satisfied by `a, b = c + e, d + f`, and it already has clearly
> > documented semantics and still works fine when one of the names on the
> > left also appears on the right: First all the calculations on the right
> > are performed, then they are assigned to the names on the left.
> >
> > >
> > > Cheers,
> > > Moj
> >
> > Kind regards,
> > Sjoerd Job
> >

From steve at  Wed Jan 27 07:41:03 2016
From: steve at (Steven D'Aprano)
Date: Wed, 27 Jan 2016 23:41:03 +1100
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Tue, Jan 26, 2016 at 12:59:07PM -0800, Andrew Barnert via Python-ideas wrote:
> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:
> > 
> > (1)  Auxiliary variables
> > 
> >    def f(x, _len=len): ...
> > 
> > This is often a micro-optimization;
> When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.:
>     def len(iterable, _len=len):
>         if something(iterable): special_case()
>         else: return _len(iterable)

I'm not sure why you call this "a very different case". It looks the 
same to me: both cases use the default argument trick to capture the 
value of a builtin name. The reasons why they do so are incidental.

I sometimes have code like this:

    enumerate("", 1)
except TypeError:
    # Too old a version of Python.
    def enumerate(it, start=0, enumerate=enumerate):
        for a, b in enumerate(it):
            yield (a+start, b)

I don't really want an extra argument, but nor do I want a global:

_enumerate = enumerate
def enumerate(it, start=0):
    for a, b in _enumerate(it):
        yield (a+start, b)

This isn't a matter of micro-optimization, it's a matter of 
encapsulation. That my enumerate calls the built-in enumerate is an 
implementation detail, and what I'd like is to capture the value without 
either a global or an extra argument:

# capture the current builtin
def enumerate(it, start=0)(enumerate):
    for a, b in enumerate(it):
        yield (a+start, b)

Obviously I'm not going to be able to use hypothetical Python 3.6 syntax 
in code that needs to run in 2.5. But I might be able to use that syntax 
in Python 3.8 for code that needs to run in 3.6.

> > (2)  immutable bindings
> > 
> > once X
> > final Y
> > const Z
> But a default value neither guarantees immutability, nor signals such 
> an intent. Parameters can be rebound or mutated just like any other 
> variables.

I don't think this proposal has anything to say about about either 
immutability or bind-once-only "constants".

> > (3)  Persistent storage
> > 
> >    def f(x, _cached_results={}): ...
> > I still think it might be nice to just have a way of easily opening a
> > new scope ...
> You mean to open a new scope _outside_ the function definition, so it 
> can capture the cache in a closure, without leaving it accessible from 
> outside the scope? But then f won't be accessible either, unless you 
> have some way to "return" the value to the parent scope. And a scope 
> that returns something--that's just a function, isn't it?

I'm not sure what point you think you are making here, or what Jim 
meant by his comment about the new scope, but in this case I don't 
think we would want an extra scope. We would want the cache to be in the 
function's local scope, but assigned once at function definition time.

When my caching needs are simple, I might write something like this:

def func(x, cache={}): ...

which is certainly better than having a global variable cache. For many 
applications (quick and dirty scripts) this is perfectly adequate.

For other applications were my caching needs are more sophisticated, I 
might invest the effort in writing a decorator (or use functools.lru_cache),
or a factory to hide the cache in a closure:

def factory():
    cache = {}
    def func(x):
    return func

func = factory()
del factory

but there's a middle ground where I want something less quick'n'dirty 
than the first, but not going to all the effort of the second. For that, 
I think that being able to capture a value fits the use-case perfectly:

def func(x)(cache={}): ...

> Meanwhile, a C-style function-static variable isn't really the same 
> thing. Statics are just globals with names nobody else can see. So, 
> for a nested function (or a method) that had a "static cache", any 
> copies of the function would all share the same cache, 

Copying functions is, I think, a pretty rare and advanced thing to do. 
At least up to 3.4, copy.copy(func) simply returns func, so if you want 
to make an actual distinct copy, you probably need to build a new 
function by hand. In which case, you could copy the cache as part of the 


From jsbueno at  Wed Jan 27 07:44:57 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Wed, 27 Jan 2016 10:44:57 -0200
Subject: [Python-ideas] intput()
In-Reply-To: <>
References: <>
Message-ID: <>

for name, obj in __builtins__.__dict__.items():
   if isinstance(obj, type) and not issubclass(obj, BaseException):
       globals()[name + "put"] = lambda obj=obj, name=name:
obj(input("Please type in a {}: ".format(name)))

On 26 January 2016 at 17:13, Bar Harel <bzvi7919 at> wrote:
> ynput can use distutils.util.strtobool instead of defining for itself (just
> an added bonus)
> On Tue, Jan 26, 2016 at 5:30 PM Random832 <random832 at> wrote:
>> On Mon, Jan 25, 2016, at 18:01, Mahmoud Hashemi wrote:
>> > I tried to have fun, but my joke ended up long and maybe useful.
>> >
>> > Anyways, here's *ynput()*:
>> >
>> >
>> >
>> > Get yourself a True/False from a y/n.
>> If I were writing such a function I'd use
>> locale.nl_langinfo(locale.YESEXPR). (and NOEXPR) A survey of these on my
>> system indicates these all accept y/n, but additionally accept their own
>> language's term (or a related language - en_DK supports danish J/N, and
>> en_CA supports french O/N). Mostly these use syntax compatible with
>> python regex, though a few use (grouping|alternation) with no backslash.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From victor.stinner at  Wed Jan 27 10:39:10 2016
From: victor.stinner at (Victor Stinner)
Date: Wed, 27 Jan 2016 16:39:10 +0100
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
Message-ID: <>


Thank you for all feedback on my PEP 511. It looks like the current
blocker point is the unclear status of "language extensions": code
tranformers which deliberately changes the Python semantics. I would
like to discuss how we should register them. I think that the PEP 511
must discuss "language extensions" even if it doesn't have to propose
a solution to make their usage easier. It's an obvious usage of code
transformers. If possible, I would like to find a compromise to
support them, but make it explicit that they change the Python

By the way, I discussed with Joseph Jevnik who wrote codetransformer
(bytecode transformer) and lazy_python (AST transformer). He wrote me:

"One concern that I have though is that transformers are registered
globally. I think that the decorators in codetransformer do a good job
of signalling to reader the scope of some new code generation."

Currently, the PEP 511 doesn't provide a way to register a code
transformer but only use it under some conditions. For example, if
fatoptimizer is registered, all .pyc files will be called
file.cpython-36.fat-0.pyc even if fatoptimizer was disabled.

I propose to change the design of sys.set_code_transformers() to use
it more like a registry similar to the codecs registry
(codecs.register), but different (details below). A difference is that
the codecs registry uses a mapping (codec name => codec functions),
whereas sys.set_code_transformers() uses an ordered sequence (list) of
code transformers. A sequence is used because multiple code
transformers can be applied sequentially on a single .py file.

Petr Viktorin wrote that language extensions "target specific modules,
with which they're closely coupled: The modules won't run without the
transformer. And with other modules, the transformer either does
nothing (as with MacroPy, hopefully), or would fail altogether (as
with Hy). So, they would benefit from specific packages opting in. The
effects of enabling them globally range from inefficiency (MacroPy) to
failures or needing workarounds (Hy)."

Problem (A): solutions proposed below don't make code tranformers
mandatory. If a code *requires* a code transformer and the code
transformer is not registered, Python doesn't complain. Do you think
that it is a real issue in practice? For MacroPy, it's not a problem
in practice since functions must be decorated using a decorator from
the macropy package. If importing macropy fails, the module cannot be

Problem (B): proposed solutions below adds markers to ask to enable a
specific code transformer, but a code transformer can decide to always
modify the Python semantics without using such marker. According to
Nick Coghlan, code transformers changing the Python semantics *must*
require a marker in the code using them. IMHO it's the responsability
of the author of the code transformer to use markers, not the
responsability of Python.

Code transformers should maybe return a flag telling if they changed
the code or not. I prefer a flag rather than comparing the output to
the input, since the comparison can be expensive, especially for a
deep AST tree. Example:

class Optimizer:
    def ast_optimizer(self, tree, context):
        # ...
        return modified, tree

*modified* must be True if tree was modified.

There are several options to decide if a code transformer must be used
on a specific source file.

(1) Add a check_code() and check_ast() functions to code transformers.
The code transformer is responsible to decide if it wants to transform
the code or not. Python doesn't use the code transformer if the check
method returns False.


* MacroPy can search for the "import macropy" statement (of "from
macropy import ...") in the AST tree
* fatoptimizer can search for "__fatoptimizer__ = {'enabled': False}"
in the code: if this variable is found, the optimizer is completly

(2) Petr proposed to extend importlib to pass a code transformer when
importing a module.

        'mypackage.specialmodule', MyTransformer())

IMHO this option is too specific: it's restricted to importlib
(py_compile, compileall and interactive interpreter don't have the
feature). I also dislike the API.

(3) Petr also proposed "a special flag in packages":

    __transformers_for_submodules__ = [MyTransformer()]

I don't like having to get access to MyTransformer. The PEP 511
mentions an use case where the transformed code is run *without*
registering the transformer. But this issue can easily be fixed by
using the string to identify the transformer in the registery (ex:
"fat") rather than its class.

I'm not sure that putting a flag on the package (package/
is a good idea. I would prefer to enable language extensions on
individual files to restrict their scope.

(4) Sjoerd Job Postmus proposed something similar but using a comment
and not for packages, but any source file:

    #:Transformers modname.TransformerClassName,

The problem is that comments are not stored in the AST tree. I would
prefer to use AST to decide if an AST transformer should be used or

Note: I'm not really motived to extend the AST to start to include
comments, or even code formatting (spaces, newlines, etc.). can be used if you want to
transform a .py file without touching the format. But I don't think
that AST must go to this direction. I prefer to keep AST simple.

(5) Nick proposed (indirectly) to use a different filename (don't use
".py") for language extensions.

This option works with my option (2): the context contains the
filename which can be used to decide to enable or not the code

I understand that the code transformer must also install an importlib
hook to search for other filenames than only .py files. Am I right?

(6) Nick proposed (indirectly) to use an encoding cookie "which are
visible as a comment in the module header".

Again, I dislike this option because comments are not stored in AST.


From abarnert at  Wed Jan 27 11:14:15 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 08:14:15 -0800
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 04:41, Steven D'Aprano <steve at> wrote:

I think you're actually agreeing with me: there _aren't_ four different cases people actually want here, just the one we've all been talking about, and FAT is irrelevant to that case, so this sub thread is ultimately just a distraction. (We may still disagree about whether the one case needs a solution, or what the best solution would be, but we won't get anywhere by talking about different and unrelated things like this distraction.) But, in case I'm wrong about that, I'll answer your replies anyway:

>> On Tue, Jan 26, 2016 at 12:59:07PM -0800, Andrew Barnert via Python-ideas wrote:
>>> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:
>>> (1)  Auxiliary variables
>>>   def f(x, _len=len): ...
>>> This is often a micro-optimization;
>> When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.:
> I'm not sure why you call this "a very different case".

Because Jim's point was that FAT could do this automatically for him, so we don't need any syntax for it at all. That works for the optimization case, but it doesn't work for your case. Therefore, they're different.

Put another way: Without the default-value trick, his function means the same thing, so if he could rely on FAT, he could just stop using len=len. Without the default value trick, your function means something very different (a RecursionError), so you can't stop using enumerate=enumerate, with or without FAT, unless there's some other equally explicit syntax you can use instead.

Moreover, your case is really no different from his case #4, the case everyone else has been talking about: you want to capture the value of enumerate at function definition time.

>>> (2)  immutable bindings
>>> once X
>>> final Y
>>> const Z
>> But a default value neither guarantees immutability, nor signals such 
>> an intent. Parameters can be rebound or mutated just like any other 
>> variables.
> I don't think this proposal has anything to say about about either 
> immutability or bind-once-only "constants".

Jim insists that it's one of the four things people use default values for, and one of the things people want from this proposal, and that FAT can make that desire irrelevant. I think he's wrong on all three counts: you can't use default values for constness, nobody cares whether any of these new proposals can be used for constness, and FAT can't help anyone who does want constness.

>>> (3)  Persistent storage
>>>   def f(x, _cached_results={}): ...
>>> I still think it might be nice to just have a way of easily opening a
>>> new scope ...
>> You mean to open a new scope _outside_ the function definition, so it 
>> can capture the cache in a closure, without leaving it accessible from 
>> outside the scope? But then f won't be accessible either, unless you 
>> have some way to "return" the value to the parent scope. And a scope 
>> that returns something--that's just a function, isn't it?
> I'm not sure what point you think you are making here, or what Jim 
> meant by his comment about the new scope, but in this case I don't 
> think we would want an extra scope.

My point is that if you want to open a new scope to attach variables to a function, you can already do that by defining and calling a function. Which you very rarely actually need to do, so we don't need to make it any easier. So the fact that no variants of this proposal make it easier is irrelevant. 

>> Meanwhile, a C-style function-static variable isn't really the same 
>> thing. Statics are just globals with names nobody else can see. So, 
>> for a nested function (or a method) that had a "static cache", any 
>> copies of the function would all share the same cache,
> Copying functions is, I think, a pretty rare and advanced thing to do. 

I'm not talking about literally copying functions. I'm talking about nested functions using the same code object for each closure that gets created, and methods using the same code and function object for every bound method that gets created. Using a C-style static variable in these cases means all your closures, or in all your methods from different instances, etc., which is not the same behavior as the other alternatives he suggested were equivalent.

This one, unlike his other points, isn't completely irrelevant. A C-style static declaration could actually serve some of the cases that the proposal is meant to serve. But it can't serve others, and it confusingly looks like it can serve more than it can, which makes it a confusing side track to bring up.

From encukou at  Wed Jan 27 11:36:47 2016
From: encukou at (Petr Viktorin)
Date: Wed, 27 Jan 2016 17:36:47 +0100
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/27/2016 04:39 PM, Victor Stinner wrote:
> Hi,
> Thank you for all feedback on my PEP 511. It looks like the current
> blocker point is the unclear status of "language extensions": code
> tranformers which deliberately changes the Python semantics. I would
> like to discuss how we should register them. I think that the PEP 511
> must discuss "language extensions" even if it doesn't have to propose
> a solution to make their usage easier. It's an obvious usage of code
> transformers. If possible, I would like to find a compromise to
> support them, but make it explicit that they change the Python
> semantics.
> By the way, I discussed with Joseph Jevnik who wrote codetransformer
> (bytecode transformer) and lazy_python (AST transformer). He wrote me:
> "One concern that I have though is that transformers are registered
> globally. I think that the decorators in codetransformer do a good job
> of signalling to reader the scope of some new code generation."
> Currently, the PEP 511 doesn't provide a way to register a code
> transformer but only use it under some conditions. For example, if
> fatoptimizer is registered, all .pyc files will be called
> file.cpython-36.fat-0.pyc even if fatoptimizer was disabled.
> I propose to change the design of sys.set_code_transformers() to use
> it more like a registry similar to the codecs registry
> (codecs.register), but different (details below). A difference is that
> the codecs registry uses a mapping (codec name => codec functions),
> whereas sys.set_code_transformers() uses an ordered sequence (list) of
> code transformers. A sequence is used because multiple code
> transformers can be applied sequentially on a single .py file.
> Petr Viktorin wrote that language extensions "target specific modules,
> with which they're closely coupled: The modules won't run without the
> transformer. And with other modules, the transformer either does
> nothing (as with MacroPy, hopefully), or would fail altogether (as
> with Hy). So, they would benefit from specific packages opting in. The
> effects of enabling them globally range from inefficiency (MacroPy) to
> failures or needing workarounds (Hy)."
> Problem (A): solutions proposed below don't make code tranformers
> mandatory. If a code *requires* a code transformer and the code
> transformer is not registered, Python doesn't complain. Do you think
> that it is a real issue in practice? For MacroPy, it's not a problem
> in practice since functions must be decorated using a decorator from
> the macropy package. If importing macropy fails, the module cannot be
> imported.
> Problem (B): proposed solutions below adds markers to ask to enable a
> specific code transformer, but a code transformer can decide to always
> modify the Python semantics without using such marker. According to
> Nick Coghlan, code transformers changing the Python semantics *must*
> require a marker in the code using them. IMHO it's the responsability
> of the author of the code transformer to use markers, not the
> responsability of Python.

I believe Nick meant that if a transformer modifies semantics of
un-marked code, it would be considered a badly written transformer that
doesn't play well with the rest of the language.
The responsibility of Python is just to make it easy to do the right thing.

> Code transformers should maybe return a flag telling if they changed
> the code or not. I prefer a flag rather than comparing the output to
> the input, since the comparison can be expensive, especially for a
> deep AST tree. Example:
> class Optimizer:
>     def ast_optimizer(self, tree, context):
>         # ...
>         return modified, tree
> *modified* must be True if tree was modified.

What would this flag be useful for?

> There are several options to decide if a code transformer must be used
> on a specific source file.
> (2) Petr proposed to extend importlib to pass a code transformer when
> importing a module.
>     importlib.util.import_with_transformer(
>         'mypackage.specialmodule', MyTransformer())
> (5) Nick proposed (indirectly) to use a different filename (don't use
> ".py") for language extensions.
> This option works with my option (2): the context contains the
> filename which can be used to decide to enable or not the code
> transformer.
> I understand that the code transformer must also install an importlib
> hook to search for other filenames than only .py files. Am I right?

Yes, you are. But once a custom import hook is in place, you can just
use a regular import, the hack in (2) isn't necessary.

Also, note that this would solve problem (A) -- without the hook
enabled, the source won't be found.

From victor.stinner at  Wed Jan 27 11:44:56 2016
From: victor.stinner at (Victor Stinner)
Date: Wed, 27 Jan 2016 17:44:56 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>


2016-01-16 17:56 GMT+01:00 Kevin Conway <kevinjacobconway at>:
> I'm a big fan of your motivation to build an optimizer for cPython code.
> What I'm struggling with is understanding why this requires a PEP and
> language modification. There are already several projects that manipulate
> the AST for performance gains such as [1] or even my own ham fisted attempt
> [2].

Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST
optimizers section of Prior Art.

I wrote astoptimizer [1] and this project uses monkey-patching of the
compile() function, I mentioned this monkey-patching hack in the
rationale of the PEP:

I would like to avoid monkey-patching because it causes various issues.

The PEP 511 also makes transformations more visible: transformers are
explicitly registered in sys.set_code_transformers() and the .pyc
filename is modified when the code is transformed.

It also adds a new feature: it becomes possible to run transformed
code without having to register the tranformer at runtime. This is
made possible with the addition of the -o command line option.


From abarnert at  Wed Jan 27 11:48:47 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 08:48:47 -0800
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 07:39, Victor Stinner <victor.stinner at> wrote:
> Hi,
> Thank you for all feedback on my PEP 511. It looks like the current
> blocker point is the unclear status of "language extensions": code
> tranformers which deliberately changes the Python semantics. I would
> like to discuss how we should register them. I think that the PEP 511
> must discuss "language extensions" even if it doesn't have to propose
> a solution to make their usage easier. It's an obvious usage of code
> transformers. If possible, I would like to find a compromise to
> support them, but make it explicit that they change the Python
> semantics.

Is this really necessary?

If someone is testing a language change locally, and just wants to use your (original) API for his tests instead of the more complicated alternative of building an import hook, it works fine. If he can't deploy that way, that's fine.

If someone builds a transformer that adds a feature in a way that makes it a pure superset of Python, he should be fine with running it on all files, so your API works fine. And if some files that didn't use any of the new features get .pyc files that imply they did, so what?

If someone builds a transformer that only runs on files with a different extension, he already needs an import hook, so he might as well just call his transformer from the input hook, same as he does today.

So... What case is served by this new, more complicated API that wasn't already served by your original, simple one (remembering that import hooks are already there as a fallback)?

> By the way, I discussed with Joseph Jevnik who wrote codetransformer
> (bytecode transformer) and lazy_python (AST transformer). He wrote me:
> "One concern that I have though is that transformers are registered
> globally. I think that the decorators in codetransformer do a good job
> of signalling to reader the scope of some new code generation."
> Currently, the PEP 511 doesn't provide a way to register a code
> transformer but only use it under some conditions. For example, if
> fatoptimizer is registered, all .pyc files will be called
> file.cpython-36.fat-0.pyc even if fatoptimizer was disabled.

That doesn't really answer his question, unless you're trying to add some syntax that's like a decorator but for an entire module, to be used in addition to the existing more local class and function decorators?

> Petr Viktorin wrote that language extensions "target specific modules,
> with which they're closely coupled: The modules won't run without the
> transformer. And with other modules, the transformer either does
> nothing (as with MacroPy, hopefully), or would fail altogether (as
> with Hy). So, they would benefit from specific packages opting in. The
> effects of enabling them globally range from inefficiency (MacroPy) to
> failures or needing workarounds (Hy)."

It seems like you're trying to find a declarative alternative to every possible use for an imperative import hook. If you can pull that off, it would be cool--but is it really necessary for your proposal?

Does your solution have to make it possible for MacroPy and Hy to drop their complicated import hooks and just register transformers, for it to be a useful solution?

If the problem you're trying to solve is just making it easier for MacroPy and Hy to coexist with the new transformers, maybe just solve that. For example, if it's too hard for them to decorate .pyc names in a way that fits in with your system, maybe adding a function to get the pre-hook pyc name and to set the post-hook one (e.g., to insert "-pymacro-" in the middle of it) would be sufficient.

If there's something that can't be solved in a similar way--e.g., if you think your proposal has to make macropy.console (or whatever he calls the "macros in the REPL" feature) either automatic or at least a lot easier--then maybe that's a different story, but it would be nice to see the rationale for why we need to solve that today. (Couldn't it be added in 3.7, after people have gotten experience with using 3.6 transformers?)

From mojtaba.gharibi at  Wed Jan 27 12:12:07 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Wed, 27 Jan 2016 12:12:07 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

I think the component-wise operation is the biggest benefit and a more
compact and understandable syntax.
For example,

innerProduct = sum(map(operator.mul, a, b))
is much more complex than
innerProduct += $a * $b

MATLAB has a built-in easy way of achieving component-wise operation and I
think Python would benefit from that without use of libraries such as numpy.

Regarding your question about the difference between
innerProduct += $a * $b
innerProduct = $innerProduct + $a * $b

The second statement returns error. I mentioned in my initial email that $
applies to a list or a tuple.
Here I explicitly set my innerProduct=0 initially which you omitted in your

innerProduct += $a * $b
is equivalent to
for i in len(range(a)):
...innerProduct +=a[i]*b[i]

On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus <sjoerdjob at>

> On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
> > Yes, I'm aware sequence unpacking.
> > There is an overlap like you mentioned, but there are things that can't
> be
> > done with sequence unpacking, but can be done here.
> >
> > For example, let's say you're given two lists that are not necessarily
> > numbers, so you can't use numpy, but you want to apply some
> component-wise
> > operator between each component. This is something you can't do with
> > sequence unpacking or with numpy. For example:
> >
> > $StudentFullName = $FirstName + " " + $LastName
> >
> > So, in effect, I think one big part of is component wise operations.
> >
> > Another thing that can't be achieved with sequence unpacking is:
> > f($x)
> > i.e. applying f for each component of x.
> map(f, x)
> >
> > About your question above, it's not ambiguous here either:
> >  a; b = x1;a + x2;5
> > is exactly "Equivalent" to
> > a = x1+x2
> > b = a + 5
> Now that's confusing, that it differs from sequence unpacking.
> >
> > Also, there is a difference in style in sequence unpacking, and here.
> > In sequence unpacking, you have to pair up the right variables and repeat
> > the operator, for example:
> > x,y,z = x1+x2 , y1+y2, z1+z2
> > Here you don't have to repeat it and pair up the right variables, i.e.
> > x;y;z = x1;y1;z1 + x2;y2;z2
> > It's I think good that you (kind of) don't break the encapsulation-ish
> > thing we have for the three values here. Also, you don't risk, making a
> > mistake in the operator for one of the values by centralizing the
> operator
> > use. For example you could make the mistake:
> > x,y,z = x1+x2, y1-y2, z1+z2
> >
> > Also there are all sort of other things that are less of a motivation for
> > me but that cannot be done with sequence unpacking.
> > For instance:
> > add ; prod = a +;* y  (This one I'm not sure how can be achieved without
> > ambiguity)
> > x;y = f;g (a;b)
> >
> >
> > On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus <sjoerdjob at>
> > wrote:
> >
> > > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
> > > > Hello,
> > > >
> > > > I'm thinking of this idea that we have a pseudo-operator called
> > > > "Respectively" and shown maybe with ;
> > >
> > > Hopefully, you're already aware of sequence unpacking? Search for
> > > 'unpacking' at
> .
> > > Unfortunately, it does not have its own section I can directly link to.
> > >
> > >     x, y = 3, 5
> > >
> > > would give the same result as
> > >
> > >     x = 3
> > >     y = 5
> > >
> > > But it's more robust, as it can also deal with things like
> > >
> > >     x, y = y + 1, x + 4
> > > >
> > > > Some examples first:
> > > >
> > > > a;b;c = x1;y1;z1 + x2;y2;z2
> > > > is equivalent to
> > > > a=x1+x2
> > > > b=y1+y2
> > > > c=z1+z2
> > >
> > > So what would happen with the following?
> > >
> > >     a; b = x1;a + x2;5
> > >
> > > >
> > > > So it means for each position in the statement, do something like
> > > > respectively. It's like what I call a vertical expansion, i.e.
> running
> > > > statements one by one.
> > > > Then there is another unpacking operator which maybe we can show
> with $
> > > > sign and it operates on lists and tuples and creates the
> "Respectively"
> > > > version of them.
> > > > So for instance,
> > > > vec=[]*10
> > > > $vec = $u + $v
> > > > will add two 10-dimensional vectors to each other and put the result
> in
> > > vec.
> > > >
> > > > I think this is a syntax that can make many things more concise plus
> it
> > > > makes component wise operation on a list done one by one easy.
> > > >
> > > > For example, we can calculate the inner product between two vectors
> like
> > > > follows (inner product is the sum of component wise multiplication
> of two
> > > > vectors):
> > > >
> > > > innerProduct =0
> > > > innerProduct += $a * $b
> > > >
> > > > which is equivalent to
> > > > innerProduct=0
> > > > for i in range(len(a)):
> > > > ...innerProduct += a[i]+b[i]
> > > >
> Thinking about this some more:
> How do you know if this is going to return a list of products, or the
> sum of those products?
> That is, why is `innerProduct += $a * $b` not equivalent to
> `innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure.
> A clearer solution would be
>     innerProduct = sum(map(operator.mul, a, b))
> But that's current-Python syntax.
> To be honest, I still haven't seen an added benefit that the new syntax
> would gain. Maybe you could expand on that?
> > >
> > > From what I can see, it would be very beneficial for you to look into
> > > numpy: . It already provides inner product, sums
> > > of arrays and such. I myself am not very familiar with it, but I think
> > > it provides what you need.
> > >
> > > >
> > > > For example, let's say we want to apply a function to all element in
> a
> > > > list, we can do:
> > > > f($a)
> > > >
> > > > The $ and ; take precedence over anything except ().
> > > >
> > > > Also, an important thing is that whenever, we don't have the
> respectively
> > > > operator, such as for example in the statement above on the left hand
> > > side,
> > > > we basically use the same variable or value or operator for each
> > > statement
> > > > or you can equivalently think we have repeated that whole thing with
> > > ;;;;.
> > > > Such as:
> > > > s=0
> > > > s;s;s += a;b;c; * d;e;f
> > > > which result in s being a*d+b,c*e+d*f
> > > >
> > > > Also, I didn't spot (at least for now any ambiguity).
> > > > For example one might think what if we do this recursively, such as
> in:
> > > > x;y;z + (a;b;c);(d;e;f);(g;h;i)
> > > > using the formula above this is equivalent to
> > > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i)
> > > > if we apply print on the statement above, the result will be:
> > > > x+a
> > > > x+b
> > > > x+c
> > > > y+d
> > > > y+e
> > > > y+f
> > > > z+g
> > > > z+h
> > > > z+i
> > > >
> > > > Beware that in all of these ; or $ does not create a new list.
> Rather,
> > > they
> > > > are like creating new lines in the program and executing those lines
> one
> > > by
> > > > one( in the case of $, to be more accurate, we create for loops).
> > > >
> > > > I'll appreciate your time and looking forward to hearing your
> thoughts.
> > >
> > > Again, probably you should use numpy. I'm not really sure it warrants a
> > > change to the language, because it seems like it would really only be
> > > beneficial to those working with matrices. Numpy already supports it,
> > > and I'm suspecting that the use case for `a;b = c;d + e;f` can already
> > > be satisfied by `a, b = c + e, d + f`, and it already has clearly
> > > documented semantics and still works fine when one of the names on the
> > > left also appears on the right: First all the calculations on the right
> > > are performed, then they are assigned to the names on the left.
> > >
> > > >
> > > > Cheers,
> > > > Moj
> > >
> > > Kind regards,
> > > Sjoerd Job
> > >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jimjjewett at  Wed Jan 27 12:27:55 2016
From: jimjjewett at (Jim J. Jewett)
Date: Wed, 27 Jan 2016 12:27:55 -0500
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

An "extra" defaulted parameter is used for many slightly different
reasons ... even a perfect solution for one of them risks being an
attractive nuisance for the others.

On Tue, Jan 26, 2016 at 11:39 PM, Andrew Barnert <abarnert at> wrote:
> On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett at> wrote:

>>> On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert at> wrote:
>>> On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett at> wrote:

>>>> (1)  Auxiliary variables

>>>>   def f(x, _len=len): ...
>> It can improve readability, usually by providing a useful rename.

> OK, but then how could FAT, or any optimizer, help with that?

It can't ... and that does argue for aux variables (or a let),
but ... would good usage be swamped by abuse?

You also brought up the case of augmenting a builtin or global,
but still delegating to the original ... I forgot that case, and didn't
even notice that you were rebinding the captured name.  In
those cases, the mechanical intent is "capture the old way",
but the higher level intent is to specialize it.  This should
probably look more like inheritance (or multimethods or
advice and dispatch) ... so even if it deserves a language
change, capture-current-value idioms wouldn't really be an improvement
over the current workaround.


>>>> So again, I think something like Victor's FAT optimizer (plus comments
>>>> when immutability really is important) ...

>>> How could an optimizer enforce immutability, much less signal it?

>> Victor's guards can "enforce" immutability by recognizing when it
>> fails in practice.

> But that doesn't do _anything_ semantically--the code runs
> exactly the same way as if FAT hadn't done anything,
> except maybe a bit slower. If that's wrong, it's still just as
> wrong, and you still have no way of noticing that it's wrong,
> much less fixing it. So FAT is completely irrelevant here.

Using the specific guards he proposes, yes.
But something like FAT could provide more active guards
that raise an exception, or swap the original value back
into place, or even actively prevent the modification.

Whether these should be triggered by a declaration in
front of the name, or by a module-level freeze statement,
or ... there are enough possibilities that I don't think a
specific solution should be enshrined in the language yet.

>>  It can't signal, but comments can ... and
>> immutability being semantically important (as opposed to merely useful
>> for optimization) is rare enough that I think a comment is more likely
>> to be accurate than a type declaration.

> Here I disagree completely. Why do we have tuple,
> or frozenset? Why do dicts only take immutable keys?
> Why does the language make it easier to build
> mapped/filtered copies in place? Why can immutable
> objects be shared between threads or processes trivially,
> while mutable objects need locks for threads and heavy
> "manager" objects for processes? Mutability is a very big deal.

Those are all "if you're living with these restrictions anyhow,
and you tell the compiler, the program can run faster."

None of those sound important in terms of "What does this program
(eventually) do?"

(Obviously, when immutability actually *is* important, and an
appropriate immutable data type exists, then *not* using it would send
a bad signal.)

>>> You mean to open a new scope _outside_ the function
>>> definition, so it can capture the cache in a closure, without
>>> leaving it accessible from outside the scope? But then f won't
>>> be accessible either, unless you have some way to "return"
>>> the value to the parent scope. And a scope that returns
>>> something--that's just a function, isn't it?

>> It is a function plus a function call, rather than just a function.
>> Getting that name (possible several names) bound properly in the outer
>> scope is also beyond the abilities of a call.

> It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript:
>     var spam = function(n) {
>         var cache = {}:
>         return function(n) {
>             if (cache[n] === undefined) {
>                 cache[n] = slow_computation(n);
>             }
>             return cache[n];
>         };
>     }();

That still doesn't bind n1, n2, n3 in the enclosing scope -- it only
binds spam, from which you can reach spam(n1), spam(n2), etc.

I guess I'm (occasionally) looking for something more like

    class _Scope:
    for attr in dir(_Scope):
        if not attr.startswith("_"):
            locals()[attr] = _Scope[attr]


From p.f.moore at  Wed Jan 27 12:54:06 2016
From: p.f.moore at (Paul Moore)
Date: Wed, 27 Jan 2016 17:54:06 +0000
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On 27 January 2016 at 17:12, Mirmojtaba Gharibi
<mojtaba.gharibi at> wrote:
> innerProduct = sum(map(operator.mul, a, b))
> is much more complex than
> innerProduct += $a * $b

Certainly the second is *shorter*. But it's full of weird "magic"
behaviour that I don't even begin to know how to explain in general
terms (i.e., without having to appeal to specific examples):

- Why does += magically initialise innerProduct to 0 before doing the
implied loop? Would it initialise to '' if a and b were lists of
- What would *= or -= or ... initialise to? Why?
- What does $a mean, in isolation from a larger expression?
- How do I generalise my understanding of this expression to work out
what innerProduct += $a * b means?
- Given that omitting the $ before one or both of the variables
totally changes the meaning, how bad of a bug magnet is this?
- What if a and b are different lengths? Why does the length of the
unrelated list b affect the meaning of the expression $a (i.e.,
there's a huge context sensitivity here).
- How do I pronounce $a? What is the name of the $ "operator". "*" is
called "multiply", to give an example of what I mean.

Oh, and your "standard Python" implementation of inner product is not
the most readable (which is a matter of opinion, certainly) approach,
so you're asking a loaded question. An alternative way of writing it
would be

innerProduct = sum(x*y for x, y in zip(a, b))

Variable names that aren't 1-character would probably help the
"normal" version. I can't be sure if they'd help or harm the proposed
version. Probably wouldn't make much difference.

Sorry, but I see no particular value in this proposal, and many issues
with it. So -1 from me.

From mojtaba.gharibi at  Wed Jan 27 12:55:33 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Wed, 27 Jan 2016 12:55:33 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert <abarnert at> wrote:

> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi <mojtaba.gharibi at>
> wrote:
> Yes, I'm aware sequence unpacking.
> There is an overlap like you mentioned, but there are things that can't be
> done with sequence unpacking, but can be done here.
> For example, let's say you're given two lists that are not necessarily
> numbers, so you can't use numpy, but you want to apply some component-wise
> operator between each component. This is something you can't do with
> sequence unpacking or with numpy.
> Yes, you can do it with numpy.
> Obviously you don't get the performance benefits when you aren't using
> "native" types (like int32) and operations that have vectorizes
> implementations (like adding two arrays of int32 or taking the dot product
> of float64 matrices), but you do still get the same elementwise operators,
> and even a way to apply arbitrary callables over arrays, or even other
> collections:
>     >>> firsts = ['John', 'Jane']
>     >>> lasts = ['Smith', 'Doe']
>     >>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
>     array(['Smith, John', 'Doe, Jane'], dtype='<U11)
> I think the form I am suggesting is simpler and more readable. I'm happy
you brought vectorize to my attention though. I think as soon you make the
statement just a bit complex, it would become really complicated with

For example lets say you have

$r = str(len($y*$p)+$x)

It would be really complex to calculate such a thing with vectorize.
All I am saving on is basically a for-loop and the indexing. We don't
really have to use numpy,etc. I think it's much easier to just use for-loop
and indexing, if you don't like the syntax. So I think the question is,
does my syntax bring enough convenience to avoid for-loop and indexing.
For example the above could be equivalently written as
for i in range(0,len(r)):
...r[i] = str(len(y[i]*p[i])+x[i])
So that's the whole saving. Just a for-loop and indexing operator.

> That's everything you're asking for, with even more flexibility, with no
> need for any new ugly perlesque syntax: just use at least one np.array type
> in an operator expression, call a method on an array type, or wrap a
> function in vectorize, and everything is elementwise.
> And of course when you actually _are_ using numbers, as in every single
> one of your examples, using numpy also gives you around a 6:1 space and
> 20:1 time savings, which is a nice bonus.
> For example:
> $StudentFullName = $FirstName + " " + $LastName
> So, in effect, I think one big part of is component wise operations.
> Another thing that can't be achieved with sequence unpacking is:
> f($x)
> i.e. applying f for each component of x.
> That's a very different operation, which I think is more readably spelled
> map(f, x).
> About your question above, it's not ambiguous here either:
>  a; b = x1;a + x2;5
> is exactly "Equivalent" to
> a = x1+x2
> b = a + 5
> Also, there is a difference in style in sequence unpacking, and here.
> In sequence unpacking, you have to pair up the right variables and repeat
> the operator, for example:
> x,y,z = x1+x2 , y1+y2, z1+z2
> Here you don't have to repeat it and pair up the right variables, i.e.
> x;y;z = x1;y1;z1 + x2;y2;z2
> If you only have two or three of these, that isn't a problem. Although in
> this case, it sure looks like you're trying to add two 3D vectors, so
> maybe you should just be storing 3D vectors as instances of a class (with
> an __add__ method, of course), or as arrays, or as columns in a larger
> array, rather than as 3 separate variables. What could be more readable
> than this:
>     v = v1 + v2
> And if you have more than about three separate variables, you _definitely_
> want some kind of array or iterable, not a bunch of separate variables.
> You're worried about accidentally typing "y1-y2" when you meant "+", but
> you're far more likely to screw up one of the letters or numbers than the
> operator. You also can't loop over separate variables, which means you
> can't factor out some logic and apply it to all three axes, or to both
> vectors. Also consider how you'd do something like transposing or pivoting
> or anything even fancier. If you've got a 2D array or iterable of
> iterables, that's trivial: transpose or zip, etc. If you've got N*M
> separate variables, you have to write them all individually. Your syntax at
> best cuts the source length and opportunity for errors in half; using
> collections cuts it down to 1.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mojtaba.gharibi at  Wed Jan 27 13:02:40 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Wed, 27 Jan 2016 13:02:40 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

I think a lot of your question are answered in my very first email.
Stuff about initialization. I had initialized my variable, but Sjoerd
dropped it when giving his example. Please refer to the very first email.

Regarding how to explain the behaviour in simple term, I also refer you to
my very first email. Basically it's a pair of (kind of) operators I called
Respectively and unpacking. You can read it more extensively there.

It's supposed that in a pairwise operation like this, you provide identical
length lists. If a and b are different length, my idea is that we just go
as much as the length of the first list in the operation or alternatively
the biggest list and then throw an exception for instance.

On Wed, Jan 27, 2016 at 12:54 PM, Paul Moore <p.f.moore at> wrote:

> On 27 January 2016 at 17:12, Mirmojtaba Gharibi
> <mojtaba.gharibi at> wrote:
> > innerProduct = sum(map(operator.mul, a, b))
> > is much more complex than
> > innerProduct += $a * $b
> Certainly the second is *shorter*. But it's full of weird "magic"
> behaviour that I don't even begin to know how to explain in general
> terms (i.e., without having to appeal to specific examples):
> - Why does += magically initialise innerProduct to 0 before doing the
> implied loop? Would it initialise to '' if a and b were lists of
> strings?
> - What would *= or -= or ... initialise to? Why?
> - What does $a mean, in isolation from a larger expression?
> - How do I generalise my understanding of this expression to work out
> what innerProduct += $a * b means?
> - Given that omitting the $ before one or both of the variables
> totally changes the meaning, how bad of a bug magnet is this?
> - What if a and b are different lengths? Why does the length of the
> unrelated list b affect the meaning of the expression $a (i.e.,
> there's a huge context sensitivity here).
> - How do I pronounce $a? What is the name of the $ "operator". "*" is
> called "multiply", to give an example of what I mean.
> Oh, and your "standard Python" implementation of inner product is not
> the most readable (which is a matter of opinion, certainly) approach,
> so you're asking a loaded question. An alternative way of writing it
> would be
> innerProduct = sum(x*y for x, y in zip(a, b))
> Variable names that aren't 1-character would probably help the
> "normal" version. I can't be sure if they'd help or harm the proposed
> version. Probably wouldn't make much difference.
> Sorry, but I see no particular value in this proposal, and many issues
> with it. So -1 from me.
> Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 27 13:13:18 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 10:13:18 -0800
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi <mojtaba.gharibi at> wrote:
> I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. 
> For example, 
> innerProduct = sum(map(operator.mul, a, b))
> is much more complex than
> innerProduct += $a * $b
> MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.

Why? What's wrong with using numpy?

It seems like only problem in your initial post was that you thought numpy can't do what you want, when in fact it can, and trivially so. Adding the same amount of complexity to the base language wouldn't make it any more discoverable--it would just mean that _all_ Python users now have the potential to be confused, rather than only Python+numpy users, which sounds like a step backward.

Also, this is going to sound like a rhetorical, or even baited, question, but it's not intended that way: what's wrong with APL, or J, or MATLAB, and what makes you want to use Python instead? I'll bet that, directly or indirectly, the reason is the simplicity, consistency, and readability of Python. If you make Python more cryptic and dense, there's a very good chance it'll end up less readable than J rather than more, which would defeat the entire purpose.

Also, while we're at it, if you want the same features as APL and MATLAB, why invent a very different syntax instead of just using their syntax? Most proposals for adding elementwise computation to the base language suggest adding array operators like .+ that work the same way on all types, not adding object-wrapping operators that turn a list or a bunch of separate objects into some hidden type that overloads the normal + operator to be elementwise. What's the rationale for doing it your way instead of the usual way? (I can see one pretty good answer--consistency with numpy--but I don't think it's what you have in mind.)

> Regarding your question about the difference between 
> innerProduct += $a * $b
> and
> innerProduct = $innerProduct + $a * $b
> The second statement returns error. I mentioned in my initial email that $ applies to a list or a tuple.
> Here I explicitly set my innerProduct=0 initially which you omitted in your example.
> innerProduct += $a * $b
> is equivalent to
> for i in len(range(a)):
> ...innerProduct +=a[i]*b[i]
>> On Wed, Jan 27, 2016 at 2:30 AM, Sjoerd Job Postmus <sjoerdjob at> wrote:
>> On Wed, Jan 27, 2016 at 01:19:56AM -0500, Mirmojtaba Gharibi wrote:
>> > Yes, I'm aware sequence unpacking.
>> > There is an overlap like you mentioned, but there are things that can't be
>> > done with sequence unpacking, but can be done here.
>> >
>> > For example, let's say you're given two lists that are not necessarily
>> > numbers, so you can't use numpy, but you want to apply some component-wise
>> > operator between each component. This is something you can't do with
>> > sequence unpacking or with numpy. For example:
>> >
>> > $StudentFullName = $FirstName + " " + $LastName
>> >
>> > So, in effect, I think one big part of is component wise operations.
>> >
>> > Another thing that can't be achieved with sequence unpacking is:
>> > f($x)
>> > i.e. applying f for each component of x.
>> map(f, x)
>> >
>> > About your question above, it's not ambiguous here either:
>> >  a; b = x1;a + x2;5
>> > is exactly "Equivalent" to
>> > a = x1+x2
>> > b = a + 5
>> Now that's confusing, that it differs from sequence unpacking.
>> >
>> > Also, there is a difference in style in sequence unpacking, and here.
>> > In sequence unpacking, you have to pair up the right variables and repeat
>> > the operator, for example:
>> > x,y,z = x1+x2 , y1+y2, z1+z2
>> > Here you don't have to repeat it and pair up the right variables, i.e.
>> > x;y;z = x1;y1;z1 + x2;y2;z2
>> > It's I think good that you (kind of) don't break the encapsulation-ish
>> > thing we have for the three values here. Also, you don't risk, making a
>> > mistake in the operator for one of the values by centralizing the operator
>> > use. For example you could make the mistake:
>> > x,y,z = x1+x2, y1-y2, z1+z2
>> >
>> > Also there are all sort of other things that are less of a motivation for
>> > me but that cannot be done with sequence unpacking.
>> > For instance:
>> > add ; prod = a +;* y  (This one I'm not sure how can be achieved without
>> > ambiguity)
>> > x;y = f;g (a;b)
>> >
>> >
>> > On Wed, Jan 27, 2016 at 12:57 AM, Sjoerd Job Postmus <sjoerdjob at>
>> > wrote:
>> >
>> > > On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
>> > > > Hello,
>> > > >
>> > > > I'm thinking of this idea that we have a pseudo-operator called
>> > > > "Respectively" and shown maybe with ;
>> > >
>> > > Hopefully, you're already aware of sequence unpacking? Search for
>> > > 'unpacking' at .
>> > > Unfortunately, it does not have its own section I can directly link to.
>> > >
>> > >     x, y = 3, 5
>> > >
>> > > would give the same result as
>> > >
>> > >     x = 3
>> > >     y = 5
>> > >
>> > > But it's more robust, as it can also deal with things like
>> > >
>> > >     x, y = y + 1, x + 4
>> > > >
>> > > > Some examples first:
>> > > >
>> > > > a;b;c = x1;y1;z1 + x2;y2;z2
>> > > > is equivalent to
>> > > > a=x1+x2
>> > > > b=y1+y2
>> > > > c=z1+z2
>> > >
>> > > So what would happen with the following?
>> > >
>> > >     a; b = x1;a + x2;5
>> > >
>> > > >
>> > > > So it means for each position in the statement, do something like
>> > > > respectively. It's like what I call a vertical expansion, i.e. running
>> > > > statements one by one.
>> > > > Then there is another unpacking operator which maybe we can show with $
>> > > > sign and it operates on lists and tuples and creates the "Respectively"
>> > > > version of them.
>> > > > So for instance,
>> > > > vec=[]*10
>> > > > $vec = $u + $v
>> > > > will add two 10-dimensional vectors to each other and put the result in
>> > > vec.
>> > > >
>> > > > I think this is a syntax that can make many things more concise plus it
>> > > > makes component wise operation on a list done one by one easy.
>> > > >
>> > > > For example, we can calculate the inner product between two vectors like
>> > > > follows (inner product is the sum of component wise multiplication of two
>> > > > vectors):
>> > > >
>> > > > innerProduct =0
>> > > > innerProduct += $a * $b
>> > > >
>> > > > which is equivalent to
>> > > > innerProduct=0
>> > > > for i in range(len(a)):
>> > > > ...innerProduct += a[i]+b[i]
>> > > >
>> Thinking about this some more:
>> How do you know if this is going to return a list of products, or the
>> sum of those products?
>> That is, why is `innerProduct += $a * $b` not equivalent to
>> `innerProduct = $innerProduct + $a * $b`? Or is it? Not quite sure.
>> A clearer solution would be
>>     innerProduct = sum(map(operator.mul, a, b))
>> But that's current-Python syntax.
>> To be honest, I still haven't seen an added benefit that the new syntax
>> would gain. Maybe you could expand on that?
>> > >
>> > > From what I can see, it would be very beneficial for you to look into
>> > > numpy: . It already provides inner product, sums
>> > > of arrays and such. I myself am not very familiar with it, but I think
>> > > it provides what you need.
>> > >
>> > > >
>> > > > For example, let's say we want to apply a function to all element in a
>> > > > list, we can do:
>> > > > f($a)
>> > > >
>> > > > The $ and ; take precedence over anything except ().
>> > > >
>> > > > Also, an important thing is that whenever, we don't have the respectively
>> > > > operator, such as for example in the statement above on the left hand
>> > > side,
>> > > > we basically use the same variable or value or operator for each
>> > > statement
>> > > > or you can equivalently think we have repeated that whole thing with
>> > > ;;;;.
>> > > > Such as:
>> > > > s=0
>> > > > s;s;s += a;b;c; * d;e;f
>> > > > which result in s being a*d+b,c*e+d*f
>> > > >
>> > > > Also, I didn't spot (at least for now any ambiguity).
>> > > > For example one might think what if we do this recursively, such as in:
>> > > > x;y;z + (a;b;c);(d;e;f);(g;h;i)
>> > > > using the formula above this is equivalent to
>> > > > (x;x;x);(y;y;y);(z;z;z)+(a;b;c);(d;e;f);(g;h;i)
>> > > > if we apply print on the statement above, the result will be:
>> > > > x+a
>> > > > x+b
>> > > > x+c
>> > > > y+d
>> > > > y+e
>> > > > y+f
>> > > > z+g
>> > > > z+h
>> > > > z+i
>> > > >
>> > > > Beware that in all of these ; or $ does not create a new list. Rather,
>> > > they
>> > > > are like creating new lines in the program and executing those lines one
>> > > by
>> > > > one( in the case of $, to be more accurate, we create for loops).
>> > > >
>> > > > I'll appreciate your time and looking forward to hearing your thoughts.
>> > >
>> > > Again, probably you should use numpy. I'm not really sure it warrants a
>> > > change to the language, because it seems like it would really only be
>> > > beneficial to those working with matrices. Numpy already supports it,
>> > > and I'm suspecting that the use case for `a;b = c;d + e;f` can already
>> > > be satisfied by `a, b = c + e, d + f`, and it already has clearly
>> > > documented semantics and still works fine when one of the names on the
>> > > left also appears on the right: First all the calculations on the right
>> > > are performed, then they are assigned to the names on the left.
>> > >
>> > > >
>> > > > Cheers,
>> > > > Moj
>> > >
>> > > Kind regards,
>> > > Sjoerd Job
>> > >
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From sjoerdjob at  Wed Jan 27 13:58:22 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Wed, 27 Jan 2016 19:58:22 +0100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 12:55:33PM -0500, Mirmojtaba Gharibi wrote:
> On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert <abarnert at> wrote:
> For example lets say you have
> x=[1,2,3,4,5,...]
> y=['A','BB','CCC',...]
> p=[2,3,4,6,6,...]
> r=[]*n
> $r = str(len($y*$p)+$x)

Several (current-Python) solutions are there I can already see:

    r = [str(len(yv * pv) + xv) for xv, yv, pv in zip(x, y, p)]
    r = map(lambda xv, yv, pv: str(len(yv * pv) + xv), x, y, p)

    # Assuming x, y, p are numpy arrays
    r = np.vectorize(lambda xv, yv, pv: str(len(yv * pv) + xv))(x, y, p)

Furthermore, the `str(len(y * p) + x)` is supposed to actually do
something, I presume. Why does that not have a name? Foobarize?

    r = [foobarize(xv, yv, pv) for xv, yv, pv in zip(x, y, p)]
    r = map(foobarize, x, y, p)
    r = np.vectorize(foobarize)(x, y, p)

or in your syntax

$r = foobarize($x, $y, $p)

I assume?

Also, supposing `f` is a function of two arguments:

    $r = f(x, $y)


    r = [f(x, y_val) for y_val in y]


    $r = f($x, y)


    r = [f(x_val, y) for x_val in x]

Then what does

    $r = f($x, $y)

mean? I suppose you want it to mean

    r = [f(x_val, y_val) for x_val, y_val in zip(x, y)]
      = map(f, x, y)

which can be confusing if `x` and `y` have different lengths.


    r = [f(x_val, y_val) for x_val in x for y_val in y]


    r = [f(x_val, y_val) for y_val in y for x_val in x]


Besides the questionable benefit of shorter syntax, I think this would
actually not be a good case. Numpy, list/generator comprehensions and
the map/zip builtins already provide more than enough ways to do it. Why
add even another syntax.

No, you don't have to use numpy. If you don't need it, please don't use
it. But, do not forget that the standard set of builtins is already
powerful enough to give you what you want.

Python is a general-purpose programming language (though often used in
sciency-stuff). Matlab is a 'matrix lab' language. If the language its
only purpose is working with matrices: please, go ahead and build
matrix-specific syntax.

In my experience, Python has a lot more purposes than just matrix
manipulation. Codebases I've worked on only had use for the `$` operator
you're suggesting for too little lines of code to bother learning the
extra syntax.

I'm definitively -1 on yet another syntax when there are already
multiple obvious ways to solve the same problem:(numpy, comprehensions,

(not sure if I even have the right to vote here, given that I'm not a
core developer, but just giving my opinion)

> It would be really complex to calculate such a thing with vectorize.
> All I am saving on is basically a for-loop and the indexing. We don't
> really have to use numpy,etc. I think it's much easier to just use for-loop
> and indexing, if you don't like the syntax. So I think the question is,
> does my syntax bring enough convenience to avoid for-loop and indexing.
> For example the above could be equivalently written as
> for i in range(0,len(r)):
> ...r[i] = str(len(y[i]*p[i])+x[i])
> So that's the whole saving. Just a for-loop and indexing operator.

And I listed some of the ways you can save the loop + indexing. That
doesn't need new syntax.

> > That's everything you're asking for, with even more flexibility, with no
> > need for any new ugly perlesque syntax: just use at least one np.array type
> > in an operator expression, call a method on an array type, or wrap a
> > function in vectorize, and everything is elementwise.
> >
> > And of course when you actually _are_ using numbers, as in every single
> > one of your examples, using numpy also gives you around a 6:1 space and
> > 20:1 time savings, which is a nice bonus.
> >
> > For example:
> >
> > $StudentFullName = $FirstName + " " + $LastName
> >
> > So, in effect, I think one big part of is component wise operations.
> >
> > Another thing that can't be achieved with sequence unpacking is:
> > f($x)
> > i.e. applying f for each component of x.
> >
> >
> > That's a very different operation, which I think is more readably spelled
> > map(f, x).
> >
> > About your question above, it's not ambiguous here either:
> >  a; b = x1;a + x2;5
> > is exactly "Equivalent" to
> > a = x1+x2
> > b = a + 5
> >
> > Also, there is a difference in style in sequence unpacking, and here.
> > In sequence unpacking, you have to pair up the right variables and repeat
> > the operator, for example:
> > x,y,z = x1+x2 , y1+y2, z1+z2
> > Here you don't have to repeat it and pair up the right variables, i.e.
> > x;y;z = x1;y1;z1 + x2;y2;z2
> >
> >
> > If you only have two or three of these, that isn't a problem. Although in
> > this case, it sure looks like you're trying to add two 3D vectors, so
> > maybe you should just be storing 3D vectors as instances of a class (with
> > an __add__ method, of course), or as arrays, or as columns in a larger
> > array, rather than as 3 separate variables. What could be more readable
> > than this:
> >
> >     v = v1 + v2
> >
> > And if you have more than about three separate variables, you _definitely_
> > want some kind of array or iterable, not a bunch of separate variables.
> > You're worried about accidentally typing "y1-y2" when you meant "+", but
> > you're far more likely to screw up one of the letters or numbers than the
> > operator. You also can't loop over separate variables, which means you
> > can't factor out some logic and apply it to all three axes, or to both
> > vectors. Also consider how you'd do something like transposing or pivoting
> > or anything even fancier. If you've got a 2D array or iterable of
> > iterables, that's trivial: transpose or zip, etc. If you've got N*M
> > separate variables, you have to write them all individually. Your syntax at
> > best cuts the source length and opportunity for errors in half; using
> > collections cuts it down to 1.
> >
> >

From srkunze at  Wed Jan 27 14:06:26 2016
From: srkunze at (Sven R. Kunze)
Date: Wed, 27 Jan 2016 20:06:26 +0100
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>


On 27.01.2016 16:39, Victor Stinner wrote:
> "One concern that I have though is that transformers are registered
> globally. I think that the decorators in codetransformer do a good job
> of signalling to reader the scope of some new code generation."

I share this concern but haven't a good solution right now. Admittedly, 
I already have a use-case where I would like to apply a transformation 
which is NOT an optimization but a global extension.

So, the discussion about allowing global extension really made me think 
about whether that is really a good idea. *BUT* it would allow me to 
experiment and find out if the risk is worth it.

(use-case: adding some hooks before entering and leaving all try blocks)

> Currently, the PEP 511 doesn't provide a way to register a code
> transformer but only use it under some conditions. For example, if
> fatoptimizer is registered, all .pyc files will be called
> file.cpython-36.fat-0.pyc even if fatoptimizer was disabled.
> I propose to change the design of sys.set_code_transformers() to use
> it more like a registry similar to the codecs registry
> (codecs.register), but different (details below). A difference is that
> the codecs registry uses a mapping (codec name => codec functions),
> whereas sys.set_code_transformers() uses an ordered sequence (list) of
> code transformers. A sequence is used because multiple code
> transformers can be applied sequentially on a single .py file.

How does it change the interface for the users? (I mean besides the 

I still like your idea of having the following three options:

1) global optimizers
2) local extensions --> via codec or import hook
3) global extension --> use with care

So, I assume we talk about specifying 2).

> Petr Viktorin wrote that language extensions "target specific modules,
> with which they're closely coupled: The modules won't run without the
> transformer. And with other modules, the transformer either does
> nothing (as with MacroPy, hopefully), or would fail altogether (as
> with Hy). So, they would benefit from specific packages opting in. The
> effects of enabling them globally range from inefficiency (MacroPy) to
> failures or needing workarounds (Hy)."
> Problem (A): solutions proposed below don't make code tranformers
> mandatory. If a code *requires* a code transformer and the code
> transformer is not registered, Python doesn't complain. Do you think
> that it is a real issue in practice? For MacroPy, it's not a problem
> in practice since functions must be decorated using a decorator from
> the macropy package. If importing macropy fails, the module cannot be
> imported.

Sounds good.

> Problem (B): proposed solutions below adds markers to ask to enable a
> specific code transformer, but a code transformer can decide to always
> modify the Python semantics without using such marker. According to
> Nick Coghlan, code transformers changing the Python semantics *must*
> require a marker in the code using them. IMHO it's the responsability
> of the author of the code transformer to use markers, not the
> responsability of Python.

I agree with Nick. Be explicit.

> Code transformers should maybe return a flag telling if they changed
> the code or not. I prefer a flag rather than comparing the output to
> the input, since the comparison can be expensive, especially for a
> deep AST tree. Example:
> class Optimizer:
>      def ast_optimizer(self, tree, context):
>          # ...
>          return modified, tree
> *modified* must be True if tree was modified.

Not sure if that is needed. If we don't have an immediate use-case, 
simpler is better.

> There are several options to decide if a code transformer must be used
> on a specific source file.

The user should decide, otherwise there is too much magic involved: a 
marker (source file) or an option (cmdline).

I am indifferent whether the marker should be a codec-decl or an import 
hook. But it should be file-local (at least I would prefer that).

All of the options below seem to involve too much magic for my taste (or 
I didn't understand them correctly).

> (1) Add a check_code() and check_ast() functions to code transformers.
> The code transformer is responsible to decide if it wants to transform
> the code or not. Python doesn't use the code transformer if the check
> method returns False.
> Examples:
> * MacroPy can search for the "import macropy" statement (of "from
> macropy import ...") in the AST tree
> * fatoptimizer can search for "__fatoptimizer__ = {'enabled': False}"
> in the code: if this variable is found, the optimizer is completly
> skipped
> (2) Petr proposed to extend importlib to pass a code transformer when
> importing a module.
>      importlib.util.import_with_transformer(
>          'mypackage.specialmodule', MyTransformer())
> IMHO this option is too specific: it's restricted to importlib
> (py_compile, compileall and interactive interpreter don't have the
> feature). I also dislike the API.
> (3) Petr also proposed "a special flag in packages":
>      __transformers_for_submodules__ = [MyTransformer()]
> I don't like having to get access to MyTransformer. The PEP 511
> mentions an use case where the transformed code is run *without*
> registering the transformer. But this issue can easily be fixed by
> using the string to identify the transformer in the registery (ex:
> "fat") rather than its class.
> I'm not sure that putting a flag on the package (package/
> is a good idea. I would prefer to enable language extensions on
> individual files to restrict their scope.
> (4) Sjoerd Job Postmus proposed something similar but using a comment
> and not for packages, but any source file:
>      #:Transformers modname.TransformerClassName,
> modname.OtherTransformerClassName
> The problem is that comments are not stored in the AST tree. I would
> prefer to use AST to decide if an AST transformer should be used or
> not.
> Note: I'm not really motived to extend the AST to start to include
> comments, or even code formatting (spaces, newlines, etc.).
> can be used if you want to
> transform a .py file without touching the format. But I don't think
> that AST must go to this direction. I prefer to keep AST simple.
> (5) Nick proposed (indirectly) to use a different filename (don't use
> ".py") for language extensions.
> This option works with my option (2): the context contains the
> filename which can be used to decide to enable or not the code
> transformer.
> I understand that the code transformer must also install an importlib
> hook to search for other filenames than only .py files. Am I right?
> (6) Nick proposed (indirectly) to use an encoding cookie "which are
> visible as a comment in the module header".
> Again, I dislike this option because comments are not stored in AST.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From random832 at  Wed Jan 27 15:17:55 2016
From: random832 at (Random832)
Date: Wed, 27 Jan 2016 15:17:55 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 27, 2016, at 09:12, Mirmojtaba Gharibi <mojtaba.gharibi at>
> wrote:
> > I think the component-wise operation is the biggest benefit and a more compact and understandable syntax. 
> > For example, 
> > 
> > innerProduct = sum(map(operator.mul, a, b))
> > is much more complex than
> > innerProduct += $a * $b

Frankly, I'd prefer simply innerProduct = sum($a * $b) - i'm not sure
how you can reasonably define all the semantics of all operators in all
combinations in a way that makes your "+=" work.

Furthermore, I think your expressions could also get hairy.

a = [1, 2]
b = [3, 4]
c = 5
a * $b = [[1, 2]*3, [1, 2]*4]] = [[1, 2, 1, 2, 1, 2], [1, 2, 1, 2, 1, 2,
1, 2]]
$a * b = [[1*[3, 4], 2*[3, 4]] = [[3, 4], [3, 4, 3, 4]]
$a * $b = [1*3, 2*4] = [3, 8]
($a * $b) * c = [3, 8] * 5 = [3, 8, 3, 8, 3, 8, 3, 8, 3, 8] # and let's
ignore the associativity problems for the moment
$($a * $b) * c = $[3, 8] * 5 = [3*5, 8*5] = [15, 40]  # oh, look, we
have to put $ on an arbitrary expression, not just a name

Do you need multiple $ signs to operate on multiple dimensions? If not,
why not?

(Arguably, sequence repeating should be a different operator than
multiplication anyway, but that ship has long sailed)

> > MATLAB has a built-in easy way of achieving component-wise operation and I think Python would benefit from that without use of libraries such as numpy.

On Wed, Jan 27, 2016, at 13:13, Andrew Barnert via Python-ideas wrote:
> Why? What's wrong with using numpy?
> It seems like only problem in your initial post was that you thought
> numpy can't do what you want, when in fact it can, and trivially so.
> Adding the same amount of complexity to the base language wouldn't make
> it any more discoverable--it would just mean that _all_ Python users now
> have the potential to be confused, rather than only Python+numpy users,
> which sounds like a step backward.

My impression is that the ultimate idea is to allow/require/recommend a
post-numpy library to use the same syntax for these semantics, so that
the base semantics with the plain operators are not different between
post-numpy and base python, in order to make post-numpy less confusing
than numpy.

I.e. that the semantics when operating on sequences of numbers ought to
be defined solely by the syntax (not confusing, even if it's more
complex than what we have now), rather than by what library the sequence
object comes from (confusing).

From brett at  Wed Jan 27 15:20:20 2016
From: brett at (Brett Cannon)
Date: Wed, 27 Jan 2016 20:20:20 +0000
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas <
python-ideas at> wrote:

> On Jan 27, 2016, at 07:39, Victor Stinner <victor.stinner at>
> wrote:
> >
> > Hi,
> >
> > Thank you for all feedback on my PEP 511. It looks like the current
> > blocker point is the unclear status of "language extensions": code
> > tranformers which deliberately changes the Python semantics. I would
> > like to discuss how we should register them. I think that the PEP 511
> > must discuss "language extensions" even if it doesn't have to propose
> > a solution to make their usage easier. It's an obvious usage of code
> > transformers. If possible, I would like to find a compromise to
> > support them, but make it explicit that they change the Python
> > semantics.
> Is this really necessary?
> If someone is testing a language change locally, and just wants to use
> your (original) API for his tests instead of the more complicated
> alternative of building an import hook, it works fine. If he can't deploy
> that way, that's fine.
> If someone builds a transformer that adds a feature in a way that makes it
> a pure superset of Python, he should be fine with running it on all files,
> so your API works fine. And if some files that didn't use any of the new
> features get .pyc files that imply they did, so what?
> If someone builds a transformer that only runs on files with a different
> extension, he already needs an import hook, so he might as well just call
> his transformer from the input hook, same as he does today.

And the import hook is not that difficult. You can reuse everything from
importlib without modification except for needing to override a single
method in some loader to do your transformation (
Otherwise the only complication is instantiating the right classes and
setting the path hook in `sys.path_hooks`.

> So... What case is served by this new, more complicated API that wasn't
> already served by your original, simple one (remembering that import hooks
> are already there as a fallback)?

As Victor pointed out, the discussion could end in "nothing changed, but we
at least discussed it". I think both you and I currently agree that's the
answer to his question. :)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 27 15:49:12 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 12:49:12 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <ypnydezpdjtvtxsvsohu@vlmj>
Message-ID: <>

On Jan 26, 2016, at 15:59, Greg Ewing <greg.ewing at> wrote:
> I'd like to do something with "let", which is famliar
> from other languages as a binding-creation construct,
> and it doesn't seem a likely choice for a variable
> namne.
> Maybe if we had a general statement for introducing
> a new scope, independent of looping:
>  let:
>    ...

A few years ago, I played with using an import hook to add let statements to Python (by AST-translating them to a function definition and call). It's a neat idea, but I couldn't find any actual uses that made my code more readable. Or, rather, I found a small a handful, but every time it was actually far _more_ readable to just refactor the let body out into a separate (non-nested) function or method.

I don't know if this would be true more universally than for my code. But I think it's worth trying to come up with non-toy examples of where you'd actually use this.

Put another way: flat is better than nested. When you actually need a closure, you have to go nested--but most of the time, you don't. And if you go flat most of the time, the few cases where you go nested now signal that something is special (you actually need a closure). So, unless there really are common cases where you need a closure over some variables, but early binding/value capture/whatever for others, I think this may harm readability more than it helps.

> The for-loop is a special case, because it assigns a
> variable in a place where we can't capture it in a
> let-block. So we introduce a variant:
>  for let x in things:
>    funcs.append(lambda: process(x))

This reads weird to me. I think it's because I've been spending too much time in Swift, but I also think Swift may have gotten things right here, so that's not totally irrelevant.

In Swift, almost anywhere you want to create a new binding--whether normal declaration statements, the equivalent of C99 "if (ch = getch())", or even pattern matching--you have to use the "let" keyword. But "for" statements are the one place you _don't_ use "let", because they _always_ create a new binding for the loop variable.

As I've mentioned before, both C# and Ruby made breaking changes from the Python behavior to the Swift behavior, because they couldn't find any legitimate code that would be broken by that change. And there have been few if any complaints since. If we really are considering adding something like "for let", we should seriously consider whether anyone would ever have a good reason to use "for" instead of "for let". If not, just change "for" instead.

> 2) It may be desirable to allow assignments on the
> same line as "let", e.g.
>  with open(filename) as f:
>    let g = f:
>      process(g)
> which seems marginally more readable.

It's also probably a lot more familiar to people who are used to let from functional languages. And I don't _think_ it's a misleading/false-cognate kind of familiarity, although I'm not positive about that.

From joejev at  Wed Jan 27 16:01:30 2016
From: joejev at (Joseph Jevnik)
Date: Wed, 27 Jan 2016 16:01:30 -0500
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

My thought about decorators is that they allow obvious scoping of changes
for the reader. Anything that becomes module scope or is implied based on
system state that is set in another module will make debugging and reading
much harder. Both lazy_python and codetransformer use bytecode
manipulation; however, it is a purely opt-in system where the transformed
function is decorated. This keeps the transformations in view while you are
reading the code that is affected by them. I would find debugging a project
much more difficult if I needed to remember that the order my modules were
imported matters a lot because they setup a bunch of state. I am not sure
why people want the module to be the smallest unit that is transformed when
really it is the code object that should be the smallest unit. This means
class bodies and functions. If we treat the module as the most atomic unit
then you wouldn't be able to use something like `asconstants`

This is a really great local optimzation when calling a function in a loop,
especially builtins that you know will most likely never change and you
don't want to change if they do. For example:

In [1]: from codetransformer.transformers.constants import asconstants

In [2]: @asconstants(a=1)
   ...: def f():
   ...:     return a

In [3]: a = 5

In [4]: f()
Out[4]: 1

In [5]: @asconstants('pow')  # string means use the built in for this name
   ...: def g(ns):
   ...:     for n in ns:
   ...:         yield pow(n, 2)

In [6]: list(g([1, 2, 3]))
Out[6]: [1, 4, 9]

In [7]: dis(g)
  3           0 SETUP_LOOP              28 (to 31)
              3 LOAD_FAST                0 (ns)
              6 GET_ITER
        >>    7 FOR_ITER                20 (to 30)
             10 STORE_FAST               1 (n)
             13 LOAD_CONST               0 (<built-in function pow>)
             16 LOAD_FAST                1 (n)
             19 LOAD_CONST               1 (2)
             22 CALL_FUNCTION            2 (2 positional, 0 keyword pair)
             25 YIELD_VALUE
             26 POP_TOP
             27 JUMP_ABSOLUTE            7
        >>   30 POP_BLOCK
        >>   31 LOAD_CONST               2 (None)
             34 RETURN_VALUE

This is a simple optimization that people emulate all the time with things
like `sum_ = sum` before the loop or `def g(ns, *, _sum=sum)`.  This cannot
be used at module scope because often you only think it is safe or worth it
to lock in the value for a small segment of code.
Hopefully this use case is being considered as I think this is a very
simple, non-semantics preserving case that is not very and also practical.

On Wed, Jan 27, 2016 at 3:20 PM, Brett Cannon <brett at> wrote:

> On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas <
> python-ideas at> wrote:
>> On Jan 27, 2016, at 07:39, Victor Stinner <victor.stinner at>
>> wrote:
>> >
>> > Hi,
>> >
>> > Thank you for all feedback on my PEP 511. It looks like the current
>> > blocker point is the unclear status of "language extensions": code
>> > tranformers which deliberately changes the Python semantics. I would
>> > like to discuss how we should register them. I think that the PEP 511
>> > must discuss "language extensions" even if it doesn't have to propose
>> > a solution to make their usage easier. It's an obvious usage of code
>> > transformers. If possible, I would like to find a compromise to
>> > support them, but make it explicit that they change the Python
>> > semantics.
>> Is this really necessary?
>> If someone is testing a language change locally, and just wants to use
>> your (original) API for his tests instead of the more complicated
>> alternative of building an import hook, it works fine. If he can't deploy
>> that way, that's fine.
>> If someone builds a transformer that adds a feature in a way that makes
>> it a pure superset of Python, he should be fine with running it on all
>> files, so your API works fine. And if some files that didn't use any of the
>> new features get .pyc files that imply they did, so what?
>> If someone builds a transformer that only runs on files with a different
>> extension, he already needs an import hook, so he might as well just call
>> his transformer from the input hook, same as he does today.
> And the import hook is not that difficult. You can reuse everything from
> importlib without modification except for needing to override a single
> method in some loader to do your transformation (
> Otherwise the only complication is instantiating the right classes and
> setting the path hook in `sys.path_hooks`.
>> So... What case is served by this new, more complicated API that wasn't
>> already served by your original, simple one (remembering that import hooks
>> are already there as a fallback)?
> As Victor pointed out, the discussion could end in "nothing changed, but
> we at least discussed it". I think both you and I currently agree that's
> the answer to his question. :)
> -Brett
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 27 16:15:18 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 13:15:18 -0800
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 09:55, Mirmojtaba Gharibi <mojtaba.gharibi at> wrote:
>> On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert <abarnert at> wrote:
>>> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi <mojtaba.gharibi at> wrote:
>>> Yes, I'm aware sequence unpacking.
>>> There is an overlap like you mentioned, but there are things that can't be done with sequence unpacking, but can be done here.
>>> For example, let's say you're given two lists that are not necessarily numbers, so you can't use numpy, but you want to apply some component-wise operator between each component. This is something you can't do with sequence unpacking or with numpy.
>> Yes, you can do it with numpy.
>> Obviously you don't get the performance benefits when you aren't using "native" types (like int32) and operations that have vectorizes implementations (like adding two arrays of int32 or taking the dot product of float64 matrices), but you do still get the same elementwise operators, and even a way to apply arbitrary callables over arrays, or even other collections:
>>     >>> firsts = ['John', 'Jane']
>>     >>> lasts = ['Smith', 'Doe']
>>     >>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
>>     array(['Smith, John', 'Doe, Jane'], dtype='<U11)
> I think the form I am suggesting is simpler and more readable.

But the form you're suggesting doesn't work for vectorizing arbitrary functions, only for operator expressions (including simple function calls, but that doesn't help for more general function calls). The fact that numpy is a little harder to read for cases that your syntax can't handle at all is hardly a strike against numpy.

And, as I already explained, for the cases where your form _does_ work, numpy already does it, without all the sigils:

    c = a + b

    c = a*a + 2*a*b + b*b

    c = (a * b).sum()

It also works nicely over multiple dimensions. For example, if a and b are both arrays of N 3-vectors instead of just being 3-vectors, you can still elementwise-add them just with +; you can sum all of the results with sum(axis=1); etc. How would you write any of those things with your $-syntax?

> I'm happy you brought vectorize to my attention though. I think as soon you make the statement just a bit complex, it would become really complicated with vectorize. 
> For example lets say you have 
> x=[1,2,3,4,5,...]
> y=['A','BB','CCC',...]
> p=[2,3,4,6,6,...]
> r=[]*n
> $r = str(len($y*$p)+$x)

As a side note, []*n is always just []. Maybe you meant [None for _ in range(n)] or [None]*n? Also, where does n come from? It doesn't seem to have anything to do with the lengths of x, y, and p. So, what happens if it's shorter than them? Or longer? With numpy, of course, that isn't a problem--there's no magic being attempted on the = operator (which is good, because = isn't an operator in Python, and I'm not sure how you'd even properly define your design, much less implement it); the operators just create arrays of the right length.

Anyway, that's still mostly just operators. You _could_ wrap up an operator expression in a function to vectorize, but you almost never want to. Just use the operators directly on the arrays. 

So, let's try a case that has even some minimal amount of logic, where translating to operators would be clumsy at best:

    def sillyslice(y, x, p):
        if x < p: return y[x:p]
        return y[p:x]

    r = sillyslice(y, x, p)

Being a separate function provides all the usual benefits: sillyslice is reusable, debuggable, unit-testable, usable as a first-class object, etc. But forget that; how would you do this at all with your $-syntax?

Since you didn't answer any of my other questions, I'll snip them and repost shorter versions:

* what's wrong with using numpy?
* what's wrong with APL or J or MATLAB?
* what's wrong with making the operators elementwise instead of wrapping the objects in some magic thing?
* what is the type of that magic thing anyway?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Wed Jan 27 16:34:46 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 13:34:46 -0800
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 12:20, Brett Cannon <brett at> wrote:
>> On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas <python-ideas at> wrote:
>> If someone builds a transformer that only runs on files with a different extension, he already needs an import hook, so he might as well just call his transformer from the input hook, same as he does today.
> And the import hook is not that difficult.

Unless it has to work in 2.7 and 3.3 (or, worse, 2.6 and 3.2). :)

> You can reuse everything from importlib without modification except for needing to override a single method in some loader to do your transformation

Yes, as of 3.4, the design is amazing. In fact, hooking any level--lookup, source, AST, bytecode, or pyc--is about as easy as it could be.

My only complaint is that it's not easy enough to find out how easy import hooks are. When I tell people "you could write a simple import hook to play with that idea", they get a look of fear and panic that's completely unwarranted and just drop their cool idea. (I wonder if having complete examples of a simple global-transformer hook and a simple special-extension hook at the start of the docs would be enough to solve that problem?)

And I'm a bit worried that if Victor tries to make things like MacroPy and Hy easier, it still won't be enough for real-life cases, so all it'll do is discourage people from going right to writing import hooks and seeing how easy that already is.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From chris.barker at  Wed Jan 27 16:53:15 2016
From: chris.barker at (Chris Barker)
Date: Wed, 27 Jan 2016 13:53:15 -0800
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi <
mojtaba.gharibi at> wrote:

> MATLAB has a built-in easy way of achieving component-wise operation and I
> think Python would benefit from that without use of libraries such as numpy.

I've always thought there should be a component-wise operations in Python.
The wlay to do it now is somthing like:

[i + j for i,j in zip(a,b)]

is really pretty darn wordy, compared to :

a_numpy_array + another_numpy array

(similar in matlab).

But maybe an operator is the way to do it. But it was long ago decide dnot
to introduce a full set of extra operators, alla matlab:


rather, it was realized that for numpy, which does element-wise operations
be default, matrix multiplication was really the only non-elementwise
operation widely used, so the new @ operator was added.

And we're kind of stuck --even if we added a full set, then in numpy, the
regular operators would be element wise, but for built-in Python sequences,
the special ones would be elementwise -- really confusing!

if you really want this, I'd make your own sequences that re-define the

Or just use Numpy... you can use object arrays if you want to handle
non-numeric values:

In [*4*]: a1 = np.array(["this", "that"], dtype=object)

In [*5*]: a2 = np.array(["some", "more"], dtype=object)

In [*6*]: a1 + a2

Out[*6*]: array(['thissome', 'thatmore'], dtype=object)


Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Wed Jan 27 17:15:58 2016
From: rosuav at (Chris Angelico)
Date: Thu, 28 Jan 2016 09:15:58 +1100
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 28, 2016 at 4:27 AM, Jim J. Jewett <jimjjewett at> wrote:
>> Here I disagree completely. Why do we have tuple,
>> or frozenset? Why do dicts only take immutable keys?
>> Why does the language make it easier to build
>> mapped/filtered copies in place? Why can immutable
>> objects be shared between threads or processes trivially,
>> while mutable objects need locks for threads and heavy
>> "manager" objects for processes? Mutability is a very big deal.
> Those are all "if you're living with these restrictions anyhow,
> and you tell the compiler, the program can run faster."
> None of those sound important in terms of "What does this program
> (eventually) do?"

The nature of hash tables and equality is such that if an object's
value (defined by __eq__) changes between when it's used as a key and
when it's looked up, bad stuff happens. It's not just an optimization
- it's a way for the dict subsystem to protect us against craziness.
Yes, you can bypass that protection:

class HashableList(list):
    def __hash__(self): return hash(tuple(self))

but it's a great safety net. You won't unexpectedly get KeyError when
you iterate over a dictionary - you'll instead get TypeError when you
try to assign. Is that a semantic question or a performance one?


From random832 at  Wed Jan 27 17:42:15 2016
From: random832 at (Random832)
Date: Wed, 27 Jan 2016 17:42:15 -0500
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016, at 17:15, Chris Angelico wrote:
> The nature of hash tables and equality is such that if an object's
> value (defined by __eq__) changes between when it's used as a key and
> when it's looked up, bad stuff happens. It's not just an optimization
> - it's a way for the dict subsystem to protect us against craziness.

This stands alone against all the things that it *could* protect users
against but doesn't due to the "consenting adults" principle.

Java allows ArrayLists to be HashMap keys and the sky hasn't fallen,
despite that language otherwise having far more of a culture of
protecting users from themselves and each other (i.e. it has stuff like
private, final, etc) than Python does.

We won't even protect from redefining math.pi, yet you want to prevent a
user from using as a key in a dictionary a value which _might_ be
altered while the dictionary is in use? This prevents all kinds of
algorithms from being used which would benefit from using a short-lived
dict/set to keep track of things. I think this came up a month or so ago
when we were talking about comparison of dict values views (which could
benefit from being able to use all the values in the dict as keys in a
Counter). They're not going to change while the algorithm is executing
unless the user does some weird multithreaded stuff or something truly
bizarre in a callback (and if they do? consenting adults.), and the dict
is thrown away at the end.

> Yes, you can bypass that protection:
> class HashableList(list):
>     def __hash__(self): return hash(tuple(self))

That doesn't really work for my scenario described above, which requires
an alternate universe in which Python (like Java) requires *all*
objects, mutable or otherwise, to define __hash__ in a way consistent
with __eq__.

> but it's a great safety net. You won't unexpectedly get KeyError when
> you iterate over a dictionary - you'll instead get TypeError when you
> try to assign. Is that a semantic question or a performance one?

But I won't get either error if I don't mutate the list, or I only do it
in equality-conserving ways (e.g. converting between numeric types).

From abarnert at  Wed Jan 27 18:19:39 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 15:19:39 -0800
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 14:42, Random832 <random832 at> wrote:
>> On Wed, Jan 27, 2016, at 17:15, Chris Angelico wrote:
>> The nature of hash tables and equality is such that if an object's
>> value (defined by __eq__) changes between when it's used as a key and
>> when it's looked up, bad stuff happens. It's not just an optimization
>> - it's a way for the dict subsystem to protect us against craziness.
> This stands alone against all the things that it *could* protect users
> against but doesn't due to the "consenting adults" principle.
> Java allows ArrayLists to be HashMap keys and the sky hasn't fallen,
> despite that language otherwise having far more of a culture of
> protecting users from themselves and each other (i.e. it has stuff like
> private, final, etc) than Python does.
> We won't even protect from redefining math.pi, yet you want to prevent a
> user from using as a key in a dictionary a value which _might_ be
> altered while the dictionary is in use? This prevents all kinds of
> algorithms from being used which would benefit from using a short-lived
> dict/set to keep track of things.

It's amazing how many people go for years using Python without noticing this restriction, then, as soon as it's pointed out to them, exclaim "That's horrible! It's way too restrictive! I can think of all kinds of useful code that this prevents!" And then you go back and try to think of code you were prevented from writing over the past five years before you learned this rule, and realize that there's little if any. And, over the next five years, you run into the rule very rarely (and more often, it's because you forgot to define an appropriate __hash__ for an immutable type than because you needed to put a mutable type in a dict or set). 

Similarly, everyone learns the tuple/frozenset trick, decries the fact that there's no way to do a "deep" equivalent, but eventually ends up using the trick once every couple years and never running into the shallowness problem.

From a pure design point of view, this looks like a case of hidebound purity over practice, exactly what Python is against. But from a practical use point of view, it actually works really well. I don't know if you could prove this fact a prioiri, or even argue very strongly for it, but it still seems to be true. People who use Python don't notice the limitation, people who rant against Python don't include it in their "why Python sucks" lists; only people who just discovered it care.

From steve at  Wed Jan 27 19:12:51 2016
From: steve at (Steven D'Aprano)
Date: Thu, 28 Jan 2016 11:12:51 +1100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
> Hello,
> I'm thinking of this idea that we have a pseudo-operator called
> "Respectively" and shown maybe with ;
> Some examples first:
> a;b;c = x1;y1;z1 + x2;y2;z2

Regardless of the merits of this proposal, the suggested syntax cannot 
be used because that's already valid Python syntax equivalent to:

c = x1
z1 + x2

So forget about using the ; as that would be ambiguous.

> Then there is another unpacking operator which maybe we can show with $
> sign and it operates on lists and tuples and creates the "Respectively"
> version of them.
> So for instance,
> vec=[]*10
> $vec = $u + $v
> will add two 10-dimensional vectors to each other and put the result in vec.

[]*10 won't work, as that's just []. And it seems very unpythonic to 
need to pre-allocate a list just to do vectorized addition.

I think you would be better off trying to get better support for 
vectorized operations into Python:

vec = add(u, v)

is nearly as nice looking as u + v, and it need not even be a built-in. 
It could be a library.

In an earlier version of the statistics module, I experimented with 
vectorized functions for some of the operations. I wanted a way for 
the statistics functions to *automatically* generate either scalar or 
vector results without any extra programming effort.

E.g. writing mean([1, 2, 3]) would return the scalar 2, of course, 

mean([(1, 10, 100), 
      (2, 20, 200),
      (3, 30, 300)])

would operate column-wise and return (2, 20, 200). To do that, I needed 
vectorized versions of sum, division, sqrt etc. I didn't mind if they 
were written as function calls instead of operators:

divide(6, 3)  # returns 2
divide((6, 60, 600), 3)  # returns (2, 20, 200)

which I got with a function:

divide = vectorize(operator.truediv)

where vectorize() took a scalar operator and returned a function that 
looped over two vectors and applied the operator to each argument in an 
elementwise fashion. I eventually abandoned this approach because the 
complexity and performance hit of my initial implementation was far too 
great, but maybe that was just my poor implementation.

I think that vectorized functions would be very useful in Python. 
Performance need not be super-fast -- numpy, numba, and the other 
heavy-duty third-party tools would continue to dominate the 
high-performance scientific computing niche, but they should be at least 
*no worse* than the equivalent code using a loop.

If you had a vectorized add() function, your example:

a;b;c = x1;y1;z1 + x2;y2;z2

would become:

a, b, c = add([x1, y1, z1], [x2, y2, z2])

Okay, it's not as *nice looking* as the + operator, but it will do. Or 
you could subclass list to do this instead of concatenation.

I would support the addition of a vectorize() function which took an 
arbitrary scalar function, and returned a vectorized version:

func = vectorized(lambda x, y: 2*x + y**3 - x*y/3)
a, b, c = func(vector_x, vector_y)

being similar to:

f = lambda x, y: 2*x + y**3 - x*y/3
a, b, c = [f(x, y) for x, y in zip(vector_x, vector_y)]

> For example, we can calculate the inner product between two vectors like
> follows (inner product is the sum of component wise multiplication of two
> vectors):
> innerProduct =0
> innerProduct += $a * $b
> which is equivalent to
> innerProduct=0
> for i in range(len(a)):
> ...innerProduct += a[i]+b[i]

def mult(*vectors):
    for t in zip(*vectors):
        yield reduce(operator.mul, t)

innerProduct = sum(mult(a, b))


From steve at  Wed Jan 27 19:31:16 2016
From: steve at (Steven D'Aprano)
Date: Thu, 28 Jan 2016 11:31:16 +1100
Subject: [Python-ideas] several different needs [Explicit variable
 capture list]
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 08:14:15AM -0800, Andrew Barnert wrote:

> I think you're actually agreeing with me: there _aren't_ four 
> different cases people actually want here, just the one we've all been 
> talking about, and FAT is irrelevant to that case, so this sub thread 
> is ultimately just a distraction.

I think we do agree. Thanks for the extra detail, I have nothing more to 
say at this point :-)


From steve at  Wed Jan 27 20:06:03 2016
From: steve at (Steven D'Aprano)
Date: Thu, 28 Jan 2016 12:06:03 +1100
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert via Python-ideas wrote:

> > The for-loop is a special case, because it assigns a
> > variable in a place where we can't capture it in a
> > let-block. So we introduce a variant:
> > 
> >  for let x in things:
> >    funcs.append(lambda: process(x))
> This reads weird to me. I think it's because I've been spending too 
> much time in Swift, but I also think Swift may have gotten things 
> right here, so that's not totally irrelevant.

It reads weird to me too, because "for let x in ..." is just weird. It's 
uncanny valley for English grammar: at first glance it looks like valid 
grammar, but it's not.

> As I've mentioned before, both C# and Ruby made breaking changes from 
> the Python behavior to the Swift behavior, because they couldn't find 
> any legitimate code that would be broken by that change.

I'm not sure if you intended this or not, but that sounds like "they 
found plenty of code that would break, but decided it wasn't legitimate 
so they didn't care".


From ethan at  Wed Jan 27 20:19:16 2016
From: ethan at (Ethan Furman)
Date: Wed, 27 Jan 2016 17:19:16 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/27/2016 05:06 PM, Steven D'Aprano wrote:
> On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert wrote:

>> This reads weird to me. I think it's because I've been spending too
>> much time in Swift, but I also think Swift may have gotten things
>> right here, so that's not totally irrelevant.
> It reads weird to me too, because "for let x in ..." is just weird. It's
> uncanny valley for English grammar: at first glance it looks like valid
> grammar, but it's not.

>> As I've mentioned before, both C# and Ruby made breaking changes from
>> the Python behavior to the Swift behavior, because they couldn't find
>> any legitimate code that would be broken by that change.
> I'm not sure if you intended this or not, but that sounds like "they
> found plenty of code that would break, but decided it wasn't legitimate
> so they didn't care".

Or, "they found code that would break, because it was already broken but 
nobody had noticed yet."


From python at  Wed Jan 27 20:30:37 2016
From: python at (MRAB)
Date: Thu, 28 Jan 2016 01:30:37 +0000
Subject: [Python-ideas]  Respectively and its unpacking sentence
In-Reply-To: <>
Message-ID: <em194efdb0-5ccd-42c4-8ac6-6d29af35d028@andromeda>

On 2016-01-28 00:12:51, "Steven D'Aprano" <steve at> wrote:

>On Wed, Jan 27, 2016 at 12:25:05AM -0500, Mirmojtaba Gharibi wrote:
>>  Hello,
>>  I'm thinking of this idea that we have a pseudo-operator called
>>  "Respectively" and shown maybe with ;
>>  Some examples first:
>>  a;b;c = x1;y1;z1 + x2;y2;z2
>Regardless of the merits of this proposal, the suggested syntax cannot
>be used because that's already valid Python syntax equivalent to:
>c = x1
>z1 + x2
>So forget about using the ; as that would be ambiguous.
>>  Then there is another unpacking operator which maybe we can show with 
>>  sign and it operates on lists and tuples and creates the 
>>  version of them.
>>  So for instance,
>>  vec=[]*10
>>  $vec = $u + $v
>>  will add two 10-dimensional vectors to each other and put the result 
>>in vec.
>[]*10 won't work, as that's just []. And it seems very unpythonic to
>need to pre-allocate a list just to do vectorized addition.
>I think you would be better off trying to get better support for
>vectorized operations into Python:
>vec = add(u, v)
>is nearly as nice looking as u + v, and it need not even be a built-in.
>It could be a library.

An alternative would be to add an element-wise class:

class Vector:
     def __init__(self, *args):
         self.args = args

     def __str__(self):
         return '<%s>' % ', '.join(repr(arg) for arg in self.args)

     def __add__(self, other):
         if isinstance(other, Vector):
             return Vector(*[left + right for left, right in 
zip(self.args, other.args)])

         return Vector(*[left + other for left in self.args])

     def __iter__(self):
         return iter(self.args)

Then you could write:

     a, b, c = Vector(x1, y1, z1) + Vector(x2, y2, z2)

I wonder whether there's a suitable pair of delimiters that could be 
used to create a 'literal' for it.

From abarnert at  Wed Jan 27 20:51:46 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 17:51:46 -0800
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 16:12, Steven D'Aprano <steve at> wrote:
> I think you would be better off trying to get better support for 
> vectorized operations into Python:

I really think, at least 90% of the time, and probably a lot more, people are better off just using numpy than reinventing it. Obviously, building a "statistics-without-numpy" module to be added to the stdlib is an exception. But otherwise, the fact that numpy already exists, and has had a couple decades of heavy use and expert attention and two predecessor libraries to work out the kinks in the design, means that it's likely to be better, even for your limited purposes, than any limited-purpose thing you come up with. 

There are a lot more edge cases than you think. For example, you thought far enough ahead that your sum that works column-wise on 2D arrays. But what about when you need to work row-wise? What's the best interface: an axis parameter, or a transpose function (hey, you can even just use zip)? How do you then extend whichever choice you made to 3D? Or to when you want to get the total sum across both axes? For another example: should I be able to use vectorize to write a function of two arrays, and then apply it to a single N+1-D array, or is that going to cause more confusion than help? And so on. I wouldn't trust my own a priori intuition on those questions, so I'd go look at APL, J, MATLAB, R, and maybe Mathematica and see how their idioms best translate to Python in a variety of different kinds of problems. And I'd probably get some of it wrong, as numpy's ancestors did, and then have to agonize over compatibility-breaking changes.

And after all that, what would be the benefit? I no longer have to install numpy--but now I have to install pyvec instead. Which is just a less-featureful, less-tested, less-optimized, and less-refined numpy.

If there's something actually _wrong_ with numpy's design for your purposes (and you can't trivially wrap it away), that's different. Maybe you could do a whole lot lazily by sticking to the iterator protocol? (There's a nifty Haskell package for vectorizing lazily that might be worth looking at, as long as you can stand reading about everything in terms of monadic lifting where you'd say vectorize, etc.) But "I want the same as numpy but less good" doesn't seem like a good place to start, because at best, that's what you'll end up with.

From abarnert at  Wed Jan 27 21:08:13 2016
From: abarnert at (Andrew Barnert)
Date: Wed, 27 Jan 2016 18:08:13 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 27, 2016, at 17:06, Steven D'Aprano <steve at> wrote:
> On Wed, Jan 27, 2016 at 12:49:12PM -0800, Andrew Barnert via Python-ideas wrote:
>>> The for-loop is a special case, because it assigns a
>>> variable in a place where we can't capture it in a
>>> let-block. So we introduce a variant:
>>> for let x in things:
>>>   funcs.append(lambda: process(x))
>> This reads weird to me. I think it's because I've been spending too 
>> much time in Swift, but I also think Swift may have gotten things 
>> right here, so that's not totally irrelevant.
> It reads weird to me too, because "for let x in ..." is just weird. It's 
> uncanny valley for English grammar: at first glance it looks like valid 
> grammar, but it's not.

Ah, good point.

> [...]
>> As I've mentioned before, both C# and Ruby made breaking changes from 
>> the Python behavior to the Swift behavior, because they couldn't find 
>> any legitimate code that would be broken by that change.
> I'm not sure if you intended this or not, but that sounds like "they 
> found plenty of code that would break, but decided it wasn't legitimate 
> so they didn't care".


What I meant is they found a small number of examples of code that would be affected, but all of them were clearly bugs, and therefore not legitimate. Obviously that can be a judgment call, but usually it's a pretty easy one. Like the function that creates N callbacks that all use the last name, instead of creating one callback for each name, preceded by this comment:

   # Don't call this function! Ruby sucks but when I complain they tell me I'm too dumb to fix it so just don't use it!!!!

Whether the 1.9 change fixed that function or re-broke it differently scarcely matters; clearly no one was depending on the old behavior.

Maybe Python is different, and we would find code that really _does_ need 10 separate functions that all compute x**9 or that all disable the last button or... well, probably something more useful than that, which I can't guess in advance. I certainly wouldn't suggest just changing Python based on the results of a search of Ruby code! But I would definitely suggest doing a similar search of Python code before giving people two similar but different statements to hang themselves with.

From kevinjacobconway at  Wed Jan 27 23:56:50 2016
From: kevinjacobconway at (Kevin Conway)
Date: Thu, 28 Jan 2016 04:56:50 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

I'm willing to take this conversation offline as it seems this thread has
cooled down quite a bit.

I would still like to hear more, though, about how adding this as a
facility in the language improves over the current, external
implementations of Python code optimizers. Python already has tools for
reading in source files, parsing them into AST, modifying that AST, and
writing the final bytecode to files as part of the standard library. I
don't see anything in PEP0511 that improves upon that.

Out of curiosity, do you consider this PEP as adding something to Python
that didn't previously exist or do you consider this PEP to more aligned
with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed
to marshal the community in a common direction? I understand that you have
other PEPs in flight that are designed to make certain optimizations easier
(or possible). Looking at this PEP in isolation, however, leaves me wanting
more explanation as to its value.

You mention the need for monkey-patching or hooking into the import process
as a part of the rational. The PyCC project, while it may not be the best
example for optimizer design, does not need to patch or hook into any thing
to function. Instead, it acts as an alternative bytecode compiler that
drops .pyc just like the standard compiler would. Other than the trade-off
of using a 3rd party library versus adding a -o flag, what significant
advantage does a sys.add_optimizer() call provide?

Again, I'm very much behind your motivation and hope you are incredibly
successful in making Python a faster place to live. I'm only trying to get
in your head and see what you see.

On Wed, Jan 27, 2016 at 10:45 AM Victor Stinner <victor.stinner at>

> Hi,
> 2016-01-16 17:56 GMT+01:00 Kevin Conway <kevinjacobconway at>:
> > I'm a big fan of your motivation to build an optimizer for cPython code.
> > What I'm struggling with is understanding why this requires a PEP and
> > language modification. There are already several projects that manipulate
> > the AST for performance gains such as [1] or even my own ham fisted
> attempt
> > [2].
> Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST
> optimizers section of Prior Art.
> I wrote astoptimizer [1] and this project uses monkey-patching of the
> compile() function, I mentioned this monkey-patching hack in the
> rationale of the PEP:
> I would like to avoid monkey-patching because it causes various issues.
> The PEP 511 also makes transformations more visible: transformers are
> explicitly registered in sys.set_code_transformers() and the .pyc
> filename is modified when the code is transformed.
> It also adds a new feature: it becomes possible to run transformed
> code without having to register the tranformer at runtime. This is
> made possible with the addition of the -o command line option.
> Victor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mojtaba.gharibi at  Thu Jan 28 03:29:17 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Thu, 28 Jan 2016 03:29:17 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 4:53 PM, Chris Barker <chris.barker at> wrote:

> On Wed, Jan 27, 2016 at 9:12 AM, Mirmojtaba Gharibi <
> mojtaba.gharibi at> wrote:
>> MATLAB has a built-in easy way of achieving component-wise operation and
>> I think Python would benefit from that without use of libraries such as
>> numpy.
> I've always thought there should be a component-wise operations in Python.
> The wlay to do it now is somthing like:
> [i + j for i,j in zip(a,b)]
> is really pretty darn wordy, compared to :
> a_numpy_array + another_numpy array
> (similar in matlab).
> But maybe an operator is the way to do it. But it was long ago decide dnot
> to introduce a full set of extra operators, alla matlab:
> .+
> .*
> etc....
> rather, it was realized that for numpy, which does element-wise operations
> be default, matrix multiplication was really the only non-elementwise
> operation widely used, so the new @ operator was added.
> And we're kind of stuck --even if we added a full set, then in numpy, the
> regular operators would be element wise, but for built-in Python sequences,
> the special ones would be elementwise -- really confusing!
> if you really want this, I'd make your own sequences that re-define the
> operators.
Problem is you always forego the hassle of subclassing at that exact moment
that you need element-wise and just use for loops. So it's almost always
not worth the hassle.

> Or just use Numpy... you can use object arrays if you want to handle
> non-numeric values:
> In [*4*]: a1 = np.array(["this", "that"], dtype=object)
> In [*5*]: a2 = np.array(["some", "more"], dtype=object)
> In [*6*]: a1 + a2
> Out[*6*]: array(['thissome', 'thatmore'], dtype=object)
> -CHB
> --
> Christopher Barker, Ph.D.
> Oceanographer
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> Chris.Barker at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From mojtaba.gharibi at  Thu Jan 28 03:29:38 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Thu, 28 Jan 2016 03:29:38 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 28, 2016 at 3:26 AM, Mirmojtaba Gharibi <
mojtaba.gharibi at> wrote:

> On Wed, Jan 27, 2016 at 4:15 PM, Andrew Barnert <abarnert at>
> wrote:
>> On Jan 27, 2016, at 09:55, Mirmojtaba Gharibi <mojtaba.gharibi at>
>> wrote:
>> On Wed, Jan 27, 2016 at 2:29 AM, Andrew Barnert <abarnert at>
>> wrote:
>>> On Jan 26, 2016, at 22:19, Mirmojtaba Gharibi <mojtaba.gharibi at>
>>> wrote:
>>> Yes, I'm aware sequence unpacking.
>>> There is an overlap like you mentioned, but there are things that can't
>>> be done with sequence unpacking, but can be done here.
>>> For example, let's say you're given two lists that are not necessarily
>>> numbers, so you can't use numpy, but you want to apply some component-wise
>>> operator between each component. This is something you can't do with
>>> sequence unpacking or with numpy.
>>> Yes, you can do it with numpy.
>>> Obviously you don't get the performance benefits when you aren't using
>>> "native" types (like int32) and operations that have vectorizes
>>> implementations (like adding two arrays of int32 or taking the dot product
>>> of float64 matrices), but you do still get the same elementwise operators,
>>> and even a way to apply arbitrary callables over arrays, or even other
>>> collections:
>>>     >>> firsts = ['John', 'Jane']
>>>     >>> lasts = ['Smith', 'Doe']
>>>     >>> np.vectorize('{1}, {0}'.format)(firsts, lasts)
>>>     array(['Smith, John', 'Doe, Jane'], dtype='<U11)
>>> I think the form I am suggesting is simpler and more readable.
>> But the form you're suggesting doesn't work for vectorizing arbitrary
>> functions, only for operator expressions (including simple function calls,
>> but that doesn't help for more general function calls). The fact that numpy
>> is a little harder to read for cases that your syntax can't handle at all
>> is hardly a strike against numpy.
> I don't need to vectorize the functions. It's already being done.
> Consider the ; example below:
> a;b = f(x;y)
> it is equivalent to
> a=f(x)
> b=f(y)
> So in effect, in your terminology, it is already vectorized.
> Similar example only with $:
> a=[0,0,0,0]
> x=[1,2,3,4]
> $a=f($x)
> is equivalent to
> a=[0,0,0,0]
> x=[1,2,3,4]
> for i in range(len(a)):
> ...a[i]=f(x[i])
>> And, as I already explained, for the cases where your form _does_ work,
>> numpy already does it, without all the sigils:
>>     c = a + b
>>     c = a*a + 2*a*b + b*b
>>     c = (a * b).sum()
>> It also works nicely over multiple dimensions. For example, if a and b
>> are both arrays of N 3-vectors instead of just being 3-vectors, you can
>> still elementwise-add them just with +; you can sum all of the results with
>> sum(axis=1); etc. How would you write any of those things with your
>> $-syntax?
>> I'm happy you brought vectorize to my attention though. I think as soon
>> you make the statement just a bit complex, it would become really
>> complicated with vectorize.
>> For example lets say you have
>> x=[1,2,3,4,5,...]
>> y=['A','BB','CCC',...]
>> p=[2,3,4,6,6,...]
>> r=[]*n
>> $r = str(len($y*$p)+$x)
>> As a side note, []*n is always just []. Maybe you meant [None for _ in
>> range(n)] or [None]*n? Also, where does n come from? It doesn't seem to
>> have anything to do with the lengths of x, y, and p. So, what happens if
>> it's shorter than them? Or longer? With numpy, of course, that isn't a
>> problem--there's no magic being attempted on the = operator (which is good,
>> because = isn't an operator in Python, and I'm not sure how you'd even
>> properly define your design, much less implement it); the operators just
>> create arrays of the right length.
>> n I just meant symbolically to be len(x). So please replace n with
> len(x). I didn't mean to confuse you. sorry.
>> Anyway, that's still mostly just operators. You _could_ wrap up an
>> operator expression in a function to vectorize, but you almost never want
>> to. Just use the operators directly on the arrays.
>> So, let's try a case that has even some minimal amount of logic, where
>> translating to operators would be clumsy at best:
>>     @np.vectorize
>>     def sillyslice(y, x, p):
>>         if x < p: return y[x:p]
>>         return y[p:x]
>>     r = sillyslice(y, x, p)
>> Being a separate function provides all the usual benefits: sillyslice is
>> reusable, debuggable, unit-testable, usable as a first-class object, etc.
>> But forget that; how would you do this at all with your $-syntax?
>> Since you didn't answer any of my other questions, I'll snip them and
>> repost shorter versions:
>> * what's wrong with using numpy? Nothing. What's wrong even with for loop
>> or assembly for that matter? I didn't argue that it's not possible to
>> achieve these things with assembly.
>> * what's wrong with APL or J or MATLAB? Not sure how relevant it is to
>> our core of conversation. Skipping this.
>> * what's wrong with making the operators elementwise instead of wrapping
>> the objects in some magic thing? The fact that whenever you
>> * what is the type of that magic thing anyway? It has no type. I refer
>> you to my very first email. In that email I exactly explained what it
>> means. It's at best a psuedo macro or something like that. It exactly is
>> equivalent when you write
> a;b=f(x;y)
> to
> a=f(x)
> b=f(y)
> In other words, if I could interpret my code before python interpreter
> interpret it, I would convert the first to the latter.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From leewangzhong+python at  Thu Jan 28 06:11:26 2016
From: leewangzhong+python at (Franklin? Lee)
Date: Thu, 28 Jan 2016 06:11:26 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

For personal use, I wrote a class. (Numpy takes a while to load on my machine.)

    vec = Vector([1,2,3])
    vec2 = vec + 5
    lst = vec2.tolist()

I also add attribute access.

    # creates a Vector of methods, which is then __call__'d
    vec_of_lengths = vec2.bit_length()

And multi-level indexing, similar to Numpy, though I think I might
want to remove indexing of the Vector itself for consistency.

   vec = Vector([{1:2}, {1:3}, {1,4}])

   values = vec[:, 1] # Vector([2,3,4])

And multiple versions of `.map()` (which have different ways of
interpreting the additional arguments).

But this is for personal use. Dirty scripting.

On Thu, Jan 28, 2016 at 3:26 AM, Mirmojtaba Gharibi
<mojtaba.gharibi at> wrote:
> On Wed, Jan 27, 2016 at 4:15 PM, Andrew Barnert <abarnert at>
> wrote:
>> But the form you're suggesting doesn't work for vectorizing arbitrary
>> functions, only for operator expressions (including simple function calls,
>> but that doesn't help for more general function calls). The fact that numpy
>> is a little harder to read for cases that your syntax can't handle at all is
>> hardly a strike against numpy.
> I don't need to vectorize the functions. It's already being done.
> Consider the ; example below:
> a;b = f(x;y)
> it is equivalent to
> a=f(x)
> b=f(y)
> So in effect, in your terminology, it is already vectorized.
> Similar example only with $:
> a=[0,0,0,0]
> x=[1,2,3,4]
> $a=f($x)
> is equivalent to
> a=[0,0,0,0]
> x=[1,2,3,4]
> for i in range(len(a)):
> ...a[i]=f(x[i])

Is that really a _benefit_ of this design? "Explicit is better than implicit."

>> And, as I already explained, for the cases where your form _does_ work,
>> numpy already does it, without all the sigils:
>>     c = a + b
>>     c = a*a + 2*a*b + b*b
>>     c = (a * b).sum()
>> It also works nicely over multiple dimensions. For example, if a and b
>> are both arrays of N 3-vectors instead of just being 3-vectors, you can
>> still elementwise-add them just with +; you can sum all of the results with
>> sum(axis=1); etc. How would you write any of those things with your
>> $-syntax?
> I'm happy you brought vectorize to my attention though. I think as soon
> you make the statement just a bit complex, it would become really
> complicated with vectorize.
> For example lets say you have
> x=[1,2,3,4,5,...]
> y=['A','BB','CCC',...]
> p=[2,3,4,6,6,...]
> r=[]*n
> $r = str(len($y*$p)+$x)
> It would be really complex to calculate such a thing with vectorize.

I think you misunderstand the way to use np.vectorize. You would write
a function, and then np.vectorize it.

    def some_name(yi, pi, xi):
        return str(len(yi * pi) + xi)

    r = np.vectorize(some_name)(y, p, x)

In terms of readability, it's probably better: you're describing an
action you would take, and then (by vectorizing it) saying that you'll
want to repeat that action multiple times.

>> * what's wrong with using numpy?
> Nothing. What's wrong even with for loop
> or assembly for that matter? I didn't argue that it's not possible to
> achieve these things with assembly.

He's asking what benefit it has over using Numpy. You are proposing a
change, and must justify the additional code, API, programmer head
room, and maintenance burden. Why do you want this feature over the
existing options?

>> * what is the type of that magic thing anyway?
> It has no type. I refer
> you to my very first email. In that email I exactly explained what it means.
> It's at best a psuedo macro or something like that. It exactly is equivalent
> when you write
> a;b=f(x;y)
> to
> a=f(x)
> b=f(y)
> In other words, if I could interpret my code before python interpreter
> interpret it, I would convert the first to the latter.

That's very magical. Magic is typically bad. Have you considered the
cost to people learning to read Python?

I also hate that it doesn't have a type. I don't see

    a;b = f(x;y)

as readable (semicolons look sort of like commas to my weak eyes) or
useful (unlike the "$x = f($y)" case). Compare with

    a, b = map(f, (x, y))

Any vectorization syntax allows us to write vectorized expressions
without the extra semicolon syntax.

    a, b = f($[x, y])
    $(a, b) = f($[x, y])

    a, b = f*(x, y)

    a, b, c = $(x1, y1, z1) + $(x2, y2, z2)

    two_dimensional_array = $(1, 2, 3) * $$(1, 2, 3)
    # == ((1, 2, 3),
    #     (2, 4, 6),
    #     (3, 6, 9))
    # (Though how would you vectorize over two dimensions? Syntax is hard.)

> innerProduct =0
> innerProduct += $a * $b

I don't like this at all. A single `=` or `+=` meaning an unbounded
number of assignments? This piece of code should really just use

With theoretical syntax:

    innerProduct = sum(^($a * $b))

where the ^ (placeholder symbol) will force collapse to a
list/tuple/iterator so that the vectorization doesn't "escape" and get
interpreted as

    innerProduct = [sum(ax * bx) for ax, bx in zip(a, b)]

Anyway, there's a lot to think about, and a lot of potential issues of
ambiguity to argue over. That's why I just put these kinds of ideas in
my notes about language designs, rather than try to think about how
they could fit into Python (or any existing language).

(Speaking of language design and syntax, vectorization syntax is
related to lambda literals: they need a way to make sure that
<something> doesn't escape. Arc Lisp uses square brackets instead of
the normal parentheses for lambdas, which bind the `_` parameter

P.S.: In the Gmail editor's bottom-right corner, click the triangle,
and set Plain Text Mode. You can also go to Settings -> General and
turn on "Reply All" as the default behavior, though this won't set it
for mobile.

From steve at  Thu Jan 28 06:20:48 2016
From: steve at (Steven D'Aprano)
Date: Thu, 28 Jan 2016 22:20:48 +1100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, Jan 27, 2016 at 05:51:46PM -0800, Andrew Barnert wrote:
> On Jan 27, 2016, at 16:12, Steven D'Aprano <steve at> wrote:
> > 
> > I think you would be better off trying to get better support for 
> > vectorized operations into Python:
> I really think, at least 90% of the time, and probably a lot more, 
> people are better off just using numpy than reinventing it. 

Oh I agree.

> There are a lot more edge cases than you think. For example, you 
> thought far enough ahead that your sum that works column-wise on 2D 
> arrays. But what about when you need to work row-wise?

I thought of all those questions, and honestly I'm not sure what the 
right answer is. But the nice thing about writing code for the simple 
use-cases is that you don't have to worry about the hard use-cases :-)

I'm mostly influenced by the UI of calculators like the HP-48GX and the 
TI CAS calculators, and typically they don't give you the option. If you 
want to do an average across the row, transpose your data :-)

> And after all that, what would be the benefit? I no longer have to 
> install numpy--but now I have to install pyvec instead. Which is just 
> a less-featureful, less-tested, less-optimized, and less-refined 
> numpy.

True, true, and for many people that's probably a deal-breaker. But for 
others, more features == more things you don't understand and don't know 
why you would ever need them.

Anyway, I'm not proposing that any of this should end up in the stdlib, 
so while I could waffle on for hours, I should bring this to a close 
before it goes completely off-topic.


From srkunze at  Thu Jan 28 11:50:32 2016
From: srkunze at (Sven R. Kunze)
Date: Thu, 28 Jan 2016 17:50:32 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

Some feedback on:

Where do I put this specific piece of code (sys.set_code_transformers([]))?


From victor.stinner at  Thu Jan 28 11:53:35 2016
From: victor.stinner at (Victor Stinner)
Date: Thu, 28 Jan 2016 17:53:35 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

2016-01-28 17:50 GMT+01:00 Sven R. Kunze <srkunze at>:
> Some feedback on:
> Where do I put this specific piece of code (sys.set_code_transformers([]))?

It's better to use -o noopt command, but if you want to call directly
sys.set_code_transformers(), you have to call it before the first
import. Example of
import module


From srkunze at  Thu Jan 28 11:57:08 2016
From: srkunze at (Sven R. Kunze)
Date: Thu, 28 Jan 2016 17:57:08 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 28.01.2016 17:53, Victor Stinner wrote:
> 2016-01-28 17:50 GMT+01:00 Sven R. Kunze <srkunze at>:
>> Some feedback on:
>> Where do I put this specific piece of code (sys.set_code_transformers([]))?
> It's better to use -o noopt command, but if you want to call directly
> sys.set_code_transformers(), you have to call it before the first
> import. Example of
> --
> sys.set_code_transformers([])
> import module
> module.main()
> --

I suspected that. So, where is this place of "before the first" import?

From victor.stinner at  Thu Jan 28 12:03:58 2016
From: victor.stinner at (Victor Stinner)
Date: Thu, 28 Jan 2016 18:03:58 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

2016-01-28 17:57 GMT+01:00 Sven R. Kunze <srkunze at>:
> I suspected that. So, where is this place of "before the first" import?

I don't understand your question.

I guess that your real question is: are stdlib modules loaded with
peephole optimizer enabled or not?

If you use -o noopt, you are safe: the peephole optimizer is disabled
before the first Python import.

If you use sys.set_code_transformers([]) in your code, it's likely
that Python already imported 20 or 40 modules during its
initialization (especially in the site module).

It's up to you to pick the best option. There are different usages for
each option. Maybe you just don't care of the stdlib, you only want to
debug your application code, so it's doesn't matter how the stlidb is


Or are you asking me to remove sys.set_code_transformers([]) from the
section "Usage 3: Disable all optimization"? I don't understand.


From srkunze at  Thu Jan 28 12:46:06 2016
From: srkunze at (Sven R. Kunze)
Date: Thu, 28 Jan 2016 18:46:06 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On 28.01.2016 18:03, Victor Stinner wrote:
> I don't understand your question.
> I guess that your real question is: are stdlib modules loaded with
> peephole optimizer enabled or not?
> If you use -o noopt, you are safe: the peephole optimizer is disabled
> before the first Python import.
> If you use sys.set_code_transformers([]) in your code, it's likely
> that Python already imported 20 or 40 modules during its
> initialization (especially in the site module).
> It's up to you to pick the best option. There are different usages for
> each option. Maybe you just don't care of the stdlib, you only want to
> debug your application code, so it's doesn't matter how the stlidb is
> optimized?
> --
> Or are you asking me to remove sys.set_code_transformers([]) from the
> section "Usage 3: Disable all optimization"? I don't understand.
That is exactly the issue with setting a transformer at runtime which I 
don't understand. That is one weakness of the PEP; some people already 
proposed to make a difference between

- local transformation
- global transformation

I can understand the motivation to have the same API for both, but's 
inherently different and it makes talking about it hard (as we can see 
now). I would like to have this clarified in the PEP (use consistent 
wording) or even split it up into two different parts of the PEP.

You said I would need to call the function before all imports. Why is 
that? Can I not call it it twice in the same file? Or in a loop? What 
will happen? Will the file get recompiled each time? Some people 
proposed a "from __extensions__ import my_extension"; inspired by 
__future__ imports, i.e. it is forced to be at the top. Why? Because it 
somehow makes sense to perform all transformations the first time a file 
is loaded. I don't see that addressed in the PEP. I have to admit I 
would prefer this kind usage over a function call.

- we already have import hooks. They can be used for local 
transformation. I don't see that addressed in the PEP.
- after re-reading the PEP I have some difficulties to see how to 
activate, say, 2 custom transformers **globally** (via -o). Maybe, 
adding an example would help here.

From jimjjewett at  Thu Jan 28 14:51:17 2016
From: jimjjewett at (Jim J. Jewett)
Date: Thu, 28 Jan 2016 11:51:17 -0800 (PST)
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
Message-ID: <>

On Wed Jan 27 15:49:12 EST 2016, Andrew Barnert wrote:

> both C# and Ruby made breaking changes from the Python behavior
> to the Swift behavior, because they couldn't find any legitimate code
> that would be broken by that change. And there have been few if any
> complaints since. If we really are considering adding something like
> "for let", we should seriously consider whether anyone would ever have
> a good reason to use "for" instead of "for let". If not, just change
> "for" instead.

The first few times I saw this, I figured Python had a stronger (and
longer) backwards compatibility guarantee.

But now that I consider the actual breakage, I'm not so sure...

    >>> for i in range(10):
            print (i)

i is explicitly changed, but it doesn't affect the flow control --
it gets reset to the next sequence item as if nothing had happened.

It would break things to hide the final value of i after the loop
is over, but that isn't needed.

I think the only way it even *could* matter is if the loop variable is
captured in a closure each time through the loop.  What would it
look like for the current behavior to be intentional?

    >>> for cache in (4, 5, 6, {}):
            def f():
                cache['haha!'] = "I know only the last will really get used!"



If there are still threading problems with my replies, please
email me with details, so that I can try to resolve them.  -jJ

From ethan at  Thu Jan 28 15:09:00 2016
From: ethan at (Ethan Furman)
Date: Thu, 28 Jan 2016 12:09:00 -0800
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/28/2016 11:51 AM, Jim J. Jewett wrote:

> I think the only way it even *could* matter is if the loop variable is
> captured in a closure each time through the loop.  What would it
> look like for the current behavior to be intentional?
>      >>> for cache in (4, 5, 6, {}):
>              def f():
>                  cache['haha!'] = "I know only the last will really get used!"
>              funcs.append(f)

I think that falls into the "not legitimate" category.  ;)


From brett at  Thu Jan 28 15:36:34 2016
From: brett at (Brett Cannon)
Date: Thu, 28 Jan 2016 20:36:34 +0000
Subject: [Python-ideas] PEP 511: Add a check function to decide if a
 "language extension" code transformer should be used or not
In-Reply-To: <>
References: <>
Message-ID: <>

On Wed, 27 Jan 2016 at 13:34 Andrew Barnert <abarnert at> wrote:

> On Jan 27, 2016, at 12:20, Brett Cannon <brett at> wrote:
> On Wed, 27 Jan 2016 at 08:49 Andrew Barnert via Python-ideas <
> python-ideas at> wrote:
>> If someone builds a transformer that only runs on files with a different
>> extension, he already needs an import hook, so he might as well just call
>> his transformer from the input hook, same as he does today.
> And the import hook is not that difficult.
> Unless it has to work in 2.7 and 3.3 (or, worse, 2.6 and 3.2). :)

Sure, but you're already asking for a lot of pain if you're trying to be
that compatible at the AST/bytecode level so I view this as the least of
your worries. :)

> You can reuse everything from importlib without modification except for
> needing to override a single method in some loader to do your
> transformation
> Yes, as of 3.4, the design is amazing. In fact, hooking any level--lookup,
> source, AST, bytecode, or pyc--is about as easy as it could be.
> My only complaint is that it's not easy enough to find out how easy import
> hooks are. When I tell people "you could write a simple import hook to play
> with that idea", they get a look of fear and panic that's completely
> unwarranted and just drop their cool idea. (I wonder if having complete
> examples of a simple global-transformer hook and a simple special-extension
> hook at the start of the docs would be enough to solve that problem?)

So two things. One is that there is an Examples section in the importlib
docs for 3.6: .
As of right now it only covers use-cases that the `imp` module provided
since that's the most common thing I get asked about.

Second, while it's much easier than it has ever been to do fancy stuff with
import, it's a balancing act of promoting it and discouraging it. :) Mess
up your import and it can be rather hard to debug. And this is especially
true if you hook in early enough such that you start to screw up stuff in
the stdlib and not just your own code. It can also lead to people going a
bit overboard with things (hence why I kept my life simple with the
LazyLoader and actively discourage its use unless you're sure you need it).
So it's a balance of "look at this shiny thing!" and "be careful because
you might come out screaming".

> And I'm a bit worried that if Victor tries to make things like MacroPy and
> Hy easier, it still won't be enough for real-life cases, so all it'll do is
> discourage people from going right to writing import hooks and seeing how
> easy that already is.

We don't need to empower every use-case as much as possible. While we're
consenting adults, we also try to prevent people from making their own
lives harder. All of this stuff is a tough balancing act to get right.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From brett at  Thu Jan 28 15:44:09 2016
From: brett at (Brett Cannon)
Date: Thu, 28 Jan 2016 20:44:09 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

On Wed, 27 Jan 2016 at 20:57 Kevin Conway <kevinjacobconway at>

> I'm willing to take this conversation offline as it seems this thread has
> cooled down quite a bit.
> I would still like to hear more, though, about how adding this as a
> facility in the language improves over the current, external
> implementations of Python code optimizers. Python already has tools for
> reading in source files, parsing them into AST, modifying that AST, and
> writing the final bytecode to files as part of the standard library. I
> don't see anything in PEP0511 that improves upon that.
> Out of curiosity, do you consider this PEP as adding something to Python
> that didn't previously exist or do you consider this PEP to more aligned
> with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed
> to marshal the community in a common direction? I understand that you have
> other PEPs in flight that are designed to make certain optimizations easier
> (or possible). Looking at this PEP in isolation, however, leaves me wanting
> more explanation as to its value.

The PEP is about empowering people to write AST transformers without having
to use third-party tools to integrate it into their workflow. As you
pointed out, there is very little here that isn't possible today with some
toolchain that reads Python source code, translates it into an AST,
optimizes it, and then writes out the .pyc file. But that all does require
going to PyPI or writing your own solution.

But if Victor's PEP gets in, then there will be a standard hook point that
all Python code will go through which will make adding AST transformers
much easier. Whether this ease of use is beneficial is part of the
discussion around this PEP.

> You mention the need for monkey-patching or hooking into the import
> process as a part of the rational. The PyCC project, while it may not be
> the best example for optimizer design, does not need to patch or hook into
> any thing to function. Instead, it acts as an alternative bytecode compiler
> that drops .pyc just like the standard compiler would. Other than the
> trade-off of using a 3rd party library versus adding a -o flag, what
> significant advantage does a sys.add_optimizer() call provide?

The -o addition is probably the biggest thing the PEP is proposing. The
overwriting of .pyc files with optimizations that are not necessarily
expected is not the best, so -o would allow for stopping the abuse of .pyc
file naming. The AST registration parts is all just to make this stuff


> Again, I'm very much behind your motivation and hope you are incredibly
> successful in making Python a faster place to live. I'm only trying to get
> in your head and see what you see.
> On Wed, Jan 27, 2016 at 10:45 AM Victor Stinner <victor.stinner at>
> wrote:
>> Hi,
>> 2016-01-16 17:56 GMT+01:00 Kevin Conway <kevinjacobconway at>:
>> > I'm a big fan of your motivation to build an optimizer for cPython code.
>> > What I'm struggling with is understanding why this requires a PEP and
>> > language modification. There are already several projects that
>> manipulate
>> > the AST for performance gains such as [1] or even my own ham fisted
>> attempt
>> > [2].
>> Oh cool, I didn't know PyCC [2]! I added it to the PEP 511 in the AST
>> optimizers section of Prior Art.
>> I wrote astoptimizer [1] and this project uses monkey-patching of the
>> compile() function, I mentioned this monkey-patching hack in the
>> rationale of the PEP:
>> I would like to avoid monkey-patching because it causes various issues.
>> The PEP 511 also makes transformations more visible: transformers are
>> explicitly registered in sys.set_code_transformers() and the .pyc
>> filename is modified when the code is transformed.
>> It also adds a new feature: it becomes possible to run transformed
>> code without having to register the tranformer at runtime. This is
>> made possible with the addition of the -o command line option.
>> Victor
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From abarnert at  Thu Jan 28 16:13:08 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 28 Jan 2016 21:13:08 +0000 (UTC)
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday, January 28, 2016 12:44 PM, Brett Cannon <brett at> wrote:
>On Wed, 27 Jan 2016 at 20:57 Kevin Conway <kevinjacobconway at> wrote:

>>Out of curiosity, do you consider this PEP as adding something to Python that didn't previously exist or do you consider this PEP to more aligned with PEP0249 (DB2API) and PEP0484 (Type Hints) which are primarily designed to marshal the community in a common direction? I understand that you have other PEPs in flight that are designed to make certain optimizations easier (or possible). Looking at this PEP in isolation, however, leaves me wanting more explanation as to its value.>

>The PEP is about empowering people to write AST transformers without having to use third-party tools to integrate it into their workflow. As you pointed out, there is very little here that isn't possible today with some toolchain that reads Python source code, translates it into an AST, optimizes it, and then writes out the .pyc file. But that all does require going to PyPI or writing your own solution.

This kind of talk worries me. It's _already_ very easy to write AST transformers. There's no need for any third-party code from PyPI, and that "your own solution" that you have to write is a few lines of trivial code.

I think a lot of people don't realize this. Maybe because they tried it in 2.6 or 3.2, where it was a lot harder, or because they read the source to MacroPy (which is compatible with 2.6 and 3.2, or at least originally was), where it looks very hard, or maybe just because they didn't realize how much work has already been put in to make it easy. But whatever the reason, they're wrong. And so they're expecting this PEP to solve a problem that doesn't need to be solved.

>But if Victor's PEP gets in, then there will be a standard hook point that all Python code will go through which will make adding AST transformers much easier. Whether this ease of use is beneficial is part of the discussion around this PEP.

There already is a standard hook point that all Python code goes through. Writing an AST transformer is as simple as replacing the code that compiles source to bytecode with a 3-line function that compiles source to AST, calls your transformer, and compiles AST to bytecode. Processing source or bytecode instead of AST is just as easy (actually, one line shorter).

Where it gets tricky is all the different variations on what you hook and how. Do you want to intercept all .py files? Or add a new extension, like .hy, instead? Or all source files, but only if they start with a magic marker line? How do you want to integrate with naming, finding, obsoleting, reading, and writing .pyc files? What about -O? And so on. And how do you want to work together with other libraries trying to do the same thing, which may have made slightly different decisions? Once you decide what you want, it's another few lines to write and install the hook that does that--the hard part is deciding what you want.

If this PEP can solve the hard part in a general way, so that the right thing to do for different kinds of transformers will be obvious and easy, that would be great.

If it can't do so, then it just shouldn't bother with anything that doesn't fit into its model of global semantic-free transformations. And that would also be great--making global semantic-free transformers easy is already a huge boon even if it doesn't do anything else, and keeping the design for that as simple as possible is better than making it more complex to partially solve other things in a way that only helps with the easiest parts.

From abarnert at  Thu Jan 28 17:01:53 2016
From: abarnert at (Andrew Barnert)
Date: Thu, 28 Jan 2016 22:01:53 +0000 (UTC)
Subject: [Python-ideas] Explicit variable capture list
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday, January 28, 2016 11:51 AM, Jim J. Jewett <jimjjewett at> wrote:

> > 
> On Wed Jan 27 15:49:12 EST 2016, Andrew Barnert wrote:
>>  both C# and Ruby made breaking changes from the Python behavior


> The first few times I saw this, I figured Python had a stronger (and
> longer) backwards compatibility guarantee.

Ruby, sure, but C#, I don't think so. Most of the worst warts in C# 6.0 are there for backward compatibility.[1]

> But now that I consider the actual breakage, I'm not so sure...
>     >>> for i in range(10):
>             print (i)
>             i=i+3
>             print(i)
> i is explicitly changed, but it doesn't affect the flow control --
> it gets reset to the next sequence item as if nothing had happened.

Yeah, that confusion is actually a separate issue. Explaining it in text is a bit difficult, but let's translate to the equivalent while loop:

    _it = iter(range(10))
        while True:
            i = next(_it)
    except StopIteration:

Now it should be obvious why you aren't affecting the control flow.

And it should also be obvious why the "for let" change wouldn't make any difference here.

Could Python solve that confusion? Sure. Swift, Scala, and lots of other languages make the loop variable constant/read-only/l-immutable/whatever, so that "i=i+3" either fails to compile, or raises at runtime, with a "ConstError". The idea is that "i=i=3" is more often a confusing bug than intentional--and, when it is intentional, the workaround is trivial (just write "j=i+3" and use j). But in dynamic languages, const tends to be more annoying than useful, so the smart ones (like Python) don't bother with it.

[1] For example: Non-generic Task interferes with type inference for generic Task<T> much harder, and isn't used except by accident, but they added it anyway, in C# 5 in 2012, because it was needed for consistency with the non-generic collections, which have been deprecated since C# 2 in 2005 but can't be removed because some code might break.

From greg.ewing at  Thu Jan 28 19:27:26 2016
From: greg.ewing at (Greg Ewing)
Date: Fri, 29 Jan 2016 13:27:26 +1300
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
Message-ID: <>

Sven R. Kunze wrote:
> Some people 
> proposed a "from __extensions__ import my_extension"; inspired by 
> __future__ imports, i.e. it is forced to be at the top. Why? Because it 
> somehow makes sense to perform all transformations the first time a file 
> is loaded.

It occurs to me that a magic import for applying local
transformations could itself be implemented using a
global transformer.


From steve at  Thu Jan 28 20:01:58 2016
From: steve at (Steven D'Aprano)
Date: Fri, 29 Jan 2016 12:01:58 +1100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 28, 2016 at 09:13:08PM +0000, Andrew Barnert via Python-ideas wrote:

> This kind of talk worries me. It's _already_ very easy to write AST 
> transformers. There's no need for any third-party code from PyPI, and 
> that "your own solution" that you have to write is a few lines of 
> trivial code.
> I think a lot of people don't realize this.

I don't realise this.

Not that I don't believe you, but I'd like to see a tutorial that goes 
through this step by step and actually explains what this is all about. 
Or, if it really is just a matter of a few lines, even just a simple 
example might help.

For instance, the PEP includes a transformer that changes all string 
literals to "Ni! Ni! Ni!". Obviously it doesn't work as 
sys.set_code_transformers doesn't exist yet, but if I'm understanding 
you, we don't need that because it's already easy to apply that 
transformer. Can you show how? Something that works today?


From victor.stinner at  Thu Jan 28 19:57:02 2016
From: victor.stinner at (Victor Stinner)
Date: Fri, 29 Jan 2016 01:57:02 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

A local transformation requires to register a global code transformer,
but it doesn't mean that all files will be modified. The code
transformer can use various kinds of checks to decide if a file must
be transformed and then which parts of the code should be transformed.
Decorators was suggested as a good granularity.


2016-01-29 1:27 GMT+01:00 Greg Ewing <greg.ewing at>:
> Sven R. Kunze wrote:
>> Some people proposed a "from __extensions__ import my_extension"; inspired
>> by __future__ imports, i.e. it is forced to be at the top. Why? Because it
>> somehow makes sense to perform all transformations the first time a file is
>> loaded.
> It occurs to me that a magic import for applying local
> transformations could itself be implemented using a
> global transformer.
> --
> Greg
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From abarnert at  Thu Jan 28 22:10:39 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 03:10:39 +0000 (UTC)
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday, January 28, 2016 5:07 PM, Steven D'Aprano <steve at> wrote:

> > On Thu, Jan 28, 2016 at 09:13:08PM +0000, Andrew Barnert via Python-ideas wrote:
>>  This kind of talk worries me. It's _already_ very easy to write AST 
>>  transformers. There's no need for any third-party code from PyPI, and 
>>  that "your own solution" that you have to write is a few lines of 
>>  trivial code.
>>  I think a lot of people don't realize this.
> I don't realise this.
> Not that I don't believe you, but I'd like to see a tutorial that goes 
> through this step by step and actually explains what this is all about. 
> Or, if it really is just a matter of a few lines, even just a simple 
> example might help.

I agree, but someone (Brett?) on one of these threads explained that they don't include such a tutorial in the docs because they don't want to encourage people to screw around with import hooks too much, so...

Anyway, I wrote a blog post about last year (, but I'll summarize it here. I'll show the simplest code for hooking in a source, AST, or bytecode transformer, not the most production-ready.

> For instance, the PEP includes a transformer that changes all string 

> literals to "Ni! Ni! Ni!". Obviously it doesn't work as 
> sys.set_code_transformers doesn't exist yet, but if I'm understanding 
> you, we don't need that because it's already easy to apply that 
> transformer. Can you show how? Something that works today?

Sure. Here's an AST transformer:

    class NiTransformer(ast.NodeTransformer):
        def visit_Str(self, node):
            node.s = 'Ni! Ni! Ni!'
            return node
Here's a complete loader implementation that uses the hook:

    class NiLoader(importlib.machinery.SourceFileLoader):
        def source_to_code(self, data, path, *, _optimize=-1):
            source = importlib._bootstrap.decode_source(data)
            tree = NiTransformer().visit(ast.parse(source, path, 'exec'))

            return compile(tree, path, 'exec')

Now, how do you install the hook? That depends on what exactly you want to do. Let's say you want to make it globally hook all .py files, be transparent to .pyc generation, and ignore -O, and you'd prefer a monkeypatch hack that works on all versions 3.3+, rather than a clean spec-based finder that requires 3.5. Here goes:

    finder = sys.meta_path[-1]
    loader = finder.find_module(__file__)
    loader.source_to_code = NiLoader.source_to_code

Just put all this code in your top level script, or just put it in a module and import that in your top level script, either way before importing anything else. (And yes, "before importing anything else" means some bits of the stdlib end up processed and some don't, just as with PEP 511.) You can see it in action at

PEP 511 writes the NiLoader part for you, but, as you can see, that's the easiest part of the whole thing.

If you want all the exact same choices that the PEP makes (global, .py files only, insert name into .pyc files, integrate with -O and -o, promise to be semantically neutral, etc.), it also makes the last part trivial, which is a much bigger deal. If you want any different choices, it doesn't help with the last part at all. (And I think that's fine, as long as that's the intention. Right now, someone has to have some idea of what they're doing to use my hack, and that's probably a good thing, right? And if I want to clean it up and make it distributable, like MacroPy, I'd better know how to write a spec finder or I have no business distributing any such thing. But if people want to experiment with optimizers that don't actually change the behavior of their code, that's a lot safer, so it seems reasonable that we should focus on making that easier.)

From abarnert at  Thu Jan 28 22:30:12 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 03:30:12 +0000 (UTC)
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Thursday, January 28, 2016 7:10 PM, Andrew Barnert <abarnert at> wrote:

Immediately after sending that, I realized that Victor's PEP uses a bytecode transform rather than an AST transform. That isn't much harder to do today. Here's a quick, untested version:

    def ni_transform(c):
        consts = []
        for const in c.co_consts:
            if isinstance(c, str):
                consts.append('Ni! Ni! Ni!')
            elif isinstance(c, types.CodeType):
        return types.CodeType(
            c.co_argcount, c.co_kwonlyargcount, c.co_nlocals, c.co_stacksize,
            c.co_flags, c.co_code, tuple(consts), c.co_names, c.co_varnames,
            c.co_filename, c.co_name, c.co_firstlineno, c.co_lnotab,
            c.co_freevars, c.co_cellvars)

    class NiLoader(importlib.machinery.SourceFileLoader):
        def source_to_code(self, data, path, *, _optimize=-1):
            return ni_transform(compile(data, path, 'exec'))

You may still need the decode_source bit, at least on some of the Python versions; I can't remember. If so, add that one line from the AST version.

Installing the hook is the same as the AST version.

You may notice that I have that horrible 18-argument constructor, and the PEP doesn't. But that's because the PEP is basically cheating with this example. For some reason, it passes 3 of those arguments separately--consts, names, and lnotab. If you modify anything else, you'll need the same horrible constructor. And, in any realistic bytecode transformer, you will need to modify something else. For example, you may want to transform the bytecode.

And meanwhile, once you start actually transforming bytecode, that becomes the hard part, and PEP 511 won't help you there. If you just want to replace every LOAD_GLOBAL with a LOAD_CONST, you can do that in a pretty simple loop with a bit of help from the dis module. But if you want to insert and delete bytecodes like the existing peephole optimizer in C does, then you're also dealing with renumbering jump targets and rebuilding the lnotab and other fun things. And if you start dealing with opcodes that change the stack effect nonlocally, like with and finally handlers, you'd have to be an idiot or a masochist to not reach for a third-party library like byteplay. (I know this because I'm enough of an idiot to have done it once, but not enough of an idiot or a masochist to do it again...).

So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.)

From mike at  Thu Jan 28 22:55:42 2016
From: mike at (Michael Selik)
Date: Thu, 28 Jan 2016 21:55:42 -0600
Subject: [Python-ideas] A bit meta
Message-ID: <>

One defect of a mailing list is the difficulty of viewing a weighted average of opinions. The benefit is that anyone can voice an opinion. This is more like the Senate than the House -- Rhode Island appears (on paper) to have as much influence as California. Luckily, we have a form of President. I'm guessing a House occurs in a more private mode of communication?

Perhaps as the community gets larger, a system like StackOverflow might be a better tool for handling things like Python-Ideas.

> On Jan 27, 2016, at 12:58 PM, Sjoerd Job Postmus <sjoerdjob at> wrote:
> (not sure if I even have the right to vote here, given that I'm not a
> core developer, but just giving my opinion)

From steve at  Fri Jan 29 03:03:50 2016
From: steve at (Steven D'Aprano)
Date: Fri, 29 Jan 2016 19:03:50 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Thu, Jan 28, 2016 at 09:55:42PM -0600, Michael Selik wrote:

> One defect of a mailing list is the difficulty of viewing a weighted 
> average of opinions. The benefit is that anyone can voice an opinion. 
> This is more like the Senate than the House -- Rhode Island appears 
> (on paper) to have as much influence as California. Luckily, we have a 
> form of President. I'm guessing a House occurs in a more private mode 
> of communication?

The Python community is not a democracy. Voting +1, -1 etc. should not 
be interpreted as *actual* votes that need to counted and averaged, but 
as personal opinions intended to give other members of the community an 
idea of whether or not you would like to see a proposed feature.


From encukou at  Fri Jan 29 04:11:26 2016
From: encukou at (Petr Viktorin)
Date: Fri, 29 Jan 2016 10:11:26 +0100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/29/2016 04:55 AM, Michael Selik wrote:
> One defect of a mailing list is the difficulty of viewing a weighted average of opinions. The benefit is that anyone can voice an opinion. This is more like the Senate than the House -- Rhode Island appears (on paper) to have as much influence as California. Luckily, we have a form of President. I'm guessing a House occurs in a more private mode of communication?

I've read up a bit on Wikipedia, so I'll try to start summarizing the
reference for the non-Americans who come after me.

One part of the US government is the "Congress", which is divided into
two "houses": the "Senate" and the House of Representatives (which, I
assume, is *the* "House").

Members of the House correspond to "districts", which are determined by
population (roughly, but the details seem irrelevant here) -- so each
member of the House corresponds roughly to some fixed number of people.
On the other hand, the Senate has two members for each "state", but
states aren't determined by population: "Rhode Island" has many fewer
people than "California". (Unsurprising, I might add: I never hear about
Rhode Island, but California makes it to local news here at times.)

There is also a "President", who doesn't seem to have as much power as
Python's BDFL: he/she can veto decisions of the Congress, but that veto
can in turn be overriden by the Congress.

Trying to hold all these details in my head while thinking how they
relate to mailing list discussions leaves me quite confused.

Would it be possible to make the argument clearer to people who need to
look these things up to understand it?

> Perhaps as the community gets larger, a system like StackOverflow might be a better tool for handling things like Python-Ideas.
>> On Jan 27, 2016, at 12:58 PM, Sjoerd Job Postmus <sjoerdjob at> wrote:
>> (not sure if I even have the right to vote here, given that I'm not a
>> core developer, but just giving my opinion)

From ncoghlan at  Fri Jan 29 09:10:02 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 00:10:02 +1000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 January 2016 at 13:30, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
> So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.)

Rather than trying to categorise things as "hard" or "easy", I find it
to be more helpful to categorise them as "inherent complexity" or
"incidental complexity".

With inherent complexity, you can never eliminate it, only move it
around, and perhaps make it easier to hide from people who don't care
about the topic (cf. the helper classes in importlib, which hide a lot
of the inherent complexity of the import system). With incidental
complexity though, you may be able to find ways to eliminate it

For a lot of code transformations, determining a suitable scope of
application is *inherent* complexity: you need to care about where the
transformation is applied, as it actually matters for that particular
use case.

For semantically significant transforms, scope of application is
inherent complexity, as it affects code readability, and may even be
an error if applied inappropriately. This is why:
- the finer-grained control offered by decorators is often preferred
to metaclasses or import hooks
- custom file extensions or in-file markers are typically used to opt
in to import hook processing

In these cases, whether or not the standard library is processed
doesn't matter, since it will never use the relevant decorator, file
extension or in-file marker. You also don't need to worry about subtle
start-up bugs, since if the decorator isn't imported, or the relevant
import hook isn't installed appropriately, then the code that depends
on that happening simply won't run.

This means the only code transformation cases where determining scope
of applicability turns out to be *incidental* complexity are those
that are intended to be semantically neutral operations. Maybe you're
collecting statistics on opcode frequency, maybe you're actually
applying safe optimisations, maybe you're doing something else, but
the one thing you're promising is that if the transformation breaks
code that works without the transformation applied, then it's a *bug
in the transformer*, not the code being transformed.

In these cases, you *do* care about whether or not the standard
library is processed, so you want an easy way to say "I want to
process *all* the code, wherever it comes from". At the moment, that
easy way doesn't exist, so you either give up, or you mess about with
the hack.

PEP 511 erases that piece of incidental complexity and say, "If you
want to apply a genuinely global transformation, this is how you do
it". The fact we already have decorators and import hooks is why I
think PEP 511 can safely ignore the use cases that those handle.

However, I think it *would* make sense to make the creation of a "Code
Transformation" HOWTO guide part of the PEP - having a guide means we
can clearly present the hierarchy in terms of:

- decorators are strongly encouraged, since the maintainability harm
they can do is limited
- for import hooks, the use of custom file extensions and in-file
markers is strongly encouraged to limit unintended side effects
- global transformation are incredibly powerful, but also very hard to
do well. Transform responsibly, or future maintainers will not think
well of you :)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Fri Jan 29 09:14:35 2016
From: rosuav at (Chris Angelico)
Date: Sat, 30 Jan 2016 01:14:35 +1100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 1:10 AM, Nick Coghlan <ncoghlan at> wrote:
> On 29 January 2016 at 13:30, Andrew Barnert via Python-ideas
> <python-ideas at> wrote:
>> So, again, PEP 511 isn't helping with the hard part. But, again, I think that may be fine. (Someone who knows how to use byteplay well enough to build a semantically-neutral optimizer function decorator, I'll trust him to be able to turn that into a global optimizer with one line of code. But if he wants to hook things in transparently to .pyc files, or to provide actual language extensions, or something like that, I think it's OK to make him do a bit more work before he can give it to me as production-ready code.)
> Rather than trying to categorise things as "hard" or "easy", I find it
> to be more helpful to categorise them as "inherent complexity" or
> "incidental complexity".
> With inherent complexity, you can never eliminate it, only move it
> around, and perhaps make it easier to hide from people who don't care
> about the topic (cf. the helper classes in importlib, which hide a lot
> of the inherent complexity of the import system). With incidental
> complexity though, you may be able to find ways to eliminate it
> entirely.
> For a lot of code transformations, determining a suitable scope of
> application is *inherent* complexity: you need to care about where the
> transformation is applied, as it actually matters for that particular
> use case.
> For semantically significant transforms, scope of application is
> inherent complexity, as it affects code readability, and may even be
> an error if applied inappropriately. This is why:
> - the finer-grained control offered by decorators is often preferred
> to metaclasses or import hooks
> - custom file extensions or in-file markers are typically used to opt
> in to import hook processing
> In these cases, whether or not the standard library is processed
> doesn't matter, since it will never use the relevant decorator, file
> extension or in-file marker. You also don't need to worry about subtle
> start-up bugs, since if the decorator isn't imported, or the relevant
> import hook isn't installed appropriately, then the code that depends
> on that happening simply won't run.
> This means the only code transformation cases where determining scope
> of applicability turns out to be *incidental* complexity are those
> that are intended to be semantically neutral operations. Maybe you're
> collecting statistics on opcode frequency, maybe you're actually
> applying safe optimisations, maybe you're doing something else, but
> the one thing you're promising is that if the transformation breaks
> code that works without the transformation applied, then it's a *bug
> in the transformer*, not the code being transformed.
> In these cases, you *do* care about whether or not the standard
> library is processed, so you want an easy way to say "I want to
> process *all* the code, wherever it comes from". At the moment, that
> easy way doesn't exist, so you either give up, or you mess about with
> the hack.
> PEP 511 erases that piece of incidental complexity and say, "If you
> want to apply a genuinely global transformation, this is how you do
> it". The fact we already have decorators and import hooks is why I
> think PEP 511 can safely ignore the use cases that those handle.

Thank you for the excellent explanation. Can words to this effect be
added to the PEP, please?


From ncoghlan at  Fri Jan 29 09:31:34 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 00:31:34 +1000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 January 2016 at 18:03, Steven D'Aprano <steve at> wrote:
> On Thu, Jan 28, 2016 at 09:55:42PM -0600, Michael Selik wrote:
>> One defect of a mailing list is the difficulty of viewing a weighted
>> average of opinions. The benefit is that anyone can voice an opinion.
>> This is more like the Senate than the House -- Rhode Island appears
>> (on paper) to have as much influence as California. Luckily, we have a
>> form of President. I'm guessing a House occurs in a more private mode
>> of communication?
> The Python community is not a democracy. Voting +1, -1 etc. should not
> be interpreted as *actual* votes that need to counted and averaged, but
> as personal opinions intended to give other members of the community an
> idea of whether or not you would like to see a proposed feature.

Right, in terms of the language and standard library design, some of
the essential points to note are:

- individual core committers have the authority to make changes
(although we vary in how comfortable we are exercising that authority)
- one of the things we're responsible for is judging what topics can
be handled with just a tracker discussion, what would benefit from a
mailing list thread, and what would benefit from going through the
full PEP process (this is still an art rather than a science, which is
why it isn't documented very well)
- records the
areas we individually feel comfortable exerting authority over
- the PEP process itself is defined in
- one relatively common cause of escalation from tracker issues to
mailing list discussions is when consensus can't be reached in a
smaller forum, so perspectives are sought from a slightly wider
audience to see if that tips the balance one way or another
- when consensus still can't be reached (and nobody wants to escalate
to the full PEP process in order to request an authoritative
decision), then the status quo wins stalemates

The python-dev and python-ideas communities form a very important part
of that process, but the most valuable things folks bring are
additional perspectives (whether that's in the form of different use
cases, additional domains of expertise, knowledge of practices in
other programming language communities, etc)


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From rosuav at  Fri Jan 29 09:38:34 2016
From: rosuav at (Chris Angelico)
Date: Sat, 30 Jan 2016 01:38:34 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 1:31 AM, Nick Coghlan <ncoghlan at> wrote:
> The python-dev and python-ideas communities form a very important part
> of that process, but the most valuable things folks bring are
> additional perspectives (whether that's in the form of different use
> cases, additional domains of expertise, knowledge of practices in
> other programming language communities, etc)

This. As mentioned in PEP 10 [1], it's the explanations and
justifications, far more than the votes, that make the real
difference. That said, though, the votes are a great way of gauging
the support levels for a set of similar proposals (eg syntactic
options), where the proposer of the idea doesn't particularly care
which of the options is picked. It's still not in any way democratic,
as evidenced by the vote in PEP 308 [2], which had four options
clearly better than the others, but the one that's now in the language
was the last of those four in the votes.



From mojtaba.gharibi at  Fri Jan 29 10:30:54 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Fri, 29 Jan 2016 10:30:54 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

I support a stack exchange website. Quite often here a few members
overwhelm the email exchanges and the ideas no matter how clearly
you've explained them get buried in your very first email which you
have to repeat over and over and basically the discussion becomes
answering different here and there criticisms of a particular member.
I mean the conversation can quickly become only marginally relevant to
the entirety of your idea. I think stack exchange can sort out that
chaos considerably and if core developers don't really are looking for
consensus, that's okay; at least the convesation is sorted out. Every
new visitor has a chance of first seeing the idea proposed at the top
of the page, then the comments and answers.

On Fri, Jan 29, 2016 at 9:38 AM, Chris Angelico <rosuav at> wrote:
> On Sat, Jan 30, 2016 at 1:31 AM, Nick Coghlan <ncoghlan at> wrote:
>> The python-dev and python-ideas communities form a very important part
>> of that process, but the most valuable things folks bring are
>> additional perspectives (whether that's in the form of different use
>> cases, additional domains of expertise, knowledge of practices in
>> other programming language communities, etc)
> This. As mentioned in PEP 10 [1], it's the explanations and
> justifications, far more than the votes, that make the real
> difference. That said, though, the votes are a great way of gauging
> the support levels for a set of similar proposals (eg syntactic
> options), where the proposer of the idea doesn't particularly care
> which of the options is picked. It's still not in any way democratic,
> as evidenced by the vote in PEP 308 [2], which had four options
> clearly better than the others, but the one that's now in the language
> was the last of those four in the votes.
> ChrisA
> [1]
> [2]
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From ncoghlan at  Fri Jan 29 10:39:09 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 01:39:09 +1000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 01:30, Mirmojtaba Gharibi
<mojtaba.gharibi at> wrote:
> I support a stack exchange website.

There are lots of things we could do to improve the communications
infrastructure the PSF provides the community, but the current
limiting factors are management capacity and (infrastructure)
contributor time, rather than ideas for potential improvement :)

There's also a vicious cycle where the limited management capacity
makes it difficult to use volunteer time effectively, which is why the
PSF is currently actively attempting to break that cycle by hiring an
Infrastructure Manager (applications already closed for that role,


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From barry at  Fri Jan 29 10:48:09 2016
From: barry at (Barry Warsaw)
Date: Fri, 29 Jan 2016 10:48:09 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

On Jan 29, 2016, at 07:03 PM, Steven D'Aprano wrote:

>The Python community is not a democracy. Voting +1, -1 etc. should not 
>be interpreted as *actual* votes that need to counted and averaged, but 
>as personal opinions intended to give other members of the community an 
>idea of whether or not you would like to see a proposed feature.

I'll just mention that if folks are interested i exploring a SO-like voting
system for mailing list archives, you should get involved with the HyperKitty
project.  HK is the Django-based new archiver for Mailman 3, and the HK
subproject is lead by the quite awesome Aurelien Bompard.

A feature like this is on our radar, but you know, resources.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From pavol.lisy at  Fri Jan 29 11:04:36 2016
From: pavol.lisy at (Pavol Lisy)
Date: Fri, 29 Jan 2016 17:04:36 +0100
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

I would really like

(a;b;c) in L


a in L and b in L and c in L


all(i in L for i in (a,b,c))

because readability very matters.

But if I understand then this is not what we could get from your
proposal because a;b;c is not expression. Right?

So we have to write something like

vec=(a;b;c) in L
all(vec) # which is now equivalent to (a in L and b in L and c in L)

> vec=[]*10
> $vec = $u + $v

First row is mistake (somebody wrote it yet) but why not simply? ->

vec = list($u +$v)

Because $u+$v is not expression. It is construct for "unpacking"
operations. It could be useful to have "operator" (calling it operator
is misleading because result is not object) to go back to python
variable. (but with which type? tuple?) Probably $(a;b) could
("return") be transformed to tuple(a,b)


So I could write

a in L and b in L and c in L


all($((a;b;c) in L))           # which is much less nice as "(a;b;c) in L"


all(i in L for i in (a,b,c))   # which is similar readable and don't
need language changes

> s=0
> s;s;s += a;b;c; * d;e;f
> which result in s being a*d+b,c*e+d*f

do you mean a*d+b*d+c*f ?


Your idea is interesting. If you like to test it then you could
improve implementation of next function and play with it (probably you
will find some caveats):

def respectively(statement):
  if statement!='a;b=b;a':
    raise SyntaxError("only supported statement is 'a;b=b;a'")
  exec('global a\nglobal b\na=b\nb=a')

2 2

Unfortunately this could work only in global context due to
limitations around 'exec' and 'locals' functions.

From guido at  Fri Jan 29 11:19:18 2016
From: guido at (Guido van Rossum)
Date: Fri, 29 Jan 2016 08:19:18 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 7:39 AM, Nick Coghlan <ncoghlan at> wrote:
> On 30 January 2016 at 01:30, Mirmojtaba Gharibi
> <mojtaba.gharibi at> wrote:
>> I support a stack exchange website.
> There are lots of things we could do to improve the communications
> infrastructure the PSF provides the community, but the current
> limiting factors are management capacity and (infrastructure)
> contributor time, rather than ideas for potential improvement :)
> There's also a vicious cycle where the limited management capacity
> makes it difficult to use volunteer time effectively, which is why the
> PSF is currently actively attempting to break that cycle by hiring an
> Infrastructure Manager (applications already closed for that role,
> though).

I do have to say I find the idea of using a dedicated StackExchange
site intriguing. I have been a big fan of its cofounder Joel Spolsky
for many years. A StackExchange discussion has some advantages over a
thread in a mailing list -- it's got a clear URL that everyone can
easily find and reference (as opposed to the variety of archive sites
that are currently used), and there is a bit more structure to the
discussion (question, answers, comments). I believe there are some
good examples of other communities of experts that have really
benefited (e.g.

A downside may be that it's hard to read via an email client (although
you can set up notifications). That doesn't bother me personally (I
live in a web browser these days anyway) but I can imagine it will be
harder for some folks to participate.

I don't think it takes much effort to set up one of these -- if
someone feels particularly strong about this I encourage them to
figure out how to set up a StackExchange site to augment python-ideas.
(I think that's where we should start; leave python-dev alone.)

--Guido van Rossum (

From mojtaba.gharibi at  Fri Jan 29 11:34:18 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Fri, 29 Jan 2016 11:34:18 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 11:04 AM, Pavol Lisy <pavol.lisy at> wrote:
> I would really like
> (a;b;c) in L
> vs
> a in L and b in L and c in L
> or
> all(i in L for i in (a,b,c))
> because readability very matters.
> But if I understand then this is not what we could get from your
> proposal because a;b;c is not expression. Right?
> So we have to write something like
> vec=[None]*3
> vec=(a;b;c) in L
> all(vec) # which is now equivalent to (a in L and b in L and c in L)

That's right. Instead, we can get it this way:
a;b;c = $L
which is equivalent to

but as others suggested for this particular example, we can already
get it from unpacking syntax, i.e.
a,b,c = *L

>> vec=[]*10
>> $vec = $u + $v
> First row is mistake (somebody wrote it yet) but why not simply? ->
> vec = list($u +$v)
> Because $u+$v is not expression. It is construct for "unpacking"
> operations. It could be useful to have "operator" (calling it operator
> is misleading because result is not object) to go back to python
> variable. (but with which type? tuple?) Probably $(a;b) could
> ("return") be transformed to tuple(a,b)
> a=[1,2]
> print(a)
> print($a)
> print($$a)
> ----
> [1,2]
> 1
> 2
> (1,2)
> So I could write
> a in L and b in L and c in L
> as
> all($((a;b;c) in L))           # which is much less nice as "(a;b;c) in L"
> and
> all(i in L for i in (a,b,c))   # which is similar readable and don't
> need language changes
>> s=0
>> s;s;s += a;b;c; * d;e;f
>> which result in s being a*d+b,c*e+d*f
> do you mean a*d+b*d+c*f ?

Yes, Oops, it was a typo.

> -------
> Your idea is interesting. If you like to test it then you could
> improve implementation of next function and play with it (probably you
> will find some caveats):
> def respectively(statement):
>   if statement!='a;b=b;a':
>     raise SyntaxError("only supported statement is 'a;b=b;a'")
>   exec('global a\nglobal b\na=b\nb=a')
> a,b=1,2
> respectively('a;b=b;a')
> print(a,b)
> 2 2
> Unfortunately this could work only in global context due to
> limitations around 'exec' and 'locals' functions.

Sounds good. I'd like to experiment with it actually.

From ethan at  Fri Jan 29 11:34:42 2016
From: ethan at (Ethan Furman)
Date: Fri, 29 Jan 2016 08:34:42 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/29/2016 08:19 AM, Guido van Rossum wrote:

> I do have to say I find the idea of using a dedicated StackExchange
> site intriguing. I have been a big fan of its cofounder Joel Spolsky
> for many years. A StackExchange discussion has some advantages over a
> thread in a mailing list -- it's got a clear URL that everyone can
> easily find and reference (as opposed to the variety of archive sites
> that are currently used), and there is a bit more structure to the
> discussion (question, answers, comments). I believe there are some
> good examples of other communities of experts that have really
> benefited (e.g.

I am also a big fan of StackExchange, but the StackExchange sites are 
about questions and answers, while Python-Ideas is about ideas and 

Given that extensive comments on a question or answer is discouraged, 
multiple answers trying to follow a thread of discussion would be 
confusing, and the person asking the question would be the one selecting 
the "approved" answer (which may have nothing to do with the actual 
outcome), I don't see this as being a good fit.


From geoffspear at  Fri Jan 29 11:45:15 2016
From: geoffspear at (Geoffrey Spear)
Date: Fri, 29 Jan 2016 11:45:15 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 11:34 AM, Ethan Furman <ethan at> wrote:

> On 01/29/2016 08:19 AM, Guido van Rossum wrote:
> I do have to say I find the idea of using a dedicated StackExchange
>> site intriguing. I have been a big fan of its cofounder Joel Spolsky
>> for many years. A StackExchange discussion has some advantages over a
>> thread in a mailing list -- it's got a clear URL that everyone can
>> easily find and reference (as opposed to the variety of archive sites
>> that are currently used), and there is a bit more structure to the
>> discussion (question, answers, comments). I believe there are some
>> good examples of other communities of experts that have really
>> benefited (e.g.
> I am also a big fan of StackExchange, but the StackExchange sites are
> about questions and answers, while Python-Ideas is about ideas and
> discussion.
> Given that extensive comments on a question or answer is discouraged,
> multiple answers trying to follow a thread of discussion would be
> confusing, and the person asking the question would be the one selecting
> the "approved" answer (which may have nothing to do with the actual
> outcome), I don't see this as being a good fit.
As a longtime follower of the SE site-creation process, I'd have to agree.
There's pretty much no way such a site would get past the existing
site-creation process. I suspect even a special arrangement with the Stack
Overflow upper management bypassing the regular process wouldn't happen.

In any event, a site that creates the illusion that "Create a Python 2.8!"
having a ton of upvotes means something seems like a Bad Idea.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From jsbueno at  Fri Jan 29 11:49:40 2016
From: jsbueno at (Joao S. O. Bueno)
Date: Fri, 29 Jan 2016 14:49:40 -0200
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 January 2016 at 14:45, Geoffrey Spear <geoffspear at> wrote:
> On Fri, Jan 29, 2016 at 11:34 AM, Ethan Furman <ethan at> wrote:
>> On 01/29/2016 08:19 AM, Guido van Rossum wrote:
>>> I do have to say I find the idea of using a dedicated StackExchange
>>> site intriguing. I have been a big fan of its cofounder Joel Spolsky
>>> for many years. A StackExchange discussion has some advantages over a
>>> thread in a mailing list -- it's got a clear URL that everyone can
>>> easily find and reference (as opposed to the variety of archive sites
>>> that are currently used), and there is a bit more structure to the
>>> discussion (question, answers, comments). I believe there are some
>>> good examples of other communities of experts that have really
>>> benefited (e.g.
>> I am also a big fan of StackExchange, but the StackExchange sites are
>> about questions and answers, while Python-Ideas is about ideas and
>> discussion.
>> Given that extensive comments on a question or answer is discouraged,
>> multiple answers trying to follow a thread of discussion would be confusing,
>> and the person asking the question would be the one selecting the "approved"
>> answer (which may have nothing to do with the actual outcome), I don't see
>> this as being a good fit.
> As a longtime follower of the SE site-creation process, I'd have to agree.
> There's pretty much no way such a site would get past the existing
> site-creation process. I suspect even a special arrangement with the Stack
> Overflow upper management bypassing the regular process wouldn't happen.
> In any event, a site that creates the illusion that "Create a Python 2.8!"
> having a ton of upvotes means something seems like a Bad Idea.

Creating an instance of a S.O. like site, does not mean getting an official
Stack Exchange site-  just instantiate some OpenSource product that
have the same look and feel (and responsiveness)   I know that Ubuntu
people run something similar,
for example.

> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From random832 at  Fri Jan 29 11:55:45 2016
From: random832 at (Random832)
Date: Fri, 29 Jan 2016 11:55:45 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016, at 11:49, Joao S. O. Bueno wrote:
> Creating an instance of a S.O. like site, does not mean getting an
> official
> Stack Exchange site-  just instantiate some OpenSource product that
> have the same look and feel (and responsiveness)   I know that Ubuntu
> people run something similar,
> for example.

Ask Ubuntu is, in fact, a real Stack Exchange site (AIUI they did the
"special arrangement" thing). Stack Exchange's software is not itself
open source, though lists
some "clones" (other software packages that provide varying degrees of
the same look and feel).

From stephen at  Fri Jan 29 12:27:54 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 30 Jan 2016 02:27:54 +0900
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum writes:

 > I don't think it takes much effort to set up one of these -- if
 > someone feels particularly strong about this I encourage them to
 > figure out how to set up a StackExchange site to augment python-ideas.
 > (I think that's where we should start; leave python-dev alone.)

I think an even better place to start would be core-mentorship.

From stephen at  Fri Jan 29 12:32:10 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 30 Jan 2016 02:32:10 +0900
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

Pavol Lisy writes:

 > I would really like
 > (a;b;c) in L

Not well-specified (does order matter? how about repeated values? is
(a;b;c) an object? it sure looks like one, and if so, object in L
already has a meaning).  But for one obvious interpretation:

    {a, b, c} <= set(L)

and in this interpretation you should probably optimize to

    {a, b, c} <= L

by constructing L as a set in the first place.  Really this thread
probably belongs on python-list anyway.

From stephen at  Fri Jan 29 12:34:25 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 30 Jan 2016 02:34:25 +0900
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Petr Viktorin writes:
 > On 01/29/2016 04:55 AM, Michael Selik wrote:

[a lot of en_US.legalese]

 > Would it be possible to make the argument clearer to people who need to
 > look these things up to understand it?

I would just skip to the chase[1]:

 > > Perhaps as the community gets larger, a system like StackOverflow
 > > might be a better tool for handling things like Python-Ideas.

I'm not sure what else is in s.o that he thinks would be helpful, the
references to the American political system weren't very specific.
Obviously a thumbs-up glyph for every post would make it simpler to
say "+1", though.

[1]  Another American idiom.

From brett at  Fri Jan 29 12:56:57 2016
From: brett at (Brett Cannon)
Date: Fri, 29 Jan 2016 17:56:57 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, 29 Jan 2016 at 08:35 Ethan Furman <ethan at> wrote:

> On 01/29/2016 08:19 AM, Guido van Rossum wrote:
> > I do have to say I find the idea of using a dedicated StackExchange
> > site intriguing. I have been a big fan of its cofounder Joel Spolsky
> > for many years. A StackExchange discussion has some advantages over a
> > thread in a mailing list -- it's got a clear URL that everyone can
> > easily find and reference (as opposed to the variety of archive sites
> > that are currently used), and there is a bit more structure to the
> > discussion (question, answers, comments). I believe there are some
> > good examples of other communities of experts that have really
> > benefited (e.g.
> I am also a big fan of StackExchange, but the StackExchange sites are
> about questions and answers, while Python-Ideas is about ideas and
> discussion.
> Given that extensive comments on a question or answer is discouraged,
> multiple answers trying to follow a thread of discussion would be
> confusing, and the person asking the question would be the one selecting
> the "approved" answer (which may have nothing to do with the actual
> outcome), I don't see this as being a good fit.

A better fit would be something like if people
wanted a focused "vote on ideas" solution, or something like for a more modern forum platform that has the
concept of likes for a thread. And then there's as Barry suggested to add the
equivalent of what Discourse has to Mailman 3.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From srkunze at  Fri Jan 29 13:09:47 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 29 Jan 2016 19:09:47 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On 29.01.2016 01:27, Greg Ewing wrote:
> Sven R. Kunze wrote:
>> Some people proposed a "from __extensions__ import my_extension"; 
>> inspired by __future__ imports, i.e. it is forced to be at the top. 
>> Why? Because it somehow makes sense to perform all transformations 
>> the first time a file is loaded.
> It occurs to me that a magic import for applying local
> transformations could itself be implemented using a
> global transformer.

That is certainly true. :)

From srkunze at  Fri Jan 29 13:18:15 2016
From: srkunze at (Sven R. Kunze)
Date: Fri, 29 Jan 2016 19:18:15 +0100
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

On 29.01.2016 01:57, Victor Stinner wrote:
> A local transformation requires to register a global code transformer,
> but it doesn't mean that all files will be modified.

I think you should differentiate between "register" and "use".

"register" basically means "provide but don't use".
"use" basically means "apply the transformation".

(Same is already true for codecs.)

The PEP's "set_code_transformers" seem not to make that distinction.

> The code transformer can use various kinds of checks to decide if a file must
> be transformed and then which parts of the code should be transformed.
> Decorators was suggested as a good granularity.

As others pointed out, implicit transformations are not desirable. So, 
why would a transformer need to check if a file must be transformed? 
Either the author of a file explicitly wants the transformer or not. 
Same goes for the global option. Either it is there or it isn't.

Btw. I would really appreciate a reply to my prior post. ;)


From abarnert at  Fri Jan 29 14:57:11 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 11:57:11 -0800
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 29, 2016, at 06:10, Nick Coghlan <ncoghlan at> wrote:
> PEP 511 erases that piece of incidental complexity and say, "If you
> want to apply a genuinely global transformation, this is how you do
> it". The fact we already have decorators and import hooks is why I
> think PEP 511 can safely ignore the use cases that those handle.

I think this is the conclusion I was hoping to reach, but wasn't sure how to get there. I'm happy with PEP 511 not trying to serve cases like MacroPy and Hy and the example from the byteplay docs, especially so if ignoring them makes PEP 511 simpler, as long as it can explain why it's ignoring them. And a shorter version of your argument should serve as such an explanation.

But the other half of my point was that too many people (even very experienced developers like most of the people on this list) think there's more incidental complexity than there is, and that's also a problem. For example, "I want to write a global processor for local experimentation purposes so I can play with my idea before posting it to Python-ideas" is not a bad desire. And, if people think it's way too hard to do with a quick&dirty import hook, they're naturally going to ask why PEP 511 doesn't help them out by adding a bunch of options to install/run the processors conditionally, handle files, skip the stdlib, etc. And I think the PEP is better without those options.

> However, I think it *would* make sense to make the creation of a "Code
> Transformation" HOWTO guide part of the PEP - having a guide means we
> can clearly present the hierarchy in terms of:

I like this idea.

Earlier I suggested that the import system documentation should have some simple examples of how to actually use the import system to write transforming hooks. Someone (Brett?) pointed out that it's a dangerous technique, and making it too easy for people to play with it without understanding it may be a bad idea. And they're probably right.

A HOWTO is a bit more "out-of-the-way" than library or reference docs--and, more importantly, it also has room to explain when you shouldn't do this or that, and why.

I'm not sure it has to be part of the PEP, but I can see the connection. While the PEP helps by separating out the most important safe case (semantically-neutral, reflected in .pyc, globally consistent, etc.), but it also makes the question "how do I do something similar to PEP 511 transformers except ___" more likely to come up in the first place, making the HOWTO more important.

From donald at  Fri Jan 29 16:27:55 2016
From: donald at (Donald Stufft)
Date: Fri, 29 Jan 2016 16:27:55 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 29, 2016, at 12:56 PM, Brett Cannon <brett at> wrote:
> On Fri, 29 Jan 2016 at 08:35 Ethan Furman <ethan at <mailto:ethan at>> wrote:
> On 01/29/2016 08:19 AM, Guido van Rossum wrote:
> > I do have to say I find the idea of using a dedicated StackExchange
> > site intriguing. I have been a big fan of its cofounder Joel Spolsky
> > for many years. A StackExchange discussion has some advantages over a
> > thread in a mailing list -- it's got a clear URL that everyone can
> > easily find and reference (as opposed to the variety of archive sites
> > that are currently used), and there is a bit more structure to the
> > discussion (question, answers, comments). I believe there are some
> > good examples of other communities of experts that have really
> > benefited (e.g. <>).
> I am also a big fan of StackExchange, but the StackExchange sites are
> about questions and answers, while Python-Ideas is about ideas and
> discussion.
> Given that extensive comments on a question or answer is discouraged,
> multiple answers trying to follow a thread of discussion would be
> confusing, and the person asking the question would be the one selecting
> the "approved" answer (which may have nothing to do with the actual
> outcome), I don't see this as being a good fit.
> A better fit would be something like <> if people wanted a focused "vote on ideas" solution, or something like <> for a more modern forum platform that has the concept of likes for a thread. And then there's <> as Barry suggested to add the equivalent of what Discourse has to Mailman 3.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at <mailto:Python-ideas at>
> <>
> Code of Conduct: <>
I?ve been thinking about trying to set up a discourse instance for the packaging stuff, for whatever it?s worth.

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From greg.ewing at  Fri Jan 29 16:41:39 2016
From: greg.ewing at (Greg Ewing)
Date: Sat, 30 Jan 2016 10:41:39 +1300
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Petr Viktorin wrote:
> Trying to hold all these details in my head while thinking how they
> relate to mailing list discussions leaves me quite confused.

I think Guido is more like the king of England was in
the old days. His word is law, but if he pisses off
his subjects too much, he risks either losing his head
or being forced to sign a Magna Carta.


From ethan at  Fri Jan 29 16:48:59 2016
From: ethan at (Ethan Furman)
Date: Fri, 29 Jan 2016 13:48:59 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/29/2016 01:27 PM, Donald Stufft wrote:

 > I?ve been thinking about trying to set up a discourse instance for
 > the packaging stuff, for whatever it?s worth.

Great!  That will be invaluable for evaluation if nothing else.  ;)


From greg.ewing at  Fri Jan 29 17:02:17 2016
From: greg.ewing at (Greg Ewing)
Date: Sat, 30 Jan 2016 11:02:17 +1300
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
 <> <>
 <> <>
Message-ID: <>

Sven R. Kunze wrote:
> On 29.01.2016 01:27, Greg Ewing wrote:
>> It occurs to me that a magic import for applying local
>> transformations could itself be implemented using a
>> global transformer.

To elaborate on that a bit, something like an __extensions__
magic import could first be prototyped as a global
transformer. If the idea caught on, that transformer
could be made "official", meaning it was incorporated
into the stdlib and applied by default.


From ned at  Fri Jan 29 17:42:18 2016
From: ned at (Ned Batchelder)
Date: Fri, 29 Jan 2016 17:42:18 -0500
Subject: [Python-ideas] Prevent importing yourself?
Message-ID: <>


A common question we get in the #python IRC channel is, "I tried 
importing a module, but I get an AttributeError trying to use the things 
it said it provided."  Turns out the beginner named their own file the 
same as the module they were trying to use.

That is, they want to try (for example) the "azure" package.  So they 
make a file called, and start with "import azure". The import 
succeeds, but it has none of the contents the documentation claims, 
because they have imported themselves.  It's baffling, because they have 
used the exact statements shown in the examples, but it doesn't work.

Could we make this a more obvious failure?  Is there ever a valid reason 
for a file to import itself?  Is this situation detectable in the import 


From rymg19 at  Fri Jan 29 17:57:20 2016
From: rymg19 at (Ryan Gonzalez)
Date: Fri, 29 Jan 2016 16:57:20 -0600
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On January 29, 2016 4:42:18 PM CST, Ned Batchelder <ned at> wrote:
>A common question we get in the #python IRC channel is, "I tried 
>importing a module, but I get an AttributeError trying to use the
>it said it provided."  Turns out the beginner named their own file the 
>same as the module they were trying to use.
>That is, they want to try (for example) the "azure" package.  So they 
>make a file called, and start with "import azure". The import 
>succeeds, but it has none of the contents the documentation claims, 
>because they have imported themselves.  It's baffling, because they
>used the exact statements shown in the examples, but it doesn't work.
>Could we make this a more obvious failure?  Is there ever a valid
>for a file to import itself?  Is this situation detectable in the

Haha, +1. This bit me a good 50 times when I started learning Python.

>Python-ideas mailing list
>Python-ideas at
>Code of Conduct:

Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

From greg.ewing at  Fri Jan 29 18:09:30 2016
From: greg.ewing at (Greg Ewing)
Date: Sat, 30 Jan 2016 12:09:30 +1300
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

Ned Batchelder wrote:
> Could we make this a more obvious failure?  Is there ever a valid reason 
> for a file to import itself?

I've done it occasionally, but only when doing something
very unusual, and I probably wouldn't mind having to
pull it out of sys.modules in cases like that.


From sjoerdjob at  Fri Jan 29 18:42:26 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Sat, 30 Jan 2016 00:42:26 +0100
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote:
> Hi,
> A common question we get in the #python IRC channel is, "I tried
> importing a module, but I get an AttributeError trying to use the
> things it said it provided."  Turns out the beginner named their own
> file the same as the module they were trying to use.
> That is, they want to try (for example) the "azure" package.  So
> they make a file called, and start with "import azure". The
> import succeeds, but it has none of the contents the documentation
> claims, because they have imported themselves.  It's baffling,
> because they have used the exact statements shown in the examples,
> but it doesn't work.
> Could we make this a more obvious failure?  Is there ever a valid
> reason for a file to import itself?  Is this situation detectable in
> the import machinery?
> --Ned.

I feel this is only a partial fix. I've been bitten by something like
this, but not precisely like this. The difference in how I experienced
this makes it enough for me to say: I don't think your suggestion is
that useful.

What I experienced was having collisions on the python-path, and modules
from my codebase colliding with libraries in the stdlib (or outside it).
For example, a library might import one of its dependencies which
coincidentally had the same name as one of the libraries I have.

Maybe a suggestion would be to add the path of the module to the error

Currently the message is

    sjoerdjob$ cat
    import json
    TEST = '{"foo": "bar"}'

sjoerdjob$ python3.5 
Traceback (most recent call last):
  File "", line 1, in <module>
    import json
  File "/Users/sjoerdjob/Development/spikes/importself/", line 5, in <module>
AttributeError: module 'json' has no attribute 'loads'

But maybe the error could/should be

AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/ has no attribute 'loads'.

As another corner case, consider the following:
    JSON_DATA = '{"foo": "bar"}'
    import json
    def parse(blob):
        return json.loads(blob)
    from json import JSON_DATA
    from mod_a import parse

(Now, consider that instead of 'json' we chose a less common module
name.). You still get the error `module 'json' has no attribute
'loads'`. In this case, I think it's more helpful to know the filename
of the 'json' module. For me that'd sooner be a trigger to "What's going
on.", because I haven't been bitten by the 'import self' issue as often
as 'name collision in dependency tree'.

(Of course, another option would be to look for other modules of the
same name when you get an attribute-error on a module to aid debugging,
but I think that's too heavy-weight.)

Kind regards,
Sjoerd Job

From gokoproject at  Fri Jan 29 19:05:32 2016
From: gokoproject at (John Wong)
Date: Fri, 29 Jan 2016 19:05:32 -0500
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 6:42 PM, Sjoerd Job Postmus <sjoerdjob at>

> I feel this is only a partial fix. I've been bitten by something like
> this, but not precisely like this. The difference in how I experienced
> this makes it enough for me to say: I don't think your suggestion is
> that useful.
Yes, your example is actually more likely to happen, and it happened to me
many times. One reason is some of the stdlib module names are kind of
commons. Once I defined my own and then another time I had a which collided with requests library.

I think the right solution is assume every import error needs some
guidance, some hints. Don't just target a specific problem.

Ned probably familiar with this, in the case of Ansible, if Ansible cannot
resolve and locate the role you specify in the playbook, Ansible will
complain and give this error message:

ERROR: cannot find role in /current/path/roles/some-role or
/current/path/some-role or /etc/ansible/roles/some-role

So the import error should be more or less like this

AttributeError: module 'json' has no attribute 'loads'

Possible root causes:
   * json is not found in the current PYTHONPATH. Python tried
/current/python/site-packages/json, /current/python/site-packages/
For the full list of PYTHONPATH, please refer to THIS DOC ON PYTHON.ORG.
   * your current module has the same name as the module you intent to

You can even simplify this to say possible cause please go to this doc on and we can go verbose there.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From oscar.j.benjamin at  Fri Jan 29 19:13:36 2016
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Sat, 30 Jan 2016 00:13:36 +0000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 29 January 2016 at 23:42, Sjoerd Job Postmus <sjoerdjob at> wrote:
> On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote:
>> Hi,
>> A common question we get in the #python IRC channel is, "I tried
>> importing a module, but I get an AttributeError trying to use the
>> things it said it provided."  Turns out the beginner named their own
>> file the same as the module they were trying to use.
>> Could we make this a more obvious failure?  Is there ever a valid
>> reason for a file to import itself?  Is this situation detectable in
>> the import machinery?
> I feel this is only a partial fix. I've been bitten by something like
> this, but not precisely like this. The difference in how I experienced
> this makes it enough for me to say: I don't think your suggestion is
> that useful.
> What I experienced was having collisions on the python-path, and modules
> from my codebase colliding with libraries in the stdlib (or outside it).
> For example, a library might import one of its dependencies which
> coincidentally had the same name as one of the libraries I have.

Another way that the error can arrive is if your script has the same
name as an installed module that is indirectly imported. I commonly
see my students choose the name "" for a script which can
lead to this problem:

$ echo 'import urllib2' >
$ python
Traceback (most recent call last):
  File "", line 1, in <module>
    import urllib2
  File "/usr/lib/python2.7/", line 94, in <module>
    import httplib
  File "/usr/lib/python2.7/", line 80, in <module>
    import mimetools
  File "/usr/lib/python2.7/", line 6, in <module>
    import tempfile
  File "/usr/lib/python2.7/", line 35, in <module>
    from random import Random as _Random
ImportError: cannot import name Random

To fully avoid this error you need to know every possible top-level
module/package name and not use any of them as the name of your
script. This would be avoided if module namespaces had some nesting
e.g. 'import stdlib.random' but apparently flat is better than

> Maybe a suggestion would be to add the path of the module to the error
> message?
> Currently the message is
> AttributeError: module 'json' has no attribute 'loads'
> But maybe the error could/should be
> AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/ has no attribute 'loads'.

I think that would be an improvement. It would still be a problem for
absolute beginners but at least the error message gives enough
information to spot the problem.

In general though I think it's unfortunate that it's possible to be
able to override installed or even stdlib modules just by having a .py
file with the same name in the same directory as the running script. I
had a discussion with a student earlier today about why '.' is not
usually on PATH for precisely this reason: you basically never want
the ls (or whatever) command to run a program in the current


From abarnert at  Fri Jan 29 19:16:46 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 16:16:46 -0800
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoerdjob at> wrote:
> What I experienced was having collisions on the python-path, and modules
> from my codebase colliding with libraries in the stdlib (or outside it).
> For example, a library might import one of its dependencies which
> coincidentally had the same name as one of the libraries I have.

Yes. The version of this I've seen most from novices is that they write a program named "" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads.

(One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :))

> Maybe a suggestion would be to add the path of the module to the error
> message?

That would probably help, but think about what it entails:

Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it.

So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting. 

> (Of course, another option would be to look for other modules of the
> same name when you get an attribute-error on a module to aid debugging,
> but I think that's too heavy-weight.)

If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes.

From oscar.j.benjamin at  Fri Jan 29 19:29:48 2016
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Sat, 30 Jan 2016 00:29:48 +0000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 00:16, Andrew Barnert via Python-ideas
<python-ideas at> wrote:
>> Maybe a suggestion would be to add the path of the module to the error
>> message?
> That would probably help, but think about what it entails:
> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

Oh yeah, good point. Somehow I read the AttributeError as an ImportError e.g.

$ python
Traceback (most recent call last):
  File "", line 1, in <module>
    import urllib2
  File "/usr/lib/python2.7/", line 94, in <module>
    import httplib
  File "/usr/lib/python2.7/", line 80, in <module>
    import mimetools
  File "/usr/lib/python2.7/", line 6, in <module>
    import tempfile
  File "/usr/lib/python2.7/", line 35, in <module>
    from random import Random as _Random
ImportError: cannot import name Random

That error message could be changed to something like

ImportError: cannot import name Random from module 'random'

Attribute errors would be more problematic.


From rosuav at  Fri Jan 29 22:11:13 2016
From: rosuav at (Chris Angelico)
Date: Sat, 30 Jan 2016 14:11:13 +1100
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin
<oscar.j.benjamin at> wrote:
> In general though I think it's unfortunate that it's possible to be
> able to override installed or even stdlib modules just by having a .py
> file with the same name in the same directory as the running script. I
> had a discussion with a student earlier today about why '.' is not
> usually on PATH for precisely this reason: you basically never want
> the ls (or whatever) command to run a program in the current
> directory.

One solution would be to always work in a package. As of Python 3,
implicit relative imports don't happen, so you should be safe. Maybe
there could be a flag like -m that means "run as if current directory
is a module"? You can change to a parent directory and run "python3 -m
dir.file" to run dir/; if "python3 -r file" could run
from the current directory (and assume the presence of an empty if one isn't found), that would prevent all accidental
imports - if you want to grab a file from right next to you, that's
"from . import otherfile", which makes perfect sense.

It'd be 100% backward compatible, as the new behaviour would take
effect only if the option is explicitly given. Doable?


From abarnert at  Fri Jan 29 22:23:01 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 19:23:01 -0800
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 29, 2016, at 19:11, Chris Angelico <rosuav at> wrote:
> On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin
> <oscar.j.benjamin at> wrote:
>> In general though I think it's unfortunate that it's possible to be
>> able to override installed or even stdlib modules just by having a .py
>> file with the same name in the same directory as the running script. I
>> had a discussion with a student earlier today about why '.' is not
>> usually on PATH for precisely this reason: you basically never want
>> the ls (or whatever) command to run a program in the current
>> directory.
> One solution would be to always work in a package. As of Python 3,
> implicit relative imports don't happen, so you should be safe. Maybe
> there could be a flag like -m that means "run as if current directory
> is a module"? You can change to a parent directory and run "python3 -m
> dir.file" to run dir/; if "python3 -r file" could run
> from the current directory (and assume the presence of an empty
> if one isn't found), that would prevent all accidental
> imports - if you want to grab a file from right next to you, that's
> "from . import otherfile", which makes perfect sense.
> It'd be 100% backward compatible, as the new behaviour would take
> effect only if the option is explicitly given. Doable?

I like it.

The only problem is that people on platforms where you can add the -r on the shbang line will start doing that, and then their scripts won't be portable...

> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

From steve at  Fri Jan 29 23:10:02 2016
From: steve at (Steven D'Aprano)
Date: Sat, 30 Jan 2016 15:10:02 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 05:56:57PM +0000, Brett Cannon wrote:

> A better fit would be something like if people
> wanted a focused "vote on ideas" solution, 

I don't think treating language design as a participatory democracy 
would be a good idea, even if it were practical. (How could you get all 
Python users to vote? Do casual users who only use Python occasionally 
get fractional votes?) If it were, Python would probably look and behave 
a lot more like PHP.

And even representative democracy has practical problems. (Who speaks 
for the users of numpy? Sys admins? Teachers?)

I'm 100% in favour of community participation and would like to 
encourage people to participate and be heard, but I don't think we 
should have any illusions about the fundamentally non-democratic nature 
of language design. Nor do I think that's necessarily a bad thing. Not 
everything needs to be decided by voting.

I think it is far more honest to admit that language design is always 
going to be an authoritarian process where a small elite, possibly even 
a single person, decides what makes it into the language and what 
doesn't, than to try to claim democratic legitimcy via voting that 
cannot possibly be representative.

> or something like
> for a more modern forum platform that has the
> concept of likes for a thread.

Ah, "like" buttons. The way to feel good about yourself for 
participating without actually participating :-)

Well, I suppose it's a bit less disruptive than having hordes of 
"Me too!!!1!" posts.


From guido at  Fri Jan 29 23:35:30 2016
From: guido at (Guido van Rossum)
Date: Fri, 29 Jan 2016 20:35:30 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 8:10 PM, Steven D'Aprano <steve at> wrote:
> On Fri, Jan 29, 2016 at 05:56:57PM +0000, Brett Cannon wrote:
>> A better fit would be something like if people
>> wanted a focused "vote on ideas" solution,
> I don't think treating language design as a participatory democracy
> would be a good idea, even if it were practical. (How could you get all
> Python users to vote? Do casual users who only use Python occasionally
> get fractional votes?) If it were, Python would probably look and behave
> a lot more like PHP.
> And even representative democracy has practical problems. (Who speaks
> for the users of numpy? Sys admins? Teachers?)
> I'm 100% in favour of community participation and would like to
> encourage people to participate and be heard, but I don't think we
> should have any illusions about the fundamentally non-democratic nature
> of language design. Nor do I think that's necessarily a bad thing. Not
> everything needs to be decided by voting.
> I think it is far more honest to admit that language design is always
> going to be an authoritarian process where a small elite, possibly even
> a single person, decides what makes it into the language and what
> doesn't, than to try to claim democratic legitimcy via voting that
> cannot possibly be representative.
>> or something like
>> for a more modern forum platform that has the
>> concept of likes for a thread.
> Ah, "like" buttons. The way to feel good about yourself for
> participating without actually participating :-)
> Well, I suppose it's a bit less disruptive than having hordes of
> "Me too!!!1!" posts.

Let me clarify why I like StackExchange. I don't care about the voting
for/against answers or even about the selection of the "best" answer
by the OP. I do like that the reputation system of the site
automatically recognizes users who should be given more
responsibilities (up to and including deleting inappropriate posts --
rarely). What I like most is that the site encourages the creation of
artifacts that are useful to reference later, e.g. when a related
issue comes up again later. And I think it will be easier for new
folks to participate than the current mailing list (where if you don't
sign up for it you're likely to miss most replies, while if you do
sign up, you'll be inundated with traffic -- not everybody is a wizard
at managing high volume mailing list traffic).

I don't understand the issues brought up about the SE site creation
process. 22 years ago we managed to create a Usenet newsgroup,
comp.lang.python. Surely today we can figure out how to create a SE

--Guido van Rossum (

From ben+python at  Sat Jan 30 00:16:27 2016
From: ben+python at (Ben Finney)
Date: Sat, 30 Jan 2016 16:16:27 +1100
Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit meta)
References: <>
Message-ID: <>

Guido van Rossum <guido at> writes:

> I don't understand the issues brought up about the SE site creation
> process. 22 years ago we managed to create a Usenet newsgroup,
> comp.lang.python. Surely today we can figure out how to create a SE
> site?

We have done, several times. One popular option is Askbot
<URL:>. I'd be happy to see a
PSF-blessed instance of Askbot running at a ? domain.

That said, it would be wise to reflect that creating the software is not
the hard part; continually responding to community needs, and managing
the system so desirable behaviours are encouraged, is the hard part

 \         ?If nature has made any one thing less susceptible than all |
  `\    others of exclusive property, it is the action of the thinking |
_o__)              power called an idea? ?Thomas Jefferson, 1813-08-13 |
Ben Finney

From guido at  Sat Jan 30 00:24:18 2016
From: guido at (Guido van Rossum)
Date: Fri, 29 Jan 2016 21:24:18 -0800
Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 9:16 PM, Ben Finney <ben+python at> wrote:
> Guido van Rossum <guido at> writes:
>> I don't understand the issues brought up about the SE site creation
>> process. 22 years ago we managed to create a Usenet newsgroup,
>> comp.lang.python. Surely today we can figure out how to create a SE
>> site?
> We have done, several times. One popular option is Askbot
> <URL:>. I'd be happy to see a
> PSF-blessed instance of Askbot running at a ? domain.
> That said, it would be wise to reflect that creating the software is not
> the hard part; continually responding to community needs, and managing
> the system so desirable behaviours are encouraged, is the hard part
> <URL:>.

Oh, I wasn't talking about creating more software. I was assuming we
could find a way to join the SE network. IOW let Jeff Atwood and co.
take care of that stuff, so we can focus on having meaningful
discussions. (Or were you trolling?)

--Guido van Rossum (

From stephen at  Sat Jan 30 00:52:11 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 30 Jan 2016 14:52:11 +0900
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Steven D'Aprano writes:

 > > I don't think treating language design as a participatory democracy
 > > would be a good idea, even if it were practical.

Fred Brooks (The Mythical Man-Month, "The Surgical Team") agreed with
you 40 years ago.  In fact he argued that dictatorship was best for
software systems in general.

 > > Ah, "like" buttons. The way to feel good about yourself for
 > > participating without actually participating :-)

Guido van Rossum[1] responds:

 > And I think it will be easier for new folks to participate than the
 > current mailing list (where if you don't sign up for it you're
 > likely to miss most replies, while if you do sign up, you'll be
 > inundated with traffic -- not everybody is a wizard at managing
 > high volume mailing list traffic).

But as you'll recall Antoine not so long ago no-mail'ed python-ideas
and possibly python-dev because of the volume of participation by
people whose comments were unlikely in the extreme to have any effect
on the decision being discussed.[2]  I don't know how many other core
developers have taken that course, but there certainly was a lot of
sympathy for Antoine -- and IMO justifiably so.  Noblesse oblige can
go only so far, and in the face of "like" buttons....

I agree that reputation systems are very interesting, but in the case
of design channels that need (in the sense Steven described well) to
be dominated by an "elite", I suspect they could make it very hard to
achieve promotion to "elite" status as quickly as python-dev often
does.  I consider the openness of Python core to potential new
members[3] to be a distinguishing characteristic of this community.
It would be unfortunate if potential were obscured by initial low

On the other hand, one attribute that you have mentioned (the ease of
finding issues) has a useful effect.  To the extent that StackExchange
makes traffic management easy (specifically filtering, threading, and
linking), it might encourage users to follow links to other threads
where relevant discussion is posted.  In the thread where Antoine
spoke up, the fact that the discussion that led to the main decision
was on python-committers almost certainly had a lot to do with the
fact that most of the posts were unaware that the main decision was
final, and of the reasons for and against the decision that had
already been discussed.  And those reasons were rehashed endlessly!  A
forum that encourages retrieval of previous discussion before posting
would make a big difference, I suspect.  Eg, one with a check box "I
have read and understood the discussions cited and I still want to
post"[4] for comment entry and a "No! He didn't do his homework!"
button next to the posted comment.<wink/>

But an experiment, eg, with core-mentorship or a SIG, would be good.
As Ben says, designing systems involving people is *hard*, and you
frequently see unintended effects.  Unfortunately, those effects are
perverse far more often than not.[5]

[1]  The juxtaposition of Guido's words with Steven's is intentional,
though no insult is intended to either.

[2]  I'm sorry about the wording, but I don't have a better one.
Python channels do not ignore *people*.  However, new participants are
more like to make comments that will have no effect, and thus their
comments are likely to be ignored or dismissed with a stock response.
Especially if to the experienced eye the comment has already been
responded to fully in the same thread.

[3]  Every core wants new members who can fit right in.  What makes
Python different from the typical project is effective mentoring of
those with mere potential.

[4]  Like the old Usenet newsreaders used to.

[5]  Which is why my field is justifiably known as "The Dismal Science."

From ben+python at  Sat Jan 30 01:29:37 2016
From: ben+python at (Ben Finney)
Date: Sat, 30 Jan 2016 17:29:37 +1100
Subject: [Python-ideas] A collaborative Q&A site for Python
References: <>
Message-ID: <>

Guido van Rossum <guido at> writes:

> Oh, I wasn't talking about creating more software. I was assuming we
> could find a way to join the SE network. IOW let Jeff Atwood and co.
> take care of that stuff, so we can focus on having meaningful
> discussions.

Ah. I guess I work from the assumption we'd want the PSF to keep control
of our own tools for collaboration, unless there's good reason

> (Or were you trolling?)

No, just didn't understand the differing priorities.

 \                         ?I doubt, therefore I might be.? ?anonymous |
  `\                                                                   |
_o__)                                                                  |
Ben Finney

From sjoerdjob at  Sat Jan 30 01:42:48 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Sat, 30 Jan 2016 07:42:48 +0100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 02:52:11PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> ...
> On the other hand, one attribute that you have mentioned (the ease of
> finding issues) has a useful effect.  To the extent that StackExchange
> makes traffic management easy (specifically filtering, threading, and
> linking), it might encourage users to follow links to other threads
> where relevant discussion is posted. ... . A forum that encourages
> retrieval of previous discussion before posting would make a big
> difference, I suspect.  Eg, one with a check box "I have read and
> understood the discussions cited and I still want to post"[4] for
> comment entry and a "No! He didn't do his homework!" button next to
> the posted comment.<wink/>

To be honest, I don't think that would make that big of a difference,
less so than the difference caused by having the discussion area more
easily accessible. In the end, there's always going to be a group of
people who are likely to ignore best practices and add irrelevant
comments (some of whom will not really learn). In fact, I'd expect the
ease of using a website to make it more likely for people to join which
at first do not follow best practices at all.

Think of using a mailing list as placing a filter on minimum intellect.

I'm also a visitor on some of the stack-exchange sites, and I see a lot
of topics that gets closed quite soon-ish on account of not fitting the
model of a Q&A site for [insert any of a thousand reasons here].

On the other hand, in (at least) python-ideas and python-dev, I don't
see any 'crap' coming by. Sometimes an idea I might think of as crap,
but at least the idea is (quite often) well-substantiated and argued for
in the initial postings.

I myself am inclined to assign the praise for the high-quality to not
only the core community, but also to the somewhat unusual sign-up
procedure[1], and would be very (happily) surprised if the quality would
stay the same when switching to something web-based with an obvious UI.

[1] Unusual in the sense that it's so not 2016 to have a mailing list
instead of a web forum. Mailing lists are a lot less common now than it
was some time ago.

> ...
> [2]  I'm sorry about the wording, but I don't have a better one.
> Python channels do not ignore *people*.  However, new participants are
> more like to make comments that will have no effect, and thus their
> comments are likely to be ignored or dismissed with a stock response.
> Especially if to the experienced eye the comment has already been
> responded to fully in the same thread.

Maybe there should be a document describing expected behaviour, instead
of expecting people to somehow 'get' it by observing. For instance, I
did not know if it was OK for me to say '-1' or '+0' or ... on a
suggestion. If there were some guidelines on that, "Everybody can
'vote', but please keep in mind that ..." and "Voting happens by
<procedure>" as well as something along the lines of "It's not a
democracy, voting is just a way of showing your support for/against, but
there will not be a formal tally.".

> [3]  Every core wants new members who can fit right in.  What makes
> Python different from the typical project is effective mentoring of
> those with mere potential.
On that topic, would it make sense to at the very least make a list of
some things you want to look for in members 'who can fit right in'?
Well, I'm not really sure that would be a good idea, but what I think
might be a good idea would be something to help people in drawing up
their opening post with an idea. That would help in people getting an
idea of what would be effective behaviour.

Things like:

- If you're proposing syntax changes, please document as fully as
  possible why what you want is not possible with the current syntax, or
  just too burdensome.
  Why what you want to do is common enough to justify the additional
  burden of the mental overhead the suggested syntax (naturally)
  imposes. Yes, your new syntax might reduce the mental overhead in the
  case you are considering, but please keep ... in mind.
- ... (Additional suggestions here)

(now, I'm just brainstorming here, but suggestions that would help
people write better opening posts, or give more effective feedback would
probably not hurt. However, I don't think I'm the proper person to write
down suggestions like that, as I'm still relatively new)

From sjoerdjob at  Sat Jan 30 01:58:51 2016
From: sjoerdjob at (Sjoerd Job Postmus)
Date: Sat, 30 Jan 2016 07:58:51 +0100
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoerdjob at> wrote:
> > 
> > What I experienced was having collisions on the python-path, and modules
> > from my codebase colliding with libraries in the stdlib (or outside it).
> > For example, a library might import one of its dependencies which
> > coincidentally had the same name as one of the libraries I have.
> Yes. The version of this I've seen most from novices is that they write a program named "" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads.
> (One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :))
> > Maybe a suggestion would be to add the path of the module to the error
> > message?
> That would probably help, but think about what it entails:
> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

True. Most AttributeErrors are on user-defined classes with a typo. But
that's not the case we're discussing here. Here we are discussing how a
user should debug the effects of module name collisions, and the
resulting AttributeError.

I would expect it to be quite unlikely that two modules with the same
name each have a class with the same name, and you accidentally
initialize the wrong one.

More likely (in my experience) is that you get an AttributeError on a
module (in the case of module-name collisions).

> To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it.
> So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting. 

Yes, indeed. That's what I was thinking of. I decided to write up a quick hack that added the filename to the exception string.

    sjoerdjob$ ../python 
    Traceback (most recent call last):
      File "", line 4, in <module>
      File "/home/sjoerdjob/dev/cpython/tmp/", line 4, in parse
        return json.loads(blob)
    AttributeError: module 'json' (loaded from /home/sjoerdjob/dev/cpython/tmp/ has no attribute 'loads'

Here's the patch, in case anyone is interested.

    diff --git a/Objects/moduleobject.c b/Objects/moduleobject.c
    index 24c5f4c..5cc144a 100644
    --- a/Objects/moduleobject.c
    +++ b/Objects/moduleobject.c
    @@ -654,17 +654,25 @@ module_repr(PyModuleObject *m)
     static PyObject*
     module_getattro(PyModuleObject *m, PyObject *name)
    -    PyObject *attr, *mod_name;
    +    PyObject *attr, *mod_name, *mod_file;
         attr = PyObject_GenericGetAttr((PyObject *)m, name);
         if (attr || !PyErr_ExceptionMatches(PyExc_AttributeError))
             return attr;
         if (m->md_dict) {
             mod_name = _PyDict_GetItemId(m->md_dict, &PyId___name__);
             if (mod_name) {
    -            PyErr_Format(PyExc_AttributeError,
    +            _Py_IDENTIFIER(__file__);
    +            mod_file = _PyDict_GetItemId(m->md_dict, &PyId___file__);
    +            if (mod_file && PyUnicode_Check(mod_file)) {
    +                PyErr_Format(PyExc_AttributeError,
    +                        "module '%U' (loaded from %U) has no attribute '%U'", mod_name, mod_file, name);
    +            } else {
    +                PyErr_Format(PyExc_AttributeError,
                             "module '%U' has no attribute '%U'", mod_name, name);
    +            }
                 return NULL;
             else if (PyErr_Occurred()) {

Unfortunately, I do think this might impose **some** performance issue, but on
the other hand, I'd be inclined to think that attribute-errors on module
objects are not that likely to begin with, except for typos and issues
like these. (And of course the case that you have to support older
versions of Python with a slower implementation, but you most often see
those checks being done at the module-level, so it would only impact
load-time and not running-time.)

The added benefit would be quicker debugging when finally having posted
to a forum: "Ah, I see from the message that the path of the module is
not likely a standard-library path. Maybe you have a name collision?
Check for files or directories named '<module name here>(.py)' in your
working directory / project / ... .

> > (Of course, another option would be to look for other modules of the
> > same name when you get an attribute-error on a module to aid debugging,
> > but I think that's too heavy-weight.)
> If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes.

No, indeed, that was also my expectation: helpful, but too big a hassle
to be worth it.

From abarnert at  Sat Jan 30 02:44:31 2016
From: abarnert at (Andrew Barnert)
Date: Fri, 29 Jan 2016 23:44:31 -0800
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 29, 2016, at 22:58, Sjoerd Job Postmus <sjoerdjob at> wrote:
>> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
>>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoerdjob at> wrote:
>>> Maybe a suggestion would be to add the path of the module to the error
>>> message?
>> That would probably help, but think about what it entails:
>> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.
> True. Most AttributeErrors are on user-defined classes with a typo. But
> that's not the case we're discussing here. Here we are discussing how a
> user should debug the effects of module name collisions, and the
> resulting AttributeError.

Right. So my point is, either we have to do the extra work in module.__getattr__ when formatting the string, or we have to extend the interface of AttributeError to carry more information in general (the object and attr name, presumably). The latter may be better, but it's also clearly not going to happen any time soon. (People have been suggesting since before 3.0 that all the standard exceptions should have more useful info, but nobody's volunteered to change the hundreds of lines of C code, Python code, and docs to do it...)

So, the only argument against your idea I can see is the potential performance issues. Which should be pretty easy to dismiss with a microbenchmark showing it's pretty small even in the worst case and a macrobenchmark showing it's not even measurable in real code, right?

From skreft at  Sat Jan 30 03:05:23 2016
From: skreft at (Sebastian Kreft)
Date: Sat, 30 Jan 2016 19:05:23 +1100
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 30, 2016 6:45 PM, "Andrew Barnert via Python-ideas" <
python-ideas at> wrote:
> On Jan 29, 2016, at 22:58, Sjoerd Job Postmus <sjoerdjob at> wrote:
> >
> >> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
> >>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoerdjob at>
> >>>
> >>> Maybe a suggestion would be to add the path of the module to the error
> >>> message?
> >>
> >> That would probably help, but think about what it entails:
> >>
> >> Most AttributeErrors aren't on module objects, they're on instances of
user-defined classes with a typo, or on None because the user forgot a
"return" somewhere, or on str because the user didn't realize the
difference between the string representation of an object and the objects,
> >
> > True. Most AttributeErrors are on user-defined classes with a typo. But
> > that's not the case we're discussing here. Here we are discussing how a
> > user should debug the effects of module name collisions, and the
> > resulting AttributeError.
> Right. So my point is, either we have to do the extra work in
module.__getattr__ when formatting the string, or we have to extend the
interface of AttributeError to carry more information in general (the
object and attr name, presumably). The latter may be better, but it's also
clearly not going to happen any time soon. (People have been suggesting
since before 3.0 that all the standard exceptions should have more useful
info, but nobody's volunteered to change the hundreds of lines of C code,
Python code, and docs to do it...)

Pep 473 centralizes all of this requests.

I started adding support for name error, the most simple change and it
turned out to be much more complex as I had thought. I had a couple of
tests which were failing and didn't have the bandwidth to debug.
> So, the only argument against your idea I can see is the potential
performance issues. Which should be pretty easy to dismiss with a
microbenchmark showing it's pretty small even in the worst case and a
macrobenchmark showing it's not even measurable in real code, right?
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ianlee1521 at  Sat Jan 30 03:47:43 2016
From: ianlee1521 at (Ian Lee)
Date: Sat, 30 Jan 2016 00:47:43 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

So with the upcoming move to GitHub of the CPython repository, planned with PEP-512 [1], what about the idea of creating a Git repository on GitHub to serve as a replacement for a mailing list, for example python-ideas? Such a repo might be hosted off the ?Python? GitHub organization: 

> On Jan 29, 2016, at 20:35, Guido van Rossum <guido at> wrote:
> What I like most is that the site encourages the creation of
> artifacts that are useful to reference later, e.g. when a related
> issue comes up again later. And I think it will be easier for new
> folks to participate than the current mailing list (where if you don't
> sign up for it you're likely to miss most replies, while if you do
> sign up, you'll be inundated with traffic -- not everybody is a wizard
> at managing high volume mailing list traffic).

Such a repository could address a number of items brought up above, including providing a permanent link to artifacts: comments and threads (issues in the issue tracker). The ability to watch the entire ?repo" (mailing list) and unsubscribe to ?issues? (threads) that are no longer interesting to a watcher (similarly to the way ?muting? works in Gmail). Or vice versa, have notifications off by default and able to opt into notifications if something interesting catches your eye.

Additionally, this would provide a straight forward and pretty easy way to link discussions to actual changes other repos in an easier way that something like ?CPython at commit XYZ123?. 

Stephen J. Turnbull writes:

> On the other hand, one attribute that you have mentioned (the ease of
> finding issues) has a useful effect.  To the extent that StackExchange
> makes traffic management easy (specifically filtering, threading, and
> linking), it might encourage users to follow links to other threads
> where relevant discussion is posted.  In the thread where Antoine
> spoke up, the fact that the discussion that led to the main decision
> was on python-committers almost certainly had a lot to do with the
> fact that most of the posts were unaware that the main decision was
> final, and of the reasons for and against the decision that had
> already been discussed.  And those reasons were rehashed endlessly!  A
> forum that encourages retrieval of previous discussion before posting
> would make a big difference, I suspect.  Eg, one with a check box "I
> have read and understood the discussions cited and I still want to
> post"[4] for comment entry and a "No! He didn't do his homework!"
> button next to the posted comment.<wink/>

A lot of the filtering, sorting, and other benefits that Stephen mentions would be available through GitHub?s searching capabilities, and others such as tagging of ?issues" (mail threads) with labels (peps, new feature, duplicate, change existing functionality, etc come to mind).

Additionally, an issue / thread in the repo could be ?closed? when it is off topic, with future issues opened being able to be closed, marked as ?duplicate? and linked against the old closed issue to try to provide that bit of history without needing to take as much time to re-write the response.

Other benefits include syntax highlighting, markdown formatting (which was announced this week [2]), and ability to interact with the thread via email (replying to the email creates a comment on the issue) or through the browser (which is nice for the presumably small, but at least >= 1 population that have their personal email blocked by their corporate firewall). 

I could also see their being a lot of benefit in making the actual code in the repository to be things like contributing information, what is appropriate to say / ask on each list, etc. For lists like core-workflow I could even see this evolving to where the ?Code? was a GitHub Pages [3] page that actually hosts directly something like the contributor guide (which could still live at whatever URL was desired, while letting GitHub do the actual hosting. Extra benefit is that it provides a very straightforward way to update some of the developer, contributor, and mentoring guides.

It doesn?t ?solve? some of the other issues such as voting, reputation of a user, etc, However, I?m not hearing a resounding desire for those anyways.

There is at least *some* precedent for this in the form of the Government GitHub [4][5] community and related agencies such as 18F [6]. The former of which has a ?best practices? repository [7] which serves this same purpose of communicating and discussing ideas, without necessarily being a code repository. Unfortunately, that repository is a private repository and requires a government email address and joining the ?government? organization to access; see [8] for details on joining if you?re interested. 

[1] <>

[2] <>


[4] <>

[5] <>

[6] <>

[7] <>

[8] <>

~ Ian Lee | IanLee1521 at <mailto:IanLee1521 at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From rosuav at  Sat Jan 30 03:50:48 2016
From: rosuav at (Chris Angelico)
Date: Sat, 30 Jan 2016 19:50:48 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 7:47 PM, Ian Lee <ianlee1521 at> wrote:
> Such a repository could address a number of items brought up above,
> including providing a permanent link to artifacts: comments and threads
> (issues in the issue tracker). The ability to watch the entire ?repo"
> (mailing list) and unsubscribe to ?issues? (threads) that are no longer
> interesting to a watcher (similarly to the way ?muting? works in Gmail). Or
> vice versa, have notifications off by default and able to opt into
> notifications if something interesting catches your eye.

How do you change the subject line to indicate that the topic has
drifted (or is a spin-off), while still appropriately quoting the
previous post?

Most web-based discussion systems are built around a concept of
"initial post" and "replies", where the replies always tie exactly to
one initial post. The branching of discussion threads never seems to
work as well as it does in netnews or email.


From ianlee1521 at  Sat Jan 30 04:01:31 2016
From: ianlee1521 at (Ian Lee)
Date: Sat, 30 Jan 2016 01:01:31 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 30, 2016, at 00:50, Chris Angelico <rosuav at> wrote:
> On Sat, Jan 30, 2016 at 7:47 PM, Ian Lee <ianlee1521 at> wrote:
>> Such a repository could address a number of items brought up above,
>> including providing a permanent link to artifacts: comments and threads
>> (issues in the issue tracker). The ability to watch the entire ?repo"
>> (mailing list) and unsubscribe to ?issues? (threads) that are no longer
>> interesting to a watcher (similarly to the way ?muting? works in Gmail). Or
>> vice versa, have notifications off by default and able to opt into
>> notifications if something interesting catches your eye.
> How do you change the subject line to indicate that the topic has
> drifted (or is a spin-off), while still appropriately quoting the
> previous post?
> Most web-based discussion systems are built around a concept of
> "initial post" and "replies", where the replies always tie exactly to
> one initial post. The branching of discussion threads never seems to
> work as well as it does in netnews or email.

True, you don?t get quite as nice forking of issues, though other solutions mentioned (e.g. StackOverflow) would have similar issues.

Off the cuff, I?d suggest that this linking could be handled by creating a new issue which linked to the old issue [1] with something like ?continuing from #12345 ??. This would actually provide an improvement over the current email approach which only really provides a link from the forked thread back to the original, by creating a reference / link to the forked issue in the original, e.g. how [2] and [3] are linked.

[1] <>

[2] <>

[3] <>
> ChrisA
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

~ Ian Lee | IanLee1521 at <mailto:IanLee1521 at>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ncoghlan at  Sat Jan 30 04:18:01 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 19:18:01 +1000
Subject: [Python-ideas] A collaborative Q&A site for Python (was: A bit
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 15:24, Guido van Rossum <guido at> wrote:
> On Fri, Jan 29, 2016 at 9:16 PM, Ben Finney <ben+python at> wrote:
>> Guido van Rossum <guido at> writes:
>>> I don't understand the issues brought up about the SE site creation
>>> process. 22 years ago we managed to create a Usenet newsgroup,
>>> comp.lang.python. Surely today we can figure out how to create a SE
>>> site?
>> We have done, several times. One popular option is Askbot
>> <URL:>. I'd be happy to see a
>> PSF-blessed instance of Askbot running at a ? domain.
>> That said, it would be wise to reflect that creating the software is not
>> the hard part; continually responding to community needs, and managing
>> the system so desirable behaviours are encouraged, is the hard part
>> <URL:>.
> Oh, I wasn't talking about creating more software. I was assuming we
> could find a way to join the SE network. IOW let Jeff Atwood and co.
> take care of that stuff, so we can focus on having meaningful
> discussions.

Area 51 is their process for doing that:

However, while Stack Exchange style sites can be good for "Why is this
existing thing the way it is?" Q&A, they're not really designed for
proposing *changes* to things, discussing the prospective merits of
those changes, and coming to a decision.

Loomio is a good example of a site that offers some much better tools
for collaborative discussion and decision making:

You still have the "critical mass" problem though, and for CPython,
the critical mass of eyeballs is on python-dev and python-ideas -
hence the inclination to try to update that infrastructure to Mailman
3 transparently (thus providing a much improved web gateway for
potential new participants and better list management tools for
existing subscribers), rather than trying to convince current list
members to switch to a different technology.


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From mojtaba.gharibi at  Sat Jan 30 04:24:52 2016
From: mojtaba.gharibi at (Mirmojtaba Gharibi)
Date: Sat, 30 Jan 2016 04:24:52 -0500
Subject: [Python-ideas] Respectively and its unpacking sentence
In-Reply-To: <>
References: <>
Message-ID: <>

Thanks everyone for your feedback.
I think I have a clearer look at it as a result.
It seems the most important feature is the vector operation aspect of it.
Also that magical behavior and the fact that $ or ;;; does not produce
types is troublesome.
Also, some of the other aspects such as
x,y = 1+2, 3+4
is already addressed by the above notation, so we're not gaining anything there.

I'll have some ideas to address the concerns and will post them later again.


On Fri, Jan 29, 2016 at 12:32 PM, Stephen J. Turnbull
<stephen at> wrote:
> Pavol Lisy writes:
>  > I would really like
>  >
>  > (a;b;c) in L
> Not well-specified (does order matter? how about repeated values? is
> (a;b;c) an object? it sure looks like one, and if so, object in L
> already has a meaning).  But for one obvious interpretation:
>     {a, b, c} <= set(L)
> and in this interpretation you should probably optimize to
>     {a, b, c} <= L
> by constructing L as a set in the first place.  Really this thread
> probably belongs on python-list anyway.

From ncoghlan at  Sat Jan 30 04:30:05 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 19:30:05 +1000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 08:42, Ned Batchelder <ned at> wrote:
> Hi,
> A common question we get in the #python IRC channel is, "I tried importing a
> module, but I get an AttributeError trying to use the things it said it
> provided."  Turns out the beginner named their own file the same as the
> module they were trying to use.
> That is, they want to try (for example) the "azure" package.  So they make a
> file called, and start with "import azure". The import succeeds,
> but it has none of the contents the documentation claims, because they have
> imported themselves.  It's baffling, because they have used the exact
> statements shown in the examples, but it doesn't work.
> Could we make this a more obvious failure?  Is there ever a valid reason for
> a file to import itself?  Is this situation detectable in the import
> machinery?

We could potentially detect when __main__ is being reimported under a
different name and issue a user visible warning when it happens, but
we can't readily detect a file importing itself in the general case
(since it may be an indirect circular reference rather than a direct).


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From steve at  Sat Jan 30 04:44:56 2016
From: steve at (Steven D'Aprano)
Date: Sat, 30 Jan 2016 20:44:56 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 12:47:43AM -0800, Ian Lee wrote:

> So with the upcoming move to GitHub of the CPython repository, planned 
> with PEP-512 [1], what about the idea of creating a Git repository on 
> GitHub to serve as a replacement for a mailing list, for example 
> python-ideas? Such a repo might be hosted off the ?Python? GitHub 
> organization:

I think any talk of migrating away from email is greatly premature. 
The Mailman folks have done a lot of fantastic work with Mailman 3 and 
Hyperkitty, which will bring many of the benefits of a web forum to the 
mailing lists. We should at least look at Hyperkitty before planning any 
widespread move away from email.


From ned at  Sat Jan 30 06:19:35 2016
From: ned at (Ned Batchelder)
Date: Sat, 30 Jan 2016 06:19:35 -0500
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/30/16 4:30 AM, Nick Coghlan wrote:
> On 30 January 2016 at 08:42, Ned Batchelder <ned at> wrote:
>> Hi,
>> A common question we get in the #python IRC channel is, "I tried importing a
>> module, but I get an AttributeError trying to use the things it said it
>> provided."  Turns out the beginner named their own file the same as the
>> module they were trying to use.
>> That is, they want to try (for example) the "azure" package.  So they make a
>> file called, and start with "import azure". The import succeeds,
>> but it has none of the contents the documentation claims, because they have
>> imported themselves.  It's baffling, because they have used the exact
>> statements shown in the examples, but it doesn't work.
>> Could we make this a more obvious failure?  Is there ever a valid reason for
>> a file to import itself?  Is this situation detectable in the import
>> machinery?
> We could potentially detect when __main__ is being reimported under a
> different name and issue a user visible warning when it happens, but
> we can't readily detect a file importing itself in the general case
> (since it may be an indirect circular reference rather than a direct).

I thought about the indirect case, and for the errors I'm trying to make 
clearer, the direct case is plenty.

While we're at it though, re-importing __main__ is a separate kind of 
behavior that is often a problem, since it means you'll have the same 
classes defined twice.

> Cheers,
> Nick.

From stephen at  Sat Jan 30 06:21:14 2016
From: stephen at (Stephen J. Turnbull)
Date: Sat, 30 Jan 2016 20:21:14 +0900
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

>>>>> On Sat, Jan 30, 2016 at 12:47:43AM -0800, Ian Lee wrote:
 > So with the upcoming move to GitHub of the CPython repository, planned 
 > with PEP-512 [1], what about the idea of creating a Git repository on 
 > GitHub to serve as a replacement for a mailing list,

-1 in general.  For existing channels, parallel operation, probably
with a gateway, is essential.

 > for example python-ideas?

-1 in particular.  There are better candidates for experimentation.

>>>>> Steven D'Aprano writes:

 > I think any talk of migrating away from email is greatly premature.

+1 (but I'm a contributor to the GNU Mailman project, so take that
with a grain of self-interest).

From ncoghlan at  Sat Jan 30 06:57:05 2016
From: ncoghlan at (Nick Coghlan)
Date: Sat, 30 Jan 2016 21:57:05 +1000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 21:19, Ned Batchelder <ned at> wrote:
> On 1/30/16 4:30 AM, Nick Coghlan wrote:
>> We could potentially detect when __main__ is being reimported under a
>> different name and issue a user visible warning when it happens, but
>> we can't readily detect a file importing itself in the general case
>> (since it may be an indirect circular reference rather than a direct).
> I thought about the indirect case, and for the errors I'm trying to make
> clearer, the direct case is plenty.

In that case, the only problem I see off the top of my head with
emitting a warning for direct self-imports is that it would rely on
import system behaviour we're currently trying to reduce/minimise: the
import machinery needing visibility into the globals for the module
initiating the import.

It's also possible that by the time we get hold of the __spec__ for
the module being imported, we've already dropped our reference to the
importing module's globals, so we can't check against __file__ any
more. However, I'd need to go read the code to remember how quickly we
get to extracting just the globals of potential interest.

> While we're at it though, re-importing __main__ is a separate kind of
> behavior that is often a problem, since it means you'll have the same
> classes defined twice.

Right, but it combines with the name shadowing behaviour to create a
*super* confusing combination when you write a *script* that shadows
the name of a standard library module:


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From oscar.j.benjamin at  Sat Jan 30 08:20:25 2016
From: oscar.j.benjamin at (Oscar Benjamin)
Date: Sat, 30 Jan 2016 13:20:25 +0000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 11:57, Nick Coghlan <ncoghlan at> wrote:
> On 30 January 2016 at 21:19, Ned Batchelder <ned at> wrote:
>> On 1/30/16 4:30 AM, Nick Coghlan wrote:
>>> We could potentially detect when __main__ is being reimported under a
>>> different name and issue a user visible warning when it happens, but
>>> we can't readily detect a file importing itself in the general case
>>> (since it may be an indirect circular reference rather than a direct).
>> I thought about the indirect case, and for the errors I'm trying to make
>> clearer, the direct case is plenty.
> In that case, the only problem I see off the top of my head with
> emitting a warning for direct self-imports is that it would rely on
> import system behaviour we're currently trying to reduce/minimise: the
> import machinery needing visibility into the globals for the module
> initiating the import.
> It's also possible that by the time we get hold of the __spec__ for
> the module being imported, we've already dropped our reference to the
> importing module's globals, so we can't check against __file__ any
> more. However, I'd need to go read the code to remember how quickly we
> get to extracting just the globals of potential interest.

Maybe this is because I don't really understand how the import
machinery works but I would say that if I run

    $ python

Then the interpreter should be able to know that __main__ is called
"random" and know the path to that file. It should also be evident if
'' is at the front of sys.path then "import random" is going to import
that same module. Why is it difficult to detect that case?

I think it would be better to try and solve the problem a little more
generally though. Having yesterday created a file called in
my user directory (on Ubuntu 15.04) I get the following today:

$ cat
import urllib2
$ python3
Python 3.4.3 (default, Mar 26 2015, 22:03:40)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> *a = [1, 2]
  File "<stdin>", line 1
SyntaxError: starred assignment target must be in a list or tuple
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/", line
63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/", line 5, in <module>
    from import Report
  File "/usr/lib/python3/dist-packages/apport/", line 12, in <module>
    import subprocess, tempfile, os.path, re, pwd, grp, os, time
  File "/usr/lib/python3.4/", line 175, in <module>
    from random import Random as _Random
  File "/home/oscar/", line 1, in <module>
    import urllib2
ImportError: No module named 'urllib2'

Original exception was:
  File "<stdin>", line 1
SyntaxError: starred assignment target must be in a list or tuple


From random832 at  Sat Jan 30 10:43:06 2016
From: random832 at (Random832)
Date: Sat, 30 Jan 2016 10:43:06 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Guido van Rossum  writes:
> Let me clarify why I like StackExchange. I don't care about the voting
> for/against answers or even about the selection of the "best" answer
> by the OP. I do like that the reputation system of the site
> automatically recognizes users who should be given more
> responsibilities (up to and including deleting inappropriate posts --
> rarely).

These can't be separated. Reputation is obtained by writing answers that
people vote for. The site would either have to be structured to allow
that, or an entirely different way of getting reputation... which would
still involve voting on _something_, if it's to be decentralized and
therefore "automatic" rather than requiring you personally to hand out
all reputation points.

From random832 at  Sat Jan 30 10:49:14 2016
From: random832 at (Random832)
Date: Sat, 30 Jan 2016 10:49:14 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Chris Angelico <rosuav at> writes:
> How do you change the subject line to indicate that the topic has
> drifted (or is a spin-off), while still appropriately quoting the
> previous post?

You're free to quote any post anywhere, even if you make a new
thread. In general to do this you have to start your reply in the
original thread, then copy/paste the quote markup (which includes a
magic link to the post you are quoting) into the post new thread form.

It would be interesting to make a forum with a "spin-off thread"
feature, which would automate the placement of the reply in a new thread
and a note in the old thread with a link to the new one.

But in most cases this can't be automated because on better-managed
forums once a digression has grown large enough to need a separate
thread, the forum's moderators will move earlier posts about it
(originally made in the first thread) to the new thread. (it might be
interesting to make a forum that provides a way to have a post live in
two different threads at the same time)

From nicholas.chammas at  Sat Jan 30 11:48:18 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sat, 30 Jan 2016 16:48:18 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

To follow up on one of the suggestions Brett and Donald made, I think the
best solution today for a modern discussion forum is Discourse

Discourse is built by some of the same people who built Stack Overflow,
including Jeff Atwood
<>. Among
the many excellent features <> it has is
full support for a ?mailing list mode?, where you can reply to and start
new conversations entirely via email. That may be important for people who
are not interested in using the web for their conversations.

Discourse doesn?t currently have a voting plugin, but here is an
interesting discussion about adding one
Just earlier today a member of the Discourse team followed-up on that
discussion with a detailed proposal to make the plugin real

As an example of the polish Discourse already has, consider this remark by

It would be interesting to make a forum with a ?spin-off thread? feature,
which would automate the placement of the reply in a new thread and a note
in the old thread with a link to the new one.

If you look at the post I linked to about adding a voting plugin
you can see just this kind of link offered by Discourse since the poster
spun that new thread from an existing one.

A large open source community using Discourse today is Docker
<>. If Donald sets up a Discourse instance for
Packaging, that should serve as a good trial for us to deploy it elsewhere.
I suspect it will be a success.

As for hosting, there are many options
that range from free but self-managed, to fully managed for a monthly fee.


On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at> wrote:

> Chris Angelico <rosuav at> writes:
> > How do you change the subject line to indicate that the topic has
> > drifted (or is a spin-off), while still appropriately quoting the
> > previous post?
> You're free to quote any post anywhere, even if you make a new
> thread. In general to do this you have to start your reply in the
> original thread, then copy/paste the quote markup (which includes a
> magic link to the post you are quoting) into the post new thread form.
> It would be interesting to make a forum with a "spin-off thread"
> feature, which would automate the placement of the reply in a new thread
> and a note in the old thread with a link to the new one.
> But in most cases this can't be automated because on better-managed
> forums once a digression has grown large enough to need a separate
> thread, the forum's moderators will move earlier posts about it
> (originally made in the first thread) to the new thread. (it might be
> interesting to make a forum that provides a way to have a post live in
> two different threads at the same time)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan 30 12:25:17 2016
From: guido at (Guido van Rossum)
Date: Sat, 30 Jan 2016 09:25:17 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Oooh, Discourse looks and sounds good. Hopefully we can opt out from
voting, everything else looks just right. I recommend requesting some
PSF money for a fully-hosted instance, so nobody has to suffer when
it's down, security upgrades will be taken care of, etc.

On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas
<nicholas.chammas at> wrote:
> To follow up on one of the suggestions Brett and Donald made, I think the
> best solution today for a modern discussion forum is Discourse.
> Discourse is built by some of the same people who built Stack Overflow,
> including Jeff Atwood. Among the many excellent features it has is full
> support for a ?mailing list mode?, where you can reply to and start new
> conversations entirely via email. That may be important for people who are
> not interested in using the web for their conversations.
> Discourse doesn?t currently have a voting plugin, but here is an interesting
> discussion about adding one. Just earlier today a member of the Discourse
> team followed-up on that discussion with a detailed proposal to make the
> plugin real.
> As an example of the polish Discourse already has, consider this remark by
> Random832:
> It would be interesting to make a forum with a ?spin-off thread? feature,
> which would automate the placement of the reply in a new thread and a note
> in the old thread with a link to the new one.
> If you look at the post I linked to about adding a voting plugin, you can
> see just this kind of link offered by Discourse since the poster spun that
> new thread from an existing one.
> A large open source community using Discourse today is Docker. If Donald
> sets up a Discourse instance for Packaging, that should serve as a good
> trial for us to deploy it elsewhere. I suspect it will be a success.
> As for hosting, there are many options that range from free but
> self-managed, to fully managed for a monthly fee.
> Nick
> On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at> wrote:
>> Chris Angelico <rosuav at> writes:
>> > How do you change the subject line to indicate that the topic has
>> > drifted (or is a spin-off), while still appropriately quoting the
>> > previous post?
>> You're free to quote any post anywhere, even if you make a new
>> thread. In general to do this you have to start your reply in the
>> original thread, then copy/paste the quote markup (which includes a
>> magic link to the post you are quoting) into the post new thread form.
>> It would be interesting to make a forum with a "spin-off thread"
>> feature, which would automate the placement of the reply in a new thread
>> and a note in the old thread with a link to the new one.
>> But in most cases this can't be automated because on better-managed
>> forums once a digression has grown large enough to need a separate
>> thread, the forum's moderators will move earlier posts about it
>> (originally made in the first thread) to the new thread. (it might be
>> interesting to make a forum that provides a way to have a post live in
>> two different threads at the same time)
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

--Guido van Rossum (

From brett at  Sat Jan 30 12:29:22 2016
From: brett at (Brett Cannon)
Date: Sat, 30 Jan 2016 17:29:22 +0000
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On Fri, Jan 29, 2016, 11:57 Andrew Barnert via Python-ideas <
python-ideas at> wrote:

> On Jan 29, 2016, at 06:10, Nick Coghlan <ncoghlan at> wrote:
> >
> > PEP 511 erases that piece of incidental complexity and say, "If you
> > want to apply a genuinely global transformation, this is how you do
> > it". The fact we already have decorators and import hooks is why I
> > think PEP 511 can safely ignore the use cases that those handle.
> I think this is the conclusion I was hoping to reach, but wasn't sure how
> to get there. I'm happy with PEP 511 not trying to serve cases like MacroPy
> and Hy and the example from the byteplay docs, especially so if ignoring
> them makes PEP 511 simpler, as long as it can explain why it's ignoring
> them. And a shorter version of your argument should serve as such an
> explanation.
> But the other half of my point was that too many people (even very
> experienced developers like most of the people on this list) think there's
> more incidental complexity than there is, and that's also a problem. For
> example, "I want to write a global processor for local experimentation
> purposes so I can play with my idea before posting it to Python-ideas" is
> not a bad desire. And, if people think it's way too hard to do with a
> quick&dirty import hook, they're naturally going to ask why PEP 511 doesn't
> help them out by adding a bunch of options to install/run the processors
> conditionally, handle files, skip the stdlib, etc. And I think the
> PEP is better without those options.
> > However, I think it *would* make sense to make the creation of a "Code
> > Transformation" HOWTO guide part of the PEP - having a guide means we
> > can clearly present the hierarchy in terms of:
> I like this idea.
> Earlier I suggested that the import system documentation should have some
> simple examples of how to actually use the import system to write
> transforming hooks. Someone (Brett?) pointed out that it's a dangerous
> technique, and making it too easy for people to play with it without
> understanding it may be a bad idea. And they're probably right.

If we added an appropriate warning to the example I would be fine adding
one that covers how to add a custom loader.


> A HOWTO is a bit more "out-of-the-way" than library or reference
> docs--and, more importantly, it also has room to explain when you shouldn't
> do this or that, and why.
> I'm not sure it has to be part of the PEP, but I can see the connection.
> While the PEP helps by separating out the most important safe case
> (semantically-neutral, reflected in .pyc, globally consistent, etc.), but
> it also makes the question "how do I do something similar to PEP 511
> transformers except ___" more likely to come up in the first place, making
> the HOWTO more important.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From nicholas.chammas at  Sat Jan 30 12:30:37 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sat, 30 Jan 2016 17:30:37 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Yeah, the voting is just a plugin which I presume you can enable or disable
as desired.

As for hosting, I agree it?s probably better to have someone else do that
so we can lessen the burden on our infra team. The Discourse team also
offers discounted hosting for open source projects
Depending on the arrangement they offer, it may be really cheap for us even
with a fully managed instance.


On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum <guido at> wrote:

> Oooh, Discourse looks and sounds good. Hopefully we can opt out from
> voting, everything else looks just right. I recommend requesting some
> PSF money for a fully-hosted instance, so nobody has to suffer when
> it's down, security upgrades will be taken care of, etc.
> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas
> <nicholas.chammas at> wrote:
> > To follow up on one of the suggestions Brett and Donald made, I think the
> > best solution today for a modern discussion forum is Discourse.
> >
> > Discourse is built by some of the same people who built Stack Overflow,
> > including Jeff Atwood. Among the many excellent features it has is full
> > support for a ?mailing list mode?, where you can reply to and start new
> > conversations entirely via email. That may be important for people who
> are
> > not interested in using the web for their conversations.
> >
> > Discourse doesn?t currently have a voting plugin, but here is an
> interesting
> > discussion about adding one. Just earlier today a member of the Discourse
> > team followed-up on that discussion with a detailed proposal to make the
> > plugin real.
> >
> > As an example of the polish Discourse already has, consider this remark
> by
> > Random832:
> >
> > It would be interesting to make a forum with a ?spin-off thread? feature,
> > which would automate the placement of the reply in a new thread and a
> note
> > in the old thread with a link to the new one.
> >
> > If you look at the post I linked to about adding a voting plugin, you can
> > see just this kind of link offered by Discourse since the poster spun
> that
> > new thread from an existing one.
> >
> > A large open source community using Discourse today is Docker. If Donald
> > sets up a Discourse instance for Packaging, that should serve as a good
> > trial for us to deploy it elsewhere. I suspect it will be a success.
> >
> > As for hosting, there are many options that range from free but
> > self-managed, to fully managed for a monthly fee.
> >
> > Nick
> >
> >
> > On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at>
> wrote:
> >>
> >> Chris Angelico <rosuav at> writes:
> >> > How do you change the subject line to indicate that the topic has
> >> > drifted (or is a spin-off), while still appropriately quoting the
> >> > previous post?
> >>
> >> You're free to quote any post anywhere, even if you make a new
> >> thread. In general to do this you have to start your reply in the
> >> original thread, then copy/paste the quote markup (which includes a
> >> magic link to the post you are quoting) into the post new thread form.
> >>
> >> It would be interesting to make a forum with a "spin-off thread"
> >> feature, which would automate the placement of the reply in a new thread
> >> and a note in the old thread with a link to the new one.
> >>
> >> But in most cases this can't be automated because on better-managed
> >> forums once a digression has grown large enough to need a separate
> >> thread, the forum's moderators will move earlier posts about it
> >> (originally made in the first thread) to the new thread. (it might be
> >> interesting to make a forum that provides a way to have a post live in
> >> two different threads at the same time)
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at
> >>
> >> Code of Conduct:
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at
> >
> > Code of Conduct:
> --
> --Guido van Rossum (
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From donald at  Sat Jan 30 12:58:53 2016
From: donald at (Donald Stufft)
Date: Sat, 30 Jan 2016 12:58:53 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Honestly, It?s probably not super hard for us to get this running (and we might want to so we can piggyback on our Fastly CDN and such). Assuming it stores all of it?s persistent state inside of PostgreSQL then we?re already running a central PostgreSQL server that we keep backed up. The biggest issue comes from software that wants to store persistent state on disk, since that makes it difficult to treat those machines as empehereal.

> On Jan 30, 2016, at 12:30 PM, Nicholas Chammas <nicholas.chammas at> wrote:
> Yeah, the voting is just a plugin which I presume you can enable or disable as desired.
> As for hosting, I agree it?s probably better to have someone else do that so we can lessen the burden on our infra team. The Discourse team also offers discounted hosting for open source projects <>. Depending on the arrangement they offer, it may be really cheap for us even with a fully managed instance.
> Nick
> On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum <guido at <mailto:guido at>> wrote:
> Oooh, Discourse looks and sounds good. Hopefully we can opt out from
> voting, everything else looks just right. I recommend requesting some
> PSF money for a fully-hosted instance, so nobody has to suffer when
> it's down, security upgrades will be taken care of, etc.
> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas
> <nicholas.chammas at <mailto:nicholas.chammas at>> wrote:
> > To follow up on one of the suggestions Brett and Donald made, I think the
> > best solution today for a modern discussion forum is Discourse.
> >
> > Discourse is built by some of the same people who built Stack Overflow,
> > including Jeff Atwood. Among the many excellent features it has is full
> > support for a ?mailing list mode?, where you can reply to and start new
> > conversations entirely via email. That may be important for people who are
> > not interested in using the web for their conversations.
> >
> > Discourse doesn?t currently have a voting plugin, but here is an interesting
> > discussion about adding one. Just earlier today a member of the Discourse
> > team followed-up on that discussion with a detailed proposal to make the
> > plugin real.
> >
> > As an example of the polish Discourse already has, consider this remark by
> > Random832:
> >
> > It would be interesting to make a forum with a ?spin-off thread? feature,
> > which would automate the placement of the reply in a new thread and a note
> > in the old thread with a link to the new one.
> >
> > If you look at the post I linked to about adding a voting plugin, you can
> > see just this kind of link offered by Discourse since the poster spun that
> > new thread from an existing one.
> >
> > A large open source community using Discourse today is Docker. If Donald
> > sets up a Discourse instance for Packaging, that should serve as a good
> > trial for us to deploy it elsewhere. I suspect it will be a success.
> >
> > As for hosting, there are many options that range from free but
> > self-managed, to fully managed for a monthly fee.
> >
> > Nick
> >
> >
> > On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at <mailto:random832 at>> wrote:
> >>
> >> Chris Angelico <rosuav at <mailto:rosuav at>> writes:
> >> > How do you change the subject line to indicate that the topic has
> >> > drifted (or is a spin-off), while still appropriately quoting the
> >> > previous post?
> >>
> >> You're free to quote any post anywhere, even if you make a new
> >> thread. In general to do this you have to start your reply in the
> >> original thread, then copy/paste the quote markup (which includes a
> >> magic link to the post you are quoting) into the post new thread form.
> >>
> >> It would be interesting to make a forum with a "spin-off thread"
> >> feature, which would automate the placement of the reply in a new thread
> >> and a note in the old thread with a link to the new one.
> >>
> >> But in most cases this can't be automated because on better-managed
> >> forums once a digression has grown large enough to need a separate
> >> thread, the forum's moderators will move earlier posts about it
> >> (originally made in the first thread) to the new thread. (it might be
> >> interesting to make a forum that provides a way to have a post live in
> >> two different threads at the same time)
> >>
> >> _______________________________________________
> >> Python-ideas mailing list
> >> Python-ideas at <mailto:Python-ideas at>
> >> <>
> >> Code of Conduct: <>
> >
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at <mailto:Python-ideas at>
> > <>
> > Code of Conduct: <>
> --
> --Guido van Rossum ( <>)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From nicholas.chammas at  Sat Jan 30 13:06:47 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sat, 30 Jan 2016 18:06:47 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Agreed. If you read through that thread about Discourse hosting for open
source projects, you'll see that Jeff made a point of stressing just how
easy it is to manage a Discourse instance.

Still, my bias is to delegate where possible, even if the task is light, to
reduce the psychological burden of being responsible for something. Then
again, I'm not on the Python infra team, so I can't speak for them. If
they're (and I'm guessing you're part of the team, Donald?) OK with it,
then sure, it should be fine to manage the instance ourselves.


On Sat, Jan 30, 2016 at 12:59 PM Donald Stufft <donald at> wrote:

> Honestly, It?s probably not super hard for us to get this running (and we
> might want to so we can piggyback on our Fastly CDN and such). Assuming it
> stores all of it?s persistent state inside of PostgreSQL then we?re already
> running a central PostgreSQL server that we keep backed up. The biggest
> issue comes from software that wants to store persistent state on disk,
> since that makes it difficult to treat those machines as empehereal.
> On Jan 30, 2016, at 12:30 PM, Nicholas Chammas <nicholas.chammas at>
> wrote:
> Yeah, the voting is just a plugin which I presume you can enable or
> disable as desired.
> As for hosting, I agree it?s probably better to have someone else do that
> so we can lessen the burden on our infra team. The Discourse team also
> offers discounted hosting for open source projects
> <>.
> Depending on the arrangement they offer, it may be really cheap for us even
> with a fully managed instance.
> Nick
> ?
> On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum <guido at>
> wrote:
>> Oooh, Discourse looks and sounds good. Hopefully we can opt out from
>> voting, everything else looks just right. I recommend requesting some
>> PSF money for a fully-hosted instance, so nobody has to suffer when
>> it's down, security upgrades will be taken care of, etc.
>> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas
>> <nicholas.chammas at> wrote:
>> > To follow up on one of the suggestions Brett and Donald made, I think
>> the
>> > best solution today for a modern discussion forum is Discourse.
>> >
>> > Discourse is built by some of the same people who built Stack Overflow,
>> > including Jeff Atwood. Among the many excellent features it has is full
>> > support for a ?mailing list mode?, where you can reply to and start new
>> > conversations entirely via email. That may be important for people who
>> are
>> > not interested in using the web for their conversations.
>> >
>> > Discourse doesn?t currently have a voting plugin, but here is an
>> interesting
>> > discussion about adding one. Just earlier today a member of the
>> Discourse
>> > team followed-up on that discussion with a detailed proposal to make the
>> > plugin real.
>> >
>> > As an example of the polish Discourse already has, consider this remark
>> by
>> > Random832:
>> >
>> > It would be interesting to make a forum with a ?spin-off thread?
>> feature,
>> > which would automate the placement of the reply in a new thread and a
>> note
>> > in the old thread with a link to the new one.
>> >
>> > If you look at the post I linked to about adding a voting plugin, you
>> can
>> > see just this kind of link offered by Discourse since the poster spun
>> that
>> > new thread from an existing one.
>> >
>> > A large open source community using Discourse today is Docker. If Donald
>> > sets up a Discourse instance for Packaging, that should serve as a good
>> > trial for us to deploy it elsewhere. I suspect it will be a success.
>> >
>> > As for hosting, there are many options that range from free but
>> > self-managed, to fully managed for a monthly fee.
>> >
>> > Nick
>> >
>> >
>> > On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at>
>> wrote:
>> >>
>> >> Chris Angelico <rosuav at> writes:
>> >> > How do you change the subject line to indicate that the topic has
>> >> > drifted (or is a spin-off), while still appropriately quoting the
>> >> > previous post?
>> >>
>> >> You're free to quote any post anywhere, even if you make a new
>> >> thread. In general to do this you have to start your reply in the
>> >> original thread, then copy/paste the quote markup (which includes a
>> >> magic link to the post you are quoting) into the post new thread form.
>> >>
>> >> It would be interesting to make a forum with a "spin-off thread"
>> >> feature, which would automate the placement of the reply in a new
>> thread
>> >> and a note in the old thread with a link to the new one.
>> >>
>> >> But in most cases this can't be automated because on better-managed
>> >> forums once a digression has grown large enough to need a separate
>> >> thread, the forum's moderators will move earlier posts about it
>> >> (originally made in the first thread) to the new thread. (it might be
>> >> interesting to make a forum that provides a way to have a post live in
>> >> two different threads at the same time)
>> >>
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at
>> >>
>> >> Code of Conduct:
>> >
>> >
>> > _______________________________________________
>> > Python-ideas mailing list
>> > Python-ideas at
>> >
>> > Code of Conduct:
>> --
>> --Guido van Rossum (
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Sat Jan 30 13:35:24 2016
From: ethan at (Ethan Furman)
Date: Sat, 30 Jan 2016 10:35:24 -0800
Subject: [Python-ideas] PEP 511: API for code transformers
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/30/2016 09:29 AM, Brett Cannon wrote:
 > On Fri, Jan 29, 2016, 11:57 Andrew Barnert wrote:

 >> Earlier I suggested that the import system documentation should have
 >> some simple examples of how to actually use the import system to
 >> write transforming hooks. Someone (Brett?) pointed out that it's a
 >> dangerous technique, and making it too easy for people to play with
 >> it without understanding it may be a bad idea. And they're probably
 >> right.

 > If we added an appropriate warning to the example I would be fine
 > adding one that covers how to add a custom loader.

That would be great!


From brett at  Sat Jan 30 14:03:11 2016
From: brett at (Brett Cannon)
Date: Sat, 30 Jan 2016 19:03:11 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

I've started a thread  amongst python-ideas-owners to see if any of us can
lead the eval between HyperKitty and Discourse and making sure PSF infra is
okay with hosting it (or just paying for hosting). If none of us have time
I will come back to the list to ask for someone to lead the evaluation.

On Sat, 30 Jan 2016 at 10:07 Nicholas Chammas <nicholas.chammas at>

> Agreed. If you read through that thread about Discourse hosting for open
> source projects, you'll see that Jeff made a point of stressing just how
> easy it is to manage a Discourse instance.
> Still, my bias is to delegate where possible, even if the task is light,
> to reduce the psychological burden of being responsible for something. Then
> again, I'm not on the Python infra team, so I can't speak for them. If
> they're (and I'm guessing you're part of the team, Donald?) OK with it,
> then sure, it should be fine to manage the instance ourselves.
> Nick
> On Sat, Jan 30, 2016 at 12:59 PM Donald Stufft <donald at> wrote:
>> Honestly, It?s probably not super hard for us to get this running (and we
>> might want to so we can piggyback on our Fastly CDN and such). Assuming it
>> stores all of it?s persistent state inside of PostgreSQL then we?re already
>> running a central PostgreSQL server that we keep backed up. The biggest
>> issue comes from software that wants to store persistent state on disk,
>> since that makes it difficult to treat those machines as empehereal.
>> On Jan 30, 2016, at 12:30 PM, Nicholas Chammas <
>> nicholas.chammas at> wrote:
>> Yeah, the voting is just a plugin which I presume you can enable or
>> disable as desired.
>> As for hosting, I agree it?s probably better to have someone else do that
>> so we can lessen the burden on our infra team. The Discourse team also
>> offers discounted hosting for open source projects
>> <>.
>> Depending on the arrangement they offer, it may be really cheap for us even
>> with a fully managed instance.
>> Nick
>> ?
>> On Sat, Jan 30, 2016 at 12:25 PM Guido van Rossum <guido at>
>> wrote:
>>> Oooh, Discourse looks and sounds good. Hopefully we can opt out from
>>> voting, everything else looks just right. I recommend requesting some
>>> PSF money for a fully-hosted instance, so nobody has to suffer when
>>> it's down, security upgrades will be taken care of, etc.
>>> On Sat, Jan 30, 2016 at 8:48 AM, Nicholas Chammas
>>> <nicholas.chammas at> wrote:
>>> > To follow up on one of the suggestions Brett and Donald made, I think
>>> the
>>> > best solution today for a modern discussion forum is Discourse.
>>> >
>>> > Discourse is built by some of the same people who built Stack Overflow,
>>> > including Jeff Atwood. Among the many excellent features it has is full
>>> > support for a ?mailing list mode?, where you can reply to and start new
>>> > conversations entirely via email. That may be important for people who
>>> are
>>> > not interested in using the web for their conversations.
>>> >
>>> > Discourse doesn?t currently have a voting plugin, but here is an
>>> interesting
>>> > discussion about adding one. Just earlier today a member of the
>>> Discourse
>>> > team followed-up on that discussion with a detailed proposal to make
>>> the
>>> > plugin real.
>>> >
>>> > As an example of the polish Discourse already has, consider this
>>> remark by
>>> > Random832:
>>> >
>>> > It would be interesting to make a forum with a ?spin-off thread?
>>> feature,
>>> > which would automate the placement of the reply in a new thread and a
>>> note
>>> > in the old thread with a link to the new one.
>>> >
>>> > If you look at the post I linked to about adding a voting plugin, you
>>> can
>>> > see just this kind of link offered by Discourse since the poster spun
>>> that
>>> > new thread from an existing one.
>>> >
>>> > A large open source community using Discourse today is Docker. If
>>> Donald
>>> > sets up a Discourse instance for Packaging, that should serve as a good
>>> > trial for us to deploy it elsewhere. I suspect it will be a success.
>>> >
>>> > As for hosting, there are many options that range from free but
>>> > self-managed, to fully managed for a monthly fee.
>>> >
>>> > Nick
>>> >
>>> >
>>> > On Sat, Jan 30, 2016 at 10:50 AM Random832 <random832 at>
>>> wrote:
>>> >>
>>> >> Chris Angelico <rosuav at> writes:
>>> >> > How do you change the subject line to indicate that the topic has
>>> >> > drifted (or is a spin-off), while still appropriately quoting the
>>> >> > previous post?
>>> >>
>>> >> You're free to quote any post anywhere, even if you make a new
>>> >> thread. In general to do this you have to start your reply in the
>>> >> original thread, then copy/paste the quote markup (which includes a
>>> >> magic link to the post you are quoting) into the post new thread form.
>>> >>
>>> >> It would be interesting to make a forum with a "spin-off thread"
>>> >> feature, which would automate the placement of the reply in a new
>>> thread
>>> >> and a note in the old thread with a link to the new one.
>>> >>
>>> >> But in most cases this can't be automated because on better-managed
>>> >> forums once a digression has grown large enough to need a separate
>>> >> thread, the forum's moderators will move earlier posts about it
>>> >> (originally made in the first thread) to the new thread. (it might be
>>> >> interesting to make a forum that provides a way to have a post live in
>>> >> two different threads at the same time)
>>> >>
>>> >> _______________________________________________
>>> >> Python-ideas mailing list
>>> >> Python-ideas at
>>> >>
>>> >> Code of Conduct:
>>> >
>>> >
>>> > _______________________________________________
>>> > Python-ideas mailing list
>>> > Python-ideas at
>>> >
>>> > Code of Conduct:
>>> --
>>> --Guido van Rossum (
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
>> -----------------
>> Donald Stufft
>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
>> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From barry at  Sat Jan 30 16:02:23 2016
From: barry at (Barry Warsaw)
Date: Sat, 30 Jan 2016 16:02:23 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

On Jan 29, 2016, at 08:35 PM, Guido van Rossum wrote:

>not everybody is a wizard at managing high volume mailing list traffic).

Which is why for me, Gmane is an indispensable tool, along with a decent NNTP

I subscribe to python-ideas and python-dev so I can post them, but I nomail
python-ideas (not yet python-dev) so my inbox doesn't get cluttered.  Then I
read the Gmane newsgroups and as this message shows, can easy post to threads
I care about.  I can kill-thread any I don't.  Plus, I read them when I have
time and can ignore them when I don't.

I appreciate this isn't a solution for everyone, but it allows me to stay
engaged on my own terms and not get overwhelmed by Python email traffic.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From barry at  Sat Jan 30 16:17:26 2016
From: barry at (Barry Warsaw)
Date: Sat, 30 Jan 2016 16:17:26 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Two big problems with moving primary discussions off the mailing list are
discoverablility and community fracture.

Any new forum will mean another login, a new work flow, another slice of the
ever diminishing attention pie, and discussions that occur both on the
traditional site and the new site.  Some people will miss the big announcement
about the new forum.  There will be lots of cross-posting because people won't
know for sure which ones the people who need to be involved frequent.

For example, many years ago I missed a discussion about something I cared
about and only accidentally took notice when I saw a commit message in my
inbox.  When I asked about why the issue had never been mentioned on
python-dev, I was told that everything was hashed out in great detail on the
tracker.  I didn't even realize that I wasn't getting email notifications of
new tracker issues, so I never saw it until it was too late.

I've seen other topics discussed primarily on G+, for which I have an account,
but rarely pay attention too.  I don't even know if it's still "a thing".
Maybe everyone's moved to Slack by now.  How many different channels do I have
to engage with to keep track of what's happening in core Python?

This isn't GOML and I'm all for experimentation, but I do urge caution.
Otherwise we might just wonder why we haven't heard from Uncle Timmy in a

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From donald at  Sat Jan 30 16:19:50 2016
From: donald at (Donald Stufft)
Date: Sat, 30 Jan 2016 16:19:50 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 30, 2016, at 4:17 PM, Barry Warsaw <barry at> wrote:
> Any new forum will mean another login, a new work flow, another slice of the
> ever diminishing attention pie, and discussions that occur both on the
> traditional site and the new site.  Some people will miss the big announcement
> about the new forum.  There will be lots of cross-posting because people won't
> know for sure which ones the people who need to be involved frequent.

For what it?s worth, another thing I want to do is setup and
consolidate all the logins :)

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From phd at  Sat Jan 30 16:29:25 2016
From: phd at (Oleg Broytman)
Date: Sat, 30 Jan 2016 22:29:25 +0100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>


On Sat, Jan 30, 2016 at 04:17:26PM -0500, Barry Warsaw <barry at> wrote:
> Two big problems with moving primary discussions off the mailing list are
> discoverablility and community fracture.
> Any new forum will mean another login, a new work flow, another slice of the
> ever diminishing attention pie, and discussions that occur both on the
> traditional site and the new site.  Some people will miss the big announcement
> about the new forum.  There will be lots of cross-posting because people won't
> know for sure which ones the people who need to be involved frequent.
> For example, many years ago I missed a discussion about something I cared
> about and only accidentally took notice when I saw a commit message in my
> inbox.  When I asked about why the issue had never been mentioned on
> python-dev, I was told that everything was hashed out in great detail on the
> tracker.  I didn't even realize that I wasn't getting email notifications of
> new tracker issues, so I never saw it until it was too late.
> I've seen other topics discussed primarily on G+, for which I have an account,
> but rarely pay attention too.  I don't even know if it's still "a thing".
> Maybe everyone's moved to Slack by now.

   Or to gitter...

> How many different channels do I have
> to engage with to keep track of what's happening in core Python?
> This isn't GOML

   GOML? Get Off my Mailing List? ;-)

> and I'm all for experimentation, but I do urge caution.
> Otherwise we might just wonder why we haven't heard from Uncle Timmy in a
> while.
> Cheers,
> -Barry

     Oleg Broytman              phd at
           Programmers don't die, they just GOSUB without RETURN.

From nicholas.chammas at  Sat Jan 30 17:02:18 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sat, 30 Jan 2016 22:02:18 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Two big problems with moving primary discussions off the mailing list are
discoverablility and community fracture.

Agreed, though consider: Community fracture is a always risk when changing
the discussion medium. On balance, we have to consider whether the risk is
outweighed by the benefits of a new medium.

As for discoverability, let me make a brief case for why Discourse is head
and shoulders above mailing lists.

Among many well-designed features <>,
Discourse has the following things going for it in the discoverability

   - You can mention people by @name from posts and they?ll get notified,
   like on GitHub. No need to wonder if people will miss something because
   they haven?t setup their email filters correctly.
   - We can unify the various lists under a single forum and separate
   discussions with categories. This would hopefully lend better to
   cross-pollination of discussions across different categories (e.g. ideas
   vs. dev), while still letting people narrow their focus to a single
   category if that?s what they want. For examples of how categories can be
   used, see Discourse Meta <> and this category
   on the Docker forum <>.
   - People starting new posts on Discourse automatically get shown
   potentially related discussions, similar to what Stack Overflow does. It
   makes it much harder to miss or forget to look for prior discussions before
   starting a new one. Naturally, generalized search is also a first-class
   - Regarding the potential proliferation of logins, Discourse supports
   single sign-on
   so if we want we can let people login with Google, GitHub, or perhaps even
   a Python-owned identity provider.

These features (and others <>) are really
well-executed, as you would expect coming from Jeff Atwood and others who
left Stack Overflow to create Discourse.

Finally, as a web-based forum, Discourse takes the burden off of users
having to each independently come up with a toolchain that makes things
manageable for them. Solutions to common problems like notification,
finding prior discussions, and so forth, are implemented centrally, and all
users automatically benefit. It?s really hard to offer that with a mailing

And to top it all off, if for whatever reason you hate web forums,
Discourse has a ?mailing list mode? which lets you respond to and start
discussions entirely via email, without affecting the web-based forum.


On Sat, Jan 30, 2016 at 4:22 PM Donald Stufft <donald at> wrote:

> > On Jan 30, 2016, at 4:17 PM, Barry Warsaw <barry at> wrote:
> >
> > Any new forum will mean another login, a new work flow, another slice of
> the
> > ever diminishing attention pie, and discussions that occur both on the
> > traditional site and the new site.  Some people will miss the big
> announcement
> > about the new forum.  There will be lots of cross-posting because people
> won't
> > know for sure which ones the people who need to be involved frequent.
> For what it?s worth, another thing I want to do is setup and
> consolidate all the logins :)
> -----------------
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From greg.ewing at  Sat Jan 30 17:09:45 2016
From: greg.ewing at (Greg Ewing)
Date: Sun, 31 Jan 2016 11:09:45 +1300
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Nicholas Chammas wrote:
> Discourse is built by some of the same people who built Stack Overflow, 
> including Jeff Atwood 
> <>. 
> Among the many excellent features <> it 
> has is full support for a ?mailing list mode?, where you can reply to 
> and start new conversations entirely via email.

If such a move were made, some kind of email or usenet gateway
would be an *essential* feature for me to continue participating.
I don't have enough time or energy to chase down multiple web
forums every day and wrestle with their clunky interfaces.

One of the usenet groups I used to follow (
is now effectively dead since everyone abandoned it for a web
forum that I can't easily follow. I'd be very sad if anything
like that happened to the main Python groups.


From barry at  Sat Jan 30 17:19:33 2016
From: barry at (Barry Warsaw)
Date: Sat, 30 Jan 2016 17:19:33 -0500
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

On Jan 30, 2016, at 10:02 PM, Nicholas Chammas wrote:

>As for discoverability, let me make a brief case for why Discourse is head
>and shoulders above mailing lists.

To be clear, I'm a fan of Discourse, and would be happy to see an SSO'd instance of it for experimentation purposes.

However, I would be really upset if major decisions were made in some
Discourse thread.  There's a reason why PEP 1 requires posting to python-dev,
and specifies headers like Discussions-To, Post-History, and Resolution.

Some features, which I'd call "tangential" to core language design do indeed
happen elsewhere primarily.  asyncio and the distutils-stack come to mind.
And I think that's fine.  But it's also important to post to python-dev at
certain milestones or critical junctures because that's what *everyone* knows
as the central place for coordinating development.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From steve at  Sat Jan 30 17:47:19 2016
From: steve at (Steven D'Aprano)
Date: Sun, 31 Jan 2016 09:47:19 +1100
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:

> While we're at it though, re-importing __main__ is a separate kind of 
> behavior that is often a problem, since it means you'll have the same 
> classes defined twice.

As far as I can tell, importing __main__ is fine. It's only when you 
import __main__ AND the main module under its real name at the same time 
that you can run into problems -- and even then, not always. The sort of 
errors I've seen involve something like this:

import myscript
import __main__  # this is actually myscript
a = myscript.TheClass()
# later
assert isinstance(a, __main__.TheClass)

which fails, because myscript and __main__ don't share state, despite 
actually coming from the same source file.

So I think it's pretty rare for something like this to actually happen. 
I've never seen it happen by accident, I've only seen it done 
deliberately as a counter-example to to the "modules are singletons" 


From steve at  Sat Jan 30 18:19:16 2016
From: steve at (Steven D'Aprano)
Date: Sun, 31 Jan 2016 10:19:16 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 10:29:25PM +0100, Oleg Broytman wrote:

> > This isn't GOML
>    GOML? Get Off my Mailing List? ;-)

"Get off my lawn!", traditionally yelled by grumpy old men at kids 
playing. Figuratively means "this is new, therefore I hate it".


From njs at  Sat Jan 30 18:25:44 2016
From: njs at (Nathaniel Smith)
Date: Sat, 30 Jan 2016 15:25:44 -0800
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 2:47 PM, Steven D'Aprano <steve at> wrote:
> On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:
>> While we're at it though, re-importing __main__ is a separate kind of
>> behavior that is often a problem, since it means you'll have the same
>> classes defined twice.
> As far as I can tell, importing __main__ is fine. It's only when you
> import __main__ AND the main module under its real name at the same time
> that you can run into problems -- and even then, not always. The sort of
> errors I've seen involve something like this:
> import myscript
> import __main__  # this is actually myscript
> a = myscript.TheClass()
> # later
> assert isinstance(a, __main__.TheClass)
> which fails, because myscript and __main__ don't share state, despite
> actually coming from the same source file.
> So I think it's pretty rare for something like this to actually happen.
> I've never seen it happen by accident, I've only seen it done
> deliberately as a counter-example to to the "modules are singletons"
> rule.

Not only is importing __main__ fine, it's actually unavoidable...
__main__ is just an alias to the main script's namespace.

-- --
class Foo:
import __main__
# Prints "True"
print(Foo is __main__.Foo)
-- end --

So importing __main__ never creates any new copies of any singletons;
it's only importing the main script under its filesystem name that
creates the problem.


Nathaniel J. Smith --

From nicholas.chammas at  Sat Jan 30 18:29:06 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sat, 30 Jan 2016 23:29:06 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

To be clear, I?m a fan of Discourse, and would be happy to see an SSO?d instance of it for experimentation purposes.


Perhaps Donald?s suggestion of starting a Discourse instance for Packaging
is the easiest way to evaluate it and give people time to kick the tires
and see what they think. I?m guessing that will be discussed on
distutils-sig? (This is part of the problem of mailing lists vs. a unified
forum. :-)


On Sat, Jan 30, 2016 at 6:19 PM Steven D'Aprano <steve at> wrote:

> On Sat, Jan 30, 2016 at 10:29:25PM +0100, Oleg Broytman wrote:
> > > This isn't GOML
> >
> >    GOML? Get Off my Mailing List? ;-)
> "Get off my lawn!", traditionally yelled by grumpy old men at kids
> playing. Figuratively means "this is new, therefore I hate it".
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sat Jan 30 19:09:51 2016
From: guido at (Guido van Rossum)
Date: Sat, 30 Jan 2016 16:09:51 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sat, Jan 30, 2016 at 1:17 PM, Barry Warsaw <barry at> wrote:
> For example, many years ago I missed a discussion about something I cared
> about and only accidentally took notice when I saw a commit message in my
> inbox.  When I asked about why the issue had never been mentioned on
> python-dev, I was told that everything was hashed out in great detail on the
> tracker.  I didn't even realize that I wasn't getting email notifications of
> new tracker issues, so I never saw it until it was too late.

That might have been a lapse in judgement for that particular issue?
Without the tracker we'd be utterly inundated in minutiae on
python-dev. Occasionally I see folks redirecting a discussion from
python-ideas or python-dev to the tracker, and vice versa, and in
general I think the line is pretty clear there and people do the right

Honestly I wouldn't want to replace python-dev for decisions, but I
know several core devs left python-ideas because it was too noisy for
them, and I think plenty of stuff on python-ideas would be totally
appropriate for some other forum (I often mute threads myself). My
rule is that if something's PEP-worthy it needs to be mentioned on
python-dev, even if most of the discussion is elsewhere (whether it's
a dedicated SIG or a specific tracker on GitHub). It seems reasonable
that python-dev should be involved early on, when the discussion is
just starting, and again close to the end, before decisions are cast
in stone. But I'm glad we don't have to do everything there.

> I've seen other topics discussed primarily on G+, for which I have an account,
> but rarely pay attention too. I don't even know if it's still "a thing".

Fortunately, G+ is dead. "Social media" as it's now known just isn't a
good place for these type of discussions.

> Maybe everyone's moved to Slack by now.  How many different channels do I have
> to engage with to keep track of what's happening in core Python?

A lot of stuff used to (or still does) happen in IRC, which (as you
know) I utterly hate and can't stand. But chat systems still serve a
purpose, and if people want to use them we can't stop them. But we can
have a written standard for how to handle major decisions, and I see
nothing wrong with the standards we currently have written up in PEP
1. I don't think whatever is being proposed here is going against
those rules (remember you're reading this in python-ideas, not
python-dev :-).

> This isn't GOML and I'm all for experimentation, but I do urge caution.
> Otherwise we might just wonder why we haven't heard from Uncle Timmy in a
> while.

Tim seems to have great filters though -- whenever someone says
"float" or "datetime" (or "farmville" :-) he perks up his ears.

--Guido van Rossum (

From ben+python at  Sat Jan 30 19:53:31 2016
From: ben+python at (Ben Finney)
Date: Sun, 31 Jan 2016 11:53:31 +1100
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Barry Warsaw <barry at> writes:

> But it's also important to post to python-dev at certain milestones or
> critical junctures because that's what *everyone* knows as the central
> place for coordinating development.

And importantly, with a PSF mailing list or PSF bug tracker or PSF code
review system, etc., collaboration with the rest of the group doesn't
require an account with some particular organisation not accountable to

 \          ?The best way to get information on Usenet is not to ask a |
  `\               question, but to post the wrong information.? ?Aahz |
_o__)                                                                  |
Ben Finney

From ned at  Sat Jan 30 19:58:49 2016
From: ned at (Ned Batchelder)
Date: Sat, 30 Jan 2016 19:58:49 -0500
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 1/30/16 5:47 PM, Steven D'Aprano wrote:
> On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:
>> While we're at it though, re-importing __main__ is a separate kind of
>> behavior that is often a problem, since it means you'll have the same
>> classes defined twice.
> As far as I can tell, importing __main__ is fine. It's only when you
> import __main__ AND the main module under its real name at the same time
> that you can run into problems -- and even then, not always. The sort of
> errors I've seen involve something like this:
> import myscript
> import __main__  # this is actually myscript
> a = myscript.TheClass()
> # later
> assert isinstance(a, __main__.TheClass)
> which fails, because myscript and __main__ don't share state, despite
> actually coming from the same source file.
> So I think it's pretty rare for something like this to actually happen.
> I've never seen it happen by accident, I've only seen it done
> deliberately as a counter-example to to the "modules are singletons"
> rule.

Something like this does happen in the real world.  A class is defined 
in the main module, and then the module is later imported with its real 
name.  Now you have __main__.Class and module.Class both defined.  You 
don't need to actually "import __main__" for it to happen. 
__main__.Class is used implicitly from the main module simply as Class.


From stephen at  Sat Jan 30 19:55:44 2016
From: stephen at (Stephen J. Turnbull)
Date: Sun, 31 Jan 2016 09:55:44 +0900
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Guido van Rossum writes:

 > Oooh, Discourse looks and sounds good. Hopefully we can opt out from
 > voting, everything else looks just right.

Random832 is right: you need some kind of voting to have forum-curated

From ncoghlan at  Sun Jan 31 01:44:53 2016
From: ncoghlan at (Nick Coghlan)
Date: Sun, 31 Jan 2016 16:44:53 +1000
Subject: [Python-ideas] Prevent importing yourself?
In-Reply-To: <>
References: <>
Message-ID: <>

On 30 January 2016 at 23:20, Oscar Benjamin <oscar.j.benjamin at> wrote:
> On 30 January 2016 at 11:57, Nick Coghlan <ncoghlan at> wrote:
>> On 30 January 2016 at 21:19, Ned Batchelder <ned at> wrote:
>>> On 1/30/16 4:30 AM, Nick Coghlan wrote:
>>>> We could potentially detect when __main__ is being reimported under a
>>>> different name and issue a user visible warning when it happens, but
>>>> we can't readily detect a file importing itself in the general case
>>>> (since it may be an indirect circular reference rather than a direct).
>>> I thought about the indirect case, and for the errors I'm trying to make
>>> clearer, the direct case is plenty.
>> In that case, the only problem I see off the top of my head with
>> emitting a warning for direct self-imports is that it would rely on
>> import system behaviour we're currently trying to reduce/minimise: the
>> import machinery needing visibility into the globals for the module
>> initiating the import.
>> It's also possible that by the time we get hold of the __spec__ for
>> the module being imported, we've already dropped our reference to the
>> importing module's globals, so we can't check against __file__ any
>> more. However, I'd need to go read the code to remember how quickly we
>> get to extracting just the globals of potential interest.
> Maybe this is because I don't really understand how the import
> machinery works but I would say that if I run
>     $ python
> Then the interpreter should be able to know that __main__ is called
> "random" and know the path to that file. It should also be evident if
> '' is at the front of sys.path then "import random" is going to import
> that same module. Why is it difficult to detect that case?

Yes, this is the case I originally said we could definitely detect.
The case I don't know if we can readily detect is the one where a
module *other than __main__* is imported a second time under a
different name. However, I'm not sure that latter capability would be
at all useful, so it probably doesn't matter whether or not it's


Nick Coghlan   |   ncoghlan at   |   Brisbane, Australia

From nicholas.chammas at  Sun Jan 31 10:16:56 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 31 Jan 2016 15:16:56 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

And importantly, with a PSF mailing list or PSF bug tracker or PSF code
review system, etc., collaboration with the rest of the group doesn?t
require an account with some particular organisation not accountable to

A quick comment on this, in case anyone thinks Discourse falls in this

Discourse the forum software is 100% open source
<>. We can run our own instance on
our own hardware, or we can have someone else run it for us (like the
Discourse team themselves). And identity is pluggable, so we can have
something like be the identity provider for our Discourse
instance, regardless of where it?s hosted.

Discourse is not like Google Groups where a company we don?t control can
decide to shut down our forum service, or where we are forced to create
accounts with a third-party in order to hold discussions. Every piece of
Discourse would be completely under our control.

Random832 is right: you need some kind of voting to have forum-curated

A quick distinction here: Voting, at least on Discourse, will be a separate
intended to let people do things like vote on proposals.

Reputation ? or, as Discourse calls it, trust levels ? is its own thing,
and comes built in to Discourse
<>. Similar
to how Stack Overflow works, as you gain trust within the community, new
abilities become unlocked.

For example, users at trust level 0
(i.e. brand new users) cannot send private messages to other users and
cannot add attachments to their posts. These defaults are configurable by
the forum admin. As they participate in the community, their trust level
goes up and formerly-locked abilities become available

People who have been around for ages and who are already trusted can be
manually promoted to the highest trust level
which effectively makes them forum moderators.

If this sounds interesting to you, I recommend reading through the Discourse
trust levels
to get a good sense of how Discourse views community building. It?s really
well thought out, IMO, and is informed by the authors? experience building
Stack Overflow.


On Sat, Jan 30, 2016 at 8:30 PM Stephen J. Turnbull <stephen at>

> Guido van Rossum writes:
>  > Oooh, Discourse looks and sounds good. Hopefully we can opt out from
>  > voting, everything else looks just right.
> Random832 is right: you need some kind of voting to have forum-curated
> reputations.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From skrah.temporarily at  Sun Jan 31 10:47:58 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sun, 31 Jan 2016 15:47:58 +0000 (UTC)
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Nicholas Chammas <nicholas.chammas at ...> writes:
> If this sounds interesting to you, I recommend reading through the
Discourse trust levels to get a good sense of how Discourse views community
building. It?s really well thought out, IMO, and is informed by the authors?
experience building Stack Overflow.

It does not sound interesting at all -- Python development is increasingly
turning into a circus, with fewer and fewer people actually writing code.

Stefan Krah 

From nicholas.chammas at  Sun Jan 31 11:19:12 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 31 Jan 2016 16:19:12 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

To be clear, I'm not on python-dev and am not advocating we replace that
list with Discourse.

I'm just making the case for why Discourse would be a good candidate for
the other discussion venues we've been talking about in this thread (e.g.
packaging, python-ideas), where people are open to trying out a new medium.

The basic idea is that investing in a better medium and better tooling
fosters better discussions, which benefits Python the community and
ultimately also Python the code base. I wouldn't call that a circus

But then again, I'm relatively new to the Python community; perhaps most
people on here find this kind of meta-discussion unproductive.

On Sun, Jan 31, 2016 at 10:48 AM Stefan Krah <skrah.temporarily at>

> Nicholas Chammas <nicholas.chammas at ...> writes:
> > If this sounds interesting to you, I recommend reading through the
> Discourse trust levels to get a good sense of how Discourse views community
> building. It?s really well thought out, IMO, and is informed by the
> authors?
> experience building Stack Overflow.
> It does not sound interesting at all -- Python development is increasingly
> turning into a circus, with fewer and fewer people actually writing code.
> Stefan Krah
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From barry at  Sun Jan 31 12:40:36 2016
From: barry at (Barry Warsaw)
Date: Sun, 31 Jan 2016 12:40:36 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 30, 2016, at 04:09 PM, Guido van Rossum wrote:

>On Sat, Jan 30, 2016 at 1:17 PM, Barry Warsaw <barry at> wrote:
>> For example, many years ago I missed a discussion about something I cared
>> about and only accidentally took notice when I saw a commit message in my
>> inbox.  When I asked about why the issue had never been mentioned on
>> python-dev, I was told that everything was hashed out in great detail on the
>> tracker.  I didn't even realize that I wasn't getting email notifications of
>> new tracker issues, so I never saw it until it was too late.  
>That might have been a lapse in judgement for that particular issue?

I actually don't remember the details, just that it happened.  But the fact
that a lot of smaller details are discussed primarily or solely on the tracker
is totally fine and all good!  I think we do a much better job of advertising
it now.  Hopefully everyone knows that to stay involved at that level of
detail, sign up for new-issue notifications and nosey yourself in on the
topics you care about.

Agreed with you about the rest of what you said, except perhaps for:

>A lot of stuff used to (or still does) happen in IRC, which (as you
>know) I utterly hate and can't stand. But chat systems still serve a
>purpose, and if people want to use them we can't stop them.

Yep.  We have similar discussions internally.

I actually don't mind IRC since I have a good client (bip + Emacs/ERC) and I
do live on dozens of channels for work.  It can get a little spammy at times,
but I find them relatively effective at getting or giving focused, short-term
help.  IRC doesn't work as well for bigger collaborations.  But IRC does have
the advantage of being totally open and accessible via numerous clients, so
information can't be too exclusive or owned.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From mal at  Sun Jan 31 13:22:38 2016
From: mal at (M.-A. Lemburg)
Date: Sun, 31 Jan 2016 19:22:38 +0100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Would it be possible to provide an integration of Mailman with
Discourse ?

I know that Discourse already provides quite a few mailing list
like features, but there are still a few issues, which an
integration like the existing NNTP gateway of Mailman could
likely help resolve:

Esp. the inline reply style (problem 4) mentioned there seems
like a show stopper for the way we are used to working here
and on other Python MLs.

There already is a grant for improving Discourse for some of these

If both Discourse and Mailman can live side-by-side, with
Discourse being the "web interface" to the Mailman list,
I think we'd get the best of both worlds.

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 31 2016)
>>> Python Projects, Coaching and Consulting ...
>>> Python Database Interfaces ... 
>>> Plone/Zope Database Interfaces ... 

::: We implement business ideas - efficiently in both time and costs ::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

From brett at  Sun Jan 31 13:27:22 2016
From: brett at (Brett Cannon)
Date: Sun, 31 Jan 2016 18:27:22 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, 31 Jan 2016 at 08:19 Nicholas Chammas <nicholas.chammas at>

> To be clear, I'm not on python-dev and am not advocating we replace that
> list with Discourse.
> I'm just making the case for why Discourse would be a good candidate for
> the other discussion venues we've been talking about in this thread (e.g.
> packaging, python-ideas), where people are open to trying out a new medium.
> The basic idea is that investing in a better medium and better tooling
> fosters better discussions, which benefits Python the community and
> ultimately also Python the code base. I wouldn't call that a circus
> activity.
> But then again, I'm relatively new to the Python community; perhaps most
> people on here find this kind of meta-discussion unproductive.

It should happen on occasion, just not regularly. :)

Keeping an open source project running is part technical, part social
(which makes it part political :). That social bit means having to
occasionally evaluate how we are managing our communication amongst not
just long-time participants but also new ones. This means we have to
sometimes look at what kids in university are  using in order to entice
them to participate (heck, even high school at this rate). For instance,
Barry has mentioned NNTP as part of his solution to managing his mail
relating to Python. But go into any university around the world and ask
some CS student, "what is Usenet?" -- let alone NNTP -- and it's quite
possible you will get a blank stare. This is why I don't call it
comp.lang.python anymore but  python-list at (same goes for IRC,
but it's probably known a lot more widely than Usenet). What this means is
we occasionally have to evaluate whether our ways of communicating are too
antiquated for new participants in open source and whether they are no
longer the most effective (because old does not mean bad, but it does not
mean better either), while balancing it with not having constant churn or
inadvertently making things worse. Toss in people's principled stances on
open source and it leads to a heated discussion.

For instance, people have said they don't want to set up another account.
But people forget that *every* mailing list on requires its
own account to post (I personally have near a bazillion at this point). And
while the archives and gmane give you anonymous access to read without an
account, so does Discourse or any of the other solutions being discussed
(no one wants to wall off the archives or make it so we can't keep a hold
of our data in case of another move).

It's the usual issue of having to get down to the root of the issue as to
why people would want to stay with the mailing list vs. why others would
want to switch to Discourse. Finding out the fundamental reasons and taking
out the emotion of the discussion is usually the key to helping solve this
sort of grounded discussion (at which point you can start ignoring those
who can't remove the emotion).

And in the case of people worrying about bifurcating the discussions, the
python-ideas mailing list would simply be shut down to new email and its
archive left up to prevent a split in audience if we do end up changing
things up.

> On Sun, Jan 31, 2016 at 10:48 AM Stefan Krah <skrah.temporarily at>
> wrote:
>> Nicholas Chammas <nicholas.chammas at ...> writes:
>> > If this sounds interesting to you, I recommend reading through the
>> Discourse trust levels to get a good sense of how Discourse views
>> community
>> building. It?s really well thought out, IMO, and is informed by the
>> authors?
>> experience building Stack Overflow.
>> It does not sound interesting at all -- Python development is increasingly
>> turning into a circus, with fewer and fewer people actually writing code.
>> Stefan Krah
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From guido at  Sun Jan 31 13:35:45 2016
From: guido at (Guido van Rossum)
Date: Sun, 31 Jan 2016 10:35:45 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 31, 2016 at 9:40 AM, Barry Warsaw <barry at> wrote:
> But IRC does have
> the advantage of being totally open and accessible via numerous clients, so
> information can't be too exclusive or owned.

Maybe the software is totally open, but the community doesn't feel
that way. hen I forayed into it briefly felt hostile to people who
don't have the right personality to be online 24/7.

--Guido van Rossum (

From barry at  Sun Jan 31 13:36:18 2016
From: barry at (Barry Warsaw)
Date: Sun, 31 Jan 2016 13:36:18 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 31, 2016, at 07:22 PM, M.-A. Lemburg wrote:

>Would it be possible to provide an integration of Mailman with
>Discourse ?

Possible, I don't know, but on the wish list, yes!  None of the core Mailman
developers have time for this, but we would gladly help and work with anybody
who wanted to look into this.

>If both Discourse and Mailman can live side-by-side, with
>Discourse being the "web interface" to the Mailman list,
>I think we'd get the best of both worlds.

Definitely.  Also note that we'd like to build NNTP and IMAP support into
Mailman, again though it's lack of resources.

If anybody wants to work on these areas, please contact us over in
mailman-developers at

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From barry at  Sun Jan 31 13:39:41 2016
From: barry at (Barry Warsaw)
Date: Sun, 31 Jan 2016 13:39:41 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 31, 2016, at 10:35 AM, Guido van Rossum wrote:

>Maybe the software is totally open, but the community doesn't feel
>that way. hen I forayed into it briefly felt hostile to people who
>don't have the right personality to be online 24/7.

It's probably a lot like FLOSS communities in general.  Some are very open,
patient, and accepting, and others aren't.  Maybe we're spoiled here in the
Pythonia.  :)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From donald at  Sun Jan 31 13:54:16 2016
From: donald at (Donald Stufft)
Date: Sun, 31 Jan 2016 13:54:16 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 31, 2016, at 1:39 PM, Barry Warsaw <barry at> wrote:
> On Jan 31, 2016, at 10:35 AM, Guido van Rossum wrote:
>> Maybe the software is totally open, but the community doesn't feel
>> that way. hen I forayed into it briefly felt hostile to people who
>> don't have the right personality to be online 24/7.
> It's probably a lot like FLOSS communities in general.  Some are very open,
> patient, and accepting, and others aren't.  Maybe we're spoiled here in the
> Pythonia.  :)

Eh, I think IRC as a protocol tends to be hostile to people who can't have some
method of being online 24/7 (even if it's via a bouncer and they aren't
physically there). I think it's why you see more projects using things like
Slack or gitter instead of IRC. You can sort of recreate some of this using
log bots and/or bouncers and the like, but I think one of the things we're
seeing across all of F/OSS is that for the newer generation of developers, UX
matters, in  many cases more than F/OSS does and they're less willing to put up
with bad UX. I think it is why you see so many people developing software on
OS X that they plan to deploy to Linux, why you see people preferring GitHub
over other solutions, why Slack over IRC, etc.

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From barry at  Sun Jan 31 14:02:59 2016
From: barry at (Barry Warsaw)
Date: Sun, 31 Jan 2016 14:02:59 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Jan 31, 2016, at 01:54 PM, Donald Stufft wrote:

>Eh, I think IRC as a protocol tends to be hostile to people who can't have
>some method of being online 24/7

I wouldn't say "hostile" but certainly not nearly as useful.  On the flip
side, I've heard complaints from Slack users (I'm not one myself) that they
can get overwhelmed by notifications when they want to be "off the clock".

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <>

From ben+python at  Sun Jan 31 14:20:54 2016
From: ben+python at (Ben Finney)
Date: Mon, 01 Feb 2016 06:20:54 +1100
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Brett Cannon <brett at> writes:

> For instance, people have said they don't want to set up another
> account.

The complaint expressed (by me, at least; perhaps others agree) was not
against setting up an account. As you point out, PSF mailing lists
already require creating accounts. It's against being required to
maintain a trusted relationship with some non-PSF-accountable entity, in
order to participate in some aspect of Python community.

I agree with others that a Discourse instance entirely controlled by PSF
would avoid that problem.

 \        ?Consider the daffodil. And while you're doing that, I'll be |
  `\              over here, looking through your stuff.? ?Jack Handey |
_o__)                                                                  |
Ben Finney

From nicholas.chammas at  Sun Jan 31 16:11:53 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 31 Jan 2016 21:11:53 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

If both Discourse and Mailman can live side-by-side, with
Discourse being the ?web interface? to the Mailman list,I think we?d get
the best of both worlds.

Funny you ask that, since I wondered about exactly the same thing when I
looked into using Discourse for an Apache project. The Apache Software
Foundation has a strict policy about ASF-owned mailing lists being the
place of discussion, so the only way Discourse would have been able to play
a role was as an interface to an existing, ASF-owned mailing list.

Here is the discussion I started about this
on Discourse Meta around a year ago.

In short, I think the answer that came out from that discussion is (quoting
Jeff Atwood; emphasis his):

This really depends on the culture of the mailing list. Discourse has
fairly robust email support (for notifications, and if configured, for
replies and email-in to start new topics), but it is still fundamentally
web-centric in the way that it views the world. There will be clashes for
people who are 100% email-centric.

Do you have support from the ?powers that be? at said mailing lists to make
such a change? Are they asking for such a change? We are very open to
working with a partner on migrating mailing lists and further enhancing the
mailing list support in Discourse, but it very much requires solid support
from the *leadership* and a significant part of the *community*.

There?s a lot of friction involved in changes for groups!


On Sun, Jan 31, 2016 at 2:21 PM Ben Finney <ben+python at>

> Brett Cannon <brett at> writes:
> > For instance, people have said they don't want to set up another
> > account.
> The complaint expressed (by me, at least; perhaps others agree) was not
> against setting up an account. As you point out, PSF mailing lists
> already require creating accounts. It's against being required to
> maintain a trusted relationship with some non-PSF-accountable entity, in
> order to participate in some aspect of Python community.
> I agree with others that a Discourse instance entirely controlled by PSF
> would avoid that problem.
> --
>  \        ?Consider the daffodil. And while you're doing that, I'll be |
>   `\              over here, looking through your stuff.? ?Jack Handey |
> _o__)                                                                  |
> Ben Finney
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at
> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From nicholas.chammas at  Sun Jan 31 16:53:13 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 31 Jan 2016 21:53:13 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

Brett wrote:

What this means is we occasionally have to evaluate whether our ways of
communicating are too antiquated for new participants in open source and
whether they are no longer the most effective (because old does not mean
bad, but it does not mean better either), while balancing it with not
having constant churn or inadvertently making things worse.

Discourse aside, I?m really glad to see that people understand that this is
important to the long-term health of Python ? as a community and otherwise
? and are willing to give it priority. (And I totally agree that
significant workflow changes, or discussions thereof, should happen
infrequently and be evaluated carefully for their cost and benefit over

Donald wrote:

I think one of the things we?re seeing across all of F/OSS is that for the
newer generation of developers, UX matters, in many cases more than F/OSS
does and they?re less willing to put up with bad UX.

I can attest to this personally, and I?ll also offer this conjecture:

I don?t think older generations of developers are intrinsically any more
tolerant of bad UX than the younger generations are. They hate bad UX too,
and they had to figure out their own solutions to make things better ?
their email filters, their clients, their homegrown scripts, etc. ? when
nothing better was available, and eventually settled into a flow that
worked for them.


On Sun, Jan 31, 2016 at 4:11 PM Nicholas Chammas <nicholas.chammas at>

> If both Discourse and Mailman can live side-by-side, with
> Discourse being the ?web interface? to the Mailman list,I think we?d get
> the best of both worlds.
> Funny you ask that, since I wondered about exactly the same thing when I
> looked into using Discourse for an Apache project. The Apache Software
> Foundation has a strict policy about ASF-owned mailing lists being the
> place of discussion, so the only way Discourse would have been able to play
> a role was as an interface to an existing, ASF-owned mailing list.
> Here is the discussion I started about this
> <>
> on Discourse Meta around a year ago.
> In short, I think the answer that came out from that discussion is (
> quoting
> <>
> Jeff Atwood; emphasis his):
> This really depends on the culture of the mailing list. Discourse has
> fairly robust email support (for notifications, and if configured, for
> replies and email-in to start new topics), but it is still fundamentally
> web-centric in the way that it views the world. There will be clashes for
> people who are 100% email-centric.
> Do you have support from the ?powers that be? at said mailing lists to
> make such a change? Are they asking for such a change? We are very open to
> working with a partner on migrating mailing lists and further enhancing the
> mailing list support in Discourse, but it very much requires solid support
> from the *leadership* and a significant part of the *community*.
> There?s a lot of friction involved in changes for groups!
> Nick
> ?
> On Sun, Jan 31, 2016 at 2:21 PM Ben Finney <ben+python at>
> wrote:
>> Brett Cannon <brett at> writes:
>> > For instance, people have said they don't want to set up another
>> > account.
>> The complaint expressed (by me, at least; perhaps others agree) was not
>> against setting up an account. As you point out, PSF mailing lists
>> already require creating accounts. It's against being required to
>> maintain a trusted relationship with some non-PSF-accountable entity, in
>> order to participate in some aspect of Python community.
>> I agree with others that a Discourse instance entirely controlled by PSF
>> would avoid that problem.
>> --
>>  \        ?Consider the daffodil. And while you're doing that, I'll be |
>>   `\              over here, looking through your stuff.? ?Jack Handey |
>> _o__)                                                                  |
>> Ben Finney
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at
>> Code of Conduct:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From skrah.temporarily at  Sun Jan 31 17:03:59 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sun, 31 Jan 2016 22:03:59 +0000 (UTC)
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Nicholas Chammas <nicholas.chammas at ...> writes:
> I can attest to this personally, and I?ll also offer this conjecture:
> I don?t think older generations of developers are intrinsically any more
tolerant of bad UX than the younger generations are. They hate bad UX too,
and they had to figure out their own solutions to make things better ? their
email filters, their clients, their homegrown scripts, etc. ? when nothing
better was available, and eventually settled into a flow that worked for them.

You are really getting on a soapbox here while having no clue at all
about basic mailing list etiquette like

  a) not top posting

  b) not full quoting the entire thread

  c) properly quoting your predecessors.

I guess we'll see more of that once the move to discourse has

Stefan Krah

From donald at  Sun Jan 31 17:17:46 2016
From: donald at (Donald Stufft)
Date: Sun, 31 Jan 2016 17:17:46 -0500
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

> On Jan 31, 2016, at 5:03 PM, Stefan Krah <skrah.temporarily at> wrote:
>  c) properly quoting your predecessors.

Which is ironic, given that you incorrectly quoted Nicholas and half of the message isn?t quoted at all though it should be. Perhaps it?d be a lot more welcoming if we didn?t scold people for ?mailing list etiquette? when the various email clients make it pretty easy to accidentally mess it up.

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <>

From skrah.temporarily at  Sun Jan 31 17:29:19 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sun, 31 Jan 2016 22:29:19 +0000 (UTC)
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Donald Stufft <donald at ...> writes:
> >  c) properly quoting your predecessors.
> > 
> Which is ironic, given that you incorrectly quoted Nicholas and half of
the message isn?t quoted at all
> though it should be. Perhaps it?d be a lot more welcoming if we didn?t
scold people for ?mailing list
> etiquette? when the various email clients make it pretty easy to
accidentally mess it up.

How can I quote properly if it isn't clear at all who wrote what?

Stefan Krah

From nicholas.chammas at  Sun Jan 31 17:30:58 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Sun, 31 Jan 2016 22:30:58 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 31, 2016 at 5:04 PM Stefan Krah skrah.temporarily at
<http://mailto:skrah.temporarily at> wrote:

a) not top posting

b) not full quoting the entire thread

Sorry, by default Gmail hides the thread when replying so it?s easy to
forget that you are re-mailing the whole thing out. So normally you would
not even notice that someone has top posted or quoted the entire thread if
you?re reading on Gmail?s web client. Chalk it up to my being a mailing
list n00b.

I hope you also recognize that this particular piece of mailing list
etiquette arose in a time where people did not have nice tooling to do the
work for them, which is part of the point of this discussion.

c) properly quoting your predecessors.

OK, did I do it right this time?

I guess we?ll see more of that once the move to discourse has happened.

No such move has been agreed upon as far as I can tell, but if I may
continue on my ?soap box? and repeat what I?ve said earlier:

I think Discourse will make etiquette *easier* to follow by taking care of
repetitive tasks like this for people, instead of requiring that everyone
independently remember to do X, Y, and Z every time they post.

If you disagree, it would be good to hear why so we can discuss the root

And for the record: I?m really taken aback by how cynical your comments
are, and I apologize for ticking you off. I?ll do a better job of following
mailing list etiquette going forward.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

From ethan at  Sun Jan 31 17:39:48 2016
From: ethan at (Ethan Furman)
Date: Sun, 31 Jan 2016 14:39:48 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On 01/31/2016 02:03 PM, Stefan Krah wrote:

>    b) not full quoting the entire thread

Do you mean something like quoting an entire PEP when responding to only 
one or two lines of it, or do you mean keeping everything from the first 
email through all the replies so we have 15 levels of indentation?

`Cause frankly, both those suck, and many long time users here are 
guilty of it.

So why don't you lay off the personality war, and have an honest 
discussion of the idea.


From guido at  Sun Jan 31 17:49:50 2016
From: guido at (Guido van Rossum)
Date: Sun, 31 Jan 2016 14:49:50 -0800
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 31, 2016 at 2:29 PM, Stefan Krah
<skrah.temporarily at> wrote:
> How can I quote properly if it isn't clear at all who wrote what?

Stefan, take this "etiquette" stuff off the thread.

--Guido van Rossum (

From skrah.temporarily at  Sun Jan 31 18:06:05 2016
From: skrah.temporarily at (Stefan Krah)
Date: Sun, 31 Jan 2016 23:06:05 +0000 (UTC)
Subject: [Python-ideas] A bit meta
References: <>
Message-ID: <>

Nicholas Chammas <nicholas.chammas at ...> writes:
> Sorry, by default Gmail hides the thread when replying so it?s easy to
forget that you are re-mailing the whole thing out. So normally you would
not even notice that someone has top posted or quoted the entire thread if
you?re reading on Gmail?s web client. Chalk it up to my being a mailing list

That's okay, but perhaps your tools aren't as good as you think.

> I hope you also recognize that this particular piece of mailing list
etiquette arose in a time where people did not have nice tooling to do the
work for them, which is part of the point of this discussion.

Have you actually *used* Gnus, mutt, slrn or even You
are again stating things with great certainty while I don't think
you know the subject.

> OK, did I do it right this time?

No, try replying to one of your own posts on and you'll see.

> I think Discourse will make etiquette easier to follow by taking care of
repetitive tasks like this for people, instead of requiring that everyone
independently remember to do X, Y, and Z every time they post.

You don't need to: Any of the above options does it automatically.

> And for the record: I?m really taken aback by how cynical your comments
are, and I apologize for ticking you off. I?ll do a better job of following
mailing list etiquette going forward.

It's not about the etiquette: If you come in here and tell us
that our tools are inferior, expect some pushback.

I for example think that looks cluttered
and distracting.

Stefan Krah

From rosuav at  Sun Jan 31 18:09:47 2016
From: rosuav at (Chris Angelico)
Date: Mon, 1 Feb 2016 10:09:47 +1100
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Mon, Feb 1, 2016 at 9:30 AM, Nicholas Chammas
<nicholas.chammas at> wrote:
> On Sun, Jan 31, 2016 at 5:04 PM Stefan Krah skrah.temporarily at
> wrote:
> a) not top posting
> b) not full quoting the entire thread
> Sorry, by default Gmail hides the thread when replying so it?s easy to
> forget that you are re-mailing the whole thing out. So normally you would
> not even notice that someone has top posted or quoted the entire thread if
> you?re reading on Gmail?s web client. Chalk it up to my being a mailing list
> n00b.

Yeah, Gmail can be a pain. But it's good in so many other ways that I
keep using it. A couple of tips:

1) Turn off Rich Text by default, and if ever you see it active, turn
it off for that email. It's a lot easier to make sure you're quoting
properly etc when the email is in plain text mode.

2) If you're replying to just part of the message, you should be able
to highlight that part and click in the Reply box. (That might require
a config option - been ages since I set this up.) There'll be a couple
of blank lines at the top,  but you can either delete or ignore them,
and just hit Ctrl-End to start typing underneath (or insert text in
between different blocks).

3) To reply to the whole message, hit R or click in the box - and then
press Ctrl-A to "select all". This instantly expands out the quoted
text, making it easy to see what's worth trimming.

And either because of Monty Python or because of it being one of the
two hardest problems in computing, I said "a couple" and gave three.
Whatever. :)


From nicholas.chammas at  Sun Jan 31 20:59:46 2016
From: nicholas.chammas at (Nicholas Chammas)
Date: Mon, 01 Feb 2016 01:59:46 +0000
Subject: [Python-ideas] A bit meta
In-Reply-To: <>
References: <>
Message-ID: <>

On Sun, Jan 31, 2016 at 6:06 PM Stefan Krah <skrah.temporarily at>

> > I hope you also recognize that this particular piece of mailing list
> etiquette arose in a time where people did not have nice tooling to do the
> work for them, which is part of the point of this discussion.
> Have you actually *used* Gnus, mutt, slrn or even You
> are again stating things with great certainty while I don't think
> you know the subject.

I haven't used those tools. It would be enlightening if you explained how
they can address the issues we've been discussing in this thread. That is
what we're discussing here, after all--improving how we discuss things.
I've done my part by explaining the potential benefits that Discourse can
offer us in great detail, because that's what I know.

Regarding "stating things with great certainty", I'm not sure what you're
referring to. I made some arguments, quoted people, and linked to stuff.
Not sure what my crime is there. And my quote about "the older generations
of developers" -- which you sneered at earlier with the "soap box" comment
-- I explicitly prefaced with: "I'll also offer this conjecture: ..."

> OK, did I do it right this time?
> No, try replying to one of your own posts on and you'll see.

I'm not sure what I'm supposed to see on I can see that this
thread is on there, but I can't find the most recent messages.

The way I am quoting you now is: I am hitting "Reply" in Gmail, clearing
out older parts of the thread, and replying inline to what you wrote. It's
pretty simple.

If that's still not correct then I'm not sure how to satisfy you. All I can
say is that I think it would be better if we had a way to solve mundane
issues like this centrally, instead of pushing the responsibility onto each
list user to piece together their own toolchain or workflow for doing the
right thing.

> I think Discourse will make etiquette easier to follow by taking care of
> repetitive tasks like this for people, instead of requiring that everyone
> independently remember to do X, Y, and Z every time they post.
> You don't need to: Any of the above options does it automatically.

Are Gnus, mutt, and slrn client-side tools? If they are, then we are
pushing this responsibility onto every list user to find and use these
tools correctly.

You also mentioned being able to respond to mail via Is that the
standard way everyone is expected to interact with the list? If not, then
you have the same problem.

Having a modern, web-based forum like Discourse which takes care of
repetitive tasks like this centrally means everyone on the forum
automatically has it taken care of. It's part of the interface of the
forum, and everyone is using the same interface.

Discourse's UX is good enough that in many cases the user *can't* or is
*extremely unlikely* to do the wrong thing when it comes to mundane,
routine things like quoting people, replying, etc. I think that's great.

> And for the record: I?m really taken aback by how cynical your comments
> are, and I apologize for ticking you off. I?ll do a better job of following
> mailing list etiquette going forward.
> It's not about the etiquette: If you come in here and tell us
> that our tools are inferior, expect some pushback.

I don't think I've bashed anyone's tools on here as "inferior". My
discussion has been limited to Discourse vs. mailing lists. As you yourself
stated, I clearly don't know about tools like Gnus, mutt, and so forth, and
I'm not going to bash something I don't know.

I *have* been arguing that a modern web-based forum solves common
discussion issues in a way that mailing lists cannot match. But I think my
arguments have been dispassionate and have not involved disparaging any
tools out there as "inferior".

As for "coming in here", I guess you're telling me that I'm an outsider.
Sure. And as for "pushback", I would make a distinction between pushback
that is substantive in nature and focused on the problem at hand, and
simple derision. They don't belong in the same category.

I for example think that looks cluttered
> and distracting.

Finally! An actual discussion of Discourse. And in this case, I agree with

I've gotten more accustomed to the layout over time, but I do remember
being overwhelmed when I first discovered Discourse. I'd bet there are
options to change the layout and reduce visual noise, but I don't know.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>