Mailman 3 several different needs [Explicit variable capture list] - Python-ideas

newer
Re: [Python-ideas] Respectively...

several different needs [Explicit variable capture list]

Jim J. Jewett

Jan. 26, 2016

7:40 p.m.

I think a small part of the confusion is that there are at least four separate (albeit related) use cases. They all use default arguments for the current workarounds, but they don't have the same ideal solution. (1) Auxiliary variables def f(x, _len=len): ... This is often a micro-optimization; the _len keyword really shouldn't be overridden. Partly because it shouldn't be overridden, having it in the signature is just ugly. This could be solved with another separator in the signature, such as ; or a second () or a new keyword ... def f(x, aux _len=len): ... def f(x, once _len=len): ... def f(x; _len=len):... def f(x)(_len=len): ... def f(x){_len=len}: ... But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete. (2) immutable bindings once X final Y const Z This is pretty similar to the auxiliary variables case, except that it tends to be desired more outside of functions. The immutability can be worth documenting on its own, but it feels too much like a typing declaration, which raises questions of "why *this* distinction for *this* variable?" So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1. (3) Persistent storage def f(x, _cached_results={}): ... In the past, I've managed to convince myself that it is good to be able to pass in a different cache ... or to turn the function into a class, so that I could get to self, or even to attach attributes to the function itself (so that rebinding the name to another function in a naive manner would fail, rather than produces bugs). Those convincings don't stick very well, though. This was clearly at least one of the motivations of some people who asked about static variables. I still think it might be nice to just have a way of easily opening a new scope ... but then I have to explain why I can't just turn the function into a class... So in the end, I suspect this use case is better off ignored, but I am pretty sure it will lead to some extra confusion if any of the others are "solved" in a way that doesn't consider it. (4) Current Value Capture This is the loop variable case that some have thought was the only case under consideration. I don't have anything to add to Andrew Barnert's https://mail.python.org/pipermail/python-ideas/2016-January/037980.html but do see Steven D'Aprano's https://mail.python.org/pipermail/python-ideas/2016-January/038047.html for gotchas even within this use case. -jJ

Show replies by date

Andrew Barnert

January 2016

8:59 p.m.

On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

(1) Auxiliary variables

def f(x, _len=len): ...

This is often a micro-optimization;

When _isn't_ it a micro-optimization? I think if it isn't, it's a very different case, e.g.: def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable) Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4, except that you're capturing a builtin instead of a nonlocal.

...

But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete.

But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations. Also, marking that you're using an intentional micro-optimization is useful, even (or maybe especially) if it's ugly: it signals to any future maintainer that performance is particularly important here, and they should be careful with any changes. Of course some people will abuse that (IIRC, a couple years ago, someone removed all the "register" declarations in the perl 5 source, which not only sped it up by a small amount, but also got people to look at some other micro-optimized code from 15 years ago that was actually pessimizing things on modern platforms...), but those people are the last ones who will stop micro-optimizing because you tell them the compiler can often do it better.

...

(2) immutable bindings

once X final Y const Z

But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables.

...

So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

How could an optimizer enforce immutability, much less signal it? It only makes changes that are semantically transparent, and changing a mutable binding to immutable is definitely not transparent.

...

(3) Persistent storage

def f(x, _cached_results={}): ...

...

I still think it might be nice to just have a way of easily opening a new scope ...

You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it? Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

Jim J. Jewett

1:23 a.m.

On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote:

...

On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
(1) Auxiliary variables

...

...
def f(x, _len=len): ...

...

...
This is often a micro-optimization;

...

When _isn't_ it a micro-optimization?

It can improve readability, usually by providing a useful rename. I have a vague sense that there might be other cases I'm forgetting, simply because I haven't had much use for them myself.

...

I think if it isn't, it's a very different case, e.g.:

def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable)

I would (perhaps wrongly) still have assumed that was at least intended for optimization.

...

Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4 ...

[#4 was current-value capture] I almost never set out to capture a snapshot of the current environment's values. I get around to that solution after being annoyed that something else didn't work, but it isn't the original intent. (That might be one reason I sometimes have to stop and think about closures created in a loop.) The "this shouldn't be in the signature" and "why is something being assigned to itself" problems won't go away even if current-value capture is resolved. I suspect current-value capture would even become an attractive nuisance that led to obscure bugs when the value was captured too soon.

...

...
But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete.

...

But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations.

That still argues for not making any changes to the language; I think the equivalent of (faster access to unchanged globals or builtins) is a better portability bet than new language features.

...

Also, marking that you're using an intentional micro-optimization is useful, even (or maybe especially) if it's ugly: it signals to any future maintainer that performance is particularly important here, and they should be careful with any changes.

Changing the language to formalize that signal takes away some of the emphasis you get from ugliness. I also wouldn't assume that such speed assessments are likely to be valid across the timescales needed for adoption of new syntax.

...

...
(2) immutable bindings

...

...
once X final Y const Z

...

But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables.

It is difficult to signal "once set, this should not change" in Python, largely because it is so difficult to enforce. This case might actually be worth new syntax, or a keyword. Alternatively, it might be like const contagion, that ends up being applied too often and just adding visual noise.

...

...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

...

How could an optimizer enforce immutability, much less signal it?

Victor's guards can "enforce" immutability by recognizing when it fails in practice. It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration.

...

...
(3) Persistent storage

...

...
def f(x, _cached_results={}): ...

...

...
I still think it might be nice to just have a way of easily opening a new scope ...

...

You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call. But "opening a new scope" can start to look a lot like creating a new class instance, yes.

...

Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

And explaining when they want one instead of the other will still be so difficult that whichever is easier to write will become an attractive nuisance, that would only cause problems under load. -jJ

Andrew Barnert

4:39 a.m.

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote: On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...
...
(1) Auxiliary variables

...
...
def f(x, _len=len): ...

...
...
This is often a micro-optimization;

...
When _isn't_ it a micro-optimization?

It can improve readability, usually by providing a useful rename.

OK, but then how could FAT, or any optimizer, help with that?

...

...
I think if it isn't, it's a very different case, e.g.:

def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable)

I would (perhaps wrongly) still have assumed that was at least intended for optimization.

This is how you hook a global or builtin function with special behavior for a special case, when you can't use the normal protocol (e.g., because the special case is a C extension type so you can't monkeypatch it), or want to hook it at a smaller scope than builtin. That's usually nothing to do with optimization, but with adding functionality. But, either way, it's not something an optimizer can help with anyway.

...

...
Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4 ...

[#4 was current-value capture]

I almost never set out to capture a snapshot of the current environment's values. I get around to that solution after being annoyed that something else didn't work, but it isn't the original intent. (That might be one reason I sometimes have to stop and think about closures created in a loop.)

The "this shouldn't be in the signature" and "why is something being assigned to itself" problems won't go away even if current-value capture is resolved. I suspect current-value capture would even become an attractive nuisance that led to obscure bugs when the value was captured too soon.

You may be right here. The fact that current-value capture is currently ugly means you only use it when you need to explicitly signal something unusual, or when you have no other choice. Making it nicer could make it an attractive nuisance.

...

...
But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations.

That still argues for not making any changes to the language; I think the equivalent of (faster access to unchanged globals or builtins) is a better portability bet than new language features.

Sure. I already said I don't think anything but maybe (and probably not) the loop-capture problem actually needs to be solved, so you don't have to convince me. :) When you really need the micro-optimization, which is very rare, you will continue to spell it with the default-value trick. The rest of the time, you don't need any way to spell it at all (and maybe FAT will sometimes optimize things for you, but that's just gravy).

...

Alternatively, it might be like const contagion, that ends up being applied too often and just adding visual noise.

Const contagion is a C++-specific problem. (Actually, two problems--mixing up lvalue-const and rvalue-const incorrectly, and having half the stdlib and half the popular third-party libraries out there not being const-correct because they're actually C libs--but they're both unique to C++.) Play with D or Swift for a while to see how it can work.

...

...
...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

...
How could an optimizer enforce immutability, much less signal it?

Victor's guards can "enforce" immutability by recognizing when it fails in practice.

But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here.

...

It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration.

Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal.

...

...
...
(3) Persistent storage

...
...
def f(x, _cached_results={}): ...

...
...
I still think it might be nice to just have a way of easily opening a new scope ...

...
You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call.

It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript: var spam = function(n) { var cache = {}: return function(n) { if (cache[n] === undefined) { cache[n] = slow_computation(n); } return cache[n]; }; }(); And the exact same thing works in Python: def _(): cache = {} def spam(n): if n not in cache: cache[n] = slow_computation(n) return cache[n] return spam spam = _() You just rarely do it in Python because we have better ways of doing everything this can do.

...

...
Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

And explaining when they want one instead of the other will still be so difficult that whichever is easier to write will become an attractive nuisance, that would only cause problems under load.

Yes, yet another strike against C-style static variables. But, again, I don't think this was a problem that needed solving in the first place.

Jim J. Jewett

5:27 p.m.

TLDR: An "extra" defaulted parameter is used for many slightly different reasons ... even a perfect solution for one of them risks being an attractive nuisance for the others. On Tue, Jan 26, 2016 at 11:39 PM, Andrew Barnert <abarnert@yahoo.com> wrote:

...

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
...
On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote: On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
...
...
(1) Auxiliary variables

...

...
...
...
def f(x, _len=len): ... ... It can improve readability, usually by providing a useful rename.

...

OK, but then how could FAT, or any optimizer, help with that?

It can't ... and that does argue for aux variables (or a let), but ... would good usage be swamped by abuse? You also brought up the case of augmenting a builtin or global, but still delegating to the original ... I forgot that case, and didn't even notice that you were rebinding the captured name. In those cases, the mechanical intent is "capture the old way", but the higher level intent is to specialize it. This should probably look more like inheritance (or multimethods or advice and dispatch) ... so even if it deserves a language change, capture-current-value idioms wouldn't really be an improvement over the current workaround. ...

...

...
...
...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) ...

...

...
...
How could an optimizer enforce immutability, much less signal it?

...

...
Victor's guards can "enforce" immutability by recognizing when it fails in practice.

...

But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here.

Using the specific guards he proposes, yes. But something like FAT could provide more active guards that raise an exception, or swap the original value back into place, or even actively prevent the modification. Whether these should be triggered by a declaration in front of the name, or by a module-level freeze statement, or ... there are enough possibilities that I don't think a specific solution should be enshrined in the language yet.

...

...
It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration.

...

Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal.

Those are all "if you're living with these restrictions anyhow, and you tell the compiler, the program can run faster." None of those sound important in terms of "What does this program (eventually) do?" (Obviously, when immutability actually *is* important, and an appropriate immutable data type exists, then *not* using it would send a bad signal.)

...

...
...
You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

...

...
It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call.

...

It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript:

var spam = function(n) { var cache = {}: return function(n) { if (cache[n] === undefined) { cache[n] = slow_computation(n); } return cache[n]; }; }();

That still doesn't bind n1, n2, n3 in the enclosing scope -- it only binds spam, from which you can reach spam(n1), spam(n2), etc. I guess I'm (occasionally) looking for something more like class _Scope: ... for attr in dir(_Scope): if not attr.startswith("_"): locals()[attr] = _Scope[attr] -jJ

Chris Angelico

10:15 p.m.

On Thu, Jan 28, 2016 at 4:27 AM, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

The nature of hash tables and equality is such that if an object's value (defined by __eq__) changes between when it's used as a key and when it's looked up, bad stuff happens. It's not just an optimization - it's a way for the dict subsystem to protect us against craziness. Yes, you can bypass that protection: class HashableList(list): def __hash__(self): return hash(tuple(self)) but it's a great safety net. You won't unexpectedly get KeyError when you iterate over a dictionary - you'll instead get TypeError when you try to assign. Is that a semantic question or a performance one? ChrisA

Random832

10:42 p.m.

On Wed, Jan 27, 2016, at 17:15, Chris Angelico wrote:

...

This stands alone against all the things that it *could* protect users against but doesn't due to the "consenting adults" principle. Java allows ArrayLists to be HashMap keys and the sky hasn't fallen, despite that language otherwise having far more of a culture of protecting users from themselves and each other (i.e. it has stuff like private, final, etc) than Python does. We won't even protect from redefining math.pi, yet you want to prevent a user from using as a key in a dictionary a value which _might_ be altered while the dictionary is in use? This prevents all kinds of algorithms from being used which would benefit from using a short-lived dict/set to keep track of things. I think this came up a month or so ago when we were talking about comparison of dict values views (which could benefit from being able to use all the values in the dict as keys in a Counter). They're not going to change while the algorithm is executing unless the user does some weird multithreaded stuff or something truly bizarre in a callback (and if they do? consenting adults.), and the dict is thrown away at the end.

...

That doesn't really work for my scenario described above, which requires an alternate universe in which Python (like Java) requires *all* objects, mutable or otherwise, to define __hash__ in a way consistent with __eq__.

...

But I won't get either error if I don't mutate the list, or I only do it in equality-conserving ways (e.g. converting between numeric types).

Andrew Barnert

11:19 p.m.

On Jan 27, 2016, at 14:42, Random832 <random832@fastmail.com> wrote:

...

It's amazing how many people go for years using Python without noticing this restriction, then, as soon as it's pointed out to them, exclaim "That's horrible! It's way too restrictive! I can think of all kinds of useful code that this prevents!" And then you go back and try to think of code you were prevented from writing over the past five years before you learned this rule, and realize that there's little if any. And, over the next five years, you run into the rule very rarely (and more often, it's because you forgot to define an appropriate __hash__ for an immutable type than because you needed to put a mutable type in a dict or set). Similarly, everyone learns the tuple/frozenset trick, decries the fact that there's no way to do a "deep" equivalent, but eventually ends up using the trick once every couple years and never running into the shallowness problem. From a pure design point of view, this looks like a case of hidebound purity over practice, exactly what Python is against. But from a practical use point of view, it actually works really well. I don't know if you could prove this fact a prioiri, or even argue very strongly for it, but it still seems to be true. People who use Python don't notice the limitation, people who rant against Python don't include it in their "why Python sucks" lists; only people who just discovered it care.

Steven D'Aprano

12:41 p.m.

On Tue, Jan 26, 2016 at 12:59:07PM -0800, Andrew Barnert via Python-ideas wrote:

...

I'm not sure why you call this "a very different case". It looks the same to me: both cases use the default argument trick to capture the value of a builtin name. The reasons why they do so are incidental. I sometimes have code like this: try: enumerate("", 1) except TypeError: # Too old a version of Python. def enumerate(it, start=0, enumerate=enumerate): for a, b in enumerate(it): yield (a+start, b) I don't really want an extra argument, but nor do I want a global: _enumerate = enumerate def enumerate(it, start=0): for a, b in _enumerate(it): yield (a+start, b) This isn't a matter of micro-optimization, it's a matter of encapsulation. That my enumerate calls the built-in enumerate is an implementation detail, and what I'd like is to capture the value without either a global or an extra argument: # capture the current builtin def enumerate(it, start=0)(enumerate): for a, b in enumerate(it): yield (a+start, b) Obviously I'm not going to be able to use hypothetical Python 3.6 syntax in code that needs to run in 2.5. But I might be able to use that syntax in Python 3.8 for code that needs to run in 3.6.

...

I don't think this proposal has anything to say about about either immutability or bind-once-only "constants".

...

I'm not sure what point you think you are making here, or what Jim meant by his comment about the new scope, but in this case I don't think we would want an extra scope. We would want the cache to be in the function's local scope, but assigned once at function definition time. When my caching needs are simple, I might write something like this: def func(x, cache={}): ... which is certainly better than having a global variable cache. For many applications (quick and dirty scripts) this is perfectly adequate. For other applications were my caching needs are more sophisticated, I might invest the effort in writing a decorator (or use functools.lru_cache), or a factory to hide the cache in a closure: def factory(): cache = {} def func(x): ... return func func = factory() del factory but there's a middle ground where I want something less quick'n'dirty than the first, but not going to all the effort of the second. For that, I think that being able to capture a value fits the use-case perfectly: def func(x)(cache={}): ...

...

Copying functions is, I think, a pretty rare and advanced thing to do. At least up to 3.4, copy.copy(func) simply returns func, so if you want to make an actual distinct copy, you probably need to build a new function by hand. In which case, you could copy the cache as part of the process. -- Steve

Andrew Barnert

4:14 p.m.

On Jan 27, 2016, at 04:41, Steven D'Aprano <steve@pearwood.info> wrote: I think you're actually agreeing with me: there _aren't_ four different cases people actually want here, just the one we've all been talking about, and FAT is irrelevant to that case, so this sub thread is ultimately just a distraction. (We may still disagree about whether the one case needs a solution, or what the best solution would be, but we won't get anywhere by talking about different and unrelated things like this distraction.) But, in case I'm wrong about that, I'll answer your replies anyway:

...

Because Jim's point was that FAT could do this automatically for him, so we don't need any syntax for it at all. That works for the optimization case, but it doesn't work for your case. Therefore, they're different. Put another way: Without the default-value trick, his function means the same thing, so if he could rely on FAT, he could just stop using len=len. Without the default value trick, your function means something very different (a RecursionError), so you can't stop using enumerate=enumerate, with or without FAT, unless there's some other equally explicit syntax you can use instead. Moreover, your case is really no different from his case #4, the case everyone else has been talking about: you want to capture the value of enumerate at function definition time.

...

Jim insists that it's one of the four things people use default values for, and one of the things people want from this proposal, and that FAT can make that desire irrelevant. I think he's wrong on all three counts: you can't use default values for constness, nobody cares whether any of these new proposals can be used for constness, and FAT can't help anyone who does want constness.

...

My point is that if you want to open a new scope to attach variables to a function, you can already do that by defining and calling a function. Which you very rarely actually need to do, so we don't need to make it any easier. So the fact that no variants of this proposal make it easier is irrelevant.

...

I'm not talking about literally copying functions. I'm talking about nested functions using the same code object for each closure that gets created, and methods using the same code and function object for every bound method that gets created. Using a C-style static variable in these cases means all your closures, or in all your methods from different instances, etc., which is not the same behavior as the other alternatives he suggested were equivalent. This one, unlike his other points, isn't completely irrelevant. A C-style static declaration could actually serve some of the cases that the proposal is meant to serve. But it can't serve others, and it confusingly looks like it can serve more than it can, which makes it a confusing side track to bring up.

Steven D'Aprano

12:31 a.m.

On Wed, Jan 27, 2016 at 08:14:15AM -0800, Andrew Barnert wrote:

...

I think we do agree. Thanks for the extra detail, I have nothing more to say at this point :-) -- Steve

Andrew Barnert

January 2016

8:59 p.m.

On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

(1) Auxiliary variables

def f(x, _len=len): ...

This is often a micro-optimization;

...

But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete.

...

(2) immutable bindings

once X final Y const Z

But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables.

...

So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

How could an optimizer enforce immutability, much less signal it? It only makes changes that are semantically transparent, and changing a mutable binding to immutable is definitely not transparent.

...

(3) Persistent storage

def f(x, _cached_results={}): ...

...

I still think it might be nice to just have a way of easily opening a new scope ...

Jim J. Jewett

1:23 a.m.

On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote:

...

On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
(1) Auxiliary variables

...

...
def f(x, _len=len): ...

...

...
This is often a micro-optimization;

...

When _isn't_ it a micro-optimization?

It can improve readability, usually by providing a useful rename. I have a vague sense that there might be other cases I'm forgetting, simply because I haven't had much use for them myself.

...

I think if it isn't, it's a very different case, e.g.:

def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable)

I would (perhaps wrongly) still have assumed that was at least intended for optimization.

...

Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4 ...

...

...
But realistically, that _len isn't ugly *just* because it shouldn't be overridden; it is also inherently ugly. I would prefer that something like Victor's FAT optimizer just make this idiom obsolete.

...

But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations.

That still argues for not making any changes to the language; I think the equivalent of (faster access to unchanged globals or builtins) is a better portability bet than new language features.

...

Also, marking that you're using an intentional micro-optimization is useful, even (or maybe especially) if it's ugly: it signals to any future maintainer that performance is particularly important here, and they should be careful with any changes.

...

...
(2) immutable bindings

...

...
once X final Y const Z

...

But a default value neither guarantees immutability, nor signals such an intent. Parameters can be rebound or mutated just like any other variables.

...

...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

...

How could an optimizer enforce immutability, much less signal it?

...

...
(3) Persistent storage

...

...
def f(x, _cached_results={}): ...

...

...
I still think it might be nice to just have a way of easily opening a new scope ...

...

You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

...

Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

And explaining when they want one instead of the other will still be so difficult that whichever is easier to write will become an attractive nuisance, that would only cause problems under load. -jJ

Andrew Barnert

4:39 a.m.

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote: On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...
...
(1) Auxiliary variables

...
...
def f(x, _len=len): ...

...
...
This is often a micro-optimization;

...
When _isn't_ it a micro-optimization?

It can improve readability, usually by providing a useful rename.

OK, but then how could FAT, or any optimizer, help with that?

...

...
I think if it isn't, it's a very different case, e.g.:

def len(iterable, _len=len): if something(iterable): special_case() else: return _len(iterable)

I would (perhaps wrongly) still have assumed that was at least intended for optimization.

...

...
Obviously non-optimization use cases can't be solved by an optimizer. I think this is really more a special case of your #4 ...

[#4 was current-value capture]

I almost never set out to capture a snapshot of the current environment's values. I get around to that solution after being annoyed that something else didn't work, but it isn't the original intent. (That might be one reason I sometimes have to stop and think about closures created in a loop.)

The "this shouldn't be in the signature" and "why is something being assigned to itself" problems won't go away even if current-value capture is resolved. I suspect current-value capture would even become an attractive nuisance that led to obscure bugs when the value was captured too soon.

...

...
But, like most micro-optimizations, you should use this only when you really need it. Which means you probably can't count on a general-purpose optimizer that may do it for you, on some people's installations.

That still argues for not making any changes to the language; I think the equivalent of (faster access to unchanged globals or builtins) is a better portability bet than new language features.

...

Alternatively, it might be like const contagion, that ends up being applied too often and just adding visual noise.

...

...
...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) is a better long-term solution, but I'm not as sure as I was for case 1.

...
How could an optimizer enforce immutability, much less signal it?

Victor's guards can "enforce" immutability by recognizing when it fails in practice.

...

It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration.

...

...
...
(3) Persistent storage

...
...
def f(x, _cached_results={}): ...

...
...
I still think it might be nice to just have a way of easily opening a new scope ...

...
You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call.

...

...
Meanwhile, a C-style function-static variable isn't really the same thing. Statics are just globals with names nobody else can see. So, for a nested function (or a method) that had a "static cache", any copies of the function would all share the same cache, while one with a closure over a cache defined in a new scope (or a default parameter value, or a class instance) would get a new cache for each copy. So, if you give people an easier way to write statics, they'd still have to use something else when they want the other.

And explaining when they want one instead of the other will still be so difficult that whichever is easier to write will become an attractive nuisance, that would only cause problems under load.

Yes, yet another strike against C-style static variables. But, again, I don't think this was a problem that needed solving in the first place.

Jim J. Jewett

5:27 p.m.

...

On Jan 26, 2016, at 17:23, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
...
On Tue, Jan 26, 2016 at 3:59 PM, Andrew Barnert <abarnert@yahoo.com> wrote: On Jan 26, 2016, at 11:40, Jim J. Jewett <jimjjewett@gmail.com> wrote:

...

...
...
...
(1) Auxiliary variables

...

...
...
...
def f(x, _len=len): ... ... It can improve readability, usually by providing a useful rename.

...

OK, but then how could FAT, or any optimizer, help with that?

...

...
...
...
So again, I think something like Victor's FAT optimizer (plus comments when immutability really is important) ...

...

...
...
How could an optimizer enforce immutability, much less signal it?

...

...
Victor's guards can "enforce" immutability by recognizing when it fails in practice.

...

But that doesn't do _anything_ semantically--the code runs exactly the same way as if FAT hadn't done anything, except maybe a bit slower. If that's wrong, it's still just as wrong, and you still have no way of noticing that it's wrong, much less fixing it. So FAT is completely irrelevant here.

...

...
It can't signal, but comments can ... and immutability being semantically important (as opposed to merely useful for optimization) is rare enough that I think a comment is more likely to be accurate than a type declaration.

...

Here I disagree completely. Why do we have tuple, or frozenset? Why do dicts only take immutable keys? Why does the language make it easier to build mapped/filtered copies in place? Why can immutable objects be shared between threads or processes trivially, while mutable objects need locks for threads and heavy "manager" objects for processes? Mutability is a very big deal.

...

...
...
You mean to open a new scope _outside_ the function definition, so it can capture the cache in a closure, without leaving it accessible from outside the scope? But then f won't be accessible either, unless you have some way to "return" the value to the parent scope. And a scope that returns something--that's just a function, isn't it?

...

...
It is a function plus a function call, rather than just a function. Getting that name (possible several names) bound properly in the outer scope is also beyond the abilities of a call.

...

It isn't at all beyond the abilities of defining and calling a function. Here's how you solve this kind of problem in JavaScript:

var spam = function(n) { var cache = {}: return function(n) { if (cache[n] === undefined) { cache[n] = slow_computation(n); } return cache[n]; }; }();