Add static variable storage in functions
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program. Well Python also kind of has that functionality. Python's default values provide the same type of functionality but it's a *hack* and also *problematic* because only mutable types that are mutated persists. Static should behave much like Python's for loop variables. This idea proposes to add a keyword (static, maybe?) that can create static variables that can persist throughout the program yet only accessible through the function they are declared and initialized in. Thanking you, With Regards
On 27 May 2021, at 09:56, Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program. Well Python also kind of has that functionality. Python's default values provide the same type of functionality but it's a *hack* and also *problematic* because only mutable types that are mutated persists. Static should behave much like Python's for loop variables. This idea proposes to add a keyword (static, maybe?) that can create static variables that can persist throughout the program yet only accessible through the function they are declared and initialized in.
How experienced are your with Python? On first glance your recent proposals appear to be for feature in languages you know about and can’t find in Python, without necessarily a good understanding of Python. For this particular question/proposal: “static” variables in functions in C like languages are basically hidden global variables, and global variables are generally a bad idea. In Python you can get the same result with a global variable and the use of the “global” keyword in a function (or cooperating set of functions) when you want to update the global variable from that function. Closures or instances of classes with an ``__call__`` method can be used as well and can hide state (with the “consulting adults” caveat, the state is hidden, not inaccessible). Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
Well sometimes we don't want to pollute the module namespace. Two functions can have two variables with the same name but with different values that we want to be static. And this functionality already exists in Python but as a *hack*. This idea proposes to add a new dunder member and a keyword that allows us to use global variables but are limited to local scope. But since it's Python anyone can access it using the dunder member.
On Thu, May 27, 2021 at 7:20 PM Ronald Oussoren via Python-ideas <python-ideas@python.org> wrote:
On 27 May 2021, at 09:56, Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Static should behave much like Python's for loop variables.
I have no idea what this means.
For this particular question/proposal: “static” variables in functions in C like languages are basically hidden global variables, and global variables are generally a bad idea.
Hmm, I'd distinguish a couple of things here. Global constants are most certainly not a problem, and objects that last the entire run of the program are not a problem either. The usual problem with "global variables" is that they become hidden state, action-at-a-distance. You can change the global in one function and it notably affects some other function. Part of the point of statics is that they are *not* global; they might live for the entire run of the program, but they can't be changed by any other function, so there's no AaaD that will mess with your expectations.
In Python you can get the same result with a global variable and the use of the “global” keyword in a function (or cooperating set of functions) when you want to update the global variable from that function.
That's globals, with all the risks thereof.
Closures or instances of classes with an ``__call__`` method can be used as well and can hide state (with the “consulting adults” caveat, the state is hidden, not inaccessible).
This would be the easiest way to manage it. But both of them provide a way to have multiple independent, yet equivalent, states. If that's what you want, great! But it can also be an unnecessary level of confusion ("why would I ever make a second one of these? Why is there a factory function for something that I'll only ever need one of?"), where static variables wouldn't do that. There is one namespace that would very aptly handle this kind of thing: the function object itself.
def count(): ... count.cur += 1 ... return count.cur ... count.cur = 0 count() 1 count() 2 count() 3 count() 4
As long as you can reference your own function reliably, this will work. There may be room for a keyword like this_function, but for the most part, it's not necessary, and you can happily work with the function by its name. It's a little clunkier than being able to say "static cur = 0;" to initialize it (the initializer has to go *after* the function, which feels backwards), but the functionality is all there. ChrisA
On 2021-05-27 10:39, Shreyan Avigyan wrote:
Well sometimes we don't want to pollute the module namespace. Two functions can have two variables with the same name but with different values that we want to be static. And this functionality already exists in Python but as a *hack*. This idea proposes to add a new dunder member and a keyword that allows us to use global variables but are limited to local scope. But since it's Python anyone can access it using the dunder member.
This was discussed some years ago. IIRC, one of the questions was about how and when such variables would be initialised.
Reply to Chris: I'm proposing a way to do this officially in Python. For example I know another hack, def count(cur={"cur":0}): cur["cur"] += 1 return cur
Static should behave much like Python's for loop variables. I have no idea what this means.
That's a bad example. I was just trying to make it clear. But you have got the idea. So don't worry about that.
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program.
+1 on this idea. One common use for function defaults is to optimize function lookups to local variables instead of global or builtins: def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup Benchmarking shows that this actually does make a significant difference to performance, but it's a technique under-used because of the horribleness of a len=len parameter. (Raymond Hettinger is, I think, a proponent of this optimization trick. At least I learned it from his code.) If functions had static storage that didn't need to be declared in the function parameter list, then we could use that for this trick. As you correctly point out:
Well Python also kind of has that functionality. Python's default values provide the same type of functionality
Indeed: def func(static_storage=[0]): static_storage[0] += 1 print(static_storage[0]) But that's kinda yucky for the same reason as above: we have to expose our static storage in the parameter list to the caller, and if the value we care about is immutable, we have to stuff it inside a mutable container.
but it's a *hack*
I disagree that it is a hack. At least, the implementation is not a hack. The fact that the only way we can take advantage of this static storage is to declare a parameter and give it a default value is hacky. So I guess I agree with you :-)
and also *problematic* because only mutable types that are mutated persists. Static should behave much like Python's for loop variables.
I have no idea what that comment about loop variables means.
This idea proposes to add a keyword (static, maybe?) that can create static variables that can persist throughout the program yet only accessible through the function they are declared and initialized in.
Here is a sketch of how this could work, given a function like this: def func(arg): static spam, eggs static cheese = expression ... At function declaration time, the two static statements tell the compiler to: * treat spam, eggs and cheese as local variables (use LOAD_FAST instead of LOAD_GLOBAL for lookups); * allocate static storage for them using the same (or similar) mechanism used for function default values; * spam and eggs get initialised as None; * cheese gets initialised to the value of `expression`, evaluated at function declaration time just as default arguments are. When the function is called: * the interpreter automatically initialises the static variables with the stored values; * when the function exits (whether by return or by raising an exception) the static storage will be updated with the current values of the variables. As a sketch of one possible implementation, the body of the function represented by ellipsis `...` might be transformed to this: # initialise statics spam = LOAD_STATIC(0) eggs = LOAD_STATIC(1) cheese = LOAD_STATIC(2) try: # body of the function ... finally: STORE_STATIC(spam, 0) STORE_STATIC(eggs, 1) STORE_STATIC(cheese, 2) One subtlety: what if the body of the function executes `del spam`? No problem: the spam variable will become undefined on the next function call, which means that subsequent attempts to get its value will raise UnboundLocalError: try: x = spam + 1 except UnboundLocalError: spam = 0 x = 1 I would use this static feature if it existed. +1 -- Steve
On Thu, May 27, 2021 at 11:17:18AM +0200, Ronald Oussoren via Python-ideas wrote:
For this particular question/proposal: “static” variables in functions in C like languages are basically hidden global variables, and global variables are generally a bad idea.
Python is not required to use the same design mistakes as C :-) Shreyan already said that static variables in a function should be local to that function. The semantic difference compared to regular locals is that they should persist from one call to the next. Aside from globals, which we agree are Considered Harmful, you've suggested two alternative implementations: - something with closures; - hidden state in an object with a `__call__` method. Closures are cool, but the hidden state really is inaccessible from outside the function. (At least I've never worked out how to get to it.) So the callable object is better for introspection and debugging. Functions are objects with a `__call__` method, and they already have persistent state! >>> def spam(arg="Hello world"): ... pass ... >>> spam.__defaults__ ('Hello world',) We could, I think, leverage this functionality to get this. -- Steve
On 2021-05-27 at 22:33:25 +1000, Steven D'Aprano <steve@pearwood.info> wrote:
Aside from globals, which we agree are Considered Harmful, you've suggested two alternative implementations:
- something with closures;
- hidden state in an object with a `__call__` method.
Closures are cool, but the hidden state really is inaccessible from outside the function. (At least I've never worked out how to get to it.) So the callable object is better for introspection and debugging.
Then fix your debugger. ;-) As I recall, you're a proponent of fixing what's broken rather than creating workarounds. (I also recall many discussions regarding failed sandboxes because of Python's nearly infinite capacity for introspection. Maybe closures are the path to a true sandbox.) Globals, persistent locals, closures, statics, class variables, instance variables, etc. are all just different ways to hide state. IMO, we don't need another one.
On Thu, May 27, 2021 at 8:19 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
This idea proposes to add a keyword (static, maybe?) that can create static variables that can persist throughout the program yet only accessible through the function they are declared and initialized in.
Here is a sketch of how this could work, given a function like this:
def func(arg): static spam, eggs static cheese = expression ...
At function declaration time, the two static statements tell the compiler to:
* treat spam, eggs and cheese as local variables (use LOAD_FAST instead of LOAD_GLOBAL for lookups);
* allocate static storage for them using the same (or similar) mechanism used for function default values;
* spam and eggs get initialised as None;
* cheese gets initialised to the value of `expression`, evaluated at function declaration time just as default arguments are.
When the function is called:
* the interpreter automatically initialises the static variables with the stored values;
* when the function exits (whether by return or by raising an exception) the static storage will be updated with the current values of the variables.
As a sketch of one possible implementation, the body of the function represented by ellipsis `...` might be transformed to this:
# initialise statics spam = LOAD_STATIC(0) eggs = LOAD_STATIC(1) cheese = LOAD_STATIC(2) try: # body of the function ... finally: STORE_STATIC(spam, 0) STORE_STATIC(eggs, 1) STORE_STATIC(cheese, 2)
Couldn't you already get pretty close to this by attaching your static values to the function __dict__? Example: def func(): print(func.a) func.a = 1 Usage:
func() 1
Of course that is slower because there is an attribute lookup. But could there be a decorator that links the function __dict__ to locals(), so they are intertwined? @staticify({ 'a':1}) def func(): print(a) print(b) func.b = 2 Usage:
func() 1 2 func.a = 3 # dynamic update of func.__dict__ func() 3 2
The locals dict in the function body would look something like this: ChainMap(locals(), {'a':1}) --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
One subtlety: what if the body of the function executes `del spam`? No problem: the spam variable will become undefined on the next function call, which means that subsequent attempts to get its value will raise UnboundLocalError:
try: x = spam + 1 except UnboundLocalError: spam = 0 x = 1
I would use this static feature if it existed. +1
-- Steve
Same thing would happen with my idea: del a would delete a from the func.__dict__ (just like with ChainMap). But if you add it back again later, it would not be static anymore. Example: @staticify('a': 1) def func(): print(a) # fast static lookup del a # static is deleted a = 2 # this is local now func.b = 3 # but this is a static
For the implementation I had the same idea as Steven. And I don't think static variables should stored in __dict__ or __defaults__. Instead to increase efficiency (more importantly not to decrease current efficiency) it should be stored as a dict in __static__ or some other dunder member.
On Thu, May 27, 2021 at 10:20 PM Steven D'Aprano <steve@pearwood.info> wrote:
Here is a sketch of how this could work, given a function like this:
def func(arg): static spam, eggs static cheese = expression ...
At function declaration time, the two static statements tell the compiler to:
* treat spam, eggs and cheese as local variables (use LOAD_FAST instead of LOAD_GLOBAL for lookups);
I don't think LOAD_FAST would be suitable here - isn't it always going to look in the stack frame?
* allocate static storage for them using the same (or similar) mechanism used for function default values;
Default values are attached to the function object (in either the __defaults__ tuple or the __kwdefaults__ dict).
* spam and eggs get initialised as None;
* cheese gets initialised to the value of `expression`, evaluated at function declaration time just as default arguments are.
When the function is called:
* the interpreter automatically initialises the static variables with the stored values;
* when the function exits (whether by return or by raising an exception) the static storage will be updated with the current values of the variables.
Hmm, I see what you mean. Not sure that this is really necessary though - and it could cause extremely confusing results with threading.
As a sketch of one possible implementation, the body of the function represented by ellipsis `...` might be transformed to this:
# initialise statics spam = LOAD_STATIC(0) eggs = LOAD_STATIC(1) cheese = LOAD_STATIC(2) try: # body of the function ... finally: STORE_STATIC(spam, 0) STORE_STATIC(eggs, 1) STORE_STATIC(cheese, 2)
One subtlety: what if the body of the function executes `del spam`? No problem: the spam variable will become undefined on the next function call, which means that subsequent attempts to get its value will raise UnboundLocalError:
try: x = spam + 1 except UnboundLocalError: spam = 0 x = 1
I would use this static feature if it existed. +1
Agreed, I'd use it too. But I'd define the semantics slightly differently: * If there's an expression given, evaluate that when the 'def' statement is executed, same as default args * Otherwise it'll be uninitialized, or None, bikeshedding opportunity, have fun * Usage of this name uses a dedicated LOAD_STATIC or STORE_STATIC bytecode * The values of the statics are stored in some sort of high-performance cell collection, indexed numerically It would be acceptable to store statics in a dict, too, but I suspect that that would negate some or all of the performance advantages. Whichever way, though, it should ideally be introspectable via a dunder attribute on the function. Semantically, this would be very similar to writing code like this: def count(): THIS_FUNCTION.__statics__["n"] += 1 return THIS_FUNCTION.__statics__["n"] count.__statics__ = {"n": 1} except that it'd be more optimized (and wouldn't require magic to get a function self-reference). Note that the statics *must* be defined on the function, NOT on the code object. Just like function defaults, they need to be associated with individual instances of a function.
f = [] for n in range(10): ... def spam(n=n): ... # static n=n # Same semantics ... print(n) ... f.append(spam) ...
Each spam() should print out its particular number, even though they all share the same code object. This has been proposed a few times, never really got a lot of support though. ChrisA
On Thu, 27 May 2021 at 14:22, Chris Angelico <rosuav@gmail.com> wrote:
Note that the statics *must* be defined on the function, NOT on the code object. Just like function defaults, they need to be associated with individual instances of a function.
f = [] for n in range(10): ... def spam(n=n): ... # static n=n # Same semantics ... print(n) ... f.append(spam) ...
Each spam() should print out its particular number, even though they all share the same code object.
This reminds me, if we ignore the performance aspect, function attributes provide this functionality, but there's a significant problem with using them because you can't access them other than by referencing the *name* of the function being defined.
def f(): ... print(f.i) ... f.i = 1 g = f del f g() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in f NameError: name 'f' is not defined
OK, you can, if you're willing to mess around with sys._getframe and make some dodgy assumptions:
def me(): ... parent = sys._getframe(1) ... for obj in parent.f_globals.values(): ... if getattr(obj, "__code__", None) == parent.f_code: ... return obj ... def f(): ... print(me().i) ... f.i = 1 g = f del f g() 1
It would be nice to have a better way to reference function attributes from within a function. (This would also help write recursive functions that could be safely renamed, but I'm not sure many people would necessarily think that's a good thing ;-)) Paul
On Thu, May 27, 2021 at 11:38 PM Paul Moore <p.f.moore@gmail.com> wrote:
This reminds me, if we ignore the performance aspect, function attributes provide this functionality, but there's a significant problem with using them because you can't access them other than by referencing the *name* of the function being defined.
Yeah, I mentioned that earlier, but just as one of the wide variety of variously-clunky ways to achieve the same thing. I think it's semantically the closest, but defining the initial value *after* the function is pretty unexciting.
It would be nice to have a better way to reference function attributes from within a function. (This would also help write recursive functions that could be safely renamed, but I'm not sure many people would necessarily think that's a good thing ;-))
The interaction with renaming isn't particularly significant, but the interaction with decoration is notable. Inside the execution of a function, you'd have a reference to the innermost function, NOT the one that would be identified externally. Whether that's a good thing or a bad thing remains to be seen... Hmm. def static(**kw): def deco(func): statics = types.SimpleNamespace(**kw) @functools.wraps(func) def f(*a, **kw): func(*a, **kw, _statics=statics) return f return deco @statics(n=0) def count(*, _statics): _statics.n += 1 return _statics.n Add it to the pile of clunky options, but it's semantically viable. Unfortunately, it's as introspectable as a closure (that is: not at all). ChrisA
On Thu, 27 May 2021 at 15:04, Chris Angelico <rosuav@gmail.com> wrote:
Hmm.
def static(**kw): def deco(func): statics = types.SimpleNamespace(**kw) @functools.wraps(func) def f(*a, **kw): func(*a, **kw, _statics=statics) return f return deco
@statics(n=0) def count(*, _statics): _statics.n += 1 return _statics.n
Add it to the pile of clunky options, but it's semantically viable. Unfortunately, it's as introspectable as a closure (that is: not at all).
Still arguably clunky, still doesn't have any performance benefits, but possibly a better interface to function attributes than just using them in their raw form. def static(**statics): def deco(func): for name, value in statics.items(): setattr(func, name, value) func.__globals__["__me__"] = func return func return deco @static(a=1) def f(): print(__me__.a) Paul
On 27 May 2021, at 14:18, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program.
+1 on this idea.
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
That’s a CPython performance hack, and “static” would just introduce a different performance hack. IIRC there has been work in recent versions of CPython to reduce the need for that hack by caching values in the VM. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On Fri, May 28, 2021 at 12:25 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Thu, 27 May 2021 at 15:04, Chris Angelico <rosuav@gmail.com> wrote:
Hmm.
def static(**kw): def deco(func): statics = types.SimpleNamespace(**kw) @functools.wraps(func) def f(*a, **kw): func(*a, **kw, _statics=statics) return f return deco
@statics(n=0) def count(*, _statics): _statics.n += 1 return _statics.n
Add it to the pile of clunky options, but it's semantically viable. Unfortunately, it's as introspectable as a closure (that is: not at all).
Still arguably clunky, still doesn't have any performance benefits, but possibly a better interface to function attributes than just using them in their raw form.
def static(**statics): def deco(func): for name, value in statics.items(): setattr(func, name, value) func.__globals__["__me__"] = func return func return deco
@static(a=1) def f(): print(__me__.a)
Can't use globals like that, since there's only one globals() dict per module. It'd require some compiler magic to make __me__ work the way you want. But on the plus side, this doesn't require a run-time trampoline - all the work is done on the original function object. So, yeah, add it to the pile. ChrisA
On 27 May 2021, at 11:42, Chris Angelico <rosuav@gmail.com> wrote:
On Thu, May 27, 2021 at 7:20 PM Ronald Oussoren via Python-ideas <python-ideas@python.org> wrote:
On 27 May 2021, at 09:56, Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Static should behave much like Python's for loop variables.
I have no idea what this means.
For this particular question/proposal: “static” variables in functions in C like languages are basically hidden global variables, and global variables are generally a bad idea.
Hmm, I'd distinguish a couple of things here. Global constants are most certainly not a problem, and objects that last the entire run of the program are not a problem either. The usual problem with "global variables" is that they become hidden state, action-at-a-distance. You can change the global in one function and it notably affects some other function. Part of the point of statics is that they are *not* global; they might live for the entire run of the program, but they can't be changed by any other function, so there's no AaaD that will mess with your expectations.
Statics are still hidden global state, and those can be problematic regardless of being function local or module global. Having global state like this affects testability and can affect threading as well.
In Python you can get the same result with a global variable and the use of the “global” keyword in a function (or cooperating set of functions) when you want to update the global variable from that function.
That's globals, with all the risks thereof.
Closures or instances of classes with an ``__call__`` method can be used as well and can hide state (with the “consulting adults” caveat, the state is hidden, not inaccessible).
This would be the easiest way to manage it. But both of them provide a way to have multiple independent, yet equivalent, states. If that's what you want, great! But it can also be an unnecessary level of confusion ("why would I ever make a second one of these? Why is there a factory function for something that I'll only ever need one of?"), where static variables wouldn't do that.
The factory function doesn’t need to be part of the public API of a module, I’ve used a pattern like this to create APIs with some hidden state: ``` def make_api(): state = ... def api1(…): … def ap2(…): … return api1, api2 api1, api2 = make_api() ``` I’m not saying that this is a particularly good way to structure code, in general just using a private module global is better (assuming the design calls for some kind of global state).
There is one namespace that would very aptly handle this kind of thing: the function object itself.
def count(): ... count.cur += 1 ... return count.cur ... count.cur = 0 count() 1 count() 2 count() 3 count() 4
As long as you can reference your own function reliably, this will work. There may be room for a keyword like this_function, but for the most part, it's not necessary, and you can happily work with the function by its name. It's a little clunkier than being able to say "static cur = 0;" to initialize it (the initializer has to go *after* the function, which feels backwards), but the functionality is all there.
I generally dislike functions with internal state like this. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On Thu, May 27, 2021 at 08:46:03AM -0400, Ricky Teachey wrote:
Couldn't you already get pretty close to this by attaching your static values to the function __dict__?
Sure, we can use function attributes as a form of static storage, and I have done that. It is sometimes quite handy. But it's not really the same as proper static variables. (1) It will be much less convenient. We have to write `func.variable` when we want `variable`, and if the function gets rebound to a new name, the call will fail. (2) The performance will be worse. The name look-up to get `func` is a relatively slow LOAD_GLOBAL instead of LOAD_FAST, and then on top of that we need to do a second name look-up, using attribute access. And if the func happens to be a method rather than a top level function, then we end up with three lookups: `self.method.variable`. And that could involve a deep inheritence chain. (3) And the semantics are completely different: - Function attributes are intended to be visible to the caller; they should be part of the function's API, just like any other attribute. - Static variables are intended to be internal to the function, they are not part of the function's API. They are conceptually private to the function. (At least as "private" as Python allows, given it's dynamic nature.) [...]
But could there be a decorator that links the function __dict__ to locals(), so they are intertwined?
I doubt that this would be possible, without large changes to the internal workings of functions.
The locals dict in the function body would look something like this:
ChainMap(locals(), {'a':1})
Are you aware that the locals() dictionary inside functions is a snapshot of the actual, true, hidden locals? Outside of a function, locals() returns the true global dict which is used to hold variables, so this works fine: x = 1 locals()['x'] = 99 # locals() here returns globals() print(x) # 99 But inside a function, well, try it for yourself :-) def test(): x = 1 locals()['x'] = 99 print(x) I'm not an expert on the internals of functions, but I think that my earlier proposal could be added to the existing implementation, while your linking values proposal would require a full re-implementation of how functions operate. -- Steve
Statics are still hidden global state, and those can be problematic regardless of being function local or module global. Having global state like this affects testability and can affect threading as well.
I think this is a very good point. I'm no expert, but I know a HUGE amount of old C code isn't thread-safe -- and static has something to do with that? Not that people shouldn't be allowed to write non thread-safe code in Python, but it shouldn't be encouraged. An awful lot of code is written with no idea that it will be run in multi-threaded code later on. Personally, I can't think of any times when I would have used this -- maybe because it wasn't there, so I didn't think about it. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
I'll try to implement the idea roughly and I'll try to observe how much performance improvements (or the opposite) will occur.
On Thu, May 27, 2021 at 04:37:10PM +0200, Ronald Oussoren wrote:
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
That’s a CPython performance hack,
No it isn't. I think we can assume than any non-crappy implementation will have faster access to locals than globals and builtins, or at least no worse. So it is a fair expectation that any time you can turn a global lookup into a local lookup, you should have some performance benefit. Function parameters are guaranteed to be local variables. Default values are guaranteed to be evaluated once, at function definition time. These are language guarantees not CPython implementation details. The precise performance benefit will, of course, vary from VM to VM, but we should expect that any serious implementation should give some performance benefit. (All the usual optimization caveats still apply: measure, don't guess, etc. Optimizations that work in theory may not always work in practice, yadda yadda yadda.)
and “static” would just introduce a different performance hack. IIRC there has been work in recent versions of CPython to reduce the need for that hack by caching values in the VM.
This trick has worked all the way back to Python 1.5, maybe even longer, so I think it's pretty stable against changes in the interpreter. But for the sake of the argument I'll accept that some future performance improvements that reduces the need for the `len=len` trick to negligible amounts. Static storage in functions will still be useful. Any time you need data to persist from one function call to another, you can either: - expose your data in a global variable; - or as a function attribute; - use a mutable function default; - rewrite your function as a callable instance; - obfuscate your function by wrapping it in a closure; all of which are merely work-arounds for the lack of proper static locals. -- Steve
On Thu, 27 May 2021 at 15:49, Chris Angelico <rosuav@gmail.com> wrote:
On Fri, May 28, 2021 at 12:25 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Thu, 27 May 2021 at 15:04, Chris Angelico <rosuav@gmail.com> wrote:
def static(**statics): def deco(func): for name, value in statics.items(): setattr(func, name, value) func.__globals__["__me__"] = func return func return deco
@static(a=1) def f(): print(__me__.a)
Can't use globals like that, since there's only one globals() dict per module. It'd require some compiler magic to make __me__ work the way you want. But on the plus side, this doesn't require a run-time trampoline - all the work is done on the original function object.
Rats, you're right. Hacking globals felt like a code smell. I considered trying to get really abusive by injecting some sort of local but that's not going to work because the code's already compiled by the time the decorator runs:
def f(): ... print(__me__.a) ... dis.dis(f) 2 0 LOAD_GLOBAL 0 (print) 2 LOAD_GLOBAL 1 (__me__) 4 LOAD_ATTR 2 (a) 6 CALL_FUNCTION 1 8 POP_TOP 10 LOAD_CONST 0 (None) 12 RETURN_VALUE
So yes, without compiler support you can only go so far (but you could always use the _getframe approach instead).
So, yeah, add it to the pile.
Yep. It's an interesting exercise, is all, and honestly, I don't think I'd use static much anyway, so something "good enough" that works now is probably more than enough for me personally. I do think that having a compiler supported way of referring efficiently to the current function (without relying on looking it up by name) would be an interesting alternative proposal, if we *are* looking at actual language changes - it allows for something functionally equivalent to statics, without the performance advantage but in compensation it has additional uses (recursion, and more general access to function attributes). I'm not going to push for it though, as I say, I don't have enough use for it to want to invest the time in it. Paul
On Fri, May 28, 2021 at 1:17 AM Christopher Barker <pythonchb@gmail.com> wrote:
Statics are still hidden global state, and those can be problematic regardless of being function local or module global. Having global state like this affects testability and can affect threading as well.
I think this is a very good point. I'm no expert, but I know a HUGE amount of old C code isn't thread-safe -- and static has something to do with that?
Not that people shouldn't be allowed to write non thread-safe code in Python, but it shouldn't be encouraged. An awful lot of code is written with no idea that it will be run in multi-threaded code later on.
Personally, I can't think of any times when I would have used this -- maybe because it wasn't there, so I didn't think about it.
Maybe, but it's worth noting that Python code has an inherent level of thread-safety as guaranteed by the language itself; you can't, for instance, have threads trample over each other's reference counts by trying to incref or decref the same object at the same time. Fundamentally, ANY form of mutable state will bring with it considerations for multithreading. Generally, the solution is to keep the state mutation to the tightest part possible, preferably some atomic action (if necessary, by guarding it with a lock), and then you have less to worry about. Hence my preference for direct manipulation of the state, rather than snapshotting it at function start and reapplying it at function exit. Multithreading isn't very hard in Python, and even when you get something wrong, it's not too difficult to figure out what happened. It's not nearly as hard as in C, where you have more possible things to go wrong. ChrisA
On Thu, May 27, 2021 at 11:21:26PM +1000, Chris Angelico wrote:
On Thu, May 27, 2021 at 10:20 PM Steven D'Aprano <steve@pearwood.info> wrote:
Here is a sketch of how this could work, given a function like this:
def func(arg): static spam, eggs static cheese = expression ...
At function declaration time, the two static statements tell the compiler to:
* treat spam, eggs and cheese as local variables (use LOAD_FAST instead of LOAD_GLOBAL for lookups);
I don't think LOAD_FAST would be suitable here - isn't it always going to look in the stack frame?
Where else should it look? I'll admit I'm not an expert on the various LOAD_* bytecodes, but I'm pretty sure LOAD_FAST is used for local variables. Am I wrong? My idea is that the static variable should be a local variable that gets saved on function exit and restored on function entry. Is there another concept of static variables that I should know about?
* allocate static storage for them using the same (or similar) mechanism used for function default values;
Default values are attached to the function object (in either the __defaults__ tuple or the __kwdefaults__ dict).
Right. In principle we could just shove the static values in the __defaults__ tuple, but it's probably better to use a distinct __statics__ dunder.
* spam and eggs get initialised as None;
* cheese gets initialised to the value of `expression`, evaluated at function declaration time just as default arguments are.
When the function is called:
* the interpreter automatically initialises the static variables with the stored values;
* when the function exits (whether by return or by raising an exception) the static storage will be updated with the current values of the variables.
Hmm, I see what you mean. Not sure that this is really necessary though -
If you don't store the values away somewhere on function exit, how do you expect them to persist from one call to the next? Remember that they are fundamentally variables -- they should be able to vary from one call to the next.
and it could cause extremely confusing results with threading.
Don't we have the same issues with globals, function attributes, and instance attributes? I'm okay with saying that if you use static *variables* (i.e. they change their value from one call to the next) they won't be thread-safe. As opposed to static "constants" that are only read, never written. But if you have a suggestion for a thread-safe way for functions to keep variables alive from one call to the next, that doesn't suffer a big performance hit, I'm all ears :-)
Agreed, I'd use it too. But I'd define the semantics slightly differently:
* If there's an expression given, evaluate that when the 'def' statement is executed, same as default args
That's what I said, except I called it function definition time :-)
* Otherwise it'll be uninitialized, or None, bikeshedding opportunity, have fun
I decided on initialising them to None because it is more convenient to write: if my_static_var is None: # First time, expensive computation. ... than the alternative with catching an exception. YMMV.
* Usage of this name uses a dedicated LOAD_STATIC or STORE_STATIC bytecode * The values of the statics are stored in some sort of high-performance cell collection, indexed numerically
Isn't that what LOAD_FAST already does?
It would be acceptable to store statics in a dict, too, but I suspect that that would negate some or all of the performance advantages. Whichever way, though, it should ideally be introspectable via a dunder attribute on the function.
Maybe I'm misunderstanding you, or you me. Let me explain further what I think can happen. When the function is being executed, I think that static variables should be treated as local variables. Why build a whole second implementation for fast cell-based variables, with a separate set of bytecodes, to do exactly what locals and LOAD_FAST does? We should reuse the existing fast local variable machinery, not duplicate it. The only part that is different is that those static locals have to be automatically initialised on function entry (like parameters are), and then on function exit their values have to be stored away in a dunder so they aren't lost (as plain local variables are lost when the function exists). That bit is new. We already have a mechanism to initialise locals: it's used for default values. The persistent data is retrieved from the appropriate dunder on the function and bound to the local variable (parameter). We can do the same thing. We will probably use a different dunder. That doesn't mean that every single access to the local static variable needs to be retrieved from the dunder, that would likely be slow. Only on function entry. The difference between function default arguments and statics is that if you rebind the parameter, that new value doesn't get written out to the __defaults__ dunder on function exit. But for statics, it should be. Are we on the same page here?
Semantically, this would be very similar to writing code like this:
def count(): THIS_FUNCTION.__statics__["n"] += 1 return THIS_FUNCTION.__statics__["n"] count.__statics__ = {"n": 1}
except that it'd be more optimized (and wouldn't require magic to get a function self-reference).
The most obvious optimization is that you only write the static value out to the dunder on function exit, not on ever rebinding operation.
Note that the statics *must* be defined on the function, NOT on the code object. Just like function defaults, they need to be associated with individual instances of a function.
Absolutely. -- Steve
On Thu, May 27, 2021 at 04:53:17PM +0200, Ronald Oussoren via Python-ideas wrote:
Statics are still hidden global state
How are they *global* state when they are specific to an individual function? We can already get the basic behaviour of statics right now, only with an ugly hack that pollutes the function parameter list and is inconvenient to use. static = [0] def spam(arg, static=[0]): static[0] += arg return static[0] def eggs(arg, static=[0]): static[0] -= arg return static[0] Is it your argument that all three `static` variables are using shared global state? If not, then I have no idea what you mean by insisting that statics are "hidden global state". They are hidden state, but not global. Just like closures and instance attributes.
and those can be problematic regardless of being function local or module global. Having global state like this affects testability and can affect threading as well.
It sounds like your problem is with *mutable state* in general. Global variables, instance attributes, class attributes, they all have exactly the same issues. So don't use mutable state. Nobody is forcing you to use this feature if you prefer to write in a functional style with no mutable state.
The factory function doesn’t need to be part of the public API of a module, I’ve used a pattern like this to create APIs with some hidden state:
``` def make_api(): state = ...
def api1(…): … def ap2(…): …
return api1, api2 api1, api2 = make_api() ```
Congratulations, you've just used static local variables. You just used closures for the implementation.
I’m not saying that this is a particularly good way to structure code, in general just using a private module global is better (assuming the design calls for some kind of global state).
You: "Global state is bad! Don't use global state!" Also you: "Don't use local state (closures)! Use global state, it's better!" *wink* -- Steve
On Fri, May 28, 2021 at 2:03 AM Steven D'Aprano <steve@pearwood.info> wrote:
I'll admit I'm not an expert on the various LOAD_* bytecodes, but I'm pretty sure LOAD_FAST is used for local variables. Am I wrong?
You're correct, but I dispute that that's the best way to do things.
Right. In principle we could just shove the static values in the __defaults__ tuple, but it's probably better to use a distinct __statics__ dunder.
Right, agreed.
If you don't store the values away somewhere on function exit, how do you expect them to persist from one call to the next? Remember that they are fundamentally variables -- they should be able to vary from one call to the next.
and it could cause extremely confusing results with threading.
Don't we have the same issues with globals, function attributes, and instance attributes?
Your proposal requires that every static involve a load at function start and a store-back at function end. My proposal requires that they get loaded directly from their one true storage location (a dict, or some kind of cell system, or whatever) attached to the function, and stored directly back there when assigned to. Huge difference. With your proposal, two threads that are *anywhere inside the function* can trample over each other's statics. Consider: def f(): static n = 0 n += 1 ... ... ... ... # end of function Your proposal requires that the "real" value of n not be updated until the function exits. What if that takes a long time to happen - should the static value remain at its old value until then? What if it never exits at all - if it's a generator function and never gets fully pumped? Mutating a static needs to happen immediately.
I'm okay with saying that if you use static *variables* (i.e. they change their value from one call to the next) they won't be thread-safe. As opposed to static "constants" that are only read, never written.
They'll never be fully thread-safe, but it should be possible to have a short-lived lock around the mutation site itself, followed by an actual stack-local. By your proposal, the *entire function* becomes non-thread-safe, *no matter what you do with locks*. By my proposal, this kind of code becomes entirely sane: def f(): static lock = Lock() static count = 0 with lock: my_count = count + 1 count = my_count ... ... ... print(my_count) There's a language guarantee with every other form of assignment that it will happen immediately in its actual target location. There's no write-back caching anywhere else in the language. Why have it here?
But if you have a suggestion for a thread-safe way for functions to keep variables alive from one call to the next, that doesn't suffer a big performance hit, I'm all ears :-)
The exact way that I described: a function attribute and a dedicated opcode pair :)
Agreed, I'd use it too. But I'd define the semantics slightly differently:
* If there's an expression given, evaluate that when the 'def' statement is executed, same as default args
That's what I said, except I called it function definition time :-)
Yep, that part we agree on.
* Otherwise it'll be uninitialized, or None, bikeshedding opportunity, have fun
I decided on initialising them to None because it is more convenient to write:
if my_static_var is None: # First time, expensive computation. ...
than the alternative with catching an exception. YMMV.
Not hugely important either way, I'd be happy with either.
* Usage of this name uses a dedicated LOAD_STATIC or STORE_STATIC bytecode * The values of the statics are stored in some sort of high-performance cell collection, indexed numerically
Isn't that what LOAD_FAST already does?
This would be a separate cell collection. The LOAD_FAST cells are in the stack frame, the LOAD_STATIC cells are on the function object. But yes, the code could be pretty much identical.
It would be acceptable to store statics in a dict, too, but I suspect that that would negate some or all of the performance advantages. Whichever way, though, it should ideally be introspectable via a dunder attribute on the function.
Maybe I'm misunderstanding you, or you me. Let me explain further what I think can happen.
When the function is being executed, I think that static variables should be treated as local variables. Why build a whole second implementation for fast cell-based variables, with a separate set of bytecodes, to do exactly what locals and LOAD_FAST does? We should reuse the existing fast local variable machinery, not duplicate it.
Because locals are local to the invocation, not the function. They're TOO local.
The only part that is different is that those static locals have to be automatically initialised on function entry (like parameters are), and then on function exit their values have to be stored away in a dunder so they aren't lost (as plain local variables are lost when the function exists). That bit is new.
Right, except that the mutations have to happen immediately.
We already have a mechanism to initialise locals: it's used for default values. The persistent data is retrieved from the appropriate dunder on the function and bound to the local variable (parameter). We can do the same thing. We will probably use a different dunder.
That doesn't mean that every single access to the local static variable needs to be retrieved from the dunder, that would likely be slow. Only on function entry.
I'm not sure that it would have to be all that slow; global lookup has to do a lot more work than static lookup would. But that would be something to measure.
The difference between function default arguments and statics is that if you rebind the parameter, that new value doesn't get written out to the __defaults__ dunder on function exit. But for statics, it should be.
Are we on the same page here?
No, because function exit is too late.
Semantically, this would be very similar to writing code like this:
def count(): THIS_FUNCTION.__statics__["n"] += 1 return THIS_FUNCTION.__statics__["n"] count.__statics__ = {"n": 1}
except that it'd be more optimized (and wouldn't require magic to get a function self-reference).
The most obvious optimization is that you only write the static value out to the dunder on function exit, not on ever rebinding operation.
That's too aggressive an optimization. Consider this function: def walk(tree): if tree is None: return static idx = 0 walk(tree.left) idx += 1 print(idx, tree.data) walk(tree.right) Suppose that, during the recursive call down the left tree, idx gets incremented five times. Now we return out of there and come back to the original invocation. What should the value of idx be? By your semantics, it's been loaded at the very start of the function, so it'll still be zero! And at the end of the function, regardless of the recursive calls to either left or right, a 1 will be written out to the static. You've effectively made statics utterly useless for recursion. In contrast, directly loading and writing the statics has the exact semantics of every other load/store in the language - it happens immediately at its correct location, atomically, and can be relied upon. ChrisA
My proposal is somewhat the sum of all of your ideas. Well I propose there should a STORE_STATIC_FAST opcode that stores a static variable. Static variable will be declared only once and will be initialized to None (statement syntax will be similar to that of global). It will be initialized in MAKE_FUNCTION. Now it will be set by STORE_STATIC_FAST. Where will the variables be stored? It will have references in locals and __statics__. Therefore LOAD_FAST can find it. So I don't hope there will be performance decrease but performance increase is also not guaranteed. :-) And if these are thread unsafe then is __defaults__ also thread unsafe?
On Fri, May 28, 2021 at 2:44 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
My proposal is somewhat the sum of all of your ideas. Well I propose there should a STORE_STATIC_FAST opcode that stores a static variable. Static variable will be declared only once and will be initialized to None (statement syntax will be similar to that of global). It will be initialized in MAKE_FUNCTION. Now it will be set by STORE_STATIC_FAST. Where will the variables be stored? It will have references in locals and __statics__. Therefore LOAD_FAST can find it. So I don't hope there will be performance decrease but performance increase is also not guaranteed. :-)
The duplicated store fixes half the problem, but it still fails on the recursion example that I posted in reply to Steve. It would be a nice optimization, but it may or may not be sufficient.
And if these are thread unsafe then is __defaults__ also thread unsafe?
Thread safety isn't a problem with constants. Python guarantees that internal details (like CPython's reference counts) aren't going to be trampled on, and inside your code, nothing is going to change __defaults__ (unless you're doing something bizarre, in which case it isn't about __defaults__ any more). Thread safety only becomes an issue when you have something like this: counter = 0 def get_next(): global counter counter += 1 return counter This disassembles to: 6 0 LOAD_GLOBAL 0 (counter) 2 LOAD_CONST 1 (1) 4 INPLACE_ADD 6 STORE_GLOBAL 0 (counter) 7 8 LOAD_GLOBAL 0 (counter) 10 RETURN_VALUE A context switch can happen between any two of those instructions. That means one thread could load the global, then another thread could load the same value, resulting in both of them writing back the same incremented value. Or, between opcodes 6 and 8 (between the lines of Python code), you could store the value, then fetch back a different value. None of this is a problem if you're using constants. The only reason to use statics instead of global constants is performance - the "len=len" trick is specific to this performance advantage - but you don't have to worry about thread safety. ChrisA
On 2021-05-27 17:44, Shreyan Avigyan wrote:
My proposal is somewhat the sum of all of your ideas. Well I propose there should a STORE_STATIC_FAST opcode that stores a static variable. Static variable will be declared only once and will be initialized to None (statement syntax will be similar to that of global). It will be initialized in MAKE_FUNCTION. Now it will be set by STORE_STATIC_FAST. Where will the variables be stored? It will have references in locals and __statics__. Therefore LOAD_FAST can find it. So I don't hope there will be performance decrease but performance increase is also not guaranteed. :-)
And if these are thread unsafe then is __defaults__ also thread unsafe?
Why initialise them to None? Other variables don't behave like that. My preference would be to have something like this: def foo(): static x = 0 that would bind 'x' the first time it met that statement, and if it tried to use 'x' before that has met that statement for the same time it would raise something like UnboundLocalError (perhaps UnboundStaticError in this case?) like happens currently for local variables.
You can just use nonlocal variables: def stator(): static_var_1 = 0 def myfunc(n): nonlocal static_var_1 static_var_1 += n return static_var_1 return myfunc myfunc = stator() del stator Or you can attach any variable to the function itself: def myfunc(n): if not hasattr(myfunc, "static_var_1"): myfunc.static_var_1 = 0 myfunc.static_var_1 += n return myfunc.static_var_1 There are several options to get the effect that static variables would have, not to mention that states of this type are better held as class attributes anyway. BTW: # callable singleton - AKA "function": class myfunc(metaclass=lambda *args: type(*args)()): def __init__(self): self.static_var_1 = 0 def __call__(self, n): self.static_var_1 += n return self.static_var_1 Or, as you put it in the first e-mail, the static var could be built into a data structure in the default arguments of the function. (There are also contextvars, threading.local, etc...) I can't see a separate "static" declaration being of any use. Beginners needing the functionality should just resort to either globals or plain classes to keep state. As you get the way Python works, there are plenty of ways to keep the state, without making the language more complicated. On Thu, 27 May 2021 at 14:06, Chris Angelico <rosuav@gmail.com> wrote:
On Fri, May 28, 2021 at 2:44 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
My proposal is somewhat the sum of all of your ideas. Well I propose
there should a STORE_STATIC_FAST opcode that stores a static variable. Static variable will be declared only once and will be initialized to None (statement syntax will be similar to that of global). It will be initialized in MAKE_FUNCTION. Now it will be set by STORE_STATIC_FAST. Where will the variables be stored? It will have references in locals and __statics__. Therefore LOAD_FAST can find it. So I don't hope there will be performance decrease but performance increase is also not guaranteed. :-)
The duplicated store fixes half the problem, but it still fails on the recursion example that I posted in reply to Steve. It would be a nice optimization, but it may or may not be sufficient.
And if these are thread unsafe then is __defaults__ also thread unsafe?
Thread safety isn't a problem with constants. Python guarantees that internal details (like CPython's reference counts) aren't going to be trampled on, and inside your code, nothing is going to change __defaults__ (unless you're doing something bizarre, in which case it isn't about __defaults__ any more). Thread safety only becomes an issue when you have something like this:
counter = 0 def get_next(): global counter counter += 1 return counter
This disassembles to:
6 0 LOAD_GLOBAL 0 (counter) 2 LOAD_CONST 1 (1) 4 INPLACE_ADD 6 STORE_GLOBAL 0 (counter)
7 8 LOAD_GLOBAL 0 (counter) 10 RETURN_VALUE
A context switch can happen between any two of those instructions. That means one thread could load the global, then another thread could load the same value, resulting in both of them writing back the same incremented value. Or, between opcodes 6 and 8 (between the lines of Python code), you could store the value, then fetch back a different value.
None of this is a problem if you're using constants. The only reason to use statics instead of global constants is performance - the "len=len" trick is specific to this performance advantage - but you don't have to worry about thread safety.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DD62RX... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, 27 May 2021 at 10:39, Paul Moore <p.f.moore@gmail.com> wrote: [...]
the performance aspect, function attributes provide this functionality, but there's a significant problem with using them because you can't access them other than by referencing the *name* of the function being defined. [...]
It would be nice to have a better way to reference function attributes from within a function. (This would also help write recursive functions that could be safely renamed, but I'm not sure many people would necessarily think that's a good thing ;-))
Now, yes, being able to reference a function from inside itself is a feature I had missed over the years.
Paul
A context switch can happen between any two of those instructions. That means one thread could load the global, then another thread could load the same value, resulting in both of them writing back the same incremented value. Or, between opcodes 6 and 8 (between the lines of Python code), you could store the value, then fetch back a different value.
I see now. Then we can go with Steven's idea. Let's keep the changes in locals temporarily and when it yields or returns then modify the __statics__ member. And even if it's a generator it will stop iteration some time and if it doesn't then the member wasn't meant to be modified.
On Fri, May 28, 2021 at 4:14 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
A context switch can happen between any two of those instructions. That means one thread could load the global, then another thread could load the same value, resulting in both of them writing back the same incremented value. Or, between opcodes 6 and 8 (between the lines of Python code), you could store the value, then fetch back a different value.
I see now. Then we can go with Steven's idea. Let's keep the changes in locals temporarily and when it yields or returns then modify the __statics__ member. And even if it's a generator it will stop iteration some time and if it doesn't then the member wasn't meant to be modified.
So you're saying that... def f(): static x = 0 x += 1 yield x next(f()) next(f()) next(f()) will yield 1 every time? According to the "write-back later" semantics, this is only going to actually update the static once it gets nexted the second time. This would be utterly unique in all of Python: that assignment doesn't happen when you say it should happen, it instead happens arbitrarily later. I would have to call that behaviour "buggy". If you use literally any other form of persistent state, ANY other, it would increment. Even worse: I don't see how your conclusion relates to my explanation of threading, which you describe. Instead of having a minor risk of a context switch in the middle of a statement (which can be controlled with a lock), you have a major risk of a context switch ANYWHERE in the function - and now there's no way you can use a lock to protect it, because the lock would (by definition) be released before the function epilogue. ChrisA
Reply to Chris: The only problem is that with that approach that we can't understand if that's the last yield statement. To achieve that we need to keep going until we encounter a StopIteration. And the value of x would 3. Because we're not iterating over a particular generator. We're creating multiple instances which actually would increase x. And also is there another way we can make it thread safe? Steven's idea is actually the only solution we've encountered till now. I'd be really happy if someone could come up with even a better idea.
On Fri, May 28, 2021 at 4:49 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Reply to Chris:
The only problem is that with that approach that we can't understand if that's the last yield statement. To achieve that we need to keep going until we encounter a StopIteration. And the value of x would 3. Because we're not iterating over a particular generator. We're creating multiple instances which actually would increase x.
And also is there another way we can make it thread safe? Steven's idea is actually the only solution we've encountered till now. I'd be really happy if someone could come up with even a better idea.
This is thread-safe: from threading import Lock lock = Lock() counter = 0 def get_next(): with lock: global counter counter += 1 my_counter = counter ... ... ... The equivalent with a static counter and a static Lock object would also be thread-safe under my proposed semantics. This is guaranteed, because ALL mutation happens while the lock is held, and then there's a stack-local variable for the value you want to use. The language promises that the assignment back to the global happens immediately, not at some later point, after the lock has been released. This is, in fact, the normal expectation of locks and assignment, and it works whether you're using a global, a closure (and a nonlocal assignment), a mutable function default argument, an attribute on the function object, or in fact, *any other assignment in the language*. They all happen immediately. You would have to ensure that you don't have a yield inside the locking context, but anywhere else would be fine. The biggest downside of this sort of system is the overhead of the locking. A high-performance thread-aware static system should be able to avoid some or even most of that overhead. But mainly, the semantics have to be such that locks behave sanely, and can be a solution to other problems. ChrisA
On Fri, May 28, 2021 at 4:49 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Reply to Chris:
The only problem is that with that approach that we can't understand if that's the last yield statement. To achieve that we need to keep going until we encounter a StopIteration. And the value of x would 3. Because we're not iterating over a particular generator. We're creating multiple instances which actually would increase x.
And also is there another way we can make it thread safe? Steven's idea is actually the only solution we've encountered till now. I'd be really happy if someone could come up with even a better idea.
Also - Steven's idea is NOT a solution. It worsens the problem. I don't see how it is at all a solution. ChrisA
On 2021-05-27 05:18, Steven D'Aprano wrote:
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program.
+1 on this idea.
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
Benchmarking shows that this actually does make a significant difference to performance, but it's a technique under-used because of the horribleness of a len=len parameter.
(Raymond Hettinger is, I think, a proponent of this optimization trick. At least I learned it from his code.)
I don't see this as a great motivation for this feature. If the goal is to make things faster I think that would be better served by making the interpreter smarter or adding other global-level optimizations. As it is, you're just trading one "manual" optimization (len=len) for another (static len). Yes, the new one is perhaps slightly less ugly, but it still puts the onus on the user to manually "declare" variables as local, not because they are semantically local in any way, but just because we want a faster lookup. I see that as still a hack. A non-hack would be some kind of JIT or optimizing interpreter that actually reasons about how the variables are used so that the programmer doesn't have to waste time worrying about hand-tuning optimizations like this. So basically for me anything that involves the programmer saying "Please make this part faster" is a hack. :-) We all want everything to be as fast as possible all the time, and in sofar as we're concerned about speed we should focus on making the entire interpreter smarter so everything is faster, rather than adding new ways for the programmer do extra work to make just a few things faster. Even something like a way of specifying constants (which has been proposed in another thread) would be better to my eye. That would let certain variables be marked as "safe" so that they could always be looked up fast because we'd be sure they're never going to change. As to the original proposal, I'm not in favor of it. It's fairly uncommon for me to want to do this, and in the cases where I do, Python classes are simple enough that I can just make a class with a method (or a __call__ if I want to be really cool) that stores the data in a way that's more transparent and more clearly connected to the normal ways of storing state in Python. It just isn't worth adding yet another complexity to the language for this minor use case. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
Chris wrote:
This is thread-safe:
from threading import Lock
lock = Lock() counter = 0 def get_next(): with lock: global counter counter += 1 my_counter = counter
This is a great workaround. I can try to improve this. But first of all should we depend on the user to do this locking? I don't think so. So is it possible to implement this in the background without affecting current performance?
On Fri, May 28, 2021 at 5:19 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-05-27 05:18, Steven D'Aprano wrote:
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program.
+1 on this idea.
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
Benchmarking shows that this actually does make a significant difference to performance, but it's a technique under-used because of the horribleness of a len=len parameter.
(Raymond Hettinger is, I think, a proponent of this optimization trick. At least I learned it from his code.)
I don't see this as a great motivation for this feature. If the goal is to make things faster I think that would be better served by making the interpreter smarter or adding other global-level optimizations. As it is, you're just trading one "manual" optimization (len=len) for another (static len).
Yes, the new one is perhaps slightly less ugly, but it still puts the onus on the user to manually "declare" variables as local, not because they are semantically local in any way, but just because we want a faster lookup. I see that as still a hack. A non-hack would be some kind of JIT or optimizing interpreter that actually reasons about how the variables are used so that the programmer doesn't have to waste time worrying about hand-tuning optimizations like this.
If you're doing a lot of length checks, the standard semantics of Python demand that the name 'len' be looked up every time it's called. That's expensive - first you check the module globals, then you check the builtins. With some sort of local reference, the semantics change: now the name 'len' is looked up once, and the result is cached. That means that creating globals()["len"] in the middle of the function will no longer affect its behaviour. An optimizing compiler that did this would be a nightmare. Explicitly choosing which names to retain means that the programmer is in control. The biggest problem with the default argument trick is that it makes it look as if those arguments are part of the function's API, where they're really just an optimization. Consider: def shuffle(things, *, randrange=random.randrange): ... def merge_shortest(things, *, len=len): ... Is it reasonable to pass a different randrange function to shuffle()? Absolutely! You might have a dedicated random.Random instance (maybe a seeded PRNG for reproducible results). Is it reasonable to pass a different len function to merge_shortest()? Probably not - it looks like it's probably an optimization. Yes, you could say "_len=len", but now your optimization infects the entire body of the function, instead of being a simple change in the function header. With statics, you could write it like this: def merge_shortest(things): static len=len ... Simple. Easy. Reliable. (And this usage would work with pretty much any of the defined semantics.) There's no more confusion.
So basically for me anything that involves the programmer saying "Please make this part faster" is a hack. :-) We all want everything to be as fast as possible all the time, and in sofar as we're concerned about speed we should focus on making the entire interpreter smarter so everything is faster, rather than adding new ways for the programmer do extra work to make just a few things faster.
It's never a bad thing to make the interpreter smarter and faster, if it can be done without semantic changes. (Mark Shannon has some current plans that, I believe, fit that description.) This is different, though - the behaviour WILL change, so it MUST be under programmer control.
Even something like a way of specifying constants (which has been proposed in another thread) would be better to my eye. That would let certain variables be marked as "safe" so that they could always be looked up fast because we'd be sure they're never going to change.
Question: When does this constant get looked up? def merge_shortest(things): constant len=len ... Is it looked up as the function begins execution, or when the function is defined? How much are you going to assume that it won't change?
As to the original proposal, I'm not in favor of it. It's fairly uncommon for me to want to do this, and in the cases where I do, Python classes are simple enough that I can just make a class with a method (or a __call__ if I want to be really cool) that stores the data in a way that's more transparent and more clearly connected to the normal ways of storing state in Python. It just isn't worth adding yet another complexity to the language for this minor use case.
Yes, static state can always be implemented with a class or closure instead. You may notice that the optimization technique still exists, and for very good reason :) Plus, it's often just unnecessary overhead to lay out your code that way. It should be trivially easy to convert something from a global to a function-scoped static, but reworking it to a closure/class is an actual refactor with notable changes. Python could have been defined with no classes. Instead, we could have all just used factory functions, with a bunch of local variables in the constructor for private state, and a bunch of things packaged up into a dict as the public API. Why do we have classes? Because they are better at expressing the things we need to express. ChrisA
On Fri, May 28, 2021 at 5:25 AM Shreyan Avigyan <pythonshreyan09@gmail.com> wrote:
Chris wrote:
This is thread-safe:
from threading import Lock
lock = Lock() counter = 0 def get_next(): with lock: global counter counter += 1 my_counter = counter
This is a great workaround. I can try to improve this. But first of all should we depend on the user to do this locking? I don't think so. So is it possible to implement this in the background without affecting current performance?
No, you can't, because it's impossible to know when you're done mutating. However, if the mutation is inherently atomic - or if subsequent lookups don't require atomicity - then the lock becomes unnecessary, and your code will be thread-safe already. (If you use async functions or recursion but no threads, then every yield point becomes explicit in the code, and you effectively have a lock that governs every block of code between those points. But that has many many other implications. Point is, statics should be compatible with ALL use-cases, and that shouldn't be difficult.) ChrisA
Reply to Chris: Also it's rarely the case where it can become thread unsafe suddenly. 1 / 10*something chances. Because I've repeatedly run a thread-unsafe code and have not encountered thread unsafe state yet. GIL executes the code to a very good extent. And is it hypothetically even possible to have thread unsafe state that can affect functions? Because locals of the different functions are different. Since the variable will be loaded by LOAD_FAST it will look into locals for the static variable and both locals will differ. The only dangerous code is op=. Because this depends on the current value of static variable that can make the function go to an undefined state.
On 2021-05-27 12:33, Chris Angelico wrote:
With statics, you could write it like this:
def merge_shortest(things): static len=len ...
Simple. Easy. Reliable. (And this usage would work with pretty much any of the defined semantics.) There's no more confusion.
You can already do that: def merge_shortest(things): len=len ... Yes, it does require a single global lookup on each function call, but if that's really a bottleneck for you I don't think there's much hope. :-)
Even something like a way of specifying constants (which has been proposed in another thread) would be better to my eye. That would let certain variables be marked as "safe" so that they could always be looked up fast because we'd be sure they're never going to change.
Question: When does this constant get looked up?
def merge_shortest(things): constant len=len ...
Is it looked up as the function begins execution, or when the function is defined? How much are you going to assume that it won't change?
Sorry, I was a bit vague there. What I was envisioning is that you would specify len as a constant at the GLOBAL level, meaning that all functions in the module could always assume it referred to the same thing. (It's true this might require something different from what was proposed in the other thread about constants.) -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Fri, May 28, 2021 at 6:04 AM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-05-27 12:33, Chris Angelico wrote:
With statics, you could write it like this:
def merge_shortest(things): static len=len ...
Simple. Easy. Reliable. (And this usage would work with pretty much any of the defined semantics.) There's no more confusion.
You can already do that:
def merge_shortest(things): len=len ...
Yes, it does require a single global lookup on each function call, but if that's really a bottleneck for you I don't think there's much hope. :-)
Hmmmmmmmm.... let's see.
def merge_shortest(things): ... len=len ... ... ... merge_shortest([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in merge_shortest UnboundLocalError: local variable 'len' referenced before assignment
There are languages in which you're allowed to do this (using a name in an initializer to fetch from a parent scope), but Python isn't one of them. At best, you could write "_len=len", but then you have to rewrite the function body to use _len, leaving the question of "why is this _len and not len?" for every future maintainer. Since a static declaration is evaluated at function definition time (just like a default argument is), this problem doesn't come up, because the local name "len" won't exist at that point.
Even something like a way of specifying constants (which has been proposed in another thread) would be better to my eye. That would let certain variables be marked as "safe" so that they could always be looked up fast because we'd be sure they're never going to change.
Question: When does this constant get looked up?
def merge_shortest(things): constant len=len ...
Is it looked up as the function begins execution, or when the function is defined? How much are you going to assume that it won't change?
Sorry, I was a bit vague there. What I was envisioning is that you would specify len as a constant at the GLOBAL level, meaning that all functions in the module could always assume it referred to the same thing. (It's true this might require something different from what was proposed in the other thread about constants.)
Gotcha, gotcha. I think module-level constants could *also* be useful, but they're orthogonal to this proposal. Unless it's a compile-time constant (so, as the module gets imported, all references to "len" become LOAD_CONST of whatever object was in the builtins at that point), I doubt it would have the same performance benefits, and it obviously couldn't handle the mutable statics use-case. I think there are very good use-cases for module-level constants, but the trouble is, there are so many variants of the idea and so many not-quite-overlapping purposes that they can be put to :) ChrisA
My concern about thread safety is about how easy it would be to make it thread unsafe accidentally. Sure, global is not thread safe, but it is well known that use of global is, to newbies, “bad”, and to more experienced programmers, “to be used with caution, understanding the risks”. But particularly if static provides a performance boost, people will be very tempted to use it without considering the implications. If people want a high performance local constant— that sounds something like the constant proposal the OP brought up earlier. -CHB On Thu, May 27, 2021 at 12:18 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-05-27 05:18, Steven D'Aprano wrote:
On Thu, May 27, 2021 at 07:56:16AM -0000, Shreyan Avigyan wrote:
Lot of programming languages have something known as static variable storage in *functions* not *classes*. Static variable storage means a variable limited to a function yet the data it points to persists until the end of the program.
+1 on this idea.
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
Benchmarking shows that this actually does make a significant difference to performance, but it's a technique under-used because of the horribleness of a len=len parameter.
(Raymond Hettinger is, I think, a proponent of this optimization trick. At least I learned it from his code.)
I don't see this as a great motivation for this feature. If the goal is to make things faster I think that would be better served by making the interpreter smarter or adding other global-level optimizations. As it is, you're just trading one "manual" optimization (len=len) for another (static len).
Yes, the new one is perhaps slightly less ugly, but it still puts the onus on the user to manually "declare" variables as local, not because they are semantically local in any way, but just because we want a faster lookup. I see that as still a hack. A non-hack would be some kind of JIT or optimizing interpreter that actually reasons about how the variables are used so that the programmer doesn't have to waste time worrying about hand-tuning optimizations like this. So basically for me anything that involves the programmer saying "Please make this part faster" is a hack. :-) We all want everything to be as fast as possible all the time, and in sofar as we're concerned about speed we should focus on making the entire interpreter smarter so everything is faster, rather than adding new ways for the programmer do extra work to make just a few things faster.
Even something like a way of specifying constants (which has been proposed in another thread) would be better to my eye. That would let certain variables be marked as "safe" so that they could always be looked up fast because we'd be sure they're never going to change.
As to the original proposal, I'm not in favor of it. It's fairly uncommon for me to want to do this, and in the cases where I do, Python classes are simple enough that I can just make a class with a method (or a __call__ if I want to be really cool) that stores the data in a way that's more transparent and more clearly connected to the normal ways of storing state in Python. It just isn't worth adding yet another complexity to the language for this minor use case.
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/AT5HA2... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, May 27, 2021 at 01:02:15PM -0700, Brendan Barnwell wrote:
You can already do that:
def merge_shortest(things): len=len ...
Yes, it does require a single global lookup on each function call, but if that's really a bottleneck for you I don't think there's much hope. :-)
It's not so much the single global lookup on each function call as the fact that you can't do that at all :-( UnboundLocalError: local variable 'len' referenced before assignment [...]
Sorry, I was a bit vague there. What I was envisioning is that you would specify len as a constant at the GLOBAL level, meaning that all functions in the module could always assume it referred to the same thing. (It's true this might require something different from what was proposed in the other thread about constants.)
Getting actual constants that the interpreter can trust will not change is likely to be a much bigger language change than taking advantage of existing mechanisms that already exist in functions, with perhaps a little extra work, to get per function local static storage. But even if we did have actual constants, how does that help get static *variables*, you know, things that aren't constant but can vary? -- Steve
On Fri, May 28, 2021 at 7:04 AM Christopher Barker <pythonchb@gmail.com> wrote:
My concern about thread safety is about how easy it would be to make it thread unsafe accidentally.
Sure, global is not thread safe, but it is well known that use of global is, to newbies, “bad”, and to more experienced programmers, “to be used with caution, understanding the risks”.
But particularly if static provides a performance boost, people will be very tempted to use it without considering the implications.
If people want a high performance local constant— that sounds something like the constant proposal the OP brought up earlier.
Variable statics are no less thread safe than globals are (nor any more thread safe). They behave virtually identically. Constant statics are completely thread safe. If you're doing it just for the performance improvement, there's no way that threading can possibly affect it. ChrisA
On 2021-05-27 21:02, Brendan Barnwell wrote:
On 2021-05-27 12:33, Chris Angelico wrote:
With statics, you could write it like this:
def merge_shortest(things): static len=len ...
Simple. Easy. Reliable. (And this usage would work with pretty much any of the defined semantics.) There's no more confusion.
You can already do that:
def merge_shortest(things): len=len ...
No, that raises an UnboundLocalError exception.
On Thu, May 27, 2021 at 02:02:11PM -0700, Christopher Barker wrote:
My concern about thread safety is about how easy it would be to make it thread unsafe accidentally.
I'm intrigued what gives you the impression that Python functions and classes are, by default, thread safe. The FAQ is a little bit misleading: https://docs.python.org/3/faq/library.html#what-kinds-of-global-value-mutati... While it is true that builtin operations like list.append are thread safe, as soon as you have two of them, the compound operation is no longer thread safe unless guarded with a lock. L.append(x) # thread-safe L.append(y) # L may have already been modified by another thread assert L[-2:] == (x, y) # may fail And if L is a subclass of list or duck-typed, all bets are off. Even L.append may not be safe, if L is something other than an actual list. Of course, there is a simple solution to that (apart from locks): don't use threads, or don't use shared mutable data. It is only the combination of concurrency with shared mutable state that is problematic. Remove either of those, and you're cool. Function local static variables would be no worse in this regard than existing features: globals, mutable default values, classes with attributes, etc. I'm not sure about closures and thread-safety, but closures come with their own pitfalls: https://docs.python-guide.org/writing/gotchas/#late-binding-closures
Sure, global is not thread safe, but it is well known that use of global is, to newbies, “bad”, and to more experienced programmers, “to be used with caution, understanding the risks”.
Is it well-known that writing classes is "bad", "to be used with caution, understanding the risks"? https://stackoverflow.com/questions/8309902/are-python-instance-variables-th... One of the more disheartening things about the culture of this mailing list is the way new proposals are held to significantly higher standards that existing language and stdlib features do not meet. This is a perfect example: it's not like regular Python functions and classes are thread-safe by default and this is introducing a new problem that is almost unique to static variables. Regular Python functions and classes are almost never thread-safe unless carefully written to be, which most people don't bother to do unless they specifically care about thread safety, which most people don't. Any time you have a function with state, then it requires care to make it thread-safe. Doesn't matter whether that storage is a global, mutable defaults, instance or class attributes, or these hypothetical static variables. Pure functions with no state or side-effects are thread-safe, but beyond that, every non-builtin, and some builtins, should be assumed to be unsafe unless carefully designed for concurrent use. It's not always obvious either: print(x) Not thread-safe. Two threads can write to stdout simultaneously, interleaving their output. -- Steve
On 2021-05-27 13:15, Chris Angelico wrote:
Hmmmmmmmm.... let's see.
def merge_shortest(things): ... len=len ... ... ... merge_shortest([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in merge_shortest UnboundLocalError: local variable 'len' referenced before assignment
Okay, yeah, mea culpa. As several people pointed out that doesn't work. But `len_ = len` does work. However, that doesn't change the calculus at all for me. My point wasn't about using the exact same variable name. It's that ANY ability to create a local variable that is a fast-lookup shortcut for a global one is enough. My point is that manually creating fast-lookup local-variable shortcuts is inherently a performance hack and there's no real use in making it slightly nicer-looking. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 2021-05-27 14:33, Steven D'Aprano wrote:
But even if we did have actual constants, how does that help get static *variables*, you know, things that aren't constant but can vary?
All of those use cases can already be handled with a class that stores its data in an attribute. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Fri, May 28, 2021 at 12:13 PM Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2021-05-27 13:15, Chris Angelico wrote:
Hmmmmmmmm.... let's see.
def merge_shortest(things): ... len=len ... ... ... merge_shortest([]) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in merge_shortest UnboundLocalError: local variable 'len' referenced before assignment
Okay, yeah, mea culpa. As several people pointed out that doesn't work. But `len_ = len` does work. However, that doesn't change the calculus at all for me. My point wasn't about using the exact same variable name. It's that ANY ability to create a local variable that is a fast-lookup shortcut for a global one is enough. My point is that manually creating fast-lookup local-variable shortcuts is inherently a performance hack and there's no real use in making it slightly nicer-looking.
The change of variable name is significant, though. It means that this is no longer a minor change to the function's header; in order to use this optimization, you have to replace every use of the name "len" with "len_" (or "_len"). That isn't necessarily a deal-breaker; I've seen code that optimizes method lookups away by retaining the callable (eg "ap = some_list.append"), so there's uses for that kind of rename; but refactoring becomes harder if you have to be aware of whether you're using the optimized version or not. Why should a performance-improving hoist look ugly? It's a perfectly normal thing to do - calculate something once and reuse the value, because you know that it won't change (or don't care if it changes). It's not a "hack" - it's a legit method of saving effort. (Some day I'll learn how to do this in real life. Why can't I buy just one egg, and then reuse the same egg for every meal?) ChrisA
On 2021-05-28 at 12:22:22 +1000, Chris Angelico <rosuav@gmail.com> wrote:
[...] calculate something once and reuse the value, because you know that it won't change (or don't care if it changes) [...]
(Some day I'll learn how to do this in real life. Why can't I buy just one egg, and then reuse the same egg for every meal?)
Those are mutable eggs. Try immutable eggs instead. Or obtain a hen, aka an egg factory (and a rooster, too, but that's off topic, even for Python Ideas). ObPython: >>> egg = Egg() >>> egg.scramble() >>> egg.fry() Traceback (most recent call last): File "<stdin>", line 1, in <module> EggStateError: cannot fry a scrambled egg Mutability is the root of all evil.
My concern about thread safety is about how easy it would be to make it thread unsafe accidentally.
I'm intrigued what gives you the impression that Python functions and classes are, by default, thread safe.
Well, there is thread safe, and there is thread dangerous. I have an enormous amount of code that is not strictly thread safe, but works fine when run under a multi-threaded web server because there are no cases where the same instances of objects are running in different threads. Well, not true, there are many shared function objects (and class objects). I very much assume that those function objects are not changing at run time. But having static variables would totally break that assumption. That would be like mutating class attributes, which, of course, is perfectly possible, but a little harder to do without thinking about it. or don't use shared mutable data. Exactly— and functions are always shared if you are using threads at all. But not mutable if used in the usual way. (Now that I think about it, the suggestions on this thread about putting things in the function namespace makes them mutable — but I at least have never done that. Function local static variables would be no worse in this regard than
existing features: globals, mutable default values,
Mutable default values are a notable “gotcha”. classes with
attributes, etc.
Yes, I think this is most like class attributes, which are probably less well known as a “gotcha”. But also not that widely used by accident. I don’t think I’ve even seen a student use them. And I sure have seen mutable default values mistakenly used in students' code. Another point -- whether there is a static variable in the function becomes a not obvious part of its API. That's probably the biggest issue -- some library author used static, and all its users may not know that, and then use the code in a multi-threaded application, or frankly in a single threaded application that isn't expecting functions to be mutated. Anyway, agreed — it is very easy to write not-thread safe code in Python now. So maybe this wouldn’t provide significantly more likelihood of it happening accidentally. One of the more disheartening things about the culture of this mailing
list is the way new proposals are held to significantly higher standards that existing language and stdlib features do not meet.
A lot of people (I thought you included) hold the view that new features SHOULD be held to a higher standard.
This is a perfect example: it's not like regular Python functions and
classes are thread-safe by default and this is introducing a new problem that is almost unique to static variables.
no -- but as above, functions themselves are immutable by default in normal usage --that would change. And since the entire point of classes is to hold state and the functions that work with that state in one place, it's expected behaviour that that state changes. Which makes me realize why I never wanted a function static variable -- if I wanted changable state associated with a function, I used a class. And most commonly there was more than one function associated with that state, so a class was the rigth solution anyway. I know that Jack Diederich says: "if a class has only two functions, and one of them is __init__ -- it's not a class", and I totally agree with him, but if you do have a case where you have some state and only one function associated with it, maybe a class IS the right solution. -CHB
On Thu, May 27, 2021 at 7:24 PM Chris Angelico <rosuav@gmail.com> wrote:
But `len_ = len` does work. However, that doesn't change the
calculus at all for me. My point wasn't about using the exact same variable name. It's that ANY ability to create a local variable that is a fast-lookup shortcut for a global one is enough.
The change of variable name is significant, though. It means that this is no longer a minor change to the function's header; in order to use this optimization, you have to replace every use of the name "len" with "len_" (or "_len").
sure. but this is only worth doing if you are using that name in a tight loop somewhere -- so you only need the special name in one or two places, and I've always put that hack right next to that loop anyway. if you are calling, e.g. len() in hundreds of separate places in one function -- you've got a much larger problem. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 27 May 2021, at 17:24, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 27, 2021 at 04:37:10PM +0200, Ronald Oussoren wrote:
One common use for function defaults is to optimize function lookups to local variables instead of global or builtins:
def func(arg, len=len): # now len is a fast local lookup instead of a slow name lookup
That’s a CPython performance hack,
No it isn't. I think we can assume than any non-crappy implementation will have faster access to locals than globals and builtins, or at least no worse. So it is a fair expectation that any time you can turn a global lookup into a local lookup, you should have some performance benefit.
It is a performance hack regardless. There are other solutions possible, including bytecode rewriting (look for “Binding Constants at Compile Time” in the Python Cookbook. That particular implementation likely doesn’t work anymore due to changes in the bytecode representation, but the general idea is still valid (and I’ve used it in the past because it leads to somewhat cleaner code). This could even be an opt-in language feature (“from __future__ import no_builtins_override”, or a better thought out alternative). That said, we’re sailing off-topic. There should be much more enticing reasons for having this “static” feature than “rebinding globals as locals” before anyone will consider adding it to the language. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On 27 May 2021, at 18:15, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 27, 2021 at 04:53:17PM +0200, Ronald Oussoren via Python-ideas wrote:
Statics are still hidden global state
How are they *global* state when they are specific to an individual function?
We can already get the basic behaviour of statics right now, only with an ugly hack that pollutes the function parameter list and is inconvenient to use.
static = [0]
def spam(arg, static=[0]): static[0] += arg return static[0]
def eggs(arg, static=[0]): static[0] -= arg return static[0]
Is it your argument that all three `static` variables are using shared global state? If not, then I have no idea what you mean by insisting that statics are "hidden global state". They are hidden state, but not global. Just like closures and instance attributes.
I honestly don’t see the difference between: def spam(arg, static=[0]): … and _static = 0 def spam(arg): global _static; … The difference is where the state is stored. State in the latter example is less tightly coupled to the function and is easier to access externally, but for both cases there is a documented way to access them (spam.__defaults__[’static’] for the first example). In both cases there’s effectively a singleton object that stores data, which makes testing harder because it is harder to ensure the right preconditions for testing (e.g. some other code might have called the function and affected the function state in an unexpected way).
and those can be problematic regardless of being function local or module global. Having global state like this affects testability and can affect threading as well.
It sounds like your problem is with *mutable state* in general. Global variables, instance attributes, class attributes, they all have exactly the same issues.
So don't use mutable state. Nobody is forcing you to use this feature if you prefer to write in a functional style with no mutable state.
I don’t think anyone has accused me of advocating functional programming before ;-) I don’t have a problem with mutable state in general, just with singletons. That includes globals and class attributes, but not necessarily instance variables or other data structures.
The factory function doesn’t need to be part of the public API of a module, I’ve used a pattern like this to create APIs with some hidden state:
``` def make_api(): state = ...
def api1(…): … def ap2(…): …
return api1, api2 api1, api2 = make_api() ```
Congratulations, you've just used static local variables. You just used closures for the implementation.
I’m not saying that this is a particularly good way to structure code, in general just using a private module global is better (assuming the design calls for some kind of global state).
You: "Global state is bad! Don't use global state!"
Also you: "Don't use local state (closures)! Use global state, it's better!"
No, I wrote that *if* you want to use global state there are already multiple ways to do this. I’m just as flawed as anyone and do use global state when that’s convenient. But this has a tendency of causing problems later on, especially in larger projects.
*wink*
Slightly more seriously, I also wrote that the OP hasn’t provided a better reason for adding this feature than “other languages have this”. Ronald — Twitter / micro.blog: @ronaldoussoren Blog: https://blog.ronaldoussoren.net/
On Fri, May 28, 2021 at 7:28 PM Ronald Oussoren via Python-ideas <python-ideas@python.org> wrote:
I honestly don’t see the difference between:
def spam(arg, static=[0]): …
and
_static = 0 def spam(arg): global _static; …
The difference is where the state is stored. State in the latter example is less tightly coupled to the function and is easier to access externally, but for both cases there is a documented way to access them (spam.__defaults__[’static’] for the first example). In both cases there’s effectively a singleton object that stores data, which makes testing harder because it is harder to ensure the right preconditions for testing (e.g. some other code might have called the function and affected the function state in an unexpected way).
They're not very different if the function is itself global; you'd have to make sure you don't have a name collision with any other function that also uses global state, but other than that, they're basically going to behave the same way. But if spam() is defined in any other context, it's no longer equivalent. The default argument is tied to the function object, not to its surrounding context. You could create five spam() functions in a loop, and each one has its own static value. You can't do that with global or nonlocal. ChrisA
On Thu, May 27, 2021 at 08:06:22PM -0700, Christopher Barker wrote:
Well, there is thread safe, and there is thread dangerous. I have an enormous amount of code that is not strictly thread safe, but works fine when run under a multi-threaded web server because there are no cases where the same instances of objects are running in different threads.
Cool. The easiest way to write safe concurrent code is to avoid shared mutable data.
Well, not true, there are many shared function objects (and class objects). I very much assume that those function objects are not changing at run time. But having static variables would totally break that assumption.
That assumption is already broken, and has been forever. Functions already have potential shared state. Static variables would not add any risk that isn't already there. Functions can already use globals, or mutable defaults, or function attributes. If any of them are written in C, they can already use static variables.
That would be like mutating class attributes, which, of course, is perfectly possible, but a little harder to do without thinking about it.
self.items.append(x) Is that a class attribute or instance attribute being modified? [...]
Exactly— and functions are always shared if you are using threads at all.
But not mutable if used in the usual way.
They are mutable, always. What you mean to say is that they aren't mutated, because nobody actually mutates them. They could, but they don't. And that won't change. [...]
Mutable default values are a notable “gotcha”.
They're a gotcha for people who expect that function defaults are evaluated on each function call. Just like assignment is a notable gotcha for people who expect that assignment copies: a = [1, 2, 3] b = a b.append(5) # why does `a` magically change when I modified the copy??? For those who understand this, it is a feature, not a bug, that defaults are eagerly evaluated once and binding shares references and doesn't copy.
classes with attributes, etc.
Yes, I think this is most like class attributes, which are probably less well known as a “gotcha”.
There are plenty of people who get confused why class attributes are shared between all instances. Especially for those whose only exposure to OOP is with Java: class MyClass: # this is a shared class attribute # not a per instance attribute items = [] https://stackoverflow.com/questions/1680528/how-to-avoid-having-class-data-s... https://stackoverflow.com/questions/11040438/class-variables-is-shared-acros... https://stackoverflow.com/questions/45284838/are-the-attributes-in-a-python-... https://www.geeksforgeeks.org/python-avoiding-class-data-shared-among-the-in... among others. This is a genuine gotcha that trips up beginners who haven't yet mastered Python's object model.
But also not that widely used by accident. I don’t think I’ve even seen a student use them. And I sure have seen mutable default values mistakenly used in students' code.
Oh come on Christopher, you can't have it both ways -- mutable default values are either a gotcha that trip beginners up, or beginners never use it. It can't be both.
Another point -- whether there is a static variable in the function becomes a not obvious part of its API. That's probably the biggest issue -- some library author used static, and all its users may not know that, and then use the code in a multi-threaded application, or frankly in a single threaded application that isn't expecting functions to be mutated.
"whether there is a **global variable** in the function becomes a not obvious part of its API. That's probably the biggest issue -- some library author used global, and all its users may not know that..." "whether there is a **mutable default** used for static storage in the function becomes a not obvious part of its API. That's probably the biggest issue -- some library author used the default value trick, and all its users may not know that..." "whether there is a **shared class (or instance) attribute** used in the function becomes a not obvious part of its API. That's probably the biggest issue -- some library author used shared class attributes, and all its users may not know that..." You've been a Python user long enough that you know that there are many ways for functions to store data that will persists from one call to another. We have globals, we have classes, we have generators, we have closures, we can write data to files and read it back, there are probably others I haven't thought of. None of them are *nice to use*, they are slow and awkward, or have scope problems (globals), hard to debug, expose things that shouldn't be exposed, etc. But they already do everything you are afraid static variables will do. Static variables won't be a problem is practice because all those other functionally equivalent ways are not problems in practice. If people need persistent data, they're already using it. This proposal just makes it less painful. And if the function *doesn't* need persistent data, then adding static variables isn't going to make people suddenly use it when there is no need for it. "Oh, I was going to write a pure function that calculates the number of seconds between two dates, but now that we have static I'm going to make it impure just to screw up threaded code, mwahahahaha!!!" This is a usability, and maybe performance, improvement over the status quo, not a fundamentally new idea that nobody has ever had before. [...]
And since the entire point of classes is to hold state and the functions that work with that state in one place, it's expected behaviour that that state changes.
And yet, in this very post, you said that you use shared classes in threaded code: "there are many shared function objects (and class objects)" It is perfectly safe to use mutable objects in concurrent code if you don't actually mutate them. You have been doing that for many years, and nothing will change for you if we get statics.
Which makes me realize why I never wanted a function static variable -- if I wanted changable state associated with a function, I used a class.
Okay. Let's compare a trivial counter function with a class. # Make a callable that counts the number of times it is called. class Counter: def __init__(self): self.count = 0 def __call__(self): self.count += 1 return self.count counter = Counter() Seven lines, two objects with two methods. Versus hypothetical: def counter(): static count = 0 count += 1 return count Four lines, only a single object. And with the new walrus operator: def counter(): static count = 0 return count := (count + 1) if saving one line is important *wink* There is nothing that either closures or generators can do that can't be done with a class. But we have both of those because for many cases they make it easier, more convenient and faster (faster to write, faster to read, faster to run) than a class.
And most commonly there was more than one function associated with that state, so a class was the rigth solution anyway.
Great. And that won't go away.
I know that Jack Diederich says: "if a class has only two functions, and one of them is __init__ -- it's not a class", and I totally agree with him, but if you do have a case where you have some state and only one function associated with it, maybe a class IS the right solution.
Only because we don't have static variables, and the other alternatives (globals etc) are maybe worse than a class. But Jack is still right: for something that needs only a single method (plus init) it probably shouldn't be a class. -- Steve
On Thu, May 27, 2021 at 07:13:56PM -0700, Brendan Barnwell wrote:
On 2021-05-27 14:33, Steven D'Aprano wrote:
But even if we did have actual constants, how does that help get static *variables*, you know, things that aren't constant but can vary?
All of those use cases can already be handled with a class that stores its data in an attribute.
And with one sentence you have just explained why Python doesn't have closures or generators. There is nothing that they do that can't be handled by a class. Oh wait, Python does have closures (since version 1.5) and generators (since 2.2). Maybe your argument "just use a class" isn't quite as convincing as you hoped. https://pyvideo.org/pycon-us-2012/stop-writing-classes.html (I actually love classes. I just don't think that every function should be a class.) We're not using Java. We have first class (pun intended) functions, and they are, perhaps, even more important than classes. They're certainly easier to right and more efficient for many purposes. Instead of writing a minimal bundle of * a class * with an init method * and a call method * and then create an instance this proposal will allow us to encapsulate that in a single function, just as closures and generators do. -- Steve
On Fri, May 28, 2021 at 04:20:15AM +1000, Chris Angelico wrote:
def f(): static x = 0 x += 1 yield x
next(f()) next(f()) next(f())
will yield 1 every time?
I think that this example has just about convinced me that Chris' approach is correct. I wasn't thinking about generators or recursion. I think that closure nonlocals are almost as fast as locals, so we might be able to use the closure mechanism to get this. Something vaguely like this: def func(): static var = initial body is transformed into: def factory(): var = initial def func(): nonlocal var body return func func = factory() except that the factory is never actually exposed to Python code. It would be nice if there was some way to introspect the value of `var` but if there is a way to do it I don't know it. We might not even need new syntax if we could do that transformation using a decorator. @static(var=initial) def func(): body -- Steve
On Fri, 28 May 2021 at 13:11, Steven D'Aprano <steve@pearwood.info> wrote:
We might not even need new syntax if we could do that transformation using a decorator.
@static(var=initial) def func(): body
The problem here is injecting the "nonlocal var" statement and adjusting all of the references to the variable in body. I don't think that can be done short of bytecode manipulation. Paul
On Fri, May 28, 2021 at 10:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 28, 2021 at 04:20:15AM +1000, Chris Angelico wrote:
def f(): static x = 0 x += 1 yield x
next(f()) next(f()) next(f())
will yield 1 every time?
I think that this example has just about convinced me that Chris' approach is correct. I wasn't thinking about generators or recursion.
I think that closure nonlocals are almost as fast as locals, so we might be able to use the closure mechanism to get this. Something vaguely like this:
def func(): static var = initial body
is transformed into:
def factory(): var = initial def func(): nonlocal var body return func func = factory()
except that the factory is never actually exposed to Python code.
I think that would probably work, but even better would be if the outer function didn't actually exist. A bit of playing around suggests that LOAD_DEREF could just work here. If you have multiple levels of nonlocals, the function flattens them out into a tuple in f.__closure__, identifying them by index. Statics could be another level of nonlocals that doesn't actually require a function as such. In terms of describing the semantics, I think this is probably the cleanest way to give a pure-Python equivalent.
It would be nice if there was some way to introspect the value of `var` but if there is a way to do it I don't know it.
No idea about other implementations, but in CPython, you can look at f.__closure__[*].cell_contents, but you'd need to know the mapping from static name to lookup index. I think the corresponding names are in f.__code__.co_freevars, but I'm not sure if there are any other things that go in there.
We might not even need new syntax if we could do that transformation using a decorator.
@static(var=initial) def func(): body
Hmm, there'd need to be some transformations, since the code generator is always going to use lexical scope. You can't magically change a LOAD_GLOBAL into a LOAD_DEREF with a decorator - the only way would be to do some fairly hairy rewriting, and you might lose a lot of the efficiency. This really needs proper compiler support, and if it gets compiler support, it may as well have dedicated syntax. ChrisA
I was thinking about introducing new opcodes for implementing static variables. Not sure though. All of the ideas actually do the same thing. The difference is approach. On Fri, May 28, 2021 at 6:57 PM Chris Angelico <rosuav@gmail.com> wrote:
On Fri, May 28, 2021 at 10:11 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 28, 2021 at 04:20:15AM +1000, Chris Angelico wrote:
def f(): static x = 0 x += 1 yield x
next(f()) next(f()) next(f())
will yield 1 every time?
I think that this example has just about convinced me that Chris' approach is correct. I wasn't thinking about generators or recursion.
I think that closure nonlocals are almost as fast as locals, so we might be able to use the closure mechanism to get this. Something vaguely like this:
def func(): static var = initial body
is transformed into:
def factory(): var = initial def func(): nonlocal var body return func func = factory()
except that the factory is never actually exposed to Python code.
I think that would probably work, but even better would be if the outer function didn't actually exist. A bit of playing around suggests that LOAD_DEREF could just work here. If you have multiple levels of nonlocals, the function flattens them out into a tuple in f.__closure__, identifying them by index. Statics could be another level of nonlocals that doesn't actually require a function as such.
In terms of describing the semantics, I think this is probably the cleanest way to give a pure-Python equivalent.
It would be nice if there was some way to introspect the value of `var` but if there is a way to do it I don't know it.
No idea about other implementations, but in CPython, you can look at f.__closure__[*].cell_contents, but you'd need to know the mapping from static name to lookup index. I think the corresponding names are in f.__code__.co_freevars, but I'm not sure if there are any other things that go in there.
We might not even need new syntax if we could do that transformation using a decorator.
@static(var=initial) def func(): body
Hmm, there'd need to be some transformations, since the code generator is always going to use lexical scope. You can't magically change a LOAD_GLOBAL into a LOAD_DEREF with a decorator - the only way would be to do some fairly hairy rewriting, and you might lose a lot of the efficiency. This really needs proper compiler support, and if it gets compiler support, it may as well have dedicated syntax.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7CXYEB... Code of Conduct: http://python.org/psf/codeofconduct/
On 2021-05-28 04:53, Steven D'Aprano wrote:
On Thu, May 27, 2021 at 07:13:56PM -0700, Brendan Barnwell wrote:
On 2021-05-27 14:33, Steven D'Aprano wrote:
But even if we did have actual constants, how does that help get static *variables*, you know, things that aren't constant but can vary?
All of those use cases can already be handled with a class that stores its data in an attribute.
And with one sentence you have just explained why Python doesn't have closures or generators. There is nothing that they do that can't be handled by a class.
Oh wait, Python does have closures (since version 1.5) and generators (since 2.2). Maybe your argument "just use a class" isn't quite as convincing as you hoped.
I see your point, but I don't agree that static function variables are parallel to either closures or generators. Closures are, to my mind, just an outgrowth of Python's ability to define functions inside other functions. If you're going to allow such nested functions, you have to have some well-defined behavior for variables that are defined in the other function and used in the inner one, and closures are just a reasonable way to do that. As for generators, they are tied to iteration, which is a pre-existing concept in Python. Generators provide a way to make your own functions/objects that work in a `for` loop in a manner that naturally extends the existing iteration behavior of lists, tuples, etc. The proposal for static function variables is quite different. It does not hook into or extend any existing control flow structure as generators do, nor is it necessary to ground allowed syntax as closures are. It's just a new proposal to add a totally new behavior to functions. And the behavior that it adds --- storing state --- is exactly what attributes are already used for across all kinds of contexts in Python. So it's not just that you can theoretically contort your code to use classes and attributes instead of using static function variables. It's that classes and attributes are already designed to do and already do exactly what static function variables are supposedly going to do, namely store state across multiple calls. Also, although I haven't used dataclasses extensively, it seems that dataclasses would provide a pretty simple way of doing this, since they provide a concise way of specifying the necessary attributes so you don't need to write an `__init__`. It's true there is a small additional cost in that you'd have to instantiate the class and then access your "static variables" as `self.blah` instead of as a local variable `blah`, but that minor increase in verbosity doesn't, to me, justify the creation of static function variables as a whole new language construct. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Fri, May 28, 2021 at 07:37:57PM -0700, Brendan Barnwell wrote:
I see your point, but I don't agree that static function variables are parallel to either closures or generators.
Okay, this is an important point, I think. I argue that some sort of sugar for static storage in functions is exactly analogous to closures and generators. The history of Python demonstrates a pattern of using classes for complex data structures with state and *multiple* behaviours (methods), while using functions for the simple cases where you have only a single behaviour and a small amount of state, without the anti-pattern of global variables. Closures, generators and coroutines are all ways of doing this.
Closures are, to my mind, just an outgrowth of Python's ability to define functions inside other functions. If you're going to allow such nested functions, you have to have some well-defined behavior for variables that are defined in the other function and used in the inner one, and closures are just a reasonable way to do that.
When closures and lexical scoping were first introduced, they were intentionally and explicitly limited because classes could do everything that closures can. Here is the history of the feature. Python didn't gain closures until version 2.1 (with a future import): https://www.python.org/dev/peps/pep-0227/ Before then, nested functions behaved quite differently. This is Python1.5: >>> x = "outside" >>> def function(): ... x = "inside" ... def inner(): ... print x ... inner() ... >>> function() outside That's a simple, reasonable behaviour. Inner functions were allowed by the language but they didn't add much functionality and consequently were hardly ever used. Lexical scoping changed that, and made inner functions much more useful. At the time people wanted a way to rebind nonlocal variables, but that was explicitly rejected because: "this would encourage the use of local variables to hold state that is better stored in a class instance" https://www.python.org/dev/peps/pep-0227/#id15 That comment has aged like milk. By the time Python 2.3 and 2.4 came along and people were talking about some future "Python 3000", it had already become clear that it would be useful to rebind nonlocal names and hold state encapsulated in a function without going to all the trouble of creating a class. And so in Python 3 we gained the nonlocal keyword and the ability to store and modify state in a closure. The proposed static statement would be sugar for functionality already possible: storing per function data in the function without needing to write a class. Classes are great, but not everything is a nail that needs to be hammered with a class.
As for generators, they are tied to iteration, which is a pre-existing concept in Python. Generators provide a way to make your own functions/objects that work in a `for` loop in a manner that naturally extends the existing iteration behavior of lists, tuples, etc.
We already had a way to create our own iterable values: classes using the sequence or iterator protocols. Iteration using `__getitem__` and IndexError was possible all the way back to Python 1.x. The iterator protocol with `__iter__` and `__next__` wasn't introduced until 2.2 (and originally the second dunder was spelled `next`). https://www.python.org/dev/peps/pep-0234/ Generators were also introduced in 2.2, explicitly as a way for functions to *hold state from one call to the next*: https://www.python.org/dev/peps/pep-0255/ The motivation section of the PEP starts with this: "When a producer function has a hard enough job that it requires maintaining state between values produced, most programming languages offer no pleasant and efficient solution ..." and goes on to discuss alternatives such as functions with global state. (Global variables.) One alternative left out is to write a class, possibly because everyone acknowledged that writing a single function with its own state is so obviously superior to a class for solving this sort of problem that nobody bothered to list it as an alternative. (You *can* spread butter on bread using a surf board, but why would you even try when you have a butterknife?) The next step in the evolution of function-local state was to make generators two-way coroutines. (Alas, the name "corountine" has been hijacked by async for a related but different concept, so there is some unavoidable terminology confusion here.) https://www.python.org/dev/peps/pep-0342/ Generators now have send and throw methods, even when you can't use them for anything useful! >>> g = (x+1 for x in range(10)) >>> g.send <built-in method send of generator object at 0x7f38c8545b30> >>> g.throw <built-in method throw of generator object at 0x7f38c8545b30> So that's yet another unobvious way to get static storage in a function: use a PEP 343 enhanced generator coroutine. http://www.dabeaz.com/coroutines/index.html The downside of this is that the body of the function has to use yield in an infinite loop, and the caller has to use a less familiar `func.send(arg)` syntax instead of using the familiar `func(arg)` syntax. Again, there is nothing that coroutines or generators can do which cannot be done with classes. But people prefer generators, because having to write an entire class with multiple methods just to store a bit of state from one function call to the next sucks. Here's an example of a coroutine that implements a simple counter: >>> def counter(): ... n = 0 ... while True: ... n += 1 ... yield n ... >>> func = counter() >>> func.send(None) 1 >>> func.send(None) 2 Too much boilerplate to make this a compelling alternative, although with a bit of jiggery-pokery we can make it nicer: >>> from functools import partial >>> func = partial(func.send, None) >>> func() 3 >>> func() 4 So there we go: a pocket tour of the history of per function state in Python. None of the alternatives are really smooth, and classes least of all. That's where this proposal for a static keyword comes into it: syntactic sugar for what we can and already do, to make it smoother and easier for functions to keep state alive from one call to the next. Just like closures, generators and coroutines. -- Steve
participants (11)
-
2QdxY4RzWzUUiLuE@potatochowder.com
-
Brendan Barnwell
-
Chris Angelico
-
Christopher Barker
-
Joao S. O. Bueno
-
MRAB
-
Paul Moore
-
Ricky Teachey
-
Ronald Oussoren
-
Shreyan Avigyan
-
Steven D'Aprano