Live variable analysis -> earlier release?

Look at the following code. def foo(a, b): x = a + b if not x: return None sleep(1) # A calculation that does not use x return a*b This code DECREFs x when the frame is exited (at the return statement). But (assuming) we can clearly see that x is not needed during the sleep (representing a big calculation), we could insert a "del x" statement before the sleep. I think our compiler is smart enough to find out *some* cases where it could safely insert such del instructions. And this would potentially save some memory. (For example, in the above example, if a and b are large lists, x would be an even larger list, and its memory could be freed earlier.) For sure, we could manually insert del statements in code where it matters to us. But if the compiler could do it in all code, regardless of whether it matters to us or not, it would probably catch some useful places where we wouldn't even have thought of this idea, and we might see a modest memory saving for most programs. Can anyone tear this idea apart? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

This would break uses of locals(), e.g. def foo(a, b): x = a + b if not x: return None del x print('{x}, {a}, {b}'.format(**locals())) return a * b foo(1, 2) Plus if the calculation raises an exception and I'm looking at the report on Sentry, I'd like to see the values of all variables. In particular I might have expected the function to return early and I want to see what `x` was.

On Wed, Apr 8, 2020 at 10:05 AM Alex Hall <alex.mojaki@gmail.com> wrote:
This would break uses of locals(), e.g.
Hm, okay, so suppose the code analysis was good enough to recognize most un-obfuscated uses of locals(), exec() and eval() (and presumably sys._getframe() -- IIUC there's already a Python implementation that generates less optimal code for functions where it detects usage of sys._getframe(), maybe IronPython).
That's a very valid objection. For simpletons like myself who just use pdb it could also be problematic. So at the very least there would have to be a way to turn it off, and probably it should have to be requested explicitly (maybe just with -O). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Wed, Apr 8, 2020 at 10:21 AM Guido van Rossum <guido@python.org> wrote:
Even though it seems like a pedantically correct behavior to toss something after no future direct references to it are detected, it would just break so much existing code that the upgrade to an interpreter doing this would be a nightmare. Yes, people should in theory be using a context manager or explicit reference to things that need to live, but what _is_ an explicit reference if not the local scope? In practice code all over the place assumes "still in local scope" means the reference lives on. Including wrapped C/C++ code where the destructor frees underlying resources that are needed by other things outside of CPython's view. Different behavior when debugging or adding a debug print vs when running normally is bad. Seeing surprising behavior changes due to the addition or removal of code later in a scope that inadvertently changes when something's destructor is called is action at a distance and hard to debug. In my experience, people understand scope-based lifetime. It is effectively the same as modern C++ STL based pointer management semantics; destruction only happens after going out of scope or when explicitly called for. Making it optional? It'd need to be at least at a per-file level (from __future__ style... but called something else as this isn't likely to be our future default). A behavior change of this magnitude globally for a program with -O would just reinforce the existing practice of nobody practically using -O. (does anyone actually run their tests under -O let alone deploy with -O? I expect that to be a single digit % or lower minority) -gps

[Guido]
My guess: it would overwhelmingly free tiny objects, giving a literally unmeasurable (just theoretically provable) memory savings, at the cost of adding extra trips around the eval loop. So not really attractive to me. But when I leave "large" temp objects hanging and give a rip, I already stick in "del" statements anyway. Very rarely, but it happens. Which is addressing it at a higher level than any other feedback you're going to get ;-) Of course there can be visible consequences when people are playing with introspection gimmicks.

On Wed, Apr 8, 2020 at 10:13 AM Tim Peters <tim.peters@gmail.com> wrote:
Yeah, the extra opcodes might well kill the idea for good.
I recall that in the development of asyncio there were a few places where we had to insert del statements, not so much to free a chunk of memory, but to cause some destructor or finalizer to run early enough. (IIRC not right at that moment, but at some later moment even if some Futures are still alive.) Those issues took considerable effort to find, and would conceivably have been prevented by this proposal.
Plus about all the debugging that would ensue because destructors/finalizers are running *earlier* than expected. ;-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Wed, Apr 8, 2020, 10:37 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
As far as I know they all do? The existence of locals() as an API cements this behavior. If you want something to disappear from locals it requires an explicit del. (explicit is better than implicit and all...) I'd actually accept this optimization in something like micropython where bending rules to fit in mere kilobytes makes sense. But in CPython I want to see serious demonstrated practical benefit before changing this behavior in any file by default. (it could be implemented per file based on a declaration; this would be a bytecode optimization pass) -gps

On Thu, 9 Apr 2020 20:56:56 -0700 "Gregory P. Smith" <greg@krypto.org> wrote:
I mean all Python implementations would have to implement the exact same variant of live variable analysis. Right now there is none: variables are deleted when the frame dies (or when `del x` is issued explicitly, which does not imply any analysis :-)). Regards Antoine.

On Wed, 8 Apr 2020 09:53:41 -0700 Guido van Rossum <guido@python.org> wrote:
The problem is if variable `x` has a side-effect destructor. It's certainly not a common idiom to keep a resource alive simply by keeping a Python object around (you should probably use `with` instead), but I'm sure some people do it. FWIW, Numba does something similar to try and release memory earlier. But Numba certainly doesn't claim to support general-purpose Python code :-) Regards Antoine.

On Apr 8, 2020, at 09:57, Guido van Rossum <guido@python.org> wrote:
It depends on how much you’re willing to break and still call it “safely”. def sleep(n): global store store = inspect.current_frame().f_back.f_locals['x'] This is a ridiculous example, but it shows that you can’t have all of Python’s dynamic functionality and still know when locals are dead. And there are less ridiculous examples with different code. If foo actually calls eval, exec, locals, vars, etc., or if it has a nested function that nonlocals x, etc., how can we spot that at compile time and keep x alive? Maybe that’s ok. After all, that code doesn’t work in a Python implementation that doesn’t have stack frame support. Some of the other possibilities might be more portable, but I don’t know without digging in further. Or maybe you can add new restrictions to what locals and eval and so on guarantee that will make it ok? Some code will break, but only rare “expert” code, where the authors will know how to work around it. Or, if not, it’s definitely fine as an opt-in optimization: decorate the function with @deadlocals and that decorator scans the bytecode and finds any locals that are dead assuming there’s no use of locals/eval/cells/etc. and, because you told it to assume that by opting in to the decorator, it can insert a DELETE_FAST safely. People already do similar things today—e.g., I’ve (only once in live code, but that’s still more than zero) used a @fastconst decorator that turns globals into consts on functions that I know are safe and are bottlenecks, and this would be no different. And of course you can add a recursive class decorator, or an import hook (or maybe even a command line flag or something) that enables it everywhere (maybe with a @nodeadlocals decorator for people who want it _almost_ everywhere but need to opt out one or two functions). Did Victor Stinner explore this as one of the optimizations for FAT Python/PEP 511/etc.? Maybe not, since it’s not something you can insert a guard, speculatively do, and then undo if the guard triggers, which was I think his key idea.

Could something just heuristically add del statements with an AST transformation that we could review with source control before committing? When the gc pause occurs is something I don't fully understand. For example: FWIW, this segfaults CPython in 2 lines: import ctypes ctypes.cast(1, ctypes.py_object) Interestingly, this (tends to?) work; even when there are ah scope closures?: import ctypes, gc x = 22 _id = id(x) del x gc.collect() y = ctypes.cast(_id, ctypes.py_object).value assert y == 22 Adding explicit calls to del with e.g. redbaron or similar would likely be less surprising. https://redbaron.readthedocs.io/en/latest/ Or something like a @jit decorator that pros would be aware of (who could just add del statements to free as necessary) On Wed, Apr 8, 2020, 10:29 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:

On Apr 8, 2020, at 23:53, Wes Turner <wes.turner@gmail.com> wrote:
Could something just heuristically add del statements with an AST transformation that we could review with source control before committing?
When the gc pause occurs is something I don't fully understand. For example:
Your examples don’t have anything to do with gc pause.
Yes, because this is ultimately trying to print the repr of (PyObject*)1, which means calling some function that tries to dereference some member of a struct at address 1, which means trying to access an int or pointer or whatever at address 1 or 9 or 17 or whatever. On most platforms, those addresses are going to be unmapped (and, on some, illegally aligned to boot), so you’ll get a segfault. This has nothing to do with the GC, or with Python objects at all.
The gc.collect isn’t doing anything here. First, the 22 object, like other small integers and a few other special cases, is immortal. Even after you del x, the object is still alive, so of course everything works. Even if you used a normal object that does get deleted, it would get deleted immediately when the last reference to the value goes away, in that del x statement. The collect isn’t needed and doesn’t do anything relevant here. (It’s there to detect reference cycles, like `a.b=b; b.a=a; del a; del b`. Assuming a and b were the only references to their objects at the start, a.b and b.a are the only references at the end. They won’t be deleted by refcounting because there’s still one reference to each, but they are garbage because they’re not accessible. The gc.collect is a cycle detector that handles exactly this case.) But your code may well still often work on most platforms. Deleting an object rarely unmaps its memory; it just returns that memory to the object allocator’s store. Eventually that memory will be reused for another object, but until it is, it will often still look like a perfectly valid value if you cheat and look at it (as you’re doing). (And even after it’s reused, it will often end up getting reused by some object of the same shape, so you won’t crash, you’ll just get odd results.) Anyway, getting off this side track and back to the main point: releasing the locals reference to an object that’s no longer being used locally isn’t guaranteed to destroy the object—but in CPython, if locals is the only reference, the object will be destroyed immediately. That’s why Guido’s optimization makes sense. The only way gc pause is relevant is for other implementations. For example, if CPython stops guaranteeing that x is alive until the end of the scope under certain conditions, PyPy could decide to do the same thing, and in PyPy, there is no refcount; garbage is deleted when it’s detected by the GC. So it wouldn’t be deterministic when x goes away, and the question of how much earlier does it go away and how much benefit there is becomes more complicated than in CPython. But the PyPy guys seem to be really good at figuring out how to test such questions empirically.

Thanks for removing the mystery. FWIW, here are some of the docs and resources for memory management in Python; I share these not to be obnoxious or to atoen, but to point to the docs that would need updating to explain what is going on if this is not explicit. - https://docs.python.org/3/reference/datamodel.html#object.__del__ - https://docs.python.org/3/extending/extending.html?highlight=__del__#thin-ic... - https://docs.python.org/3/c-api/memory.html - https://docs.python.org/3/library/gc.html - https://docs.python.org/3/library/tracemalloc.html - https://devguide.python.org/gdb/ - https://devguide.python.org/garbage_collector/ - https://devguide.python.org/garbage_collector/#optimization-reusing-fields-t... - https://doc.pypy.org/en/latest/gc_info.html - https://github.com/jythontools/jython/blob/master/src/org/python/modules/gc.... https://javadoc.io/doc/org.python/jython-standalone/2.7.2/ org/python/modules/gc.html - https://github.com/IronLanguages/ironpython2/blob/master/Src/IronPython.Modu... https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test... https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test... https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython.Modu... - "[Python-Dev] Re: Mixed Python/C debugging" https://mail.python.org/archives/list/python-dev@python.org/message/Z3S2RAXR... - @anthonypjshaw's CPython Internals book has a (will have a) memory management chapter. - > And then take a look at how @ApacheArrow "supports zero-copy reads for lightning-fast data access without serialization overhead." - .@blazingsql … #cuDF … @ApacheArrow https://docs.blazingdb.com/docs/blazingsql … New #DataFrame Interface and when that makes a copy for 2x+ memory use - "A dataframe protocol for the PyData ecosystem" https://discuss.ossdata.org/t/a-dataframe-protocol-for-the-pydata-ecosystem/... Presumably, nothing about magic del statements would affect C extensions, Cython, zero-copy reads, or data that's copied to the GPU for faster processing; but I don't understand this or how weakrefs and c-extensions share memory that could be unlinked by a del. Would be interested to see the real performance impact of this potential optimization: - 10%: https://instagram-engineering.com/dismissing-python-garbage-collection-at-in... On Thu, Apr 9, 2020 at 2:48 PM Andrew Barnert <abarnert@yahoo.com> wrote:

On Apr 9, 2020, at 15:13, Wes Turner <wes.turner@gmail.com> wrote:
This isn’t relevant here at all. How objects get constructed and manage their internal storage is completely orthogonal to the how Python manages object lifetimes.
Same here.
Presumably, nothing about magic del statements would affect C extensions, Cython, zero-copy reads, or data that's copied to the GPU for faster processing; but I don't understand this or how weakrefs and c-extensions share memory that could be unlinked by a del.
And same for some of this—but not all. C extensions can do the same kind of frame hacking, etc., as Python code, so they will have the same problems already raised in this thread. But I don’t think they they add anything new. (There are special rules allowing you to cheat with objects that haven’t been shared with Python code yet, which sounds like it would make things more complicated—until you realize that objects that haven’t been shared with Python code obviously can’t be affected by when Python code releases references.) But weakrefs would be affected, and that might be a problem with the proposal that I don’t think anyone else has noticed before you. Consider this toy example: spam = make_giant_spam() weakspam = weakref.ref(spam) with ThreadPoolExecutor() as e: for _ in range(1000): e.submit(dostuff, weakspam) Today, the spam variable lives until the end of the scope, which doesn’t happen until the with statement ends, which doesn’t happen until all 1000 tasks complete. So, the object in that variable is still alive for all of the tasks. With Guido’s proposed change, the spam variable is deleted after the last statement that uses it, which is before the with statement is even entered. Assuming it’s the only (non-weak) reference to the object, which is probably true, it will get destroyed, releasing all the memory (or other expensive resources) used by that giant spam object. That’s the whole point of the proposal, after all. But that means weakspam is now a dead weakref. So all those dostuff tasks are now doing stuff with a dead weakref. Presumably dostuff is designed to handle that safely, so you won’t crash or anything—but it can’t do the actual stuff you wanted it to do with that spam object. And, while this is obviously a toy example, perfectly reasonable real code will do similar things. It’s pretty common to use weakrefs for cases where 99% of the time the object is there but occasionally it’s dead (e.g., during graceful shutdown), and changing that 99% to 0% or 1% will make the entire process useless. It’s also common to use weakrefs for cases where 80% of the time the object is there but 20% of the time it’s been ejected from some cache and has to be regenerated; changing that 80% to 1% will mean the process still functions, but the cache is no longer doing anything, so it functions a lot slower. And so on. So, unless you could introduce some compiler magic to detect weakref.ref and weakref.weakdict.__setitem__ and so on (which might not be feasible, especially since it’s often buried inside some wrapper code), this proposal might well break many, maybe even most, good uses of weakrefs.
Would be interested to see the real performance impact of this potential optimization: - 10%: https://instagram-engineering.com/dismissing-python-garbage-collection-at-in...
Skimming this, it looks like this one is not just orthogonal to Guido’s proposal, it’s almost directly counter to it. Their goal is to have relatively short-lived killable children that defer refcount twiddling and destruction as much as possible so that fork-inherited objects don’t have to be copied and temporary objects don’t have to be cleaned up, they can just be abandoned. Guido’s goal is to get things decref’d and therefore hopefully destroyed as early as possible. Anyway, their optimization is definitely useful for a special class of programs that meet some requirements that sound unusual until you realize a lot of web servers/middlewares are designed around nearly the same requirements. People have done similar (in fact, even more radical, akin to building CPython and all of your extensions with refcounting completely disabled) in C and other languages, and there’s no reason (if you’re really careful) it couldn’t work in Python. But it’s certainly not the behavior you’d want from a general-purpose Python implementation.

This would break uses of locals(), e.g. def foo(a, b): x = a + b if not x: return None del x print('{x}, {a}, {b}'.format(**locals())) return a * b foo(1, 2) Plus if the calculation raises an exception and I'm looking at the report on Sentry, I'd like to see the values of all variables. In particular I might have expected the function to return early and I want to see what `x` was.

On Wed, Apr 8, 2020 at 10:05 AM Alex Hall <alex.mojaki@gmail.com> wrote:
This would break uses of locals(), e.g.
Hm, okay, so suppose the code analysis was good enough to recognize most un-obfuscated uses of locals(), exec() and eval() (and presumably sys._getframe() -- IIUC there's already a Python implementation that generates less optimal code for functions where it detects usage of sys._getframe(), maybe IronPython).
That's a very valid objection. For simpletons like myself who just use pdb it could also be problematic. So at the very least there would have to be a way to turn it off, and probably it should have to be requested explicitly (maybe just with -O). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Wed, Apr 8, 2020 at 10:21 AM Guido van Rossum <guido@python.org> wrote:
Even though it seems like a pedantically correct behavior to toss something after no future direct references to it are detected, it would just break so much existing code that the upgrade to an interpreter doing this would be a nightmare. Yes, people should in theory be using a context manager or explicit reference to things that need to live, but what _is_ an explicit reference if not the local scope? In practice code all over the place assumes "still in local scope" means the reference lives on. Including wrapped C/C++ code where the destructor frees underlying resources that are needed by other things outside of CPython's view. Different behavior when debugging or adding a debug print vs when running normally is bad. Seeing surprising behavior changes due to the addition or removal of code later in a scope that inadvertently changes when something's destructor is called is action at a distance and hard to debug. In my experience, people understand scope-based lifetime. It is effectively the same as modern C++ STL based pointer management semantics; destruction only happens after going out of scope or when explicitly called for. Making it optional? It'd need to be at least at a per-file level (from __future__ style... but called something else as this isn't likely to be our future default). A behavior change of this magnitude globally for a program with -O would just reinforce the existing practice of nobody practically using -O. (does anyone actually run their tests under -O let alone deploy with -O? I expect that to be a single digit % or lower minority) -gps

[Guido]
My guess: it would overwhelmingly free tiny objects, giving a literally unmeasurable (just theoretically provable) memory savings, at the cost of adding extra trips around the eval loop. So not really attractive to me. But when I leave "large" temp objects hanging and give a rip, I already stick in "del" statements anyway. Very rarely, but it happens. Which is addressing it at a higher level than any other feedback you're going to get ;-) Of course there can be visible consequences when people are playing with introspection gimmicks.

On Wed, Apr 8, 2020 at 10:13 AM Tim Peters <tim.peters@gmail.com> wrote:
Yeah, the extra opcodes might well kill the idea for good.
I recall that in the development of asyncio there were a few places where we had to insert del statements, not so much to free a chunk of memory, but to cause some destructor or finalizer to run early enough. (IIRC not right at that moment, but at some later moment even if some Futures are still alive.) Those issues took considerable effort to find, and would conceivably have been prevented by this proposal.
Plus about all the debugging that would ensue because destructors/finalizers are running *earlier* than expected. ;-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Wed, Apr 8, 2020, 10:37 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
As far as I know they all do? The existence of locals() as an API cements this behavior. If you want something to disappear from locals it requires an explicit del. (explicit is better than implicit and all...) I'd actually accept this optimization in something like micropython where bending rules to fit in mere kilobytes makes sense. But in CPython I want to see serious demonstrated practical benefit before changing this behavior in any file by default. (it could be implemented per file based on a declaration; this would be a bytecode optimization pass) -gps

On Thu, 9 Apr 2020 20:56:56 -0700 "Gregory P. Smith" <greg@krypto.org> wrote:
I mean all Python implementations would have to implement the exact same variant of live variable analysis. Right now there is none: variables are deleted when the frame dies (or when `del x` is issued explicitly, which does not imply any analysis :-)). Regards Antoine.

On Wed, 8 Apr 2020 09:53:41 -0700 Guido van Rossum <guido@python.org> wrote:
The problem is if variable `x` has a side-effect destructor. It's certainly not a common idiom to keep a resource alive simply by keeping a Python object around (you should probably use `with` instead), but I'm sure some people do it. FWIW, Numba does something similar to try and release memory earlier. But Numba certainly doesn't claim to support general-purpose Python code :-) Regards Antoine.

On Apr 8, 2020, at 09:57, Guido van Rossum <guido@python.org> wrote:
It depends on how much you’re willing to break and still call it “safely”. def sleep(n): global store store = inspect.current_frame().f_back.f_locals['x'] This is a ridiculous example, but it shows that you can’t have all of Python’s dynamic functionality and still know when locals are dead. And there are less ridiculous examples with different code. If foo actually calls eval, exec, locals, vars, etc., or if it has a nested function that nonlocals x, etc., how can we spot that at compile time and keep x alive? Maybe that’s ok. After all, that code doesn’t work in a Python implementation that doesn’t have stack frame support. Some of the other possibilities might be more portable, but I don’t know without digging in further. Or maybe you can add new restrictions to what locals and eval and so on guarantee that will make it ok? Some code will break, but only rare “expert” code, where the authors will know how to work around it. Or, if not, it’s definitely fine as an opt-in optimization: decorate the function with @deadlocals and that decorator scans the bytecode and finds any locals that are dead assuming there’s no use of locals/eval/cells/etc. and, because you told it to assume that by opting in to the decorator, it can insert a DELETE_FAST safely. People already do similar things today—e.g., I’ve (only once in live code, but that’s still more than zero) used a @fastconst decorator that turns globals into consts on functions that I know are safe and are bottlenecks, and this would be no different. And of course you can add a recursive class decorator, or an import hook (or maybe even a command line flag or something) that enables it everywhere (maybe with a @nodeadlocals decorator for people who want it _almost_ everywhere but need to opt out one or two functions). Did Victor Stinner explore this as one of the optimizations for FAT Python/PEP 511/etc.? Maybe not, since it’s not something you can insert a guard, speculatively do, and then undo if the guard triggers, which was I think his key idea.

Could something just heuristically add del statements with an AST transformation that we could review with source control before committing? When the gc pause occurs is something I don't fully understand. For example: FWIW, this segfaults CPython in 2 lines: import ctypes ctypes.cast(1, ctypes.py_object) Interestingly, this (tends to?) work; even when there are ah scope closures?: import ctypes, gc x = 22 _id = id(x) del x gc.collect() y = ctypes.cast(_id, ctypes.py_object).value assert y == 22 Adding explicit calls to del with e.g. redbaron or similar would likely be less surprising. https://redbaron.readthedocs.io/en/latest/ Or something like a @jit decorator that pros would be aware of (who could just add del statements to free as necessary) On Wed, Apr 8, 2020, 10:29 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:

On Apr 8, 2020, at 23:53, Wes Turner <wes.turner@gmail.com> wrote:
Could something just heuristically add del statements with an AST transformation that we could review with source control before committing?
When the gc pause occurs is something I don't fully understand. For example:
Your examples don’t have anything to do with gc pause.
Yes, because this is ultimately trying to print the repr of (PyObject*)1, which means calling some function that tries to dereference some member of a struct at address 1, which means trying to access an int or pointer or whatever at address 1 or 9 or 17 or whatever. On most platforms, those addresses are going to be unmapped (and, on some, illegally aligned to boot), so you’ll get a segfault. This has nothing to do with the GC, or with Python objects at all.
The gc.collect isn’t doing anything here. First, the 22 object, like other small integers and a few other special cases, is immortal. Even after you del x, the object is still alive, so of course everything works. Even if you used a normal object that does get deleted, it would get deleted immediately when the last reference to the value goes away, in that del x statement. The collect isn’t needed and doesn’t do anything relevant here. (It’s there to detect reference cycles, like `a.b=b; b.a=a; del a; del b`. Assuming a and b were the only references to their objects at the start, a.b and b.a are the only references at the end. They won’t be deleted by refcounting because there’s still one reference to each, but they are garbage because they’re not accessible. The gc.collect is a cycle detector that handles exactly this case.) But your code may well still often work on most platforms. Deleting an object rarely unmaps its memory; it just returns that memory to the object allocator’s store. Eventually that memory will be reused for another object, but until it is, it will often still look like a perfectly valid value if you cheat and look at it (as you’re doing). (And even after it’s reused, it will often end up getting reused by some object of the same shape, so you won’t crash, you’ll just get odd results.) Anyway, getting off this side track and back to the main point: releasing the locals reference to an object that’s no longer being used locally isn’t guaranteed to destroy the object—but in CPython, if locals is the only reference, the object will be destroyed immediately. That’s why Guido’s optimization makes sense. The only way gc pause is relevant is for other implementations. For example, if CPython stops guaranteeing that x is alive until the end of the scope under certain conditions, PyPy could decide to do the same thing, and in PyPy, there is no refcount; garbage is deleted when it’s detected by the GC. So it wouldn’t be deterministic when x goes away, and the question of how much earlier does it go away and how much benefit there is becomes more complicated than in CPython. But the PyPy guys seem to be really good at figuring out how to test such questions empirically.

Thanks for removing the mystery. FWIW, here are some of the docs and resources for memory management in Python; I share these not to be obnoxious or to atoen, but to point to the docs that would need updating to explain what is going on if this is not explicit. - https://docs.python.org/3/reference/datamodel.html#object.__del__ - https://docs.python.org/3/extending/extending.html?highlight=__del__#thin-ic... - https://docs.python.org/3/c-api/memory.html - https://docs.python.org/3/library/gc.html - https://docs.python.org/3/library/tracemalloc.html - https://devguide.python.org/gdb/ - https://devguide.python.org/garbage_collector/ - https://devguide.python.org/garbage_collector/#optimization-reusing-fields-t... - https://doc.pypy.org/en/latest/gc_info.html - https://github.com/jythontools/jython/blob/master/src/org/python/modules/gc.... https://javadoc.io/doc/org.python/jython-standalone/2.7.2/ org/python/modules/gc.html - https://github.com/IronLanguages/ironpython2/blob/master/Src/IronPython.Modu... https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test... https://github.com/IronLanguages/ironpython2/blob/master/Src/StdLib/Lib/test... https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython.Modu... - "[Python-Dev] Re: Mixed Python/C debugging" https://mail.python.org/archives/list/python-dev@python.org/message/Z3S2RAXR... - @anthonypjshaw's CPython Internals book has a (will have a) memory management chapter. - > And then take a look at how @ApacheArrow "supports zero-copy reads for lightning-fast data access without serialization overhead." - .@blazingsql … #cuDF … @ApacheArrow https://docs.blazingdb.com/docs/blazingsql … New #DataFrame Interface and when that makes a copy for 2x+ memory use - "A dataframe protocol for the PyData ecosystem" https://discuss.ossdata.org/t/a-dataframe-protocol-for-the-pydata-ecosystem/... Presumably, nothing about magic del statements would affect C extensions, Cython, zero-copy reads, or data that's copied to the GPU for faster processing; but I don't understand this or how weakrefs and c-extensions share memory that could be unlinked by a del. Would be interested to see the real performance impact of this potential optimization: - 10%: https://instagram-engineering.com/dismissing-python-garbage-collection-at-in... On Thu, Apr 9, 2020 at 2:48 PM Andrew Barnert <abarnert@yahoo.com> wrote:

On Apr 9, 2020, at 15:13, Wes Turner <wes.turner@gmail.com> wrote:
This isn’t relevant here at all. How objects get constructed and manage their internal storage is completely orthogonal to the how Python manages object lifetimes.
Same here.
Presumably, nothing about magic del statements would affect C extensions, Cython, zero-copy reads, or data that's copied to the GPU for faster processing; but I don't understand this or how weakrefs and c-extensions share memory that could be unlinked by a del.
And same for some of this—but not all. C extensions can do the same kind of frame hacking, etc., as Python code, so they will have the same problems already raised in this thread. But I don’t think they they add anything new. (There are special rules allowing you to cheat with objects that haven’t been shared with Python code yet, which sounds like it would make things more complicated—until you realize that objects that haven’t been shared with Python code obviously can’t be affected by when Python code releases references.) But weakrefs would be affected, and that might be a problem with the proposal that I don’t think anyone else has noticed before you. Consider this toy example: spam = make_giant_spam() weakspam = weakref.ref(spam) with ThreadPoolExecutor() as e: for _ in range(1000): e.submit(dostuff, weakspam) Today, the spam variable lives until the end of the scope, which doesn’t happen until the with statement ends, which doesn’t happen until all 1000 tasks complete. So, the object in that variable is still alive for all of the tasks. With Guido’s proposed change, the spam variable is deleted after the last statement that uses it, which is before the with statement is even entered. Assuming it’s the only (non-weak) reference to the object, which is probably true, it will get destroyed, releasing all the memory (or other expensive resources) used by that giant spam object. That’s the whole point of the proposal, after all. But that means weakspam is now a dead weakref. So all those dostuff tasks are now doing stuff with a dead weakref. Presumably dostuff is designed to handle that safely, so you won’t crash or anything—but it can’t do the actual stuff you wanted it to do with that spam object. And, while this is obviously a toy example, perfectly reasonable real code will do similar things. It’s pretty common to use weakrefs for cases where 99% of the time the object is there but occasionally it’s dead (e.g., during graceful shutdown), and changing that 99% to 0% or 1% will make the entire process useless. It’s also common to use weakrefs for cases where 80% of the time the object is there but 20% of the time it’s been ejected from some cache and has to be regenerated; changing that 80% to 1% will mean the process still functions, but the cache is no longer doing anything, so it functions a lot slower. And so on. So, unless you could introduce some compiler magic to detect weakref.ref and weakref.weakdict.__setitem__ and so on (which might not be feasible, especially since it’s often buried inside some wrapper code), this proposal might well break many, maybe even most, good uses of weakrefs.
Would be interested to see the real performance impact of this potential optimization: - 10%: https://instagram-engineering.com/dismissing-python-garbage-collection-at-in...
Skimming this, it looks like this one is not just orthogonal to Guido’s proposal, it’s almost directly counter to it. Their goal is to have relatively short-lived killable children that defer refcount twiddling and destruction as much as possible so that fork-inherited objects don’t have to be copied and temporary objects don’t have to be cleaned up, they can just be abandoned. Guido’s goal is to get things decref’d and therefore hopefully destroyed as early as possible. Anyway, their optimization is definitely useful for a special class of programs that meet some requirements that sound unusual until you realize a lot of web servers/middlewares are designed around nearly the same requirements. People have done similar (in fact, even more radical, akin to building CPython and all of your extensions with refcounting completely disabled) in C and other languages, and there’s no reason (if you’re really careful) it couldn’t work in Python. But it’s certainly not the behavior you’d want from a general-purpose Python implementation.
participants (12)
-
Alex Hall
-
Andrew Barnert
-
Antoine Pitrou
-
Brandt Bucher
-
Brett Cannon
-
Caleb Donovick
-
Gregory P. Smith
-
Guido van Rossum
-
MRAB
-
Tim Peters
-
Tolo Palmer
-
Wes Turner