isolating user allocations

Hello, The API provided by PEP 445 makes it possible to intercept allocation requests through hooks, but it seems that both user allocations and interpreter allocations are sent to the hooks. Here, user allocations refer to those that are triggered explicitly by the code (e.g. memory allocations to hold the integer created by x = 1), and interpreter allocations refer to everything else (e.g. memory allocations for internal states). I've poked around a bit in the interpreter source code, and I think such differentiations aren't being done at all, so all allocations are directed to the same set of API. If that's indeed the case, why is the interpreter implemented this way? Would it make sense to implement the differentiation? Thanks

What purpose do you have in mind for making this distinction? Even if it could be done easily (which I doubt), why would this be useful? On Sun, Jul 19, 2020 at 19:01 <wenjunhuang@umass.edu> wrote:
Hello,
The API provided by PEP 445 makes it possible to intercept allocation requests through hooks, but it seems that both user allocations and interpreter allocations are sent to the hooks.
Here, user allocations refer to those that are triggered explicitly by the code (e.g. memory allocations to hold the integer created by x = 1), and interpreter allocations refer to everything else (e.g. memory allocations for internal states).
I've poked around a bit in the interpreter source code, and I think such differentiations aren't being done at all, so all allocations are directed to the same set of API. If that's indeed the case, why is the interpreter implemented this way? Would it make sense to implement the differentiation?
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HJQW2M... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)

Hi Guido, It's great to hear from you directly :) Sorry for not mentioning this earlier. The use case here is profiling. Specifically, Scalene: https://github.com/emeryberger/scalene. At the moment, Scalene does stack inspection to decide if an allocation is from the user. If there are hooks helping with the differentiation, it's arguably more efficient. Additionally, users might want to know about memory allocated explicitly by their objects instead of by the interpreter internals. If the interpreter doesn't have this functionality already, it would probably be difficult to add, as the existing allocation API is called throughout the interpreter code base. Thanks On Sun, Jul 19, 2020 at 10:39 PM Guido van Rossum <guido@python.org> wrote:
What purpose do you have in mind for making this distinction? Even if it could be done easily (which I doubt), why would this be useful?
On Sun, Jul 19, 2020 at 19:01 <wenjunhuang@umass.edu> wrote:
Hello,
The API provided by PEP 445 makes it possible to intercept allocation requests through hooks, but it seems that both user allocations and interpreter allocations are sent to the hooks.
Here, user allocations refer to those that are triggered explicitly by the code (e.g. memory allocations to hold the integer created by x = 1), and interpreter allocations refer to everything else (e.g. memory allocations for internal states).
I've poked around a bit in the interpreter source code, and I think such differentiations aren't being done at all, so all allocations are directed to the same set of API. If that's indeed the case, why is the interpreter implemented this way? Would it make sense to implement the differentiation?
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HJQW2M... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)

On 20 Jul 2020, at 18:51, Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Guido,
It's great to hear from you directly :)
Sorry for not mentioning this earlier. The use case here is profiling. Specifically, Scalene: https://github.com/emeryberger/scalene.
At the moment, Scalene does stack inspection to decide if an allocation is from the user. If there are hooks helping with the differentiation, it's arguably more efficient. Additionally, users might want to know about memory allocated explicitly by their objects instead of by the interpreter internals.
In my day job and open source projects I care about memory leaks and fixing them. If my code creates a list you call that user allocated? If memory increases as part of python book keeping that is counted as something else? Why is that interesting? When looking for memory leaks I have not cared about that distinction. Should I be caring? Am I missing something? Barry
If the interpreter doesn't have this functionality already, it would probably be difficult to add, as the existing allocation API is called throughout the interpreter code base.
Thanks
On Sun, Jul 19, 2020 at 10:39 PM Guido van Rossum <guido@python.org> wrote: What purpose do you have in mind for making this distinction? Even if it could be done easily (which I doubt), why would this be useful?
On Sun, Jul 19, 2020 at 19:01 <wenjunhuang@umass.edu> wrote: Hello,
The API provided by PEP 445 makes it possible to intercept allocation requests through hooks, but it seems that both user allocations and interpreter allocations are sent to the hooks.
Here, user allocations refer to those that are triggered explicitly by the code (e.g. memory allocations to hold the integer created by x = 1), and interpreter allocations refer to everything else (e.g. memory allocations for internal states).
I've poked around a bit in the interpreter source code, and I think such differentiations aren't being done at all, so all allocations are directed to the same set of API. If that's indeed the case, why is the interpreter implemented this way? Would it make sense to implement the differentiation?
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HJQW2M... Code of Conduct: http://python.org/psf/codeofconduct/ -- --Guido (mobile)
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/H33JNM... Code of Conduct: http://python.org/psf/codeofconduct/

Hi Barry, It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects. Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless. Also, I'm not saying "we must implement this because it's so useful." My original intention is to understand: (1) is the differentiation being done at all? (2) if it's not being done, why? (3) does it make sense to implement it? So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround. Thanks On Mon, Jul 20, 2020 at 5:16 PM Barry <barry@barrys-emacs.org> wrote:
On 20 Jul 2020, at 18:51, Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Guido,
It's great to hear from you directly :)
Sorry for not mentioning this earlier. The use case here is profiling. Specifically, Scalene: https://github.com/emeryberger/scalene.
At the moment, Scalene does stack inspection to decide if an allocation is from the user. If there are hooks helping with the differentiation, it's arguably more efficient. Additionally, users might want to know about memory allocated explicitly by their objects instead of by the interpreter internals.
In my day job and open source projects I care about memory leaks and fixing them.
If my code creates a list you call that user allocated? If memory increases as part of python book keeping that is counted as something else?
Why is that interesting?
When looking for memory leaks I have not cared about that distinction. Should I be caring? Am I missing something?
Barry
If the interpreter doesn't have this functionality already, it would probably be difficult to add, as the existing allocation API is called throughout the interpreter code base.
Thanks
On Sun, Jul 19, 2020 at 10:39 PM Guido van Rossum <guido@python.org> wrote:
What purpose do you have in mind for making this distinction? Even if it could be done easily (which I doubt), why would this be useful?
On Sun, Jul 19, 2020 at 19:01 <wenjunhuang@umass.edu> wrote:
Hello,
The API provided by PEP 445 makes it possible to intercept allocation requests through hooks, but it seems that both user allocations and interpreter allocations are sent to the hooks.
Here, user allocations refer to those that are triggered explicitly by the code (e.g. memory allocations to hold the integer created by x = 1), and interpreter allocations refer to everything else (e.g. memory allocations for internal states).
I've poked around a bit in the interpreter source code, and I think such differentiations aren't being done at all, so all allocations are directed to the same set of API. If that's indeed the case, why is the interpreter implemented this way? Would it make sense to implement the differentiation?
Thanks _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/HJQW2M... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/H33JNM... Code of Conduct: http://python.org/psf/codeofconduct/

On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Barry,
It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects.
Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless.
I have to side with Barry -- I don't understand why the difference between "interpreter internals" and "user objects" matters. Can you give some examples of interpreter internals that aren't being allocated in direct response to user code? For example you might call stack frames internals. But a stack frame is only created when a user calls a function, so maybe that's a user object too? Or take dictionaries. These contain hash tables with empty spaces in them. Are the empty spaces internals? Or strings. These cache the hash value. Are the 8 bytes for the hash value interpreter internals? So, here's my request -- can you clarify your need for the differentiation? Other than just pointing to Scalene. If Scalene has a reason for making this differentiation can you explain what Scalene users get out of this? Suppose Scalene tells me "your objects take 84.3% of the memory and interpreter internals take the other 17.7%" what can I as a user do with that information?
Also, I'm not saying "we must implement this because it's so useful." My original intention is to understand: (1) is the differentiation being done at all?
It's not. We're not being mean here. If it was being done someone would have told you after your first message.
(2) if it's not being done, why?
Because nobody saw a need for it. In fact, apart from you, there still isn't anyone who sees the need for it, since you haven't explained your need. (This, too, should have been obvious to you given the responses you'v gotten so far. :-)
(3) does it make sense to implement it?
Probably not. I certainly don't expect it to be easy. So it won't "make sense" unless you have actually explained your reason for wanting this and convinced some folks that this is a good reason. See the answer for (1) and (2) above.
So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround.
If you were asking me to weigh in *now* I'd say "no", if only because you haven't explained the reason why this is needed. And if you have an implementation idea in mind, please don't be shy. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Hi Guido, Thank you for bearing with me. I wasn't trying to say you guys are mean btw. I thought that the interpreter might allocate some memory for its own use. Perhaps I was wrong, but I'll work with your examples here just to be sure. Stack frames would be considered as interpreter objects here, as they aren't created because a user object is created. Instead, they are the results of function calls. Following that, empty spaces in hash tables and string hashes would be considered as user allocations, as they are created through explicitly created objects. I think a transitive relation would work here (i.e. if an explicit object allocation triggers an implicit allocation, then the latter is considered an user allocation). Now, maybe getting this to work doesn't benefit profiler users so much, but there are other potential uses as well. Hopefully they can be more compelling. I didn't bring these up earlier because I thought the profiling case was easier to discuss. For example, provenance of data can be tracked through taint analysis, but if all objects are lumped together then we have to taint the entire interpreter. Another example would be partial GIL sidestepping. The approach would be blowing up threads into processes and allocating all user objects in shared memory (accesses would be synchronized). This way we get parallel execution and threading semantics. However, this is not possible if we can't isolate user objects, as there's no sensible default to synchronize interpreter states. This design has been done before for C/C++ ( https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for different reasons. On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Barry,
It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects.
Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless.
I have to side with Barry -- I don't understand why the difference between "interpreter internals" and "user objects" matters. Can you give some examples of interpreter internals that aren't being allocated in direct response to user code? For example you might call stack frames internals. But a stack frame is only created when a user calls a function, so maybe that's a user object too? Or take dictionaries. These contain hash tables with empty spaces in them. Are the empty spaces internals? Or strings. These cache the hash value. Are the 8 bytes for the hash value interpreter internals?
So, here's my request -- can you clarify your need for the differentiation? Other than just pointing to Scalene. If Scalene has a reason for making this differentiation can you explain what Scalene users get out of this? Suppose Scalene tells me "your objects take 84.3% of the memory and interpreter internals take the other 17.7%" what can I as a user do with that information?
Also, I'm not saying "we must implement this because it's so useful." My original intention is to understand: (1) is the differentiation being done at all?
It's not. We're not being mean here. If it was being done someone would have told you after your first message.
(2) if it's not being done, why?
Because nobody saw a need for it. In fact, apart from you, there still isn't anyone who sees the need for it, since you haven't explained your need. (This, too, should have been obvious to you given the responses you'v gotten so far. :-)
(3) does it make sense to implement it?
Probably not. I certainly don't expect it to be easy. So it won't "make sense" unless you have actually explained your reason for wanting this and convinced some folks that this is a good reason. See the answer for (1) and (2) above.
So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround.
If you were asking me to weigh in *now* I'd say "no", if only because you haven't explained the reason why this is needed. And if you have an implementation idea in mind, please don't be shy.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Wenjun, I feel we're just not communicating. Your suggestion seems to be a solution in search of a problem. And now you're making more super speculative suggestions. How much do you actually know about Python's internals? It's not at all like C++, where I could see the distinction between user allocations and system allocations making sense. --Guido On Mon, Jul 20, 2020 at 7:25 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Guido,
Thank you for bearing with me. I wasn't trying to say you guys are mean btw.
I thought that the interpreter might allocate some memory for its own use. Perhaps I was wrong, but I'll work with your examples here just to be sure.
Stack frames would be considered as interpreter objects here, as they aren't created because a user object is created. Instead, they are the results of function calls. Following that, empty spaces in hash tables and string hashes would be considered as user allocations, as they are created through explicitly created objects. I think a transitive relation would work here (i.e. if an explicit object allocation triggers an implicit allocation, then the latter is considered an user allocation).
Now, maybe getting this to work doesn't benefit profiler users so much, but there are other potential uses as well. Hopefully they can be more compelling. I didn't bring these up earlier because I thought the profiling case was easier to discuss.
For example, provenance of data can be tracked through taint analysis, but if all objects are lumped together then we have to taint the entire interpreter.
Another example would be partial GIL sidestepping. The approach would be blowing up threads into processes and allocating all user objects in shared memory (accesses would be synchronized). This way we get parallel execution and threading semantics. However, this is not possible if we can't isolate user objects, as there's no sensible default to synchronize interpreter states. This design has been done before for C/C++ ( https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for different reasons.
On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Barry,
It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects.
Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless.
I have to side with Barry -- I don't understand why the difference between "interpreter internals" and "user objects" matters. Can you give some examples of interpreter internals that aren't being allocated in direct response to user code? For example you might call stack frames internals. But a stack frame is only created when a user calls a function, so maybe that's a user object too? Or take dictionaries. These contain hash tables with empty spaces in them. Are the empty spaces internals? Or strings. These cache the hash value. Are the 8 bytes for the hash value interpreter internals?
So, here's my request -- can you clarify your need for the differentiation? Other than just pointing to Scalene. If Scalene has a reason for making this differentiation can you explain what Scalene users get out of this? Suppose Scalene tells me "your objects take 84.3% of the memory and interpreter internals take the other 17.7%" what can I as a user do with that information?
Also, I'm not saying "we must implement this because it's so useful." My original intention is to understand: (1) is the differentiation being done at all?
It's not. We're not being mean here. If it was being done someone would have told you after your first message.
(2) if it's not being done, why?
Because nobody saw a need for it. In fact, apart from you, there still isn't anyone who sees the need for it, since you haven't explained your need. (This, too, should have been obvious to you given the responses you'v gotten so far. :-)
(3) does it make sense to implement it?
Probably not. I certainly don't expect it to be easy. So it won't "make sense" unless you have actually explained your reason for wanting this and convinced some folks that this is a good reason. See the answer for (1) and (2) above.
So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround.
If you were asking me to weigh in *now* I'd say "no", if only because you haven't explained the reason why this is needed. And if you have an implementation idea in mind, please don't be shy.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Hi Guido, I guess I didn't think it through. Thanks for all your comments! Regards On Mon, Jul 20, 2020 at 10:35 PM Guido van Rossum <guido@python.org> wrote:
Wenjun,
I feel we're just not communicating. Your suggestion seems to be a solution in search of a problem. And now you're making more super speculative suggestions. How much do you actually know about Python's internals? It's not at all like C++, where I could see the distinction between user allocations and system allocations making sense.
--Guido
On Mon, Jul 20, 2020 at 7:25 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Guido,
Thank you for bearing with me. I wasn't trying to say you guys are mean btw.
I thought that the interpreter might allocate some memory for its own use. Perhaps I was wrong, but I'll work with your examples here just to be sure.
Stack frames would be considered as interpreter objects here, as they aren't created because a user object is created. Instead, they are the results of function calls. Following that, empty spaces in hash tables and string hashes would be considered as user allocations, as they are created through explicitly created objects. I think a transitive relation would work here (i.e. if an explicit object allocation triggers an implicit allocation, then the latter is considered an user allocation).
Now, maybe getting this to work doesn't benefit profiler users so much, but there are other potential uses as well. Hopefully they can be more compelling. I didn't bring these up earlier because I thought the profiling case was easier to discuss.
For example, provenance of data can be tracked through taint analysis, but if all objects are lumped together then we have to taint the entire interpreter.
Another example would be partial GIL sidestepping. The approach would be blowing up threads into processes and allocating all user objects in shared memory (accesses would be synchronized). This way we get parallel execution and threading semantics. However, this is not possible if we can't isolate user objects, as there's no sensible default to synchronize interpreter states. This design has been done before for C/C++ ( https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for different reasons.
On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Barry,
It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects.
Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless.
I have to side with Barry -- I don't understand why the difference between "interpreter internals" and "user objects" matters. Can you give some examples of interpreter internals that aren't being allocated in direct response to user code? For example you might call stack frames internals. But a stack frame is only created when a user calls a function, so maybe that's a user object too? Or take dictionaries. These contain hash tables with empty spaces in them. Are the empty spaces internals? Or strings. These cache the hash value. Are the 8 bytes for the hash value interpreter internals?
So, here's my request -- can you clarify your need for the differentiation? Other than just pointing to Scalene. If Scalene has a reason for making this differentiation can you explain what Scalene users get out of this? Suppose Scalene tells me "your objects take 84.3% of the memory and interpreter internals take the other 17.7%" what can I as a user do with that information?
Also, I'm not saying "we must implement this because it's so useful." My original intention is to understand: (1) is the differentiation being done at all?
It's not. We're not being mean here. If it was being done someone would have told you after your first message.
(2) if it's not being done, why?
Because nobody saw a need for it. In fact, apart from you, there still isn't anyone who sees the need for it, since you haven't explained your need. (This, too, should have been obvious to you given the responses you'v gotten so far. :-)
(3) does it make sense to implement it?
Probably not. I certainly don't expect it to be easy. So it won't "make sense" unless you have actually explained your reason for wanting this and convinced some folks that this is a good reason. See the answer for (1) and (2) above.
So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround.
If you were asking me to weigh in *now* I'd say "no", if only because you haven't explained the reason why this is needed. And if you have an implementation idea in mind, please don't be shy.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
participants (4)
-
Barry
-
Guido van Rossum
-
Wenjun Huang
-
wenjunhuang@umass.edu