Wenjun,

I feel we're just not communicating. Your suggestion seems to be a solution in search of a problem. And now you're making more super speculative suggestions. How much do you actually know about Python's internals? It's not at all like C++, where I could see the distinction between user allocations and system allocations making sense.

--Guido

On Mon, Jul 20, 2020 at 7:25 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Guido,

Thank you for bearing with me. I wasn't trying to say you guys are mean btw.

I thought that the interpreter might allocate some memory for its own use. Perhaps I was wrong, but I'll work with your examples here just to be sure.

Stack frames would be considered as interpreter objects here, as they aren't created because a user object is created. Instead, they are the results of function calls. Following that, empty spaces in hash tables and string hashes would be considered as user allocations, as they are created through explicitly created objects. I think a transitive relation would work here (i.e. if an explicit object allocation triggers an implicit allocation, then the latter is considered an user allocation).

Now, maybe getting this to work doesn't benefit profiler users so much, but there are other potential uses as well. Hopefully they can be more compelling. I didn't bring these up earlier because I thought the profiling case was easier to discuss.

For example, provenance of data can be tracked through taint analysis, but if all objects are lumped together then we have to taint the entire interpreter.

Another example would be partial GIL sidestepping. The approach would be blowing up threads into processes and allocating all user objects in shared memory (accesses would be synchronized). This way we get parallel execution and threading semantics. However, this is not possible if we can't isolate user objects, as there's no sensible default to synchronize interpreter states. This design has been done before for C/C++ (https://people.cs.umass.edu/~emery/pubs/dthreads-sosp11.pdf), but for different reasons.

On Mon, Jul 20, 2020 at 8:16 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Jul 20, 2020 at 4:09 PM Wenjun Huang <wenjunhuang@umass.edu> wrote:
Hi Barry,

It's not just about leaks. You might want to know if certain objects are occupying a lot of memory by themselves. Then you can optimize the memory usage of these objects.

Another possibility is to do binary instrumentation and see how the user code is interacting with objects. If we can't tell which objects are created by the interpreter internals, then interpreter accesses and user accesses would be mixed together. It's likely that some accesses would be connected of course, but I don't think this should be outright labeled as useless.

I have to side with Barry -- I don't understand why the difference between "interpreter internals" and "user objects" matters. Can you give some examples of interpreter internals that aren't being allocated in direct response to user code? For example you might call stack frames internals. But a stack frame is only created when a user calls a function, so maybe that's a user object too? Or take dictionaries. These contain hash tables with empty spaces in them. Are the empty spaces internals? Or strings. These cache the hash value. Are the 8 bytes for the hash value interpreter internals?

So, here's my request -- can you clarify your need for the differentiation? Other than just pointing to Scalene. If Scalene has a reason for making this differentiation can you explain what Scalene users get out of this? Suppose Scalene tells me "your objects take 84.3% of the memory and interpreter internals take the other 17.7%" what can I as a user do with that information?
 
Also, I'm not saying "we must implement this because it's so useful."
My original intention is to understand:
(1) is the differentiation being done at all?

It's not. We're not being mean here. If it was being done someone would have told you after your first message.
 
(2) if it's not being done, why?

Because nobody saw a need for it. In fact, apart from you, there still isn't anyone who sees the need for it, since you haven't explained your need. (This, too, should have been obvious to you given the responses you'v gotten so far. :-)
 
(3) does it make sense to implement it?

Probably not. I certainly don't expect it to be easy. So it won't "make sense" unless you have actually explained your reason for wanting this and convinced some folks that this is a good reason. See the answer for (1) and (2) above.
 
So far I think I've got the answers to 1 & 2--it's not being done because people don't find it useful. The answer to 3 is most likely "no" due to the costs, but it would be nice if someone could weigh in on this part. Maybe there's some workaround.

If you were asking me to weigh in *now* I'd say "no", if only because you haven't explained the reason why this is needed. And if you have an implementation idea in mind, please don't be shy.
 
--
--Guido van Rossum (python.org/~guido)


--
--Guido van Rossum (python.org/~guido)
Pronouns: he/him (why is my pronoun here?)