On Thu, Dec 23, 2010 at 20:30, Dima Tisnek <dimaqq@gmail.com> wrote:
Basically collecting this is hard:
dict(a=range(9**9))
large list is referenced, the object that holds the only reference is small no matter how you look at it. First, usually (in most GC-ed languages) you can collect the list before the dict. In PyPy, if finalizers are involved (is this the case here? That'd be surprising), this is no more true.
However, object size is not the point. For standard algorithms, the size of an object does not matter at all in deciding when it's collected - I already discussed this in my other email in this thread, and I noted what actually could happen in the examples described by Armin, and your examples show that it is a good property. A large object in the same heap can fill it up and trigger an earlier garbage collection. In general, if GC ran in the background (but it usually doesn't, and not in PyPy) it could make sense to free objects sooner or later, depending not on object size, but on "how much memory would be 'indirectly freed' by freeing this object". However, because of sharing, answering this question is too complex (it requires collecting data from the whole heap). Moreover, the whole thing makes no sense at all with usual, stop-the-world collectors: the app is stopped, then the whole young generation, or the whole heap, is collected, then the app is resumed. When separate heaps are involved (such as with ctypes, or with Large Object Spaces, which avoid using a copy collector for large objects), it is more complicated to ensure that the same property holds: you need to consider stats of all heaps to decide whether to trigger GC.
I guess it gets harder still if there are many small live objects, as getting to this dict takes a while (easier in this simple case with generataional collector, O(n) in general case)
Not sure what you mean; I can make sense of it (not fully) only with an incremental collector, and they are still used seldom (especially, not in PyPy). Best regards
On 23 December 2010 06:38, Armin Rigo <arigo@tunes.org> wrote:
Hi René,
On Thu, Dec 23, 2010 at 2:33 PM, René Dudfield <renesd@gmail.com> wrote:
I think this is a case where the object returned by ctypes.create_string_buffer() could use a correct __sizeof__ method return value. If pypy supported that, then the GC's could support extensions, and 'opaque' data structures in C too a little more nicely.
I think you are confusing levels. There is no way the GC can call some app-level Python method to get information about the objects it frees (and when would it even call it?). Remember that our GC is written at a level where it works for any interpreter for any language, not just Python.
A bientôt,
Armin. _______________________________________________ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
_______________________________________________ pypy-dev@codespeak.net http://codespeak.net/mailman/listinfo/pypy-dev
-- Paolo Giarrusso - Ph.D. Student http://www.informatik.uni-marburg.de/~pgiarrusso/