[Python-Dev] Proposal: Run GC less often
Leif Walsh
adlaiff6 at gmail.com
Sat Jun 21 23:33:15 CEST 2008
If you can get me a version of the interpreter with this change made
(I wouldn't know what to change), I can run a very
allocation/deallocation-heavy application I have lying around, and get
you some good benchmarks.
On Sat, Jun 21, 2008 at 1:23 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
- Show quoted text -
> Here is my proposal for making the GC run less often.
> The objective is to avoid the quadratic-time behavior
> if many objects are created and none of them is garbage.
>
> The youngest generation remains collected after 700
> allocations, the middle generation after 10 young collections.
> Therefore, full GC remains considered every 7000 allocations.
> The two youngest generations don't cause quadratic behavior,
> as the number of objects in each generation is bounded.
>
> Currently, full GC is invoked every 10 middle collections.
> Under my proposal, 10 middle collections must have passed,
> PLUS the number of survivor objects from the middle generation
> must exceed 10% of the number of objects in the oldest
> generation.
>
> As a consequence, garbage collection becomes less frequent
> as the number of objects on the heap grows, resulting in
> an amortized O(N) overhead for garbage collection (under
> the access pattern stated above).
>
> To determine the number of survivor objects, counts are
> taken during the collection. Objects deallocated through
> reference counting still remain accounted for as survivors
> (likewise for determining the size of the oldest generation).
>
> I made a simulation assuming an initially-empty heap,
> and invocations of the collector every M=7000 objects.
> The table below is in units of M and shows the number
> of objects allocated, and the number of objects inspected
> since the start of the simulation, for the old and the
> new scheme (the asterisk indicates that a collection
> took place; lines with no collection are dropped):
>
> total old_inspected new_inspected
> 10 10* 10*
> 20 30* 30*
> 30 60* 60*
> 40 100* 100*
> 50 150* 150*
> 60 210* 210*
> 70 280* 280*
> 80 360* 360*
> 90 450* 450*
> 100 550* 450
> 102 550 552*
> 110 660* 552
> 115 660 667*
> 120 780* 667
> 129 780 796*
> 130 910* 796
> 140 1050* 796
> ...
> 940 44650* 7724
> 942 44650 8666*
> 950 45600* 8666
> 960 46560* 8666
> 970 47530* 8666
> 980 48510* 8666
> 990 49500* 8666
> 1000 50500* 8666
> ...
> 9990 4995000* 95887
>
> As you can see, the old and the new scheme would start
> of equally until 100M objects have been allocated. The
> table shows how the old scheme grows quadratically, and
> the new scheme eventually approaches roughly a factor
> of ten between the number of objects and the number of
> times that GC considers an object.
>
> Applications with a small number of objects will see no
> change in behavior, also, applications that create
> lots of small cycles will continue to see them collected
> in the youngest or middle generations.
>
> Please let me know what you think.
>
> Regards,
> Martin
>
> P.S. I don't plan to implement this myself, so if you like
> the idea, go ahead.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/adlaiff6%40gmail.com
>
>
--
Cheers,
Leif
More information about the Python-Dev
mailing list