[pypy-dev] Pypy garbage collection
Martin Koch
mak at issuu.com
Mon Mar 17 11:46:17 CET 2014
Based On Maciej's suggestion, I tried the following
PYPYLOG=- pypy mem.py 10000000 > out
This generates a logfile which looks something like this
start-->
[2b99f1981b527e] {gc-minor
[2b99f1981ba680] {gc-minor-walkroots
[2b99f1981c2e02] gc-minor-walkroots}
[2b99f19890d750] gc-minor}
[snip]
...
<--stop
It turns out that the culprit is a lot of MINOR collections.
I base this on the following observations:
- I can't understand the format of the timestamp on each logline (the "
[2b99f1981b527e]"). From what I can see in the code, this should be
output from time.clock(), but that doesn't return a number like that when I
run pypy interactively
- Instead, I count the number of debug lines between start--> and the
corresponding <--stop.
- Most runs have a few hundred lines of output between start/stop
- All slow runs have very close to 57800 lines out output between
start/stop
- One such sample does 9609 gc-collect-step operations, 9647
gc-minor operations,
and 9647 gc-minor-walkroots operations.
Thanks,
/Martin
On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski <fijall at gmail.com>wrote:
> there is an environment variable PYPYLOG=gc:- (where - is stdout)
> which will do that for you btw.
>
> maybe you can find out what's that using profiling or valgrind?
>
> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch <mak at issuu.com> wrote:
> > I have tried getting the pypy source and building my own version of
> pypy. I
> > have modified rpython/memory/gc/incminimark.py:major_collection_step() to
> > print out when it starts and when it stops. Apparently, the slow queries
> do
> > NOT occur during major_collection_step; at least, I have not observed
> major
> > step output during a query execution. So, apparently, something else is
> > blocking. This could be another aspect of the GC, but it could also be
> > anything else.
> >
> > Just to be sure, I have tried running the same application in python with
> > garbage collection disabled. I don't see the problem there, so it is
> somehow
> > related to either GC or the runtime somehow.
> >
> > Cheers,
> > /Martin
> >
> >
> > On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch <mak at issuu.com> wrote:
> >>
> >> We have hacked up a small sample that seems to exhibit the same issue.
> >>
> >> We basically generate a linked list of objects. To increase
> connectedness,
> >> elements in the list hold references (dummy_links) to 10 randomly chosen
> >> previous elements in the list.
> >>
> >> We then time a function that traverses 50000 elements from the list
> from a
> >> random start point. If the traversal reaches the end of the list, we
> instead
> >> traverse one of the dummy links. Thus, exactly 50K elements are
> traversed
> >> every time. To generate some garbage, we build a list holding the
> traversed
> >> elements and a dummy list of characters.
> >>
> >> Timings for the last 100 runs are stored in a circular buffer. If the
> >> elapsed time for the last run is more than twice the average time, we
> print
> >> out a line with the elapsed time, the threshold, and the 90% runtime (we
> >> would like to see that the mean runtime does not increase with the
> number of
> >> elements in the list, but that the max time does increase (linearly
> with the
> >> number of object, i guess); traversing 50K elements should be
> independent of
> >> the memory size).
> >>
> >> We have tried monitoring memory consumption by external inspection, but
> >> cannot consistently verify that memory is deallocated at the same time
> that
> >> we see slow requests. Perhaps the pypy runtime doesn't always return
> freed
> >> pages back to the OS?
> >>
> >> Using top, we observe that 10M elements allocates around 17GB after
> >> building, 20M elements 26GB, 30M elements 28GB (and grows to 35GB
> shortly
> >> after building).
> >>
> >> Here is output from a few runs with different number of elements:
> >>
> >>
> >> pypy mem.py 10000000
> >> start build
> >> end build 84.142424
> >> that took a long time elapsed: 13.230586 slow_threshold: 1.495401
> >> 90th_quantile_runtime: 0.421558
> >> that took a long time elapsed: 13.016531 slow_threshold: 1.488160
> >> 90th_quantile_runtime: 0.423441
> >> that took a long time elapsed: 13.032537 slow_threshold: 1.474563
> >> 90th_quantile_runtime: 0.419817
> >>
> >> pypy mem.py 20000000
> >> start build
> >> end build 180.823105
> >> that took a long time elapsed: 27.346064 slow_threshold: 2.295146
> >> 90th_quantile_runtime: 0.434726
> >> that took a long time elapsed: 26.028852 slow_threshold: 2.283927
> >> 90th_quantile_runtime: 0.374190
> >> that took a long time elapsed: 25.432279 slow_threshold: 2.279631
> >> 90th_quantile_runtime: 0.371502
> >>
> >> pypy mem.py 30000000
> >> start build
> >> end build 276.217811
> >> that took a long time elapsed: 40.993855 slow_threshold: 3.188464
> >> 90th_quantile_runtime: 0.459891
> >> that took a long time elapsed: 41.693553 slow_threshold: 3.183003
> >> 90th_quantile_runtime: 0.393654
> >> that took a long time elapsed: 39.679769 slow_threshold: 3.190782
> >> 90th_quantile_runtime: 0.393677
> >> that took a long time elapsed: 43.573411 slow_threshold: 3.239637
> >> 90th_quantile_runtime: 0.393654
> >>
> >> Code below
> >> --------------------------------------------------------------
> >> import time
> >> from random import randint, choice
> >> import sys
> >>
> >>
> >> allElems = {}
> >>
> >> class Node:
> >> def __init__(self, v_):
> >> self.v = v_
> >> self.next = None
> >> self.dummy_data = [randint(0,100)
> >> for _ in xrange(randint(50,100))]
> >> allElems[self.v] = self
> >> if self.v > 0:
> >> self.dummy_links = [allElems[randint(0, self.v-1)] for _ in
> >> xrange(10)]
> >> else:
> >> self.dummy_links = [self]
> >>
> >> def set_next(self, l):
> >> self.next = l
> >>
> >>
> >> def follow(node):
> >> acc = []
> >> count = 0
> >> cur = node
> >> assert node.v is not None
> >> assert cur is not None
> >> while count < 50000:
> >> # return a value; generate some garbage
> >> acc.append((cur.v, [choice("abcdefghijklmnopqrstuvwxyz") for x
> in
> >> xrange(100)]))
> >>
> >> # if we have reached the end, chose a random link
> >> cur = choice(cur.dummy_links) if cur.next is None else cur.next
> >> count += 1
> >>
> >> return acc
> >>
> >>
> >> def build(num_elems):
> >> start = time.time()
> >> print "start build"
> >> root = Node(0)
> >> cur = root
> >> for x in xrange(1, num_elems):
> >> e = Node(x)
> >> cur.next = e
> >> cur = e
> >> print "end build %f" % (time.time() - start)
> >> return root
> >>
> >>
> >> num_timings = 100
> >> if __name__ == "__main__":
> >> num_elems = int(sys.argv[1])
> >> build(num_elems)
> >> total = 0
> >> timings = [0.0] * num_timings # run times for the last num_timings
> >> runs
> >> i = 0
> >> beginning = time.time()
> >> while time.time() - beginning < 600:
> >> start = time.time()
> >> elem = allElems[randint(0, num_elems - 1)]
> >> assert(elem is not None)
> >>
> >> lst = follow(elem)
> >>
> >> total += choice(lst)[0] # use the return value for something
> >>
> >> end = time.time()
> >>
> >> elapsed = end-start
> >> timings[i % num_timings] = elapsed
> >> if (i > num_timings):
> >> slow_time = 2 * sum(timings)/num_timings # slow defined as >
> >> 2*avg run time
> >> if (elapsed > slow_time):
> >> print "that took a long time elapsed: %f
> slow_threshold:
> >> %f 90th_quantile_runtime: %f" % \
> >> (elapsed, slow_time,
> >> sorted(timings)[int(num_timings*.9)])
> >> i += 1
> >> print total
> >>
> >>
> >>
> >>
> >>
> >> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski <fijall at gmail.com>
> >> wrote:
> >>>
> >>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch <mak at issuu.com> wrote:
> >>> > Hi Armin, Maciej
> >>> >
> >>> > Thanks for responding.
> >>> >
> >>> > I'm in the process of trying to determine what (if any) of the code
> I'm
> >>> > in a
> >>> > position to share, and I'll get back to you.
> >>> >
> >>> > Allowing hinting to the GC would be good. Even better would be a
> means
> >>> > to
> >>> > allow me to (transparently) allocate objects in unmanaged memory,
> but I
> >>> > would expect that to be a tall order :)
> >>> >
> >>> > Thanks,
> >>> > /Martin
> >>>
> >>> Hi Martin.
> >>>
> >>> Note that in case you want us to do the work of isolating the problem,
> >>> we do offer paid support to do that (then we can sign NDAs and stuff).
> >>> Otherwise we would be more than happy to fix bugs once you isolate a
> >>> part you can share freely :)
> >>
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140317/21426c5f/attachment-0001.html>
More information about the pypy-dev
mailing list