[pypy-dev] Pypy garbage collection

Mon Mar 17 20:04:47 CET 2014

Well, it would appear that we have the problem because we're generating a
lot of garbage in the young generation, just like we're doing in the
example we've been studying here. I'm unsure how we can avoid that in our
real implementation. Can we force gc of the young generation? Either by
gc.collect() or implcitly somehow (does the gc e.g. kick in across function
calls?).

Thanks,
/Martin

On Mon, Mar 17, 2014 at 5:39 PM, Maciej Fijalkowski <fijall at gmail.com>wrote:

> not sure how more we can help without looking into the code
>
> On Mon, Mar 17, 2014 at 6:05 PM, Martin Koch <mak at issuu.com> wrote:
> > Thanks :)
> >
> > /Martin
> >
> >
> >> On 17/03/2014, at 16.41, Maciej Fijalkowski <fijall at gmail.com> wrote:
> >>
> >> ok.
> >>
> >> so as you can probably see, the max is not that big, which means the
> >> GC is really incremental. What happens is you get tons of garbage that
> >> survives minor collection every now and then. I don't exactly know
> >> why, but you should look what objects can potentially survive for too
> >> long.
> >>
> >>> On Mon, Mar 17, 2014 at 5:37 PM, Martin Koch <mak at issuu.com> wrote:
> >>> Ah - it just occured to me that the first runs may be slow anyway:
> Since we
> >>> take the average of the last 100 runs as the benchmark, then the first
> 100
> >>> runs are not classified as slow. Indeed, the first three runs with many
> >>> collections are in the first 100 runs.
> >>>
> >>>
> >>>> On Mon, Mar 17, 2014 at 4:35 PM, Martin Koch <mak at issuu.com> wrote:
> >>>>
> >>>> Here are the total and max times in millions of units; 30000 units is
> >>>> approximately 13 seconds. I have extracted the runs where there are
> many
> >>>> gc-collect-steps. These are in execution order, so the first runs
> with many
> >>>> gc-collect-steps aren't slow.
> >>>>
> >>>> Totals: gc-minor:418 gc-minor-walkroots:0 gc-collect-step:28797 Max:
> >>>> gc-minor:10 gc-collect-step:247
> >>>> Totals: gc-minor:562 gc-minor-walkroots:0 gc-collect-step:30282 Max:
> >>>> gc-minor:10 gc-collect-step:245
> >>>> Totals: gc-minor:434 gc-minor-walkroots:0 gc-collect-step:31040 Max:
> >>>> gc-minor:11 gc-collect-step:244
> >>>> Totals: gc-minor:417 slow:1 gc-minor-walkroots:0 gc-collect-step:31270
> >>>> Max: gc-minor:17 gc-collect-step:244
> >>>> Totals: gc-minor:435 slow:1 gc-minor-walkroots:0 gc-collect-step:30365
> >>>> Max: gc-minor:11 gc-collect-step:248
> >>>> Totals: gc-minor:389 slow:1 gc-minor-walkroots:0 gc-collect-step:31235
> >>>> Max: gc-minor:8 gc-collect-step:299
> >>>> Totals: gc-minor:434 slow:1 gc-minor-walkroots:0 gc-collect-step:31124
> >>>> Max: gc-minor:11 gc-collect-step:246
> >>>> Totals: gc-minor:386 slow:1 gc-minor-walkroots:0 gc-collect-step:30541
> >>>> Max: gc-minor:8 gc-collect-step:244
> >>>> Totals: gc-minor:410 slow:1 gc-minor-walkroots:0 gc-collect-step:31427
> >>>> Max: gc-minor:36 gc-collect-step:248
> >>>> Totals: gc-minor:390 slow:1 gc-minor-walkroots:0 gc-collect-step:30743
> >>>> Max: gc-minor:8 gc-collect-step:244
> >>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30207
> >>>> Max: gc-minor:8 gc-collect-step:245
> >>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:30837
> >>>> Max: gc-minor:8 gc-collect-step:244
> >>>> Totals: gc-minor:412 slow:1 gc-minor-walkroots:0 gc-collect-step:30898
> >>>> Max: gc-minor:38 gc-collect-step:244
> >>>> Totals: gc-minor:415 slow:1 gc-minor-walkroots:0 gc-collect-step:30407
> >>>> Max: gc-minor:23 gc-collect-step:245
> >>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30591
> >>>> Max: gc-minor:8 gc-collect-step:246
> >>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:31193
> >>>> Max: gc-minor:9 gc-collect-step:244
> >>>> Totals: gc-minor:379 slow:1 gc-minor-walkroots:0 gc-collect-step:30026
> >>>> Max: gc-minor:8 gc-collect-step:246
> >>>> Totals: gc-minor:388 slow:1 gc-minor-walkroots:0 gc-collect-step:31179
> >>>> Max: gc-minor:8 gc-collect-step:248
> >>>> Totals: gc-minor:378 slow:1 gc-minor-walkroots:0 gc-collect-step:30674
> >>>> Max: gc-minor:8 gc-collect-step:250
> >>>> Totals: gc-minor:385 slow:1 gc-minor-walkroots:0 gc-collect-step:30413
> >>>> Max: gc-minor:8 gc-collect-step:245
> >>>> Totals: gc-minor:915 slow:1 gc-minor-walkroots:0 gc-collect-step:30830
> >>>> Max: gc-minor:543 gc-collect-step:244
> >>>> Totals: gc-minor:405 slow:1 gc-minor-walkroots:0 gc-collect-step:31153
> >>>> Max: gc-minor:20 gc-collect-step:246
> >>>> Totals: gc-minor:408 slow:1 gc-minor-walkroots:0 gc-collect-step:29815
> >>>> Max: gc-minor:25 gc-collect-step:245
> >>>>
> >>>> Thanks,
> >>>> /Martin
> >>>>
> >>>>
> >>>>> On Mon, Mar 17, 2014 at 3:24 PM, Martin Koch <mak at issuu.com> wrote:
> >>>>>
> >>>>> Ah. I had misunderstood. I'll get back to you on that :) thanks
> >>>>>
> >>>>> /Martin
> >>>>>
> >>>>>
> >>>>>> On 17/03/2014, at 15.21, Maciej Fijalkowski <fijall at gmail.com>
> wrote:
> >>>>>>
> >>>>>> eh, this is not what I need
> >>>>>>
> >>>>>> I need a max of TIME it took for a gc-minor and the TOTAL time it
> took
> >>>>>> for a gc-minor (per query) (ideally same for gc-walkroots and
> >>>>>> gc-collect-step)
> >>>>>>
> >>>>>>> On Mon, Mar 17, 2014 at 4:19 PM, Martin Koch <mak at issuu.com>
> wrote:
> >>>>>>> Here are the collated results of running each query. For each run,
> I
> >>>>>>> count
> >>>>>>> how many of each of the pypy debug lines i get. I.e. there were 668
> >>>>>>> runs
> >>>>>>> that printed 58 loglines that contain "{gc-minor" which was
> eventually
> >>>>>>> followed by "gc-minor}". I have also counted if the query was slow;
> >>>>>>> interestingly, not all the queries with many gc-minors were slow
> (but
> >>>>>>> all
> >>>>>>> slow queries had a gc-minor).
> >>>>>>>
> >>>>>>> Please let me know if this is unclear :)
> >>>>>>>
> >>>>>>>   668 gc-minor:58 gc-minor-walkroots:58
> >>>>>>>    10 gc-minor:58 gc-minor-walkroots:58 gc-collect-step:5
> >>>>>>>   140 gc-minor:59 gc-minor-walkroots:59
> >>>>>>>     1 gc-minor:8441 gc-minor-walkroots:8441 gc-collect-step:8403
> >>>>>>>     1 gc-minor:9300 gc-minor-walkroots:9300 gc-collect-step:9249
> >>>>>>>     9 gc-minor:9643 slow:1 gc-minor-walkroots:9643
> >>>>>>> gc-collect-step:9589
> >>>>>>>     1 gc-minor:9644 slow:1 gc-minor-walkroots:9644
> >>>>>>> gc-collect-step:9590
> >>>>>>>    10 gc-minor:9647 slow:1 gc-minor-walkroots:9647
> >>>>>>> gc-collect-step:9609
> >>>>>>>     1 gc-minor:9663 gc-minor-walkroots:9663 gc-collect-step:9614
> >>>>>>>     1 jit-backend-dump:5 gc-minor:58 gc-minor-walkroots:58
> >>>>>>>     1 jit-log-compiling-loop:1 gc-collect-step:8991
> >>>>>>> jit-backend-dump:78
> >>>>>>> jit-backend:3 jit-log-noopt-loop:6 jit-log-virtualstate:3
> >>>>>>> gc-minor:9030
> >>>>>>> jit-tracing:3 gc-minor-walkroots:9030 jit-optimize:6
> >>>>>>> jit-log-short-preamble:2 jit-backend-addr:3 jit-log-opt-loop:1
> >>>>>>> jit-mem-looptoken-alloc:3 jit-abort:3 jit-log-rewritten-bridge:2
> >>>>>>> jit-log-rewritten-loop:1 jit-log-opt-bridge:2
> >>>>>>> jit-log-compiling-bridge:2
> >>>>>>> jit-resume:84
> >>>>>>>     1 jit-log-compiling-loop:1 jit-backend-dump:13 jit-backend:1
> >>>>>>> jit-log-noopt-loop:2 gc-minor:60 jit-tracing:1
> gc-minor-walkroots:60
> >>>>>>> jit-optimize:2 jit-log-short-preamble:1 jit-backend-addr:1
> >>>>>>> jit-log-opt-loop:1 jit-mem-looptoken-alloc:1
> jit-log-rewritten-loop:1
> >>>>>>> jit-resume:14
> >>>>>>>     1 jit-log-compiling-loop:1 jit-backend-dump:73 jit-backend:3
> >>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:60
> jit-tracing:3
> >>>>>>> gc-minor-walkroots:60 jit-optimize:6 jit-log-short-preamble:2
> >>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
> >>>>>>> jit-abort:3
> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1
> >>>>>>> jit-log-opt-bridge:2
> >>>>>>> jit-log-compiling-bridge:2 jit-resume:84
> >>>>>>>     2 jit-log-compiling-loop:1 jit-backend-dump:78 jit-backend:3
> >>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:61
> jit-tracing:3
> >>>>>>> gc-minor-walkroots:61 jit-optimize:6 jit-log-short-preamble:2
> >>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
> >>>>>>> jit-abort:3
> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1
> >>>>>>> jit-log-opt-bridge:2
> >>>>>>> jit-log-compiling-bridge:2 jit-resume:84
> >>>>>>>     1 jit-log-short-preamble:2 jit-log-compiling-loop:2
> >>>>>>> jit-backend-dump:92 jit-log-noopt-loop:7 jit-log-virtualstate:3
> >>>>>>> gc-minor:61
> >>>>>>> jit-tracing:4 gc-minor-walkroots:61 jit-optimize:7 jit-backend:4
> >>>>>>> jit-backend-addr:4 jit-log-opt-loop:2 jit-mem-looptoken-alloc:4
> >>>>>>> jit-abort:3
> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:2
> >>>>>>> jit-log-opt-bridge:2
> >>>>>>> jit-log-compiling-bridge:2 jit-resume:104
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> /Martin
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Mar 17, 2014 at 2:23 PM, Maciej Fijalkowski <
> fijall at gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On Mon, Mar 17, 2014 at 3:20 PM, Maciej Fijalkowski
> >>>>>>>> <fijall at gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>> are you *sure* it's the walkroots that take that long and not
> >>>>>>>>> something else (like gc-minor)? More of those mean that you
> allocate
> >>>>>>>>> a
> >>>>>>>>> lot more surviving objects. Can you do two things:
> >>>>>>>>>
> >>>>>>>>> a) take a max of gc-minor (and gc-minor-stackwalk), per request
> >>>>>>>>> b) take the sum of those
> >>>>>>>>>
> >>>>>>>>> and plot them
> >>>>>>>>
> >>>>>>>> ^^^ or just paste the results actually
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On Mon, Mar 17, 2014 at 3:18 PM, Martin Koch <mak at issuu.com>
> wrote:
> >>>>>>>>>> Well, then it works out to around 2.5GHz, which seems
> reasonable.
> >>>>>>>>>> But
> >>>>>>>>>> it
> >>>>>>>>>> doesn't alter the conclusion from the previous email: The slow
> >>>>>>>>>> queries
> >>>>>>>>>> then
> >>>>>>>>>> all have a duration around 34*10^9 units, 'normal' queries
> 1*10^9
> >>>>>>>>>> units, or
> >>>>>>>>>> .4 seconds at this conversion. Also, the log shows that a slow
> >>>>>>>>>> query
> >>>>>>>>>> performs many more gc-minor operations than a 'normal' one: 9600
> >>>>>>>>>> gc-collect-step/gc-minor/gc-minor-walkroots operations vs 58.
> >>>>>>>>>>
> >>>>>>>>>> So the question becomes: Why do we get this large spike in
> >>>>>>>>>> gc-minor-walkroots, and, in particular, is there any way to
> avoid
> >>>>>>>>>> it :)
> >>>>>>>>>> ?
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> /Martin
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Mar 17, 2014 at 1:53 PM, Maciej Fijalkowski
> >>>>>>>>>> <fijall at gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> I think it's the cycles of your CPU
> >>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Mar 17, 2014 at 2:48 PM, Martin Koch <mak at issuu.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>> What is the unit? Perhaps I'm being thick here, but I can't
> >>>>>>>>>>>> correlate it
> >>>>>>>>>>>> with seconds (which the program does print out). Slow runs are
> >>>>>>>>>>>> around 13
> >>>>>>>>>>>> seconds, but are around 34*10^9(dec), 0x800000000 timestamp
> units
> >>>>>>>>>>>> (e.g.
> >>>>>>>>>>>> from
> >>>>>>>>>>>> 0x2b994c9d31889c to 0x2b9944ab8c4f49).
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:09 PM, Maciej Fijalkowski
> >>>>>>>>>>>> <fijall at gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The number of lines is nonsense. This is a timestamp in hex.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:46 PM, Martin Koch <mak at issuu.com
> >
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Based On Maciej's suggestion, I tried the following
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> PYPYLOG=- pypy mem.py 10000000 > out
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This generates a logfile which looks something like this
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> start-->
> >>>>>>>>>>>>>> [2b99f1981b527e] {gc-minor
> >>>>>>>>>>>>>> [2b99f1981ba680] {gc-minor-walkroots
> >>>>>>>>>>>>>> [2b99f1981c2e02] gc-minor-walkroots}
> >>>>>>>>>>>>>> [2b99f19890d750] gc-minor}
> >>>>>>>>>>>>>> [snip]
> >>>>>>>>>>>>>> ...
> >>>>>>>>>>>>>> <--stop
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> It turns out that the culprit is a lot of MINOR collections.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I base this on the following observations:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I can't understand the format of the timestamp on each
> logline
> >>>>>>>>>>>>>> (the
> >>>>>>>>>>>>>> "[2b99f1981b527e]"). From what I can see in the code, this
> >>>>>>>>>>>>>> should
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>> output
> >>>>>>>>>>>>>> from time.clock(), but that doesn't return a number like
> that
> >>>>>>>>>>>>>> when I
> >>>>>>>>>>>>>> run
> >>>>>>>>>>>>>> pypy interactively
> >>>>>>>>>>>>>> Instead, I count the number of debug lines between start-->
> and
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> corresponding <--stop.
> >>>>>>>>>>>>>> Most runs have a few hundred lines of output between
> start/stop
> >>>>>>>>>>>>>> All slow runs have very close to 57800 lines out output
> between
> >>>>>>>>>>>>>> start/stop
> >>>>>>>>>>>>>> One such sample does 9609 gc-collect-step operations, 9647
> >>>>>>>>>>>>>> gc-minor
> >>>>>>>>>>>>>> operations, and 9647 gc-minor-walkroots operations.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> /Martin
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski
> >>>>>>>>>>>>>> <fijall at gmail.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> there is an environment variable PYPYLOG=gc:- (where - is
> >>>>>>>>>>>>>>> stdout)
> >>>>>>>>>>>>>>> which will do that for you btw.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> maybe you can find out what's that using profiling or
> >>>>>>>>>>>>>>> valgrind?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch <
> mak at issuu.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> I have tried getting the pypy source and building my own
> >>>>>>>>>>>>>>>> version
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>> pypy. I
> >>>>>>>>>>>>>>>> have modified
> >>>>>>>>>>>>>>>> rpython/memory/gc/incminimark.py:major_collection_step()
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> print out when it starts and when it stops. Apparently,
> the
> >>>>>>>>>>>>>>>> slow
> >>>>>>>>>>>>>>>> queries
> >>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>> NOT occur during major_collection_step; at least, I have
> not
> >>>>>>>>>>>>>>>> observed
> >>>>>>>>>>>>>>>> major
> >>>>>>>>>>>>>>>> step output during a query execution. So, apparently,
> >>>>>>>>>>>>>>>> something
> >>>>>>>>>>>>>>>> else
> >>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>> blocking. This could be another aspect of the GC, but it
> >>>>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>> also
> >>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> anything else.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Just to be sure, I have tried running the same
> application in
> >>>>>>>>>>>>>>>> python
> >>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>> garbage collection disabled. I don't see the problem
> there,
> >>>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>> somehow
> >>>>>>>>>>>>>>>> related to either GC or the runtime somehow.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>> /Martin
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch <
> mak at issuu.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We have hacked up a small sample that seems to exhibit
> the
> >>>>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>>>> issue.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We basically generate a linked list of objects. To
> increase
> >>>>>>>>>>>>>>>>> connectedness,
> >>>>>>>>>>>>>>>>> elements in the list hold references (dummy_links) to 10
> >>>>>>>>>>>>>>>>> randomly
> >>>>>>>>>>>>>>>>> chosen
> >>>>>>>>>>>>>>>>> previous elements in the list.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We then time a function that traverses 50000 elements
> from
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>>> from a
> >>>>>>>>>>>>>>>>> random start point. If the traversal reaches the end of
> the
> >>>>>>>>>>>>>>>>> list,
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> instead
> >>>>>>>>>>>>>>>>> traverse one of the dummy links. Thus, exactly 50K
> elements
> >>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>> traversed
> >>>>>>>>>>>>>>>>> every time. To generate some garbage, we build a list
> >>>>>>>>>>>>>>>>> holding
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> traversed
> >>>>>>>>>>>>>>>>> elements and a dummy list of characters.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Timings for the last 100 runs are stored in a circular
> >>>>>>>>>>>>>>>>> buffer. If
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> elapsed time for the last run is more than twice the
> average
> >>>>>>>>>>>>>>>>> time,
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> print
> >>>>>>>>>>>>>>>>> out a line with the elapsed time, the threshold, and the
> 90%
> >>>>>>>>>>>>>>>>> runtime
> >>>>>>>>>>>>>>>>> (we
> >>>>>>>>>>>>>>>>> would like to see that the mean runtime does not increase
> >>>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> number of
> >>>>>>>>>>>>>>>>> elements in the list, but that the max time does increase
> >>>>>>>>>>>>>>>>> (linearly
> >>>>>>>>>>>>>>>>> with the
> >>>>>>>>>>>>>>>>> number of object, i guess); traversing 50K elements
> should
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> independent of
> >>>>>>>>>>>>>>>>> the memory size).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We have tried monitoring memory consumption by external
> >>>>>>>>>>>>>>>>> inspection,
> >>>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>> cannot consistently verify that memory is deallocated at
> the
> >>>>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>>>> time
> >>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> we see slow requests. Perhaps the pypy runtime doesn't
> >>>>>>>>>>>>>>>>> always
> >>>>>>>>>>>>>>>>> return
> >>>>>>>>>>>>>>>>> freed
> >>>>>>>>>>>>>>>>> pages back to the OS?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Using top, we observe that 10M elements allocates around
> >>>>>>>>>>>>>>>>> 17GB
> >>>>>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>>>> building, 20M elements 26GB, 30M elements 28GB (and
> grows to
> >>>>>>>>>>>>>>>>> 35GB
> >>>>>>>>>>>>>>>>> shortly
> >>>>>>>>>>>>>>>>> after building).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Here is output from a few runs with different number of
> >>>>>>>>>>>>>>>>> elements:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> pypy mem.py 10000000
> >>>>>>>>>>>>>>>>> start build
> >>>>>>>>>>>>>>>>> end build 84.142424
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.230586  slow_threshold:
> >>>>>>>>>>>>>>>>> 1.495401
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.421558
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.016531  slow_threshold:
> >>>>>>>>>>>>>>>>> 1.488160
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.423441
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.032537  slow_threshold:
> >>>>>>>>>>>>>>>>> 1.474563
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.419817
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> pypy mem.py 20000000
> >>>>>>>>>>>>>>>>> start build
> >>>>>>>>>>>>>>>>> end build 180.823105
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 27.346064  slow_threshold:
> >>>>>>>>>>>>>>>>> 2.295146
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.434726
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 26.028852  slow_threshold:
> >>>>>>>>>>>>>>>>> 2.283927
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.374190
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 25.432279  slow_threshold:
> >>>>>>>>>>>>>>>>> 2.279631
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.371502
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> pypy mem.py 30000000
> >>>>>>>>>>>>>>>>> start build
> >>>>>>>>>>>>>>>>> end build 276.217811
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 40.993855  slow_threshold:
> >>>>>>>>>>>>>>>>> 3.188464
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.459891
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 41.693553  slow_threshold:
> >>>>>>>>>>>>>>>>> 3.183003
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 39.679769  slow_threshold:
> >>>>>>>>>>>>>>>>> 3.190782
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393677
> >>>>>>>>>>>>>>>>> that took a long time elapsed: 43.573411  slow_threshold:
> >>>>>>>>>>>>>>>>> 3.239637
> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Code below
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> --------------------------------------------------------------
> >>>>>>>>>>>>>>>>> import time
> >>>>>>>>>>>>>>>>> from random import randint, choice
> >>>>>>>>>>>>>>>>> import sys
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> allElems = {}
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> class Node:
> >>>>>>>>>>>>>>>>>   def __init__(self, v_):
> >>>>>>>>>>>>>>>>>       self.v = v_
> >>>>>>>>>>>>>>>>>       self.next = None
> >>>>>>>>>>>>>>>>>       self.dummy_data = [randint(0,100)
> >>>>>>>>>>>>>>>>>                          for _ in
> xrange(randint(50,100))]
> >>>>>>>>>>>>>>>>>       allElems[self.v] = self
> >>>>>>>>>>>>>>>>>       if self.v > 0:
> >>>>>>>>>>>>>>>>>           self.dummy_links = [allElems[randint(0,
> >>>>>>>>>>>>>>>>> self.v-1)]
> >>>>>>>>>>>>>>>>> for _
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> xrange(10)]
> >>>>>>>>>>>>>>>>>       else:
> >>>>>>>>>>>>>>>>>           self.dummy_links = [self]
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>   def set_next(self, l):
> >>>>>>>>>>>>>>>>>       self.next = l
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> def follow(node):
> >>>>>>>>>>>>>>>>>   acc = []
> >>>>>>>>>>>>>>>>>   count = 0
> >>>>>>>>>>>>>>>>>   cur = node
> >>>>>>>>>>>>>>>>>   assert node.v is not None
> >>>>>>>>>>>>>>>>>   assert cur is not None
> >>>>>>>>>>>>>>>>>   while count < 50000:
> >>>>>>>>>>>>>>>>>       # return a value; generate some garbage
> >>>>>>>>>>>>>>>>>       acc.append((cur.v,
> >>>>>>>>>>>>>>>>> [choice("abcdefghijklmnopqrstuvwxyz")
> >>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>> x
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> xrange(100)]))
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       # if we have reached the end, chose a random link
> >>>>>>>>>>>>>>>>>       cur = choice(cur.dummy_links) if cur.next is None
> >>>>>>>>>>>>>>>>> else
> >>>>>>>>>>>>>>>>> cur.next
> >>>>>>>>>>>>>>>>>       count += 1
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>   return acc
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> def build(num_elems):
> >>>>>>>>>>>>>>>>>   start = time.time()
> >>>>>>>>>>>>>>>>>   print "start build"
> >>>>>>>>>>>>>>>>>   root = Node(0)
> >>>>>>>>>>>>>>>>>   cur = root
> >>>>>>>>>>>>>>>>>   for x in xrange(1, num_elems):
> >>>>>>>>>>>>>>>>>       e = Node(x)
> >>>>>>>>>>>>>>>>>       cur.next = e
> >>>>>>>>>>>>>>>>>       cur = e
> >>>>>>>>>>>>>>>>>   print "end build %f" % (time.time() - start)
> >>>>>>>>>>>>>>>>>   return root
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> num_timings = 100
> >>>>>>>>>>>>>>>>> if __name__ == "__main__":
> >>>>>>>>>>>>>>>>>   num_elems = int(sys.argv[1])
> >>>>>>>>>>>>>>>>>   build(num_elems)
> >>>>>>>>>>>>>>>>>   total = 0
> >>>>>>>>>>>>>>>>>   timings = [0.0] * num_timings # run times for the last
> >>>>>>>>>>>>>>>>> num_timings
> >>>>>>>>>>>>>>>>> runs
> >>>>>>>>>>>>>>>>>   i = 0
> >>>>>>>>>>>>>>>>>   beginning = time.time()
> >>>>>>>>>>>>>>>>>   while time.time() - beginning < 600:
> >>>>>>>>>>>>>>>>>       start = time.time()
> >>>>>>>>>>>>>>>>>       elem = allElems[randint(0, num_elems - 1)]
> >>>>>>>>>>>>>>>>>       assert(elem is not None)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       lst = follow(elem)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       total += choice(lst)[0] # use the return value for
> >>>>>>>>>>>>>>>>> something
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       end = time.time()
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       elapsed = end-start
> >>>>>>>>>>>>>>>>>       timings[i % num_timings] = elapsed
> >>>>>>>>>>>>>>>>>       if (i > num_timings):
> >>>>>>>>>>>>>>>>>           slow_time = 2 * sum(timings)/num_timings # slow
> >>>>>>>>>>>>>>>>> defined
> >>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2*avg run time
> >>>>>>>>>>>>>>>>>           if (elapsed > slow_time):
> >>>>>>>>>>>>>>>>>               print "that took a long time elapsed: %f
> >>>>>>>>>>>>>>>>> slow_threshold:
> >>>>>>>>>>>>>>>>> %f  90th_quantile_runtime: %f" % \
> >>>>>>>>>>>>>>>>>                   (elapsed, slow_time,
> >>>>>>>>>>>>>>>>> sorted(timings)[int(num_timings*.9)])
> >>>>>>>>>>>>>>>>>       i += 1
> >>>>>>>>>>>>>>>>>   print total
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski
> >>>>>>>>>>>>>>>>> <fijall at gmail.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch
> >>>>>>>>>>>>>>>>>> <mak at issuu.com>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>> Hi Armin, Maciej
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks for responding.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I'm in the process of trying to determine what (if
> any) of
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> code
> >>>>>>>>>>>>>>>>>>> I'm
> >>>>>>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>> position to share, and I'll get back to you.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Allowing hinting to the GC would be good. Even better
> >>>>>>>>>>>>>>>>>>> would be
> >>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> allow me to (transparently) allocate objects in
> unmanaged
> >>>>>>>>>>>>>>>>>>> memory,
> >>>>>>>>>>>>>>>>>>> but I
> >>>>>>>>>>>>>>>>>>> would expect that to be a tall order :)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>> /Martin
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi Martin.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Note that in case you want us to do the work of
> isolating
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> problem,
> >>>>>>>>>>>>>>>>>> we do offer paid support to do that (then we can sign
> NDAs
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> stuff).
> >>>>>>>>>>>>>>>>>> Otherwise we would be more than happy to fix bugs once
> you
> >>>>>>>>>>>>>>>>>> isolate
> >>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>> part you can share freely :)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140317/de882d33/attachment-0001.html>