[pypy-dev] Pypy garbage collection

Martin Koch mak at issuu.com
Mon Mar 17 16:35:33 CET 2014


Here are the total and max times in millions of units; 30000 units is
approximately 13 seconds. I have extracted the runs where there are many
gc-collect-steps. These are in execution order, so the first runs with many
gc-collect-steps aren't slow.

*Totals*: gc-minor:418 gc-minor-walkroots:0 gc-collect-step:28797 *Max*:
gc-minor:10 gc-collect-step:247
*Totals*: gc-minor:562 gc-minor-walkroots:0 gc-collect-step:30282 *Max*:
gc-minor:10 gc-collect-step:245
*Totals*: gc-minor:434 gc-minor-walkroots:0 gc-collect-step:31040 *Max*:
gc-minor:11 gc-collect-step:244
*Totals*: gc-minor:417 slow:1 gc-minor-walkroots:0 gc-collect-step:31270
*Max*: gc-minor:17 gc-collect-step:244
*Totals*: gc-minor:435 slow:1 gc-minor-walkroots:0 gc-collect-step:30365
*Max*: gc-minor:11 gc-collect-step:248
*Totals*: gc-minor:389 slow:1 gc-minor-walkroots:0 gc-collect-step:31235
*Max*: gc-minor:8 gc-collect-step:299
*Totals*: gc-minor:434 slow:1 gc-minor-walkroots:0 gc-collect-step:31124
*Max*: gc-minor:11 gc-collect-step:246
*Totals*: gc-minor:386 slow:1 gc-minor-walkroots:0 gc-collect-step:30541
*Max*: gc-minor:8 gc-collect-step:244
*Totals*: gc-minor:410 slow:1 gc-minor-walkroots:0 gc-collect-step:31427
*Max*: gc-minor:36 gc-collect-step:248
*Totals*: gc-minor:390 slow:1 gc-minor-walkroots:0 gc-collect-step:30743
*Max*: gc-minor:8 gc-collect-step:244
*Totals*: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30207
*Max*: gc-minor:8 gc-collect-step:245
*Totals*: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:30837
*Max*: gc-minor:8 gc-collect-step:244
*Totals*: gc-minor:412 slow:1 gc-minor-walkroots:0 gc-collect-step:30898
*Max*: gc-minor:38 gc-collect-step:244
*Totals*: gc-minor:415 slow:1 gc-minor-walkroots:0 gc-collect-step:30407
*Max*: gc-minor:23 gc-collect-step:245
*Totals*: gc-minor:380 slow:1 gc-minor-walkroots:0 gc-collect-step:30591
*Max*: gc-minor:8 gc-collect-step:246
*Totals*: gc-minor:387 slow:1 gc-minor-walkroots:0 gc-collect-step:31193
*Max*: gc-minor:9 gc-collect-step:244
*Totals*: gc-minor:379 slow:1 gc-minor-walkroots:0 gc-collect-step:30026
*Max*: gc-minor:8 gc-collect-step:246
*Totals*: gc-minor:388 slow:1 gc-minor-walkroots:0 gc-collect-step:31179
*Max*: gc-minor:8 gc-collect-step:248
*Totals*: gc-minor:378 slow:1 gc-minor-walkroots:0 gc-collect-step:30674
*Max*: gc-minor:8 gc-collect-step:250
*Totals*: gc-minor:385 slow:1 gc-minor-walkroots:0 gc-collect-step:30413
*Max*: gc-minor:8 gc-collect-step:245
*Totals*: gc-minor:915 slow:1 gc-minor-walkroots:0 gc-collect-step:30830
*Max*: gc-minor:543 gc-collect-step:244
*Totals*: gc-minor:405 slow:1 gc-minor-walkroots:0 gc-collect-step:31153
*Max*: gc-minor:20 gc-collect-step:246
*Totals*: gc-minor:408 slow:1 gc-minor-walkroots:0 gc-collect-step:29815
*Max*: gc-minor:25 gc-collect-step:245

Thanks,
/Martin


On Mon, Mar 17, 2014 at 3:24 PM, Martin Koch <mak at issuu.com> wrote:

> Ah. I had misunderstood. I'll get back to you on that :) thanks
>
> /Martin
>
>
> > On 17/03/2014, at 15.21, Maciej Fijalkowski <fijall at gmail.com> wrote:
> >
> > eh, this is not what I need
> >
> > I need a max of TIME it took for a gc-minor and the TOTAL time it took
> > for a gc-minor (per query) (ideally same for gc-walkroots and
> > gc-collect-step)
> >
> >> On Mon, Mar 17, 2014 at 4:19 PM, Martin Koch <mak at issuu.com> wrote:
> >> Here are the collated results of running each query. For each run, I
> count
> >> how many of each of the pypy debug lines i get. I.e. there were 668 runs
> >> that printed 58 loglines that contain "{gc-minor" which was eventually
> >> followed by "gc-minor}". I have also counted if the query was slow;
> >> interestingly, not all the queries with many gc-minors were slow (but
> all
> >> slow queries had a gc-minor).
> >>
> >> Please let me know if this is unclear :)
> >>
> >>    668 gc-minor:58 gc-minor-walkroots:58
> >>     10 gc-minor:58 gc-minor-walkroots:58 gc-collect-step:5
> >>    140 gc-minor:59 gc-minor-walkroots:59
> >>      1 gc-minor:8441 gc-minor-walkroots:8441 gc-collect-step:8403
> >>      1 gc-minor:9300 gc-minor-walkroots:9300 gc-collect-step:9249
> >>      9 gc-minor:9643 slow:1 gc-minor-walkroots:9643 gc-collect-step:9589
> >>      1 gc-minor:9644 slow:1 gc-minor-walkroots:9644 gc-collect-step:9590
> >>     10 gc-minor:9647 slow:1 gc-minor-walkroots:9647 gc-collect-step:9609
> >>      1 gc-minor:9663 gc-minor-walkroots:9663 gc-collect-step:9614
> >>      1 jit-backend-dump:5 gc-minor:58 gc-minor-walkroots:58
> >>      1 jit-log-compiling-loop:1 gc-collect-step:8991 jit-backend-dump:78
> >> jit-backend:3 jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:9030
> >> jit-tracing:3 gc-minor-walkroots:9030 jit-optimize:6
> >> jit-log-short-preamble:2 jit-backend-addr:3 jit-log-opt-loop:1
> >> jit-mem-looptoken-alloc:3 jit-abort:3 jit-log-rewritten-bridge:2
> >> jit-log-rewritten-loop:1 jit-log-opt-bridge:2 jit-log-compiling-bridge:2
> >> jit-resume:84
> >>      1 jit-log-compiling-loop:1 jit-backend-dump:13 jit-backend:1
> >> jit-log-noopt-loop:2 gc-minor:60 jit-tracing:1 gc-minor-walkroots:60
> >> jit-optimize:2 jit-log-short-preamble:1 jit-backend-addr:1
> >> jit-log-opt-loop:1 jit-mem-looptoken-alloc:1 jit-log-rewritten-loop:1
> >> jit-resume:14
> >>      1 jit-log-compiling-loop:1 jit-backend-dump:73 jit-backend:3
> >> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:60 jit-tracing:3
> >> gc-minor-walkroots:60 jit-optimize:6 jit-log-short-preamble:2
> >> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
> jit-abort:3
> >> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1 jit-log-opt-bridge:2
> >> jit-log-compiling-bridge:2 jit-resume:84
> >>      2 jit-log-compiling-loop:1 jit-backend-dump:78 jit-backend:3
> >> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:61 jit-tracing:3
> >> gc-minor-walkroots:61 jit-optimize:6 jit-log-short-preamble:2
> >> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3
> jit-abort:3
> >> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1 jit-log-opt-bridge:2
> >> jit-log-compiling-bridge:2 jit-resume:84
> >>      1 jit-log-short-preamble:2 jit-log-compiling-loop:2
> >> jit-backend-dump:92 jit-log-noopt-loop:7 jit-log-virtualstate:3
> gc-minor:61
> >> jit-tracing:4 gc-minor-walkroots:61 jit-optimize:7 jit-backend:4
> >> jit-backend-addr:4 jit-log-opt-loop:2 jit-mem-looptoken-alloc:4
> jit-abort:3
> >> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:2 jit-log-opt-bridge:2
> >> jit-log-compiling-bridge:2 jit-resume:104
> >>
> >>
> >> Thanks,
> >> /Martin
> >>
> >>
> >>
> >> On Mon, Mar 17, 2014 at 2:23 PM, Maciej Fijalkowski <fijall at gmail.com>
> >> wrote:
> >>>
> >>> On Mon, Mar 17, 2014 at 3:20 PM, Maciej Fijalkowski <fijall at gmail.com>
> >>> wrote:
> >>>> are you *sure* it's the walkroots that take that long and not
> >>>> something else (like gc-minor)? More of those mean that you allocate a
> >>>> lot more surviving objects. Can you do two things:
> >>>>
> >>>> a) take a max of gc-minor (and gc-minor-stackwalk), per request
> >>>> b) take the sum of those
> >>>>
> >>>> and plot them
> >>>
> >>> ^^^ or just paste the results actually
> >>>
> >>>>
> >>>>> On Mon, Mar 17, 2014 at 3:18 PM, Martin Koch <mak at issuu.com> wrote:
> >>>>> Well, then it works out to around 2.5GHz, which seems reasonable. But
> >>>>> it
> >>>>> doesn't alter the conclusion from the previous email: The slow
> queries
> >>>>> then
> >>>>> all have a duration around 34*10^9 units, 'normal' queries 1*10^9
> >>>>> units, or
> >>>>> .4 seconds at this conversion. Also, the log shows that a slow query
> >>>>> performs many more gc-minor operations than a 'normal' one: 9600
> >>>>> gc-collect-step/gc-minor/gc-minor-walkroots operations vs 58.
> >>>>>
> >>>>> So the question becomes: Why do we get this large spike in
> >>>>> gc-minor-walkroots, and, in particular, is there any way to avoid it
> :)
> >>>>> ?
> >>>>>
> >>>>> Thanks,
> >>>>> /Martin
> >>>>>
> >>>>>
> >>>>> On Mon, Mar 17, 2014 at 1:53 PM, Maciej Fijalkowski <
> fijall at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> I think it's the cycles of your CPU
> >>>>>>
> >>>>>>> On Mon, Mar 17, 2014 at 2:48 PM, Martin Koch <mak at issuu.com>
> wrote:
> >>>>>>> What is the unit? Perhaps I'm being thick here, but I can't
> >>>>>>> correlate it
> >>>>>>> with seconds (which the program does print out). Slow runs are
> >>>>>>> around 13
> >>>>>>> seconds, but are around 34*10^9(dec), 0x800000000 timestamp units
> >>>>>>> (e.g.
> >>>>>>> from
> >>>>>>> 0x2b994c9d31889c to 0x2b9944ab8c4f49).
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Mar 17, 2014 at 12:09 PM, Maciej Fijalkowski
> >>>>>>> <fijall at gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> The number of lines is nonsense. This is a timestamp in hex.
> >>>>>>>>
> >>>>>>>> On Mon, Mar 17, 2014 at 12:46 PM, Martin Koch <mak at issuu.com>
> >>>>>>>> wrote:
> >>>>>>>>> Based On Maciej's suggestion, I tried the following
> >>>>>>>>>
> >>>>>>>>> PYPYLOG=- pypy mem.py 10000000 > out
> >>>>>>>>>
> >>>>>>>>> This generates a logfile which looks something like this
> >>>>>>>>>
> >>>>>>>>> start-->
> >>>>>>>>> [2b99f1981b527e] {gc-minor
> >>>>>>>>> [2b99f1981ba680] {gc-minor-walkroots
> >>>>>>>>> [2b99f1981c2e02] gc-minor-walkroots}
> >>>>>>>>> [2b99f19890d750] gc-minor}
> >>>>>>>>> [snip]
> >>>>>>>>> ...
> >>>>>>>>> <--stop
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It turns out that the culprit is a lot of MINOR collections.
> >>>>>>>>>
> >>>>>>>>> I base this on the following observations:
> >>>>>>>>>
> >>>>>>>>> I can't understand the format of the timestamp on each logline
> >>>>>>>>> (the
> >>>>>>>>> "[2b99f1981b527e]"). From what I can see in the code, this should
> >>>>>>>>> be
> >>>>>>>>> output
> >>>>>>>>> from time.clock(), but that doesn't return a number like that
> >>>>>>>>> when I
> >>>>>>>>> run
> >>>>>>>>> pypy interactively
> >>>>>>>>> Instead, I count the number of debug lines between start--> and
> >>>>>>>>> the
> >>>>>>>>> corresponding <--stop.
> >>>>>>>>> Most runs have a few hundred lines of output between start/stop
> >>>>>>>>> All slow runs have very close to 57800 lines out output between
> >>>>>>>>> start/stop
> >>>>>>>>> One such sample does 9609 gc-collect-step operations, 9647
> >>>>>>>>> gc-minor
> >>>>>>>>> operations, and 9647 gc-minor-walkroots operations.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> /Martin
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski
> >>>>>>>>> <fijall at gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> there is an environment variable PYPYLOG=gc:- (where - is
> >>>>>>>>>> stdout)
> >>>>>>>>>> which will do that for you btw.
> >>>>>>>>>>
> >>>>>>>>>> maybe you can find out what's that using profiling or valgrind?
> >>>>>>>>>>
> >>>>>>>>>> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch <mak at issuu.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>> I have tried getting the pypy source and building my own
> >>>>>>>>>>> version
> >>>>>>>>>>> of
> >>>>>>>>>>> pypy. I
> >>>>>>>>>>> have modified
> >>>>>>>>>>> rpython/memory/gc/incminimark.py:major_collection_step()
> >>>>>>>>>>> to
> >>>>>>>>>>> print out when it starts and when it stops. Apparently, the
> >>>>>>>>>>> slow
> >>>>>>>>>>> queries
> >>>>>>>>>>> do
> >>>>>>>>>>> NOT occur during major_collection_step; at least, I have not
> >>>>>>>>>>> observed
> >>>>>>>>>>> major
> >>>>>>>>>>> step output during a query execution. So, apparently,
> >>>>>>>>>>> something
> >>>>>>>>>>> else
> >>>>>>>>>>> is
> >>>>>>>>>>> blocking. This could be another aspect of the GC, but it could
> >>>>>>>>>>> also
> >>>>>>>>>>> be
> >>>>>>>>>>> anything else.
> >>>>>>>>>>>
> >>>>>>>>>>> Just to be sure, I have tried running the same application in
> >>>>>>>>>>> python
> >>>>>>>>>>> with
> >>>>>>>>>>> garbage collection disabled. I don't see the problem there, so
> >>>>>>>>>>> it
> >>>>>>>>>>> is
> >>>>>>>>>>> somehow
> >>>>>>>>>>> related to either GC or the runtime somehow.
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> /Martin
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch <mak at issuu.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have hacked up a small sample that seems to exhibit the
> >>>>>>>>>>>> same
> >>>>>>>>>>>> issue.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We basically generate a linked list of objects. To increase
> >>>>>>>>>>>> connectedness,
> >>>>>>>>>>>> elements in the list hold references (dummy_links) to 10
> >>>>>>>>>>>> randomly
> >>>>>>>>>>>> chosen
> >>>>>>>>>>>> previous elements in the list.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We then time a function that traverses 50000 elements from
> >>>>>>>>>>>> the
> >>>>>>>>>>>> list
> >>>>>>>>>>>> from a
> >>>>>>>>>>>> random start point. If the traversal reaches the end of the
> >>>>>>>>>>>> list,
> >>>>>>>>>>>> we
> >>>>>>>>>>>> instead
> >>>>>>>>>>>> traverse one of the dummy links. Thus, exactly 50K elements
> >>>>>>>>>>>> are
> >>>>>>>>>>>> traversed
> >>>>>>>>>>>> every time. To generate some garbage, we build a list holding
> >>>>>>>>>>>> the
> >>>>>>>>>>>> traversed
> >>>>>>>>>>>> elements and a dummy list of characters.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Timings for the last 100 runs are stored in a circular
> >>>>>>>>>>>> buffer. If
> >>>>>>>>>>>> the
> >>>>>>>>>>>> elapsed time for the last run is more than twice the average
> >>>>>>>>>>>> time,
> >>>>>>>>>>>> we
> >>>>>>>>>>>> print
> >>>>>>>>>>>> out a line with the elapsed time, the threshold, and the 90%
> >>>>>>>>>>>> runtime
> >>>>>>>>>>>> (we
> >>>>>>>>>>>> would like to see that the mean runtime does not increase
> >>>>>>>>>>>> with
> >>>>>>>>>>>> the
> >>>>>>>>>>>> number of
> >>>>>>>>>>>> elements in the list, but that the max time does increase
> >>>>>>>>>>>> (linearly
> >>>>>>>>>>>> with the
> >>>>>>>>>>>> number of object, i guess); traversing 50K elements should be
> >>>>>>>>>>>> independent of
> >>>>>>>>>>>> the memory size).
> >>>>>>>>>>>>
> >>>>>>>>>>>> We have tried monitoring memory consumption by external
> >>>>>>>>>>>> inspection,
> >>>>>>>>>>>> but
> >>>>>>>>>>>> cannot consistently verify that memory is deallocated at the
> >>>>>>>>>>>> same
> >>>>>>>>>>>> time
> >>>>>>>>>>>> that
> >>>>>>>>>>>> we see slow requests. Perhaps the pypy runtime doesn't always
> >>>>>>>>>>>> return
> >>>>>>>>>>>> freed
> >>>>>>>>>>>> pages back to the OS?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Using top, we observe that 10M elements allocates around 17GB
> >>>>>>>>>>>> after
> >>>>>>>>>>>> building, 20M elements 26GB, 30M elements 28GB (and grows to
> >>>>>>>>>>>> 35GB
> >>>>>>>>>>>> shortly
> >>>>>>>>>>>> after building).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Here is output from a few runs with different number of
> >>>>>>>>>>>> elements:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> pypy mem.py 10000000
> >>>>>>>>>>>> start build
> >>>>>>>>>>>> end build 84.142424
> >>>>>>>>>>>> that took a long time elapsed: 13.230586  slow_threshold:
> >>>>>>>>>>>> 1.495401
> >>>>>>>>>>>> 90th_quantile_runtime: 0.421558
> >>>>>>>>>>>> that took a long time elapsed: 13.016531  slow_threshold:
> >>>>>>>>>>>> 1.488160
> >>>>>>>>>>>> 90th_quantile_runtime: 0.423441
> >>>>>>>>>>>> that took a long time elapsed: 13.032537  slow_threshold:
> >>>>>>>>>>>> 1.474563
> >>>>>>>>>>>> 90th_quantile_runtime: 0.419817
> >>>>>>>>>>>>
> >>>>>>>>>>>> pypy mem.py 20000000
> >>>>>>>>>>>> start build
> >>>>>>>>>>>> end build 180.823105
> >>>>>>>>>>>> that took a long time elapsed: 27.346064  slow_threshold:
> >>>>>>>>>>>> 2.295146
> >>>>>>>>>>>> 90th_quantile_runtime: 0.434726
> >>>>>>>>>>>> that took a long time elapsed: 26.028852  slow_threshold:
> >>>>>>>>>>>> 2.283927
> >>>>>>>>>>>> 90th_quantile_runtime: 0.374190
> >>>>>>>>>>>> that took a long time elapsed: 25.432279  slow_threshold:
> >>>>>>>>>>>> 2.279631
> >>>>>>>>>>>> 90th_quantile_runtime: 0.371502
> >>>>>>>>>>>>
> >>>>>>>>>>>> pypy mem.py 30000000
> >>>>>>>>>>>> start build
> >>>>>>>>>>>> end build 276.217811
> >>>>>>>>>>>> that took a long time elapsed: 40.993855  slow_threshold:
> >>>>>>>>>>>> 3.188464
> >>>>>>>>>>>> 90th_quantile_runtime: 0.459891
> >>>>>>>>>>>> that took a long time elapsed: 41.693553  slow_threshold:
> >>>>>>>>>>>> 3.183003
> >>>>>>>>>>>> 90th_quantile_runtime: 0.393654
> >>>>>>>>>>>> that took a long time elapsed: 39.679769  slow_threshold:
> >>>>>>>>>>>> 3.190782
> >>>>>>>>>>>> 90th_quantile_runtime: 0.393677
> >>>>>>>>>>>> that took a long time elapsed: 43.573411  slow_threshold:
> >>>>>>>>>>>> 3.239637
> >>>>>>>>>>>> 90th_quantile_runtime: 0.393654
> >>>>>>>>>>>>
> >>>>>>>>>>>> Code below
> >>>>>>>>>>>>
> >>>>>>>>>>>> --------------------------------------------------------------
> >>>>>>>>>>>> import time
> >>>>>>>>>>>> from random import randint, choice
> >>>>>>>>>>>> import sys
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> allElems = {}
> >>>>>>>>>>>>
> >>>>>>>>>>>> class Node:
> >>>>>>>>>>>>    def __init__(self, v_):
> >>>>>>>>>>>>        self.v = v_
> >>>>>>>>>>>>        self.next = None
> >>>>>>>>>>>>        self.dummy_data = [randint(0,100)
> >>>>>>>>>>>>                           for _ in xrange(randint(50,100))]
> >>>>>>>>>>>>        allElems[self.v] = self
> >>>>>>>>>>>>        if self.v > 0:
> >>>>>>>>>>>>            self.dummy_links = [allElems[randint(0,
> >>>>>>>>>>>> self.v-1)]
> >>>>>>>>>>>> for _
> >>>>>>>>>>>> in
> >>>>>>>>>>>> xrange(10)]
> >>>>>>>>>>>>        else:
> >>>>>>>>>>>>            self.dummy_links = [self]
> >>>>>>>>>>>>
> >>>>>>>>>>>>    def set_next(self, l):
> >>>>>>>>>>>>        self.next = l
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> def follow(node):
> >>>>>>>>>>>>    acc = []
> >>>>>>>>>>>>    count = 0
> >>>>>>>>>>>>    cur = node
> >>>>>>>>>>>>    assert node.v is not None
> >>>>>>>>>>>>    assert cur is not None
> >>>>>>>>>>>>    while count < 50000:
> >>>>>>>>>>>>        # return a value; generate some garbage
> >>>>>>>>>>>>        acc.append((cur.v,
> >>>>>>>>>>>> [choice("abcdefghijklmnopqrstuvwxyz")
> >>>>>>>>>>>> for
> >>>>>>>>>>>> x
> >>>>>>>>>>>> in
> >>>>>>>>>>>> xrange(100)]))
> >>>>>>>>>>>>
> >>>>>>>>>>>>        # if we have reached the end, chose a random link
> >>>>>>>>>>>>        cur = choice(cur.dummy_links) if cur.next is None
> >>>>>>>>>>>> else
> >>>>>>>>>>>> cur.next
> >>>>>>>>>>>>        count += 1
> >>>>>>>>>>>>
> >>>>>>>>>>>>    return acc
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> def build(num_elems):
> >>>>>>>>>>>>    start = time.time()
> >>>>>>>>>>>>    print "start build"
> >>>>>>>>>>>>    root = Node(0)
> >>>>>>>>>>>>    cur = root
> >>>>>>>>>>>>    for x in xrange(1, num_elems):
> >>>>>>>>>>>>        e = Node(x)
> >>>>>>>>>>>>        cur.next = e
> >>>>>>>>>>>>        cur = e
> >>>>>>>>>>>>    print "end build %f" % (time.time() - start)
> >>>>>>>>>>>>    return root
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> num_timings = 100
> >>>>>>>>>>>> if __name__ == "__main__":
> >>>>>>>>>>>>    num_elems = int(sys.argv[1])
> >>>>>>>>>>>>    build(num_elems)
> >>>>>>>>>>>>    total = 0
> >>>>>>>>>>>>    timings = [0.0] * num_timings # run times for the last
> >>>>>>>>>>>> num_timings
> >>>>>>>>>>>> runs
> >>>>>>>>>>>>    i = 0
> >>>>>>>>>>>>    beginning = time.time()
> >>>>>>>>>>>>    while time.time() - beginning < 600:
> >>>>>>>>>>>>        start = time.time()
> >>>>>>>>>>>>        elem = allElems[randint(0, num_elems - 1)]
> >>>>>>>>>>>>        assert(elem is not None)
> >>>>>>>>>>>>
> >>>>>>>>>>>>        lst = follow(elem)
> >>>>>>>>>>>>
> >>>>>>>>>>>>        total += choice(lst)[0] # use the return value for
> >>>>>>>>>>>> something
> >>>>>>>>>>>>
> >>>>>>>>>>>>        end = time.time()
> >>>>>>>>>>>>
> >>>>>>>>>>>>        elapsed = end-start
> >>>>>>>>>>>>        timings[i % num_timings] = elapsed
> >>>>>>>>>>>>        if (i > num_timings):
> >>>>>>>>>>>>            slow_time = 2 * sum(timings)/num_timings # slow
> >>>>>>>>>>>> defined
> >>>>>>>>>>>> as
> >>>>>>>>>>>>>
> >>>>>>>>>>>> 2*avg run time
> >>>>>>>>>>>>            if (elapsed > slow_time):
> >>>>>>>>>>>>                print "that took a long time elapsed: %f
> >>>>>>>>>>>> slow_threshold:
> >>>>>>>>>>>> %f  90th_quantile_runtime: %f" % \
> >>>>>>>>>>>>                    (elapsed, slow_time,
> >>>>>>>>>>>> sorted(timings)[int(num_timings*.9)])
> >>>>>>>>>>>>        i += 1
> >>>>>>>>>>>>    print total
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski
> >>>>>>>>>>>> <fijall at gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch <mak at issuu.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>> Hi Armin, Maciej
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for responding.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm in the process of trying to determine what (if any) of
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> code
> >>>>>>>>>>>>>> I'm
> >>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>> position to share, and I'll get back to you.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Allowing hinting to the GC would be good. Even better
> >>>>>>>>>>>>>> would be
> >>>>>>>>>>>>>> a
> >>>>>>>>>>>>>> means
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> allow me to (transparently) allocate objects in unmanaged
> >>>>>>>>>>>>>> memory,
> >>>>>>>>>>>>>> but I
> >>>>>>>>>>>>>> would expect that to be a tall order :)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> /Martin
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Martin.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Note that in case you want us to do the work of isolating
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>> problem,
> >>>>>>>>>>>>> we do offer paid support to do that (then we can sign NDAs
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>> stuff).
> >>>>>>>>>>>>> Otherwise we would be more than happy to fix bugs once you
> >>>>>>>>>>>>> isolate
> >>>>>>>>>>>>> a
> >>>>>>>>>>>>> part you can share freely :)
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20140317/4d3778a5/attachment-0001.html>


More information about the pypy-dev mailing list