We have hacked up a small sample that seems to exhibit the same issue.We basically generate a linked list of objects. To increase connectedness, elements in the list hold references (dummy_links) to 10 randomly chosen previous elements in the list.We then time a function that traverses 50000 elements from the list from a random start point. If the traversal reaches the end of the list, we instead traverse one of the dummy links. Thus, exactly 50K elements are traversed every time. To generate some garbage, we build a list holding the traversed elements and a dummy list of characters.Timings for the last 100 runs are stored in a circular buffer. If the elapsed time for the last run is more than twice the average time, we print out a line with the elapsed time, the threshold, and the 90% runtime (we would like to see that the mean runtime does not increase with the number of elements in the list, but that the max time does increase (linearly with the number of object, i guess); traversing 50K elements should be independent of the memory size).We have tried monitoring memory consumption by external inspection, but cannot consistently verify that memory is deallocated at the same time that we see slow requests. Perhaps the pypy runtime doesn't always return freed pages back to the OS?Using top, we observe that 10M elements allocates around 17GB after building, 20M elements 26GB, 30M elements 28GB (and grows to 35GB shortly after building).Here is output from a few runs with different number of elements:pypy mem.py 10000000start buildend build 84.142424that took a long time elapsed: 13.230586 slow_threshold: 1.495401 90th_quantile_runtime: 0.421558that took a long time elapsed: 13.016531 slow_threshold: 1.488160 90th_quantile_runtime: 0.423441that took a long time elapsed: 13.032537 slow_threshold: 1.474563 90th_quantile_runtime: 0.419817pypy mem.py 20000000start buildend build 180.823105that took a long time elapsed: 27.346064 slow_threshold: 2.295146 90th_quantile_runtime: 0.434726that took a long time elapsed: 26.028852 slow_threshold: 2.283927 90th_quantile_runtime: 0.374190that took a long time elapsed: 25.432279 slow_threshold: 2.279631 90th_quantile_runtime: 0.371502pypy mem.py 30000000start buildend build 276.217811that took a long time elapsed: 40.993855 slow_threshold: 3.188464 90th_quantile_runtime: 0.459891that took a long time elapsed: 41.693553 slow_threshold: 3.183003 90th_quantile_runtime: 0.393654that took a long time elapsed: 39.679769 slow_threshold: 3.190782 90th_quantile_runtime: 0.393677that took a long time elapsed: 43.573411 slow_threshold: 3.239637 90th_quantile_runtime: 0.393654Code below--------------------------------------------------------------import timefrom random import randint, choiceimport sysallElems = {}class Node:def __init__(self, v_):self.v = v_self.next = Noneself.dummy_data = [randint(0,100)for _ in xrange(randint(50,100))]allElems[self.v] = selfif self.v > 0:self.dummy_links = [allElems[randint(0, self.v-1)] for _ in xrange(10)]else:self.dummy_links = [self]def set_next(self, l):self.next = ldef follow(node):acc = []count = 0cur = nodeassert node.v is not Noneassert cur is not Nonewhile count < 50000:# return a value; generate some garbageacc.append((cur.v, [choice("abcdefghijklmnopqrstuvwxyz") for x in xrange(100)]))# if we have reached the end, chose a random linkcur = choice(cur.dummy_links) if cur.next is None else cur.nextcount += 1return accdef build(num_elems):start = time.time()print "start build"root = Node(0)cur = rootfor x in xrange(1, num_elems):e = Node(x)cur.next = ecur = eprint "end build %f" % (time.time() - start)return rootnum_timings = 100if __name__ == "__main__":num_elems = int(sys.argv[1])build(num_elems)total = 0timings = [0.0] * num_timings # run times for the last num_timings runsi = 0beginning = time.time()while time.time() - beginning < 600:start = time.time()elem = allElems[randint(0, num_elems - 1)]assert(elem is not None)lst = follow(elem)total += choice(lst)[0] # use the return value for somethingend = time.time()elapsed = end-starttimings[i % num_timings] = elapsedif (i > num_timings):slow_time = 2 * sum(timings)/num_timings # slow defined as > 2*avg run timeif (elapsed > slow_time):print "that took a long time elapsed: %f slow_threshold: %f 90th_quantile_runtime: %f" % \(elapsed, slow_time, sorted(timings)[int(num_timings*.9)])i += 1print totalOn Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski <fijall@gmail.com> wrote:On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch <mak@issuu.com> wrote:Hi Martin.
> Hi Armin, Maciej
>
> Thanks for responding.
>
> I'm in the process of trying to determine what (if any) of the code I'm in a
> position to share, and I'll get back to you.
>
> Allowing hinting to the GC would be good. Even better would be a means to
> allow me to (transparently) allocate objects in unmanaged memory, but I
> would expect that to be a tall order :)
>
> Thanks,
> /Martin
Note that in case you want us to do the work of isolating the problem,
we do offer paid support to do that (then we can sign NDAs and stuff).
Otherwise we would be more than happy to fix bugs once you isolate a
part you can share freely :)