Tremendous slowdown due to garbage collection

andreas.eisele at andreas.eisele at
Sat Apr 12 16:01:48 EDT 2008

> Martin said that the default settings for the cyclic gc works for most
> people.

I agree.

> Your test case has found a pathologic corner case which is *not*
> typical for common application but typical for an artificial benchmark.

I agree that my "corner" is not typical, but I strongly disagree with
the classification as pathological. The only feature of my test case
that is not typical is the huge number of distinct objects that are
allocated. I admit that 1E7 objects is today still fairly untypical,
but there is nothing pathological about it, it is just bigger. I is
about as pathological as a file size >2G, which a few years ago seemed
so outrageous that no OS bothered to support it, but is fairly common
nowadays, so that a lack of support would appear as an arbitrary and
unmotivated limitation nowadays. We all enjoy seeing Python adopted on
a large scale and used by a broad community, so we should not accept
arbitrary size limits. You could call a string with more than 2GB
pathological, but I very much appreciate the fact that Python supports
such strings for the few cases where they are needed (on a 64 bit
architecture). Now a O(N*N) effort for large numbers of objects isn't
such a hard limit, but in practice boils down to the same effect, that
people cannot use Python in such circumstances. I would prefer it very
much if such "soft limits" could be avoided as well.

Given there is a fairly simple workaround (thanks again to Amaury!),
the issue is not urgent, but I still think it is important in the long

> Python is optimized for regular apps, not for benchmark (like some video
> drivers).

I still think it would be worthwhile to support very large numbers of
objects in a way that they can just be used, without knowledge of
special tricks, and I would be fairly optimistic that those who have
designed the current GC schemes could generalize them slightly so that
these marginal cases will work better without imposing a penalty on
the more typical cases.

> By the way you shouldn't use range for large ranges of more than a
> thousand items. xrange() should be faster and it will definitely use
> much less memory - and memory Python 2.5 and older will never release
> again. I'm going to fix the issue for Python 2.6 and 3.0.

Thanks for this hint, and for the work on the newer versions. This is
very much appreciated.


More information about the Python-list mailing list