[Python-checkins] python/dist/src/Lib/test test_sort.py,NONE,1.1

Tim Peters tim.one@comcast.net
Fri, 02 Aug 2002 00:31:34 -0400


[Andrew MacIntyre, on the test_longexp + test_sort slowdown mystery]

> Hmmm, perhaps this was the sort of dragon you had in mind when you
> suggested I persist with PyMalloc'ing the parser on top of your
> overallocation strategy?

It is, but turns out it was a different dragon (a semantically invisible but
pragmatically deadly glitch in PyFrame_New()).

> It'd be worth a test with either my experimental or "final" patch(es)?

Although I didn't mention it, system malloc/free were pretty much ruled out
early because the slowdown persisted in a debug build (pymalloc takes over
the PyMem_xyz API too in the debug build, but not in the release build).

> Its mind boggling how many small mallocs in PyNode_AddChild()
> test_longexp generates (853016 total PyNode_AddChild() requests
> according to my logging harness, all bar 92(!) of which are for the
> n=1 case).

For a truly mind boggling experience, try writing down the source code it's
compiling on a sheet of paper (OK, mayber a ream <wink>), and compile it by
hand.  I expect this kind of node abuse to find relief as Jeremy moves the
compiler toward a more abstract syntax tree; right now we've got a tree as
concrete as they get.  For each of the int literals in the test_longexp
generated list, we dutifully record the whole path thru the grammar:

test ->
 and_test ->
  not_test ->
   comparison ->
    expr ->
     xor_expr ->
      and_expr ->
       shift_expr ->
        arith_expr ->
         term ->
          factor ->
           power ->
            atom ->
             NUMBER

That's 13 1-child nodes per "2", and there are REPS=65580 instances of this.
Multiply to get 852540, 99.94% of the total you counted.  An abstract syntax
tree can think of a "2" as being more a number than an exercise in linear
linked lists <wink>.