
Hi all, Some results seem to show that GCC is more happy optimizing "for" and "while" loops than the equivalent loops written with a bunch of goto's like we produce in our C backend. Interestingly, it seems that what makes GCC unhappy is not the bunch of goto's, but only the absence of the explicit C-level loop. Example: For this piece of C code as someone would write by hand: while (n > 0) { ...body... } we generate the following instead, which defeats GCC's loop-specific optimizations: block1: cond = n > 0; if (cond) goto block2; goto block3; block2: ...body... goto block1; The smallest change I could find that make GCC produce good code again is to modify block1 and block2 like this: block1: cond = n > 0; while (cond) { /* trick */ goto block2; block1_back: cond = n > 0; /* the operations in block1 are repeated */ } goto block3; block2: ...body... goto block1_back; /* changed */ GCC doesn't seem to mind that the real body of the loop is not in the loop at all but somewhere else; I guess it just follows the chain of goto's and treat the result as equivalent to having the source code directly within the loop's { }. So I wrote a quick hack that detects (for now only the innermost) loops in graphs. More specifically it finds "loop head" blocks like block1, and "loop-closing links" like the link from block2 to block1. It then produces code like above. It's in the loops-for-gcc branch. It's quite obscure but seems to give a consistent 5% speed-up on pypy-c running richards... A bientot, Armin.
participants (1)
-
Armin Rigo