I suspect that stating and loading the .pyc files is responsible for most of the overhead.
PyRun starts up quite a lot faster thanks to embedding all the modules in the executable: http://www.egenix.com/products/python/PyRun/
Freezing all the core modules into the executable should reduce start up time.
That suggests a test to me that the Cython guys might be interested in (or may well have performed in the past). How much of the stdlib could be compiled with Cython and used during the startup process? How much of an effect would it have on startup times and these benchmarks if Cython-compiled extensions were used?
I'm thinking here of elimination of .pyc interpretation and execution (stat calls would be similar, probably slightly higher).
To be clear - I'm *not* suggesting Cython become part of the required build toolchain. But *if* the Cython-compiled extensions prove to be significantly faster I'm thinking maybe it could become a semi-supported option (e.g. a HOWTO with the caveat "it worked on this particular system").