Since we're all talking about making Python faster, I thought I'd drop
some previous ideas I've had here in case (1) someone wants to actually
do them, and (2) they really are new ideas that haven't failed in the
past. Mostly I was thinking about startup time.
Here are the list of modules imported on clean startup on my Windows,
US-English machine (from -v and cleaned up a bit):
Obviously the easiest first thing is to remove or delay unnecessary
imports. But a while ago I used a native profiler to trace through this
and the most impactful modules were the encodings:
While I don't doubt that we need all of these for *some* reason,
aliases, cp437 and cp1252 are relatively expensive modules to import.
Mostly due to having large static dictionaries or data structures
generated on startup.
Given this is static and mostly read-only information, I see no
reason why we couldn't either generate completely static versions of
them, or better yet compile the resulting data structures into the core
(: If being able to write to some of the encoding data is used by
some people, I vote for breaking that for 3.6 and making it read-only.)
This is probably the code snippet that bothered me the most:
### Encoding table
It shows up in many of the encodings modules, and while it is not a bad
function in itself, we are obviously generating a known data structure
on every startup. Storing these in static data is a tradeoff between
disk space and startup performance, and one I think it likely to be
Anyway, just an idea if someone wants to try it and see what
improvements we can get. I'd love to do it myself, but when it actually
comes to finding time I keep coming up short.
P.S. If you just want to discuss optimisation techniques or benchmarking
in general, without specific application to CPython 3.6, there's a whole
internet out there. Please don't make me the cause of a pointless