[Python-Dev] More optimisation ideas

Fri Jan 29 12:05:18 EST 2016

Since we're all talking about making Python faster, I thought I'd drop 
some previous ideas I've had here in case (1) someone wants to actually 
do them, and (2) they really are new ideas that haven't failed in the 
past. Mostly I was thinking about startup time.

Here are the list of modules imported on clean startup on my Windows, 
US-English machine (from -v and cleaned up a bit):

import _frozen_importlib
import _imp
import sys
import '_warnings'
import '_thread'
import '_weakref'
import '_frozen_importlib_external'
import '_io'
import 'marshal'
import 'nt'
import '_thread'
import '_weakref'
import 'winreg'
import 'zipimport'
import '_codecs'
import 'codecs'
import 'encodings.aliases'
import 'encodings'
import 'encodings.mbcs'
import '_signal'
import 'encodings.utf_8'
import 'encodings.latin_1'
import '_weakrefset'
import 'abc'
import 'io'
import 'encodings.cp437'
import 'errno'
import '_stat'
import 'stat'
import 'genericpath'
import 'ntpath'
import '_collections_abc'
import 'os'
import '_sitebuiltins'
import 'sysconfig'
import '_locale'
import '_bootlocale'
import 'encodings.cp1252'
import 'site'

Obviously the easiest first thing is to remove or delay unnecessary 
imports. But a while ago I used a native profiler to trace through this 
and the most impactful modules were the encodings:

import 'encodings.mbcs'
import 'encodings.utf_8'
import 'encodings.latin_1'
import 'encodings.cp437'
import 'encodings.cp1252'

While I don't doubt that we need all of these for *some* reason, 
aliases, cp437 and cp1252 are relatively expensive modules to import. 
Mostly due to having large static dictionaries or data structures 
generated on startup.

Given this is static and mostly read-only information[1], I see no 
reason why we couldn't either generate completely static versions of 
them, or better yet compile the resulting data structures into the core 
binary.

([1]: If being able to write to some of the encoding data is used by 
some people, I vote for breaking that for 3.6 and making it read-only.)

This is probably the code snippet that bothered me the most:

     ### Encoding table
     encoding_table=codecs.charmap_build(decoding_table)

It shows up in many of the encodings modules, and while it is not a bad 
function in itself, we are obviously generating a known data structure 
on every startup. Storing these in static data is a tradeoff between 
disk space and startup performance, and one I think it likely to be 
worthwhile.

Anyway, just an idea if someone wants to try it and see what 
improvements we can get. I'd love to do it myself, but when it actually 
comes to finding time I keep coming up short.

Cheers,
Steve

P.S. If you just want to discuss optimisation techniques or benchmarking 
in general, without specific application to CPython 3.6, there's a whole 
internet out there. Please don't make me the cause of a pointless 
centithread. :)