[Timbot]
FYI, for years the dict code had some #ifdef'ed preprocessor gimmick to force cache alignment. I ripped that out a while back because nobody ever reported an improvement when using it.
Gee, you mean we're not the first ones to have ever thought up dictionary optimizations that didn't pan out? I've tried square wheels, pentagonal wheels, and gotten even better results with octagonal wheels. Each further subdivision seems to have less-and-less payoff so I'm confident that octagonal is close to optimum ;-) I'm going to write-up an informational PEP to summarize the results of research to-date. After the first draft, I'm sure the other experimenters will each have lessons to share. In addition, I'll attach a benchmarking suite and dictionary simulator (fully instrumented). That way, future generations can reproduce the results and pickup where we left-off. I've decided that this new process should have a name, something pithy, yet magical sounding, so it shall be dubbed SCIENCE. Raymond Hettinger