
Thanks for this email Victor, it illustrates a lot of pain-points that some core devs have with CPython development process.
I think that we are lucky that we have you and Serhiy who spend so much time to push so many improvements to the CPython internals. I think that while we are working on a new major version of CPython (3.7 now), it's acceptable to push performance optimizations without a lengthy discussion on python-dev and a thorough review by 3+ core developers. An issue on the bug tracker explaining the change and showing some benchmarks should be enough.
Those who want to see the results of Serhiy's and Victor's work can look at https://speed.python.org/comparison/ and see for themselves that 3.7 is already faster than 3.6 and 2.7 in most cases.
To reflect on my own experience: I had a patch to speed up LOAD_GLOBAL, LOAD_ATTR and LOAD_METHOD early in 3.6 development cycle. The patch promised to improve performance 5-20% on some benchmaks. I sent a few emails to python-dev explaining the change, explaining the memory usage changes etc. What I saw is that only one or two people were interested in the change, and almost nobody wanted to actually review the change. I became less motivated, and in the end I decided to focus on other areas and postpone my work on that optimization until later. And later I regretted that: I should have pushed the change, and we would have few months to improve and test it, and we would have an even faster 3.6. (I'll continue my work on the patch soon).
I think that we need to become less conservative about development of CPython internals. At this point it's impossible to make CPython any faster without invasive refactorings, and I think it's OK to trust our core developers to make them.
Any public API changes/enhancements should be discussed on python-dev and thoroughly reviewed though. API design is something that a single person usually cannot do well.
Thank you, Yury