On Sat, 5 Dec 2020 12:02:52 -0800 Christopher Barker email@example.com wrote:
just one more note:
things like you are proposing with an eye to performance is not really where the Python community wants to go.
I never met a Python user who said something like "I want Python to be slow" or "I want Python to keep being slow", so we'll see how that goes.
But many that might say "I don't want to make Python less flexible in order to gain performance"
And I'd shake hands with them, because I add "strict mode" as an additional optional mode beyond the standard Python's mode. (I'd however expect that I personally use it often, because it's just a small notch above how I write programs in Python anyway.)
Of course no one one is going to reject an enhancement that improves performance if it has no costs.
My thought on your idea is this:
Yes, a more restricted (strict) version of Python that had substantially better performance could be very nice. But the trick here is that you are proposing a spec, hoping that it could be used to enhance performance. I suspect you aren't going to get very far (with community support) without an implementation that shows what the performance benefits really are.
As I mentioned in previous replies, I fully agree that it would be nice to see performance figures. But sadly, as directly related to the strict mode, those aren't available yet. However, if the question is to explicate the idea further, that can be done on synthetic examples right away.
Suppose we have a pretty typically-looking Python code like (condensed to save on vertical space):
--- def foo(): a = 1; b = 2 for _ in range(10000000): c = min(a, b) foo() ---
The problem with executing that code is that "min" per the standard Python semantics is looked up by name (and beyond that, the look up is two-level, aka "pretty complex"). Done 10 mln types in a loop, that's gotta be slow.
Let's run it in a Python implementation which doesn't have existing means to optimize that "pretty complex" lookups, e.g. my Pycopy (btw, am I the only one who finds it weird that you can't pass a script to timeit?):
$ pycopy -m timeit -n1 -r1 "import case1" 1 loops, best of 1: 2.41 sec per loop
A common way to optimize global lookups (which are usually by name in overdynamic languages) is to cache the looked up value in a local variable (which aren't part of external interface, and thus are usually already optimized to be accessed by "stack slot"):
--- def foo(): from builtins import min a = 1; b = 2 for _ in range(10000000): c = min(a, b) foo() ---
$ pycopy -m timeit -n1 -r1 "import case3" 1 loops, best of 1: 551 msec per loop
4 times faster.
So, the idea behind the strict mode is to be able to perform such an optimization automatically, without manual patchings like "from builtins import min" above. And the example above shows just the surface of it, for bytecode interpretation cases. But the strict mode reaches straight to the JITted machine code, where it allows to generate the same code for function calls as it would for C.
The "code for function calls" is the keyword here. Of course, Python differs from C in more things that just name lookups. And most of these things are necessarily slower (and much harder to optimize). But the name lookups don't have to be, and the strict mode (so far) tries to improve just this one aspect. And it does that because it's simple to do, for very modest losses in Python expressivity (adjusted for real-world code sanity and maintainability). And it again does that to put a checkmark against it in move to the other things to optimize (or not).
I'm just one random guy on this list, but my response is:
"interesting, but show me how it works before you make anything official"
It's nothing "official", it's completely grass-roots proposal for whoever may be interested in it. But I have to admit that I like it very much (after converting a few of my apps to it), and already treat it as unalienable part of the semantics of my Python dialect, Pycopy.