[Python-ideas] Re: foo.setParseAction(lambda a, b, c: raise FuckPython(":("))

31 Oct 2019

      On Oct 31, 2019, at 16:24, Steven D'Aprano  wrote:
...
...
On Thu, Oct 31, 2019 at 02:47:35PM +0100, Andrew Barnert wrote:
...
On Oct 31, 2019, at 13:56, Steven D'Aprano  wrote:
I disagree. I think it's a pretty small breaking change,
From my test earlier on the thread, using a function to raise makes a 
function that does nothing but raise and handle an exception take more 
than twice as long. Which implies that the actual raise part (not 
counting the outer function call, handling the exception, and 
returning) is far more than twice as long.
I don't recall seeing your test code, but I don't follow your reasoning 
here.
You replied to a message from Ben that was a reply to my message with the performance test in it, which Ben was referring to in the chunk you quoted.
...
Unless you are using a custom Python build where the raise 
statement has been replaced by a raise function, I don't see how you can 
conclude that the raise statement inside a function takes "far more than 
twice as long" as a bare raise statement.
def throw(e): raise e

Test functions using throw(ValueError("something")) vs. raise ValueError("something").
...
So on my machine, raise-in-a-function is about 70% slower than raise, 
not 100% or more.
OK, and it’s possible that on a different machine those functions would be only 50% slower instead of 70% or more than 100%. So what? You can’t answer “it’s too slow” with “it’s only too slow on some machines”, much less “it’s only too slow on some machines; on others it’s arguably maybe on the border of acceptable”. It’s still significantly slower.
...
That's a pure-Python function. It's possible that a 
builtin function in C will have less overhead.
Sure, while I already mentioned in the previous message. Other things you didn’t address include the cost of looking up a builtin (which I don’t think is a big deal, but the OP was worried about it, and your test doesn’t include it)… but better if you go back and read the thread you’re replying to rather than me trying to summarize it badly.
...
But to paraphrase the prevailing opinion expressed in the "word list" 
thread, you cannot possibly be seriously worried about a few micro- 
seconds difference. (Only slightly serious.)
But this is different. Parsing a global constant is something you do once, at module compile time, so who cares about a few us. Printing is already slow, and you already can and do work around that when performance matters, so who cares about a few us. Raising exceptions is something you’re forced to do all over the place. Maybe thousands of times per second. So you may need to care about a few us.

For example, if I need to Iterate through a ton of tiny sequences, each one of them has to raise StopIteration. A ton of times in the middle of my code is not the same as one time at module compilation.
...
In this case, I'm not that worried about micro-benchmarks unless they 
can be shown to add up to a significant performance degradation in 
macro-benchmarks too.
I think the onus is on people who want this feature to prove that it doesn’t affect the performance of real-line code, not on people who don’t think it’s necessary to prove that it does.
...
...
Also, there are many places in the Python data model where you have to 
raise an exception, such as StopIteration;
Does the Python data model not have function calls? *wink*
Yes, it does. So if you proposed something that made every function call 100% slower on my machine (even if it was only 70% slower on yours, even if it was only a matter of a few microseconds), then I would be raising the exact same objection.

And if it also broke backward compatibility, then I’d be raising that objection as well.
...
...
And what’s the advantage?
If we need raise inside an expression without extra overhead, this 
doesn’t solve the problem; we’d need a raise expression (similar to 
the yield expression), not a raise function.
Yield expressions are a red herring here.
No, they’re a model for how you can turn a statement into an expression between Python versions without breaking backward compatibility. 

That doesn’t quite prove that the same thing can be done here, but it does alleviate the concern (raised earlier in the thread) that someone thought it was unlikely that you could possibly do such a thing.
...
Yield expressions are a weird 
syntactic form that allow you to feed data into a running generator. 
raise doesn't need to do that. The raise function will never return, so 
it doesn't need to return a value. It will be kind of like os._exit, and 
sys.exit.
A function that never returns is no less weird than an expression that never has a value. (Especially given that the only way to use a function is with a call expression anyway.) They’re both equally weird. But exceptions are one of the handful of cases where you expect it to be weird—along with things like exit and abort and break and return. The only way to avoid that weirdness is to do what Python (and C++ and some other languages) do with about half of them (including raise), and make it a statement.

I think the status quo is fine, but if there really is a need for raising in the middle of a lambda or a comprehension or whatever, it has to be an expression, and I think it makes more sense for that expression to still be custom syntax rather than a magic function.
...
...
If we don’t care about the overhead, there is no problem to solve; you 
can already get this behavior with a one-liner. If that’s still too 
inconvenient or undiscoverable, we can add the one-liner to builtins 
without removing the statement at the same time. So, what’s the 
benefit of removing the statement?
"Only One Way To Do It" *wink*
...
We could have added a print_ function and kept the print statement too.
Yes, but that case was different, for the reasons already discussed (print doesn’t affect flow control and doesn’t need any magic to implement as a function—you can even write it in pure Python; performance wasn’t an issue; there’s sometimes a useful benefit to shadowing a print builtin; breaking backward compatibility was less serious of an issue in 3.0 than in 3.9; …).
...
If, back in the early days of Python, Guido had made raise a function, 
we'd all accept it as normal and desirable. I doubt anyone would say "I 
really wish it was impossible to call raise from an expression without 
having to write an unnecessary boilerplate wrapper function".
If he’d made raise a syntactic expression rather than a statement, would anyone be saying “I really wish it was slower to raise an expression—and, even more, I wish it used a magic function that I can’t write myself, and that it was less recognizable as flow control (especially since half the IDEs and colorizers color it like a normal function)?”

If raise had been a function, we probably wouldn’t complain (except maybe for some people asking why it’s gratuitously different from all the familiar languages). But maybe we’d have been much more apologetic or much more evasive in all those “why Python isn’t too slow” blog posts when it came to the exceptions section. Plus, every time some new mechanism was designed to use exceptions, like StopIteration, there would have been a more serious objection to overcome, so we might have ended up with a different and clunkier design.
...
As I said above, I'll grant you the *possibility* that performance may 
tilt the balance, but honestly, I see no other reason why we should 
prefer a *less flexible, less useful* raise statement over a more 
flexible, more useful function.
More flexible isn’t always more useful. I’m not convinced that people really do frequently need to raise in the middle of an expression.

And performance is far from the only consideration—there’s backward compatibility, and looking like flow control, and so on. None of which are an issue for a raise expression—which solves the same problem, if it really is a problem that needs to be solved.