On Tue, Jul 1, 2014 at 6:04 PM, Andrew Barnert firstname.lastname@example.org wrote:
On Monday, June 30, 2014 5:39 PM, Chris Angelico email@example.com wrote:
That would be interesting, but it raises the possibility of mucking up the stack. (Imagine if you put BUILD_SET 1 in there instead. What's it going to make a set of? What's going to happen to the rest of the stack? Do you REALLY want to debug that?)
The same thing that happens if you use bad inline assembly in C, or a bad C extension module in Python—bad things that you can't debug at source level. And yet, inline assembly in C and C extension modules in Python are still quite useful.
Right, useful but it adds another set of problems. (Just out of curiosity, what protection _is_ there for a smashed stack? I just tried fiddling with it and didn't manage to crash stuff.)
I'll ignore the second case for the moment, because I think it's rarely if ever appropriate to Python, and just focus on the first. Those cases did not go away because CPUID got replaced with library functions. Those library functions—which are compiled with the same compiler you use for your code—have inline assembly in them. (Or, if you're on linux, those library functions read from a device file, but the device driver, which is compiled with the same compiler you use, has inline assembly in it.) So, the compiler still needs to be able to compile it.
Or those library functions are written in assembly language directly. It's entirely possible to write something that uses CPUID and doesn't use inline assembly in a C source file. The equivalent here, I suppose, would be hand-rolling a .pyc file.
Some people would use it to create an empty set, others would use it to replace variable swapping with a marginally faster and *almost* identical stack-based swap:
Do you really think anyone would do the latter? Seriously, what kind of code can you imagine that's too slow in CPython, not worth rewriting in C or running in PyPy or whatever, but fast enough with the rot opcode removed? And if someone really _did_ need that, I doubt they'd care much that Python 3.8 makes it unnecessary; they obviously have a specific deployment platform that has to work and that needed that last 3% speedup under 3.6.2, and they're going to need that to keep working for years.
Hang on, you're asking two different questions there. I'll split it out:
1) Do I really think anyone *should* do this? Your subsequent comments support this question, and the answer is resoundingly NO. CPython is not the sort of platform on which that kind of thing is ever worth doing. You'll get far more performance by using Cython for parts, or in some other way improving your code, than you will by hand-tweaking the Python bytecode.
2) Do I think anyone would, if given the ability to tweak the bytecode, go "Ah ha!" and proudly improve on what the compiler has done, and then brag about the performance improvement? Definitely. Someone will. It'll make some marginal difference to a microbenchmark, and if you don't believe that would cause people to warp their code into utter unreadability, you clearly don't hang out on python-list enough :)
So while an inline bytecode assembler might have some uses, I suspect it'd be an attractive nuisance more than anything else.
I honestly don't see it becoming an attractive nuisance.
I can easily see it just not getting used for anything at all, beyond people playing with the interpreter.
The "attractive nuisance" part is with microbenchmarking. Code won't materially improve, and it'll be markedly worse in readability/maintainability and portability (although the latter probably doesn't matter all that much; a lot of people's code will be suboptimal on Pythons other than CPython, if only for lack of 'with' statements around files and such), with the addition of such a feature.
What I did was put in a literal string… It uses "∅ is set()" as a marker … and the resulting function has an unnecessary const in it.
I assumed that leaving the unnecessary const behind was unacceptable. After all, we're talking about (hypothetical?) people who find the cost of LOAD_GLOBAL set; CALL_FUNCTION 0 to be unacceptable… But you're right that fixing up all the other LOAD_CONST bytecodes' args is a feasible way to solve that.
I'm not sure whether the problem is the cost of LOAD_GLOBAL followed by CALL_FUNCTION (and, by the way, one unnecessary constant in the function won't have anything like that cost - a bit of wasted RAM, but not a function call), or the fact that such a style is vulnerable to shadowing of the name 'set', which admittedly is a very useful name. But in any case, it's quite solvable.
So, if the function is a closure, how do you do that?
Ah, that part I've no idea about. But it wouldn't be impossible for someone to develop that a bit further.
Not impossible, but very hard, much harder than what you've done so far.
Ultimately, I think that just backs up your larger point: This is doable, but it's going to be a lot of work, and the benefit isn't even nearly worth the cost. My point is that there are other ways to do it that would be less work and/or that would have more side benefits… but the benefit still isn't even nearly worth the cost, so who cares? :)
Yep. Maybe someone (great, that probably means me) should write this up into a PEP for immediate rejection or withdrawal, just to be a document to point to - if you want an empty set literal, answer these objections.