On Tue, Jul 1, 2014 at 9:48 AM, Andrew Barnert firstname.lastname@example.org wrote:
First, two quick side notes:
It might be nice if the compiler were as easy to hook as the importer. Alternatively, it might be nice if there were a way to do "inline bytecode assembly" in CPython, similar to the way you do inline assembly in many C compilers, so the answer to random's question is just "asm [('BUILD_SET', 0)]" or something similar. Either of those would make this problem trivial.
That would be interesting, but it raises the possibility of mucking up the stack. (Imagine if you put BUILD_SET 1 in there instead. What's it going to make a set of? What's going to happen to the rest of the stack? Do you REALLY want to debug that?)
Back when I did a lot of C and C++ programming, I used to make good use of a "drop to assembly" feature. There were two broad areas where I'd use it: either to access a CPU feature that the compiler and library didn't offer me (like CPUID, in its early days), or to hand-optimize some code. Then compilers got better and better, and the first set of cases got replaced with library functions... and the second lot ended up being no better than the compiler's output, and potentially a lot worse - particularly because they're non-portable. Allowing a "drop to bytecode" in CPython would have the exact same effects, I think. Some people would use it to create an empty set, others would use it to replace variable swapping with a marginally faster and *almost* identical stack-based swap:
x,y = y,x LOAD_GLOBAL y LOAD_GLOBAL x ROT_TWO STORE_GLOBAL x STORE_GLOBAL y
LOAD_GLOBAL x LOAD_GLOBAL y STORE_GLOBAL x STORE_GLOBAL y
Seems fine, right? But it's a subtle change to semantics (evaluation order), and not much benefit anyway. Plus, if it's decided that this semantic change is safe (if it's provably not going to have any significance), a future version of CPython would be able to make the exact same optimization, while leaving the code readable, and portable to other Python implementations.
So while an inline bytecode assembler might have some uses, I suspect it'd be an attractive nuisance more than anything else.
On Monday, June 30, 2014 3:12 PM, Chris Angelico email@example.com wrote:
On Tue, Jul 1, 2014 at 3:18 AM, firstname.lastname@example.org wrote:
On Sat, Jun 28, 2014, at 01:28, Chris Angelico wrote:
empty_set_literal = type(lambda:0)(type((lambda:0).__code__)(0,0,0,3,67,b't\x00\x00d\x01\x00h\x00\x00\x83\x02\x00\x01d\x00\x00S',(None,"I'm
I think it makes more sense to use types.FunctionType and types.CodeType here than to generate two extra functions for each function, even if that means you have to put an import types at the top of every munged source file.
Sure. This is just a proof-of-concept anyway, and it's not meant to be good code. Either way works, I just tried to minimize name usage (and potential name collisions).
But I think what he was suggesting is something like this: Let py_compile.compile generate the .pyc file as normal, then munge the bytecode in that file, instead of compiling each function, munging its bytecode, and emitting source that creates the munged functions.
Besides being a lot less work, his version works for ∅ at top level, in class definitions, in lambda expressions, etc., not just for def statements. And it doesn't require finding and identifying all of the things to munge in a source file (which I assume you'd do bottom-up based on the ast.parse tree or something).
Sure. But all I was doing was responding to the implied statement that it's not possible to write a .py file that makes a function with BUILD_SET 0 in it. Translating a .pyu directly into a .pyc is still possible, but was not the proposal.
But either way, this still doesn't solve the big problem. Compiling a function by hand and then tweaking the bytecode is easy; doing it programmatically is more painful. You obviously need the function to compile, so you have to replace the ∅ with something else whose bytecode you can search-and-replace. But what? That something else has to be valid in an expression context (so it compiles), has to compile to a 3-byte opcode (otherwise, replacing it will screw up any jump targets that point after it), can't add any globals/constants/etc. to the list (otherwise, removing it will screw up any LOAD_FOO statements that refer to a higher-numbered foo), and can't appear anywhere in the code being compiled.
What I did was put in a literal string.
It uses "∅ is set()" as a marker, depending on that string not existing in the source. (I could compile the function twice, once with that string, and then a second time with another string; the first compilation would show what consts it uses, and the program could then generate an arbitrary constant which doesn't exist.) The opcode is the right length (assuming it doesn't go for EXTENDED_ARG, which I've never heard of; it seems to be necessary if you have more than 64K consts/globals/locals in a function???), and the resulting function has an unnecessary const in it. It wouldn't be hard to drop it (the code already parses through everything; it could just go "if it's LOAD_CONST, three options - if it's the marker, switch in a BUILD_SET, if it's less than the marker, no change, if it's more than the marker, decrement"), but it doesn't seem to be a problem to have an extra const in there.
One more thing that I'm sure you thought of, but may not have thought through all the way: To make this generally useful, you can't just hardcode creating a zero-arg top-level function; you need to copy all of the code and function constructor arguments from the compiled function.
It handles arguments and stuff. All the attributes of the original function object get passed through unchanged to the resulting function, with the exception of the bytecode, obviously.
So, if the function is a closure, how do you do that? You need to pass a list of closure cell objects that bind to the appropriate co_cellvars from the current frame, and I don't think there's a way to do that from Python. So, you need to do that by bytecode-hacking the outer function in the same way, just so it can build the inner function. And, even if you could build closure cells, once you've replaced the inner function definition with a function constructor from bytecode, when the resulting code gets compiled, it won't have any cellvars anymore.
Ah, that part I've no idea about. But it wouldn't be impossible for someone to develop that a bit further.
And going back to the top, all of these problems are why I think random's solution would be a lot easier than yours, but why my solution (first build compiler hooks or inline assembly, then use that to implement the empty set trivially) would be no harder than either (and a lot more generally useful), and also why I think this really isn't worth doing.
Right. I absolutely agree with your conclusion (not worth doing), and always have had that view. This is proof that it's kinda possible, but still a bad idea. Now, if someone comes up with a really compelling use-case for an empty set literal, then maybe it'd be more important; but if that happens, CPython will probably grow an empty set literal in ASCII somehow, and then the .pyu translation can just turn ∅ into that.