Re: [Python-Dev] Parrot -- should life imitate satire?
Hi.
The point is to put the commonly called things in the vtable in a way that you can avoid as much conditional code as possible, while less common things get dealt with in ways you'd generally expect. (Dynamic lookups
with
caching and suchlike things)
If I'm right, you're designing a object-based VM.
More or less, yep, with a semi-generic two-arg dispatch for the binary methods. The vtable associated with each variable has a fixed set of required functions--add with 2nd arg an int, add with second arg a bigint, add with second arg a generic variable, for example. Basically all the things one would commonly do on data (add, subtract, multiply, divide, modulus, some string ops, get integer/real/string representation, copy, destroy, etc) have entries, with a catchall "generic method call" to handle, well, generic method calls. A question: when you say variable you mean variable (perl sense of that) or object. It has already been pointed out but it's really confusing from the point of view of python terminology. Will perl6 have only variables which contain objects, truly references to objects like Python or ...? I should repeat that your explanation that assigment is somehow performed calling a method on the "variable" is quiet a strange notion in general, I can imagine having a slot called on assigment that eventually does a copy or return just the object but assigment as an operation on the lvalue is something very peculiar. I know that perl5 assignment is an operator returning an lvalue, is this related?
It's all designed with high-speed dispatch and minimal conditional branch requirements in mind, as well as encapsulating all the "what do I do on data" functions. Basically the opcodes generally handle control flow, register/stack ops, and VM management, while actual operations on variables is left to the vtable methods attached to the variables.
I expect things to get mildly incestuous for speed reasons, but I'm OK with that. :)
I don't how typical is OO programming in Perl, but in Python that plays a central role, if your long run goal is to compile to native code you should have a "hard-wired" concept of classes and them like,
Yep. There's a fast "get name of object's class" and "get pointer to object's class stash/variable table/methods/subs/<insert generic class terminology here>".
because an operation like getting the class of an instance should be as direct and fast as possible if you want to use any of the "nice" optimization for a VM for a OO dynamic language: inline polymorphic caches
Yup. I'm leaning towards a class based cache for inherited methods and suchlike things, but I'm not sure that's sufficient if we're going to have objects whose inheritance tree is handled on a per-object basis. Make sense for the interpreted version, but for speed call-site caches is far more promising when native compiling, but also more complicated. I imagine you already know e.g. the Self project literature.
customization
Details? I'm not sure what you're talking about here. Compiling different versions of the same method for example wrt to the receiver type in single dispatch case. See below
or if/when possible: direct inlining.
Yep, though the potential to mess with a class' methods at runtime tends to shoot this one down. (Perl's got similar issues with this, plus a few extra real nasty optimizer killers) I'm pondering a "check_if_changed" branch opcode that'll check a method/function/sub's definition to see if it's changed since the code was generated, and do the inline stuff if it hasn't. Though that still limits what you can do with code motion and other optimizations.
If any of the high-level OO system can not be used, you have to choose a lower level one and map these to it, a vtable bytecode mapping model is too underspecified.
Oh, sure, saying "vtable bytecode mapping model" is an awful lot like saying "procedural language"--it's informative without actually being useful... :)
I imagine you already know: especially when compiling to native code, it is all a matter of optimizing for the common case, and be prepared to be quiet slow otherwise, especially when dealing with dynamic changes, both in Perl and Python there is no clear distiction from normal ops and dynamic changes to methods... but in any case they are rare compared to the rest. Threading adds complexity to the problem. For the rest is just speed vs. memory, customization is a clear example for that. regards, Samuele Pedroni.
At 04:32 PM 8/2/2001 +0200, Samuele Pedroni wrote:
Hi.
The point is to put the commonly called things in the vtable in a
way that
you can avoid as much conditional code as possible, while less common things get dealt with in ways you'd generally expect. (Dynamic lookups with caching and suchlike things)
If I'm right, you're designing a object-based VM.
More or less, yep, with a semi-generic two-arg dispatch for the binary methods. The vtable associated with each variable has a fixed set of required functions--add with 2nd arg an int, add with second arg a bigint, add with second arg a generic variable, for example. Basically all the things one would commonly do on data (add, subtract, multiply, divide, modulus, some string ops, get integer/real/string representation, copy, destroy, etc) have entries, with a catchall "generic method call" to handle, well, generic method calls. A question: when you say variable you mean variable (perl sense of that) or object. It has already been pointed out but it's really confusing from the point of view of python terminology. Will perl6 have only variables which contain objects, truly references to objects like Python or ...?
Well, it sort of looks like: name--->variable--->value The name is a symbol table entry that points to the variable. The variable is a data structure that has the vtable info, some pointer data, some flags, and general whatnots. (The docs refer to this as a PMC, or Parrot Magic Cookie, FWIW. Which isn't much) The value is the actual contents. We also have to track properties on both a variable and a value basis. For example: declare c, b declare a is const = 0 but true b = a c = \a dereference(c) = 1 In this case, the variable represented by a has the property 'const', and a value of 0 which has the property 'true'. The assignment to b copies the value, and the properties on the value (true in this case), but not the properties on the variable, so b isn't const. c is a reference to the variable a represents, so the attempt to assign to the dereference of c will fail, since the variable c points to has the const property. We can't depend on the properties to be attached to the *name* a, since (in my crude attempt to use python-style blocking) the name a has gone out of scope by the time we dereference c, and so would any properties attached to the name as such. I think the variable here would correspond to a Python object, but I'll need to go digging into Python semantics more to be sure. Currently there's not a lot planned around the actual symbol table entry, since it's not required for variable existence, and there's not much I can think of to attach to it anyway. (References mean we can't use it for much, really) My bet, and this is just an off-hand thing, is that the parser would spit out bytecode tying the symbol table entry and the PMC structure more closely than it might for other languages. At least do a lot more symbolic fetch/store ops, if need be, which we might not.
I should repeat that your explanation that assigment is somehow performed calling a method on the "variable" is quiet a strange notion in general, I can imagine having a slot called on assigment that eventually does a copy or return just the object but assigment as an operation on the lvalue is something very peculiar. I know that perl5 assignment is an operator returning an lvalue, is this related?
I don't think so, but I'm betting we're talking terminologically cross-wise. Assume, for example, the following pseudoish code: declare kitchen isa Thermostat kitchen = 20 In this case the kitchen variable is active, and assigning to it changes the temperature the thermostat in the kitchen is set to. The magic properties aren't attached to the name per se, at least in perl--it's just a handy way to get the PMC that underlies the thing. I can get an indirect handle on the data that the name 'kitchen' refers to and update it that way, or if I have that handle the 'kitchen' name can go completely out of scope and the only way to refer to the object representing (and allowing control over) my kitchen temperature is via the handle I've got on what's now an anonymous object. (Or, if I'm feeling really wacky, have two or more names that look like separate things but actually all point to the same PMC structure) I think it's the potential existence of references (or pointers, or whatever you want to call them) that necessitate the three-level split. It sounds like the things I'm calling 'name' and 'variable' are much more tightly bound in Python. There are several reasons to call a PMC assignment operator, rather than just unconditionally slam a new PMC into the name slot: 1) In the kitchen example above, assigning 20 to kitchen doesn't suddenly make kitchen an integer variable--it needs to stay a Thermostat, and just do something because we assigned to it. (And I understand this might not be the way Python would do things, in which case Python variable vtables might just have an "overwrite myself with the incoming info" function, rather than allow it to be overridden) 2) Since we can't really assign properties like constant-ness to a name (in which case we could get a reference to a variable and bypass the properties on the name by referring to it by reference) that needs to be enforced in other ways. In this case with a custom assignment vtable function which pitches a fit if you try to alter the variable. 3) Some other thing or other I apparently can't think of. It's a really good reason, though.
because an operation like getting the class of an instance should be as direct and fast as possible if you want to use any of the "nice" optimization for a VM for a OO dynamic language: inline polymorphic caches
Yup. I'm leaning towards a class based cache for inherited methods and suchlike things, but I'm not sure that's sufficient if we're going to have objects whose inheritance tree is handled on a per-object basis. Make sense for the interpreted version, but for speed call-site caches is far more promising when native compiling, but also more complicated. I imagine you already know e.g. the Self project literature.
Self I've not dug into. Despite all the "I haven't see that" that keeps popping up in my conversation, I *have* done a lot of research. (The pile of digested books at one point outweighed my kids. They've grown, but the new research pile's about to grow too...)
customization
Details? I'm not sure what you're talking about here. Compiling different versions of the same method for example wrt to the receiver type in single dispatch case. See below
Ah, aggressive inlining of a sort. I think. Yep, on the list of things to do with the optimizer. (Not on the first version, probably, but certainly later)
or if/when possible: direct inlining.
Yep, though the potential to mess with a class' methods at runtime tends to shoot this one down. (Perl's got similar issues with this, plus a few extra real nasty optimizer killers) I'm pondering a "check_if_changed" branch opcode that'll check a method/function/sub's definition to see if it's changed since the code was generated, and do the inline stuff if it hasn't. Though that still limits what you can do with code motion and other optimizations.
If any of the high-level OO system can not be used, you have to choose a lower level one and map these to it, a vtable bytecode mapping model is too underspecified.
Oh, sure, saying "vtable bytecode mapping model" is an awful lot like saying "procedural language"--it's informative without actually being useful... :)
I imagine you already know: especially when compiling to native code, it is all a matter of optimizing for the common case, and be prepared to be quiet slow otherwise, especially when dealing with dynamic changes, both in Perl and Python there is no clear distiction from normal ops and dynamic changes to methods... but in any case they are rare compared to the rest.
Yup. I am profoundly tempted, and I think it will go in, to have a "stable code" switch that can be given to the optimizer which guarantees that, once the compilation phase is done and we finally get to run for real, code will not change. And then apply all the advanced optimization techniques at that point.
Threading adds complexity to the problem.
Nah, not too badly. A little synchronization issue, but that's reasonably trivial.
For the rest is just speed vs. memory, customization is a clear example for that.
Sure. The one nice thing that both perl and python have going for them is the ultimate presence of all the source (or a high-enough level representation of it to feed to the optimizer) at run time. We can tell with more surety what side-effects various methods and functions in external code libraries will have, and we can get more aggressive (maybe) because of it. Unlike, say, C, where passing a pointer to a routine outside what the compiler can see will hobble the optimizer a whole lot in many cases... Dan --------------------------------------"it's like this"------------------- Dan Sugalski even samurai dan@sidhe.org have teddy bears and even teddy bears get drunk
participants (2)
-
Dan Sugalski
-
Samuele Pedroni