[Python-Dev] Parrot -- should life imitate satire?
Dan Sugalski
dan@sidhe.org
Thu, 02 Aug 2001 17:54:46 -0400
At 04:32 PM 8/2/2001 +0200, Samuele Pedroni wrote:
>Hi.
>
> > > >
> > > > The point is to put the commonly called things in the vtable in a
> way that
> > > > you can avoid as much conditional code as possible, while less common
> > > > things get dealt with in ways you'd generally expect. (Dynamic lookups
> > > with
> > > > caching and suchlike things)
> > >
> > >If I'm right, you're designing a object-based VM.
> >
> > More or less, yep, with a semi-generic two-arg dispatch for the binary
> > methods. The vtable associated with each variable has a fixed set of
> > required functions--add with 2nd arg an int, add with second arg a bigint,
> > add with second arg a generic variable, for example. Basically all the
> > things one would commonly do on data (add, subtract, multiply, divide,
> > modulus, some string ops, get integer/real/string representation, copy,
> > destroy, etc) have entries, with a catchall "generic method call" to
> > handle, well, generic method calls.
>A question: when you say variable you mean variable (perl sense of that)
>or object. It has already been pointed out but it's really confusing
>from the point of view of python terminology. Will perl6 have only
>variables which contain objects, truly references to objects like Python
>or ...?
Well, it sort of looks like:
name--->variable--->value
The name is a symbol table entry that points to the variable. The variable
is a data structure that has the vtable info, some pointer data, some
flags, and general whatnots. (The docs refer to this as a PMC, or Parrot
Magic Cookie, FWIW. Which isn't much) The value is the actual contents.
We also have to track properties on both a variable and a value basis. For
example:
declare c, b
declare a is const = 0 but true
b = a
c = \a
dereference(c) = 1
In this case, the variable represented by a has the property 'const', and a
value of 0 which has the property 'true'. The assignment to b copies the
value, and the properties on the value (true in this case), but not the
properties on the variable, so b isn't const. c is a reference to the
variable a represents, so the attempt to assign to the dereference of c
will fail, since the variable c points to has the const property. We can't
depend on the properties to be attached to the *name* a, since (in my crude
attempt to use python-style blocking) the name a has gone out of scope by
the time we dereference c, and so would any properties attached to the name
as such.
I think the variable here would correspond to a Python object, but I'll
need to go digging into Python semantics more to be sure. Currently there's
not a lot planned around the actual symbol table entry, since it's not
required for variable existence, and there's not much I can think of to
attach to it anyway. (References mean we can't use it for much, really)
My bet, and this is just an off-hand thing, is that the parser would spit
out bytecode tying the symbol table entry and the PMC structure more
closely than it might for other languages. At least do a lot more symbolic
fetch/store ops, if need be, which we might not.
>I should repeat that your explanation that assigment is somehow performed
>calling a method on the "variable" is quiet a strange notion in general,
>I can imagine having a slot called on assigment that eventually does a copy
>or return just the object but assigment as an operation on the lvalue
>is something very peculiar. I know that perl5 assignment is an operator
>returning an lvalue, is this related?
I don't think so, but I'm betting we're talking terminologically
cross-wise. Assume, for example, the following pseudoish code:
declare kitchen isa Thermostat
kitchen = 20
In this case the kitchen variable is active, and assigning to it changes
the temperature the thermostat in the kitchen is set to.
The magic properties aren't attached to the name per se, at least in
perl--it's just a handy way to get the PMC that underlies the thing. I can
get an indirect handle on the data that the name 'kitchen' refers to and
update it that way, or if I have that handle the 'kitchen' name can go
completely out of scope and the only way to refer to the object
representing (and allowing control over) my kitchen temperature is via the
handle I've got on what's now an anonymous object. (Or, if I'm feeling
really wacky, have two or more names that look like separate things but
actually all point to the same PMC structure)
I think it's the potential existence of references (or pointers, or
whatever you want to call them) that necessitate the three-level split. It
sounds like the things I'm calling 'name' and 'variable' are much more
tightly bound in Python.
There are several reasons to call a PMC assignment operator, rather than
just unconditionally slam a new PMC into the name slot:
1) In the kitchen example above, assigning 20 to kitchen doesn't suddenly
make kitchen an integer variable--it needs to stay a Thermostat, and just
do something because we assigned to it. (And I understand this might not be
the way Python would do things, in which case Python variable vtables might
just have an "overwrite myself with the incoming info" function, rather
than allow it to be overridden)
2) Since we can't really assign properties like constant-ness to a name (in
which case we could get a reference to a variable and bypass the properties
on the name by referring to it by reference) that needs to be enforced in
other ways. In this case with a custom assignment vtable function which
pitches a fit if you try to alter the variable.
3) Some other thing or other I apparently can't think of. It's a really
good reason, though.
> > > because an operation
> > >like getting the class of an instance should be as direct and fast as
>possible
> > >if you want to use any of the "nice" optimization for a VM for a OO
> dynamic
> > >language:
> > > inline polymorphic caches
> >
> > Yup. I'm leaning towards a class based cache for inherited methods and
> > suchlike things, but I'm not sure that's sufficient if we're going to have
> > objects whose inheritance tree is handled on a per-object basis.
>Make sense for the interpreted version, but for speed call-site
>caches is far more promising when native compiling, but also more
>complicated. I
>imagine you already know e.g. the Self project literature.
Self I've not dug into. Despite all the "I haven't see that" that keeps
popping up in my conversation, I *have* done a lot of research. (The pile
of digested books at one point outweighed my kids. They've grown, but the
new research pile's about to grow too...)
> > > customization
> >
> > Details? I'm not sure what you're talking about here.
>Compiling different versions of the same method for example wrt to
>the receiver type in single dispatch case. See below
Ah, aggressive inlining of a sort. I think. Yep, on the list of things to
do with the optimizer. (Not on the first version, probably, but certainly
later)
> > > or if/when possible: direct inlining.
> >
> > Yep, though the potential to mess with a class' methods at runtime
> tends to
> > shoot this one down. (Perl's got similar issues with this, plus a few
> extra
> > real nasty optimizer killers) I'm pondering a "check_if_changed" branch
> > opcode that'll check a method/function/sub's definition to see if it's
> > changed since the code was generated, and do the inline stuff if it
> hasn't.
> > Though that still limits what you can do with code motion and other
> > optimizations.
> >
> > >If any of the high-level OO system can not be used, you have to choose
> > >a lower level one and map these to it,
> > >a vtable bytecode mapping model is too underspecified.
> >
> > Oh, sure, saying "vtable bytecode mapping model" is an awful lot like
> > saying "procedural language"--it's informative without actually being
> > useful... :)
>
>I imagine you already know: especially when compiling to native code,
>it is all a matter of optimizing for the common case, and be prepared
>to be quiet slow otherwise, especially when dealing with dynamic changes,
>both in Perl and Python there is no clear distiction from normal ops and
>dynamic changes to methods... but in any case they are rare compared to the
>rest.
Yup. I am profoundly tempted, and I think it will go in, to have a "stable
code" switch that can be given to the optimizer which guarantees that, once
the compilation phase is done and we finally get to run for real, code will
not change. And then apply all the advanced optimization techniques at that
point.
>Threading adds complexity to the problem.
Nah, not too badly. A little synchronization issue, but that's reasonably
trivial.
>For the rest is just speed vs. memory, customization is a clear example
>for that.
Sure. The one nice thing that both perl and python have going for them is
the ultimate presence of all the source (or a high-enough level
representation of it to feed to the optimizer) at run time. We can tell
with more surety what side-effects various methods and functions in
external code libraries will have, and we can get more aggressive (maybe)
because of it. Unlike, say, C, where passing a pointer to a routine outside
what the compiler can see will hobble the optimizer a whole lot in many
cases...
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
dan@sidhe.org have teddy bears and even
teddy bears get drunk