[Python-Dev] Parrot -- should life imitate satire?

Thu, 02 Aug 2001 17:54:46 -0400

At 04:32 PM 8/2/2001 +0200, Samuele Pedroni wrote:
>Hi.
>
> > > >
> > > > The point is to put the commonly called things in the vtable in a 
> way that
> > > > you can avoid as much conditional code as possible, while less common
> > > > things get dealt with in ways you'd generally expect. (Dynamic lookups
> > > with
> > > > caching and suchlike things)
> > >
> > >If I'm right, you're designing a object-based VM.
> >
> > More or less, yep, with a semi-generic two-arg dispatch for the binary
> > methods. The vtable associated with each variable has a fixed set of
> > required functions--add with 2nd arg an int, add with second arg a bigint,
> > add with second arg a generic variable, for example. Basically all the
> > things one would commonly do on data (add, subtract, multiply, divide,
> > modulus, some string ops, get integer/real/string representation, copy,
> > destroy, etc) have entries, with a catchall "generic method call" to
> > handle, well, generic method calls.
>A question: when you say variable you mean variable (perl sense of that)
>or object. It has already been pointed out but it's really confusing
>from the point of view of python terminology. Will perl6 have only
>variables which contain objects, truly references to objects like Python
>or ...?

Well, it sort of looks like:

   name--->variable--->value

The name is a symbol table entry that points to the variable. The variable 
is a data structure that has the vtable info, some pointer data, some 
flags, and general whatnots. (The docs refer to this as a PMC, or Parrot 
Magic Cookie, FWIW. Which isn't much) The value is the actual contents.

We also have to track properties on both a variable and a value basis. For 
example:

    declare c, b
      declare a is const = 0 but true
      b = a
      c = \a
    dereference(c) = 1

In this case, the variable represented by a has the property 'const', and a 
value of 0 which has the property 'true'. The assignment to b copies the 
value, and the properties on the value (true in this case), but not the 
properties on the variable, so b isn't const. c is a reference to the 
variable a represents, so the attempt to assign to the dereference of c 
will fail, since the variable c points to has the const property. We can't 
depend on the properties to be attached to the *name* a, since (in my crude 
attempt to use python-style blocking) the name a has gone out of scope by 
the time we dereference c, and so would any properties attached to the name 
as such.

I think the variable here would correspond to a Python object, but I'll 
need to go digging into Python semantics more to be sure. Currently there's 
not a lot planned around the actual symbol table entry, since it's not 
required for variable existence, and there's not much I can think of to 
attach to it anyway. (References mean we can't use it for much, really)

My bet, and this is just an off-hand thing, is that the parser would spit 
out bytecode tying the symbol table entry and the PMC structure more 
closely than it might for other languages. At least do a lot more symbolic 
fetch/store ops, if need be, which we might not.

>I should repeat that your explanation that assigment is somehow performed
>calling a method on the "variable" is quiet a strange notion in general,
>I can imagine having a slot called on assigment that eventually does a copy
>or return just the object but assigment as an operation on the lvalue
>is something very peculiar. I know that perl5 assignment is an operator
>returning an lvalue, is this related?

I don't think so, but I'm betting we're talking terminologically 
cross-wise. Assume, for example, the following pseudoish code:

    declare kitchen isa Thermostat
    kitchen = 20

In this case the kitchen variable is active, and assigning to it changes 
the temperature the thermostat in the kitchen is set to.

The magic properties aren't attached to the name per se, at least in 
perl--it's just a handy way to get the PMC that underlies the thing. I can 
get an indirect handle on the data that the name 'kitchen' refers to and 
update it that way, or if I have that handle the 'kitchen' name can go 
completely out of scope and the only way to refer to the object 
representing (and allowing control over) my kitchen temperature is via the 
handle I've got on what's now an anonymous object. (Or, if I'm feeling 
really wacky, have two or more names that look like separate things but 
actually all point to the same PMC structure)

I think it's the potential existence of references (or pointers, or 
whatever you want to call them) that necessitate the three-level split. It 
sounds like the things I'm calling 'name' and 'variable' are much more 
tightly bound in Python.

There are several reasons to call a PMC assignment operator, rather than 
just unconditionally slam a new PMC into the name slot:

1) In the kitchen example above, assigning 20 to kitchen doesn't suddenly 
make kitchen an integer variable--it needs to stay a Thermostat, and just 
do something because we assigned to it. (And I understand this might not be 
the way Python would do things, in which case Python variable vtables might 
just have an "overwrite myself with the incoming info" function, rather 
than allow it to be overridden)

2) Since we can't really assign properties like constant-ness to a name (in 
which case we could get a reference to a variable and bypass the properties 
on the name by referring to it by reference) that needs to be enforced in 
other ways. In this case with a custom assignment vtable function which 
pitches a fit if you try to alter the variable.

3) Some other thing or other I apparently can't think of. It's a really 
good reason, though.

> > >  because an operation
> > >like getting the class of an instance should be as direct and fast as
>possible
> > >if you want to use any of the "nice" optimization for a VM for a OO 
> dynamic
> > >language:
> > >   inline polymorphic caches
> >
> > Yup. I'm leaning towards a class based cache for inherited methods and
> > suchlike things, but I'm not sure that's sufficient if we're going to have
> > objects whose inheritance tree is handled on a per-object basis.
>Make sense for the interpreted version, but for speed call-site
>caches is far more promising when native compiling, but also more 
>complicated. I
>imagine you already know e.g. the Self project literature.

Self I've not dug into. Despite all the "I haven't see that" that keeps 
popping up in my conversation, I *have* done a lot of research. (The pile 
of digested books at one point outweighed my kids. They've grown, but the 
new research pile's about to grow too...)

> > >   customization
> >
> > Details? I'm not sure what you're talking about here.
>Compiling different versions of the same method for example wrt to
>the receiver type in single dispatch case. See below

Ah, aggressive inlining of a sort. I think. Yep, on the list of things to 
do with the optimizer. (Not on the first version, probably, but certainly 
later)

> > >   or if/when possible: direct inlining.
> >
> > Yep, though the potential to mess with a class' methods at runtime 
> tends to
> > shoot this one down. (Perl's got similar issues with this, plus a few 
> extra
> > real nasty optimizer killers) I'm pondering a "check_if_changed" branch
> > opcode that'll check a method/function/sub's definition to see if it's
> > changed since the code was generated, and do the inline stuff if it 
> hasn't.
> > Though that still limits what you can do with code motion and other
> > optimizations.
> >
> > >If any of the high-level OO system can not be used, you have to choose
> > >a lower level one and map these to it,
> > >a vtable bytecode mapping model is too underspecified.
> >
> > Oh, sure, saying "vtable bytecode mapping model" is an awful lot like
> > saying "procedural language"--it's informative without actually being
> > useful... :)
>
>I imagine you already know: especially when compiling to native code,
>it is all a matter of optimizing for the common case, and be prepared
>to be quiet slow otherwise, especially when dealing with dynamic changes,
>both in Perl and Python there is no clear distiction from normal ops and
>dynamic changes to methods... but in any case they are rare compared to the
>rest.

Yup. I am profoundly tempted, and I think it will go in, to have a "stable 
code" switch that can be given to the optimizer which guarantees that, once 
the compilation phase is done and we finally get to run for real, code will 
not change. And then apply all the advanced optimization techniques at that 
point.

>Threading adds complexity to the problem.

Nah, not too badly. A little synchronization issue, but that's reasonably 
trivial.

>For the rest is just speed vs. memory, customization is a clear example
>for that.

Sure. The one nice thing that both perl and python have going for them is 
the ultimate presence of all the source (or a high-enough level 
representation of it to feed to the optimizer) at run time. We can tell 
with more surety what side-effects various methods and functions in 
external code libraries will have, and we can get more aggressive (maybe) 
because of it. Unlike, say, C, where passing a pointer to a routine outside 
what the compiler can see will hobble the optimizer a whole lot in many 
cases...

					Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                      teddy bears get drunk