COM/CORBA/DCOP (was: Hello people. I have some questions)

Alex Martelli aleax at aleax.it
Wed Sep 5 09:59:41 EDT 2001


"Neil Hodgson" <nhodgson at bigpond.net.au> wrote in message
news:_Yol7.16034$bY5.92018 at news-server.bigpond.net.au...
    ...
> Alex Martelli:
>
> > >    At the risk of being pedantic (and with Alex, it is *such* a
> temptation
> > > :-) ), you can write your automation code (both client and server) to
> deal
> >
> > If you have control of both client and server, yes, but then there's
> > no point using Automation -- its interpretive nature is meant to help
> > when either end is NOT under your control:-).
>
>    No, it is symmetric. As a client I can keep all my data around in
> variants, even in preallocated argument arrays of VARIANTs - think of
making
> many calls to particular methods. As an independently developed server, I
> can reach into the VARIANT array directly, there is nothing particularly
> difficult in subscripting an array of VARIANTs (each is the same size) as
> long as the type is the expected one.

I didn't mean to imply it had to be symmetric -- as long as the
Automation protocols are respected from both sides, you may
indeed be able to save some part of the marshaling overhead by
this means.  But you did say you can write automation code
for BOTH client and server (check above, it's still quoted),
and I'm still saying you can only do that if you write both
sides -- in which case it WOULD be faster to use custom.

With Automation, the marshaling overhead will still be there,
and substantial, in any case: for example, if you keep you data
("natively" a few hundred integers) as an array of VARIANTS
(16 bytes each), you're wasting a few hundreds' byte in your
processor's cache, which hurts (when you access a given
sequence of such integers, your processor will be pulling
into cache 4 times the lines that it would otherwise need,
etc, etc).  The marshaling overhead will dominate the
dispatching overhead when both are flat-out optimized,
except in very particular cases (methods with no args
or just one arg, for example).


> > But it's still a cost of marshaling, as opposed to one of dispatching.
> > Most Automation clients cache dispatch-ID's, and so the Invoke call
> > reduces to one test of the usual idiomatic form (one-test index-in-
> > range checking):
> >     ((unsigned)dispid)<maxdispid
> > i.e. 2 machine instructions (unsigned-comparison, branch-if-above-equal
> > with the latter mostly-not-taken),
>
>    I wouldn't do it myself, but refraining from checking that a dispid is
> valid could be considered a compliant implementation of Invoke.

No way, the Automation reference is adamant about it: Invoke
*MUST* return DISP_E_MEMBERNOTFOUND if it's called with a
dispatch-id that is not valid for this object -- it's NOT
allowed to crash or return other random HRESULT's.


> // Apologies if this doesn't compile, its from memory but I did check
> // the documentation:
> HRESULT X::Invoke(DISPID dispIdMember,
>   REFIID, LCID, WORD, DISPPARAMS FAR* params,
>   VARIANT *result,
>   EXCEPINFO *, unsigned int *) {
>   VARIANTARG *rg = params->rgvarg;
>   switch (dispIdMember) {
>     case X_SET_FLAVOUR:
>       if (params->cargs == 1 && // More paranoia if warranted
>         rg[0].vt == VT_R8) {
>       result->vt = VT_R8;
>       result->dblVal = m_flavour;
>       m_flavour = rg[0].dblVal;
>       return S_OK;
>     } else {
>       // Coercion time possibly using DispGetParam
>     }
>   }
> }
>
>    Yes, my chosen operation is unfairly short but a lot of the Automation
> servers I've worked on publish a lot of simple accessor methods which are
> also short.

It's a one-argument one-return value function -- fair enough,
except it violates Automation specs in not being paranoid
enough.  It should also check that result is non-null -- it's
NOT allowed to crash if erroneously called with a wrong result,
for example.  It should also check the argument array it
receives contains no named-arguments, as it can't support
them.  Appropriate HRESULT's must be returned for such errors,
rather than ignoring them or crashing the process.


>    With this code, I expect that the biggest cost is in the signature
guard
> conditions with pushing the arguments to Invoke also being significant.
> There is argument access overhead here but I would not call it marshalling
> as I would define marshalling as the transformation of one call format to
> another instead of as the implementation of a function.

The "marshaling overhead" is the overhead of dealing with
arguments (and/or return values) in a specified format as
opposed to the natural format you have or would wish to
have.  For example,

    result->vt = VT_R8;
    result->dblVal = m_flavour;
    ...
    return S_OK;

this is (hardcoded) marshaling code, compared to the
'natural' (in the C world:-)
    double save = m_flavour;
    ...
    return save;

Here, it's admittedly a tiny amount.  The natural
approach presumably operates in a register-to-register
way only -- depending on the details of machine
architecture, something like:
    Load-to-float-stack this->m_flavour
    ...
    Return
becoming:
    Stack-to-register-A result
    Load-const-to-register-B VT_R8
    Store-indirect-integer A->vt, B
    Load-to-float-stack this->m_flavour
    Store-indirect-float A->dblVal
    ...
    Load-const-to-register-A S_OK
    Return
roughly an overhead of 4 or 5 machine cycles (plus
potential unfavourable cache-effects) -- still by
itself more substantial than the needed check on Disp-ID
which is all the minimal dispatching overhead (it
may be higher or lower if you use a case statement
for dispatching, but normally one would use a vtable
for either case of dispatching, custom or Invoke).

Feel free not to call it "marshaling overhead" but
rather "overhead due to unsuitable argument format,
imposed by call-conventions, needing extra machine
effort for each parameter or result-value access": as
it's such a frequent theme, and it makes no conceptual
difference whether one pays the cost at each use of
a parm/return-value, or all at once at the start/end
to put arguments in more suitable form, I'll still call
it "marshaling" anyway:-).


>    Why do I feel justified in special casing one incoming type signature?
> Because client programming languages and programmers are fairly
predictable
> with some always wanting to use one particular numeric type (often VT_R8)
or
> responding to available typeinfo by using the declared argument type.

It's indeed quite likely that such a squeeze-cycles-out
approach will help you shave machine cycles out of
Automation's marshaling-overhead, when the client program
is indeed written in the programming-language you expect
(it may become just-as-slightly counterproductive in
performance terms when your expectations are not fully
met, of course -- e.g., if the client is in Python:-).

You still won't get the marshaling-overhead (or however-
you-name-it-overhead-relating-to-argument-formats:-)
down to anywhere close to the tiny cost of the _dispatch_
overhead intrinsic to Automation, of course.


Alex






More information about the Python-list mailing list