COM/CORBA/DCOP (was: Hello people. I have some questions)

Sun Sep 9 05:39:09 EDT 2001

Neil Hodgson wrote:
        ...
>> It's not an interpretation, it's the literal text of the COM specs.
>    You are interpreting the text. Nothing is ever read with zero context.

Ah, deconstructionists in our midst -- well, go talk it up with Derrida, 
I've got a girlfriend who's a deconstructionist in her spare time and 
that's quite sufficient to fill my quota of deconst. exposure, you know.

>> > is referring to C++ exceptions rather than SEH or crashing. No COM
        ...
> return error status. It says that error statuses should be returned by
> using HRESULTs rather than throwing excptions because of the lack of
> portability of exceptions between languages and environments.

It says that, in any language and operating system, if your interface hands 
its caller an exception your interface is buggy, period.  In particular 
this applies to Win32 environments (by far the dominant platform for COM 
deployment as it happens) and exceptions of the SEH kind (which are known 
as "signals" on some other platforms, and aren't quite as structured nor as 
handy).

>    There is much source code and quite a few books available which
>    implement
> COM. Can you point to an example that implements COM in a way that is
> compliant with your interpretation of this section?

Visual Basic, no doubt the most widely deployed implementation of COM, 
seems to do fine in this regard (despite having quite a few bugs of its 
own).  Re books, I don't have it at hand to check, but I think Don Box's 
excellent books (Essential COM and Effective COM, the latter with many 
co-authors) may address the issue.

>    There are several problems with using SEH here - it adds to code bulk,
> obfuscates the code and is not portable. 

Your source code will not be portable to platforms not supporting SEH, 
where you'll have to use signals instead (or whatever the platform gives 
you for graceful handling of errors), but that's not really an issue for 
COM servers.  There is no obfuscation at all in wrapping your code in a 
needed try/except (or __try &c:-), and the "bulk" is about 2 extra lines.

Of course one may use alternative approaches for many typical problems 
(checking a dispid to ensure it's in a range we can handle is, as I 
mentioned, typically 2 machine instructions -- cheap!), but SEH (or other 
ways to handle exceptions and avoid propagating them to the caller) is an 
important safety net for issues we're not specifically checking.

> There is also the question of how
> far do you go? Just having one __try around the method that returns
> E_POINTER is not as explanatory as producing a error info object that

Sure, but the specs don't mandate "being as explanatory as feasible", as 
long as you do diagnose all errors via HRESULT's rather than exceptions.  
So, the tradeoff between speed, convenience, and detailed error info to 
allow client code debugging, is a completely different issue than sticking 
to the specs.

>    The only explanation I can find for "restricted" is that it is not for
> use by macro programmers or end users.

That's the design-intent, yes.  Stopping the name->dispid translation step 
is one good way to implement that design-intent, and it's exactly how the 
system-provided API's implement it.  Make a toy ATL project called resem 
with just one COM object in it, with an interface (sorry if I don't copy 
and paste, but I'm composing this response on a Linux box and the Win box 
next to it isn't networked right now):

        [object, uuid(...), dual, etc etc]
        interface IPeep: IDispatch
        {
                [id(1)] HRESULT Avail();
                [id(2), restricted] HRESULT Restr();
        }

and here's the result when you use it from Python:

>>> pp=win32com.client.Dispatch('resem.Peep')
>>> pp.Avail()
>>> pp.Restr()
gets error HRESULT -2147352573, aka 0x80020003, aka
DISP_E_MEMBERNOTFOUND

If you implement your own IDispatch, you don't *have* to give member not 
found for restricted attributes (other HRESULT's might be OK too, at the 
letter of the specs), but it's by far simplest and handiest for client code 
to do so.  In any case, the point is that client code needing to access a 
predefined restricted attribute *cannot* rely on doing so just as it would 
for an unrestricted one -- so your implementation must return HRESULT 
diagnostic to avoid crashing perfectly-correct client code that is probing 
to check if your interface is enumerable, has a default property, etc etc.

>>  Microsoft has not designated any
>> specific *names* duplicating the purposes of the various DISPID_NEWENUM,
>> DISPID_VALUE (the default-property of an Automation object, if any),
>> and so on, which it *has* reserved and designated.
> 
>    So you are prepared to not follow the letter of the specification for
>    the
> dispatch identifier? (Of course I can argue both sides :-) )

Absolutely not!  Where are you reading this?!  Follow the letter of the 
specification: e.g. if you have a default property, call it however you 
wish, but make sure DISPID_VALUE is the dispid for that property, and avoid 
using DISPID_VALUE for any other attribute; client code wishing to access 
your default property if any *has* to directly use DISPID_VALUE, rather 
than trying to guess at what name you might have chosen.  (Using name Value 
is typical, but nowhere near universal -- e.g. a textlabel control is 
likelier to name its default-property Text or Caption -- it's NOT up to 
client code to try and guess about this, rather, DISPID_VALUE should be 
used directly -- it's as easy as this!).  This implies that perfectly 
correct client code may perfectly well call your Invoke with dispids you do 
not know about, so that having Invoke crash when called with unknown 
dispids is not just an obvious violation of the letter of the COM specs, 
it's also a particularly nasty bug that's quite likely to crash *perfectly 
correct* client code.  How perverse can you get?!

>> In the abstract, you could make a case that a function must not
>> waste effort trying to validate its preconditions -- that's not
>> the function's job.  Meyers is as hot as usual in defending this
>> stance in his PbC materials.
> 
>    I was expecting to be the one that first brought up PbC ;)

Djikstra actually introduced the notion (in "A Discipline for 
Programming"), but not having a marketing knack he didn't give it a catchy 
name (he called it "the weakest precondition approach") nor did he bother 
to "divulge" it.  But it's far more reasonable in Djikstra's context (an 
excellent mathematical approach to provably-correct programming) than in 
everyday practice *except where you control both sides of an interface*, 
which is rarely the case these days (more often than not, you're coding 
your components to the specs of some framework, for reuse and for 
interchangeability, so you are constrained by the framework's specs).

For example, network servers may very well specify preconditions such as 
"this message is no longer than 256 octets" -- but since the clients for 
that interface may be coded separately (and may well have hostile intent), 
the "PbC" approach of "it's not my job to check that my preconditions are 
in fact met" can only lead to security weaknesses and exploits.  For any 
interface that is exposed on the outside of any I-control-it-all subsystem, 
one test the component SHOULD pass is surviving an 'attack' with totally 
random data (and, ideally, also one with data specifically designed to 
probe its weaknesses) -- that IS the world we live in, after all.  Of 
course, we all know server-programmers are not as thorough as they should 
be (and we've all been guilty of imperfect security at some point, I'm 
sure), but that's no reason to _encourage_ such laxity at exposed 
interfaces:-).  One day your component WILL be running with high privileges 
in a very hostile environment (if the component is any good...:-), so, why 
not minimize the vulnerable interfaces it exposes to the outside and be 
properly paranoid about what happens on those interfaces?-)

>    I no longer hang out on COM mailing lists. Is the current consensus in
> agreement with you?

I wouldn't know, really, since my "free" time these days goes to Python, 
.NET, OpenBSD, and Linux, roughly in this order of interest.  COM (and 
Win32 API's, Visual C++, ATL, and other MS'ish stuff, as well as C++ 
language lawyering and so on) is what I do, teach, consult on, and mentor 
on, for a living (while trying to squeeze in as much of my free-time 
interest stuff into the job:-) -- and so much other extremely interesting 
stuff (Haskell and other FP languages, for example) has to stay on the 
shelf for lack of time, that, like you, I've also shelved most active 
participation in public debate of these job-only issues:-).

Alex