[XML-SIG] Economics of RPC

Fri, 15 Feb 2002 01:24:02 -0800

"Martin v. Loewis" wrote:
> 
>...
> 
> For all practical purposes, the method can be only "GET" or
> "POST". Those are really synonymous; it is merely a protocol variant
> (i.e. POST can be taken to mean "do not cache", so it is part of the
> protocol header).

No, GET and POST could not be more different. Caching is irrelevant.
What happens when you hit "reload" on a browser on a POSTed page. The
reload specifically disables cache but if the browser is any good it
behaves massively different when the page came from a GET versus a POST.

Plus PUT and DELETE also exist and should be used more than they are.

> ...
> In RPC, you can also know when you invoke the same operation on the
> same object twice, and cache the result (consider the NFS case).

NFS is an application protocol. You KNOW because you are using NFS, not
because you are using RPC. RPC is just another layer you are ignoring.

> > A firewall that doesn't understand the data it is passing back and forth
> > isn't much of a firewall.
> 
> So it should not let HTTP through then, either? After all, a POST
> could include arbitrary information (such as a SOAP message, or a GIOP
> message).

I would strongly advise system administrators to disallow POSTs of SOAP
or GIOP messages. They should be able to *always know* the safety and
semantics of a particular POST by looking at the URI and the method
name. The rest of the message should be mostly irrelevant for
firewalling purposes.

>...
> The caching is done on the NFS level. NFS is an RPC application, so it
> is both: RPC and application. In this terminology, could you agree
> that HTTP is an RPC application (with a fixed set of operation names:
> GET, POST, and a few that aren't in use)?

I could agree that HTTP *could have been* an RPC application if it were
defined that way. But the underlying RPC "language" for HTTP was never
defined separately from HTTP. Similarly, I posit that NFS could have
been defined without an RPC basis and have been none the worse for wear.
For widely deployed protocols the underlying RPC bit becomes irrelevant.
It's only useful for not-very-widely deployed protocols.

> > Google will likely never do anything useful with an
> > XML-RPC site. It could only ever make progress if something ON TOP of
> > XML-RPC defined some semantics it understood.
> 
> Exactly. So what is the point? It seems what you are saying is "RPC
> cannot (in general) be used without knowledge of the specific
> interface"; I could not agree more. Your implication "it does not
> scale" then becomes "RPC does not scale, but specific RPC applications
> can".

But those applications would scale just as well (or poorly) if they
didn't use RPC.

> For example, Google could collect information from CORBA Naming
> Service installations right away (or better CORBA bootstrap
> installations, since they even have a well-known port). So it would
> seem that you can agree that the CORBA Naming Service does scale, and
> so could my own CORBA application. If so, we are in violent agreement.

Let's put it this way, 

1. if you wanted to deploy a widely used application protocol on the
Internet the CORBA layer would be a net loss, not a gain.

2. "generic" RPC, unconstrained to a particular application protocol
will never get through corporate firewalls for very long.

It is the application protocol that matters. The RPC layer does not help
and may hurt.

>...
> Because they are applications with widespread applicability. There is
> (unfortunately, IMO) a tradition to define text protocolls in the
> internet.

That's because internet protocols are adopted bottom up, by developers,
and tend not to be mandated from the top down, by managers and
standardizers. That's another reason that RPC just gets in the way.

>...
> Those infrastructures are only used if people use them for
> applications. They will do so if they help them to do their work (such
> as CORBA and COM do), instead of being in the way of solving the problem
> (as is the case with HTTP-ng, SOAP, and DCOM).

Okay, but CORBA gives me tremendous bang for my buck if you and I are
going to build a protocol to talk to each other. If everyone on XML-DEV
gets interested the bang for buck starts to drop because now we're
talking about a lot of ORB installations (or SOAP implementation
installations, or XML-RPC implementation installations). If everyone in
the computing world gets interested then it gets downright expensive. My
initial savings in not defining the wire format (which is relatively
easy with XML) have been sucked up in the extra deployment cost of
deploying two layers, the RPC layer and the application protocol layer.

Now if the Internet had standardized on CORBA twenty years ago then
maybe by now we'd have enough application protocls to recoup the cost of
deploying all of those ORBs. But there is a strong argument to be made
that application protocols are going to come along much more slowly in
the future. Part of that is that it is harder to deploy them because
firewalls are so popular. Part of that is because now that protocols
(especially HTTP) work on a universal namespace model alot of the
differences between protocols evaporate. POP and IMAP are a lot like
HTTP except that they had mail-specific namespace models. NNTP is the
same. I'll bet today they could be HTTP variants with little extra
complexity and a huge savings in duplicated infrastructure.

Also, another part of the saying "RPC doesn't scale" is the "common
usage." If we consider NFS or HTTP as "applications" of RPC, they are
incredibly careful, well-thought out applications. Many person-years
went into each. RPCs do not encourage you to put in these person years
because they give the illusion that the network component is a library
that you call like a local library. Anybody sane realizes that this is
not a good way to program. So why think of the network as a library-like
component at all? Why think of network interactions as method calls at
all? Why think of the data being sent across the network as a
"parameter"?

>...
> It seems that out main problem is a terminology one. What means "it
> cannot scale" to you? To me, it means "you cannot build an
> infrastructure where many user in vastly remote locations
> participate". I still think this not true, for RPC systems. To you,
> it seems to mean "it is unlikely that as many users will ever use
> it as they use HTTP today to get documents". If so, we agree.

Yes, I'm mostly talking about multi-vendor, multi-client, multi-server,
multi-intermediary interoperability. You won't get that until you pin
down a fixed set of methods. 

> I'm still a bit unclear what to make of the fire-wall compatibility:
> it seems that for any specific application, you'll either have to
> explain it to the firewall, or cheat it; whether the interaction
> is REST-style or RPC.

The nice thing about REST-style is that it forces you to dumb it down to
the point that the firewall can almost understand it "automatically."
Where it cannot, a few rules like "POST to URIs of this form are okay"
will help a lot. One of the things about REST is that the method name
and the URI are so central and they are also front-and-center to the
firewall. The "important bit" of an RPC message could be any parameter
and might in fact involve state that's already been set on one side or
another!

 Paul Prescod