On Aug 18, 2014, at 8:31 PM, Daniel Sank <sank.daniel@gmail.com> wrote:

Dustin,

> Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.

Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :)

Oddly this is _exactly_ how a co-worker characterized the problem as well, with (as far as I can see) no communication of this idea ;).

<http://cwe.mitre.org/data/definitions/416.html> describes the disastrous consequences of this in C; the Python equivalent is mild, but it's still annoying to get 'None has no attribute "frob"' when you do self.frobber.frob().

Since what I describe is basically WeakReferenceable, it's not obvious to me that there's a bug. I tell you when the thing to which your Remote(Weak)Reference points is destroyed, just like weak references invoke finalization callbacks. If you try to invoke remote methods after than happens I just return you an error. What's the problem? This is exactly how weak references work.

The thing about weak references is that they are almost always to objects that are shared, that have a well-defined lifecycle. Your PB objects generally should _not_ be shared; a well-structured PB application will almost always create thin wrapper objects, factoring the remote-access-control logic out of the core model abstractions. Exposing weak references across a network is also very, very tricky: distributed reference counting is hard enough without trying to introduce real distributed GC that involves reference tracing.

> If your server-side app is sensitive to when objects are destroyed for any reason aside
> from management of its memory consumption, I'd argue your app is broken.

That's a really good point. Only CPython destroys objects deterministically when the ref count hits zero.

Yes. And even the CPython developers have said it's effectively a bug to depend on this behavior, because you can't really know when it's going to happen. A debugger might be holding on to your stack frames for a little while. The profiler might be keeping your locals around for a moment longer than you expect. This is why we have idioms like 'with f as open(...)' now.

> Certainly it opens you to a denial of service from a malicious client, which might hold references
> to objects you'd rather it didn't

So my instinct to keep strong references only where they're needed is just bad?

Set limits on things. PB isn't great about this, but this is an area where it could get better, and where all the fixes are really straightforward (find the place where PB does a thing, set a limit, raise an exception if the thing is over that limit). In this case, don't allow clients to hold unlimited numbers of simultaneous references. Start throwing errors when too many live references exist on one connection. A reasonable application should not need that many at once - if you set a limit at around 1024 and allow servers to tune it for particular workloads, it should be fine. (Set it per-type maybe?)
Don't give out references to objects you can't revoke logically, at an application level. If you have a chess piece that has been taken, that is not a NULL pointer or None. There is a small, fixed number of chess pieces per game, so you don't have to worry about denials of service. Therefore your ChessPiece class should have a 'taken' state associated with it; in fact, you could remember which move of the game the piece was taken on, and produce an error message which specifically reminds the player when it was taken. Much like 'with f as open(...)' explicitly invokes 'f.close()' at the end of the block but does not force 'f' to get garbage collected, you should still be able to have a PB protocol-level reference to an application-level revoked object. Debugging distributed systems is hard enough without translating every revoked-permission error into some common "the distributed GC happened, I don't know what happened to your object, life is hard".

Is this making sense?

Should a GUI or a logger keep a strong reference to the things they observe?

As my other message indicated - yes :).

-glyph