[Twisted-Python] Is there pb documentation somewhere?
I've been trying to address ticket 7274 https://twistedmatrix.com/trac/ticket/7274 To do this, I am trying to understand the PB protocol. While I found a spec for banana in twisted-daniel/docs/core/specifications/banana.rst, I have not found anything similar for pb. I've been piecing it together by writing little test scripts, but it is slow going. In particular, it is very difficult to understand the meaning of verbs like "cook" and "preserve" and nouns like "persistent store" without some global picture of what's going on. 1. Is there some kind of narrative documentation on how pb works under the hood? 2. Is there a specification for the pb dialect of banana? 3. Is there anyone else out there interested enough in pb to want to work with me to figure things out and produce documentation if there isn't any currently? Sincerely, Daniel Sank
On Jul 27, 2014, at 7:26 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
I've been trying to address ticket 7274
https://twistedmatrix.com/trac/ticket/7274
To do this, I am trying to understand the PB protocol. While I found a spec for banana in twisted-daniel/docs/core/specifications/banana.rst, I have not found anything similar for pb. I've been piecing it together by writing little test scripts, but it is slow going. In particular, it is very difficult to understand the meaning of verbs like "cook" and "preserve" and nouns like "persistent store" without some global picture of what's going on.
1. Is there some kind of narrative documentation on how pb works under the hood?
I don't believe there is.
2. Is there a specification for the pb dialect of banana?
Beyond the code, no.
3. Is there anyone else out there interested enough in pb to want to work with me to figure things out and produce documentation if there isn't any currently?
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need. -glyph
glyph,
2. Is there a specification for the pb dialect of banana?
Beyond the code, no.
Ok.
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need.
For two personal projects, I would like to have a reasonable remote objects library in python. I need something which can announce state changes to clients, and receive state change requests from clients. My solution: 1. Make server side class which can spawn Cacheables when it wants to tell clients of its existence. 2. Give RemoteCaches to clients and use observe_* methods as appropriate. 3. Stuff a Viewable into the RemoteCaches so that clients can request state changes. Question #1: Is this a reasonable use of pb? This all worked great until I ran into a bug. In trying to fix the bug, I found that 1. pb code is really hard to understand 2. exarkun's thinks pb is bad and that I should implement what I need in AMP. 3. exarkun thinks banana and jelly are reasonable. Question #2: Would you recommend implementing a simplified replacement for pb on top of banana/jelly, or starting over from AMP? I favor the banana/jelly route because the protocol seems intrinsically flexible, but I read your blog explaining why protocols like banana are bad, so I'm confused about what I "should" do. Daniel
glyph
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need.
I am trying to understand jelly's serialization strategy: 1. In t.s.jelly._Jellier, what is the meaning of persistentStore? 2. In t.s.jelly._Jellier, what is the meaning of cooked? The comment here doesn't make sense to me yet. 3. In t.s.jelly._Jellier, what is the meaning of cooker? A short, narrative explanation of what _Jellier does would be very useful, and if you provide it I will submit a patch to the documentation. Daniel
On Aug 4, 2014, at 10:07 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need.
I am trying to understand jelly's serialization strategy:
1. In t.s.jelly._Jellier, what is the meaning of persistentStore?
From the perspective of PB, you can ignore this completely. It's effectively an unused feature. There are two entry-point call-sites for jelly in Pb. Broker.unserialize and Broker.serialize. Both explicitly pass "None" for the "persistent" argument, "persistentStore" and "persistentLoad" respectively. Reaching back into my dim and distant memory of the ancient past, I believe that the purpose of these callables was to allow you to use Jelly (and perhaps PB) to refer to objects in some kind of pluggable long-term storage. The reason they're called "persistent" was that "ephemeral" storage was local to the connection, and therefore short-lived enough that we could trust that an in-memory Python dictionary would be both large enough and long-lived enough to serve it. But if you have your objects in a database, you might want a different database backend with an application-provided callable for loading objects by ID. Again, this was never really used, so you can probably ignore it. (I think there might have been a 4X massively multiplayer video game which used it in 2002 or so, but nothing since then that I'm aware of, especially since PB doesn't even have a way to pass in your own without subclassing and overriding 'serialize'.)
2. In t.s.jelly._Jellier, what is the meaning of cooked? The comment here doesn't make sense to me yet.
I just read the comment in _cook, and I hate my younger self right now. Seriously. Screw that guy. When you make a jelly, you have to cook the fruit first. So part of the metaphor here is that you are "cooking" the objects as you're serialize them. The "cooked" attribute maps object IDs (integers representing pointers, at least in CPython) to "dereference" jelly expressions. It is said to be "cooked" at that point because you no longer need to put in the energy (I guess heat, in this metaphor?) to serialize the internal state. A "dereference" expression is one that points at an object within the same Jelly, so this is not like something pointing at a remote reference. It uses object IDs for keys and not the objects themselves because these objects are (since they can participate in circular references) implicitly mutable, and mutable objects often don't have a working __hash__ implementation, so we can't rely on that. This happens in a weird order because an object may circularly refer to itself, so we prepare it and put it in the "preserved" map before actually beginning the serialization process of its initial state. We also don't want to pollute the jelly output with reference IDs for every single object that _might_ be referenced more than once, we only want to add the ['reference'] expression if we actually refer to it twice. If you look at this example:
from twisted.spread.jelly import jelly circular = [1, 2] circular.append(circular) jelly(circular) ['reference', 1, ['list', 1, 2, ['dereference', 1]]] acyclic = [1, 2] jelly(acyclic) ['list', 1, 2]
You can see that the circular list allocates a reference ID '1' for the circular list. The output list there would have been the thing that went into the _Jellier's "cooked" list, keyed by the 'id' for the serialized list, and then 'reference 1' would have been inserted into the beginning and its body appended. So the steps are: Here's a mutable object. Let me remember that I've seen it, just in case I see it again. Now I'm going to recursively serialize it. Oh, here it is again, I know it's the same object because it has the same ID. Instead of serializing it, I'll change the ['list'] into a ['reference', 1] and stick in a ['dereference', 1] here. If we never get to step 3, we never see the ['reference'] at all, and it's as if this functionality didn't exist.
3. In t.s.jelly._Jellier, what is the meaning of cooker?
The "cooker" attribute is a hack related to the use of "id" for the unique IDs. If we used the object itself as the key (which we shouldn't do, for reasons I mentioned above), then we could just rely on it sticking around until the end of the 'jelly' call. But instead, we use its 'id', which is its pointer address, so we need to make sure that it lives on until the end of the _Jellier's lifetime, so we just stick it into the "cooker" map as the value. You'll notice that there's no store of the object itself anywhere else: in "cooked" the key is the ID, and the value is the serialized output value that Jelly is going to write out. If we didn't make sure the object stuck around, a different object might get the same ID, and that would produce spurious back-references (like, we might get a ['dereference'] where something harmless like a string should go).
A short, narrative explanation of what _Jellier does would be very useful, and if you provide it I will submit a patch to the documentation.
A _Jellier jellies objects of course, isn't it obvious ;-). Hopefully you can make sense out of the explanations above and your own existing knowledge. Are there any other phases of the process which are confusing? -glyph
glyph,
I really wish we would stop calling things "bad" and "good".
My wording of exarkun's wording. He gave a much more detailed description of what he think's is "crazy" about pb.
make your own decisions about how to write your own code.
Indeed, but gathering information from wiser folks is always a good idea, and usually best done _often_ during development :)
I'm happy to trade 2-for-1 - if you do two code reviews, I will regard it as an immediate obligation for me to review a ticket you direct me to ;).
These would also be easier to land, and a couple of decades in open
Deal. However, rather than direct your attention to tickets, at this stage I would rather trade reviews for discussion. I'll do two reviews and then post a few questions to this mailing list thread. Once I start actually writing patches/new code we can trade reviews for attention to tickets. Ok? source has taught me that nothing
motivates development activity like successful development activity ;).
Indeed. There are one or two architectural issues I want to understand before moving on to real coding. I will try to get through that asap by reviewing tickets and trading for discussion of those architectural issues.
Hopefully you can make sense out of the explanations above and your own existing knowledge. Are there any other phases of the process which are confusing?
This all makes sense now. I hadn't understood the point of the cooker, but now that you've explained it, I understand what's going on. I will transform your mailing list explanation to documentation shortly. Daniel
On Aug 7, 2014, at 10:42 AM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
I really wish we would stop calling things "bad" and "good".
My wording of exarkun's wording. He gave a much more detailed description of what he think's is "crazy" about pb.
This was a complaint about a general trend, not about specific words. Clearly exarkun gave you the impression that it is "bad", whether he specifically said so or not. We're all intimately familiar with everything that's terrible about all of our code, and we aren't shy about sharing. I just would like it if we could really lead with the details and refrain from value judgements :).
make your own decisions about how to write your own code.
Indeed, but gathering information from wiser folks is always a good idea, and usually best done _often_ during development :)
I might quibble with "wiser" but okay. I'm happy to provide feedback earlier so I don't have to say "what is this disaster" later ;-).
I'm happy to trade 2-for-1 - if you do two code reviews, I will regard it as an immediate obligation for me to review a ticket you direct me to ;).
Deal. However, rather than direct your attention to tickets, at this stage I would rather trade reviews for discussion. I'll do two reviews and then post a few questions to this mailing list thread. Once I start actually writing patches/new code we can trade reviews for attention to tickets. Ok?
I'm happy to do that.
These would also be easier to land, and a couple of decades in open source has taught me that nothing motivates development activity like successful development activity ;).
Indeed. There are one or two architectural issues I want to understand before moving on to real coding. I will try to get through that asap by reviewing tickets and trading for discussion of those architectural issues.
I'll try to respond to these questions regardless. I would like to help. It's just that the reviews will create a more tangible sense of commitment :).
Hopefully you can make sense out of the explanations above and your own existing knowledge. Are there any other phases of the process which are confusing?
This all makes sense now. I hadn't understood the point of the cooker, but now that you've explained it, I understand what's going on. I will transform your mailing list explanation to documentation shortly.
Great, glad that helped. -glyph
On 05:57 pm, glyph@twistedmatrix.com wrote:
On Aug 7, 2014, at 10:42 AM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
I really wish we would stop calling things "bad" and "good".
My wording of exarkun's wording. He gave a much more detailed description of what he think's is "crazy" about pb.
This was a complaint about a general trend, not about specific words. Clearly exarkun gave you the impression that it is "bad", whether he specifically said so or not.
I don't understand what you're saying here. Do you want people to not describe the shortcomings of certain pieces of software? Or do you want people not to conclude from such descriptions that those pieces of software are not the most well suited for certain applications? Or do you want people to write two pages of description every time they want to refer to the idea that a certain piece of software isn't the best choice for a certain application? Could you clarify what you think the problem here actually is?
We're all intimately familiar with everything that's terrible about all of our code, and we aren't shy about sharing. I just would like it if we could really lead with the details and refrain from value judgements :).
In this case, it seems like that's exactly what happened. I led with detail. The value judgement of PB being "bad" (which is a gross over- simplification, but a convenient shorthand) came afterwards. Jean-Paul
make your own decisions about how to write your own code.
Indeed, but gathering information from wiser folks is always a good idea, and usually best done _often_ during development :)
I might quibble with "wiser" but okay. I'm happy to provide feedback earlier so I don't have to say "what is this disaster" later ;-).
I'm happy to trade 2-for-1 - if you do two code reviews, I will regard it as an immediate obligation for me to review a ticket you direct me to ;).
Deal. However, rather than direct your attention to tickets, at this stage I would rather trade reviews for discussion. I'll do two reviews and then post a few questions to this mailing list thread. Once I start actually writing patches/new code we can trade reviews for attention to tickets. Ok?
I'm happy to do that.
These would also be easier to land, and a couple of decades in open source has taught me that nothing motivates development activity like successful development activity ;).
Indeed. There are one or two architectural issues I want to understand before moving on to real coding. I will try to get through that asap by reviewing tickets and trading for discussion of those architectural issues.
I'll try to respond to these questions regardless. I would like to help. It's just that the reviews will create a more tangible sense of commitment :).
Hopefully you can make sense out of the explanations above and your own existing knowledge. Are there any other phases of the process which are confusing?
This all makes sense now. I hadn't understood the point of the cooker, but now that you've explained it, I understand what's going on. I will transform your mailing list explanation to documentation shortly.
Great, glad that helped.
-glyph
On Aug 8, 2014, at 11:26 AM, exarkun@twistedmatrix.com wrote:
This was a complaint about a general trend, not about specific words. Clearly exarkun gave you the impression that it is "bad", whether he specifically said so or not.
Could you clarify what you think the problem here actually is?
I concede I was leaving some stuff out, so that wasn't the clearest description. In the past six months or so I've been going to lots of events where I talk to people about Twisted, and why they might or might want to use it. I've participated in this discussion several times: Hypothetical Amalgam of Median Interlocutors Speaking Here: "I'm using Tulip because I really like its style of coroutines." Glyph: "That's interesting. Did you know that Twisted has an equivalent style of coroutines, called inlineCallbacks, that's been around for years?" HAMISH: "I saw that, and I asked about that a while ago and I heard it was bad. It haven't heard that Tulip has the same problems, though." Glyph: "Really? What problems does inlineCallbacks have that Tulip's coroutines don't?" HAMISH: "When I asked about it everybody told me I have to use Deferreds instead, but Deferreds are really confusing and they make your code look all gross, so I didn't want to do that. With Tulip I don't have to!" Glyph: <facepalm> Of course the problems that we describe with inlineCallbacks are the exact same problems that you will have with Tulip-style coroutines, and in fact in one of the conversations that was averaged out to produce the above composite, my interlocutor specifically mentioned that they'd already had the kind of bug that explicit-yield coroutines can sometimes encourage (thoughtlessly putting in too many 'yield's and not considering their consequences) and were wondering how Twisted dealt with that sort of thing. I don't object to people using Tulip, or for that matter any of Twisted's event-driven competitors - I'm much happier if they're writing event-driven code of any stripe than just spawning a thread and writing until they block - but it does bother me if they select a different project to use or contribute to because of a perceived issue created only by our collective habit of being tersely self-critical. When anyone directly involved with producing a thing describes that thing as "good", new observers tend to take it with a grain of salt. "Of course they think X is good, they work on X." When someone involved with a project describes it as "bad", though, even if it's a convenient shorthand for many people in the conversation for a well-understood set of complex issues, those new observers tend to think, "Wow, if even they describe X as bad, it must be really bad, they work on X!". What I am asking everyone reading here to do is just avoid calling stuff "bad" or "gross" or "complicated". Even a stock stand-in phrase that more or less just means "bad" would be better. Even an unexplained "inappropriate for my use-case", for example, least implies that the user might want to consider the system under discussion's appropriateness for their particular use-case.
We're all intimately familiar with everything that's terrible about all of our code, and we aren't shy about sharing. I just would like it if we could really lead with the details and refrain from value judgements :).
In this case, it seems like that's exactly what happened. I led with detail. The value judgement of PB being "bad" (which is a gross over- simplification, but a convenient shorthand) came afterwards.
Keep in mind that my introduction to this interaction was Daniel saying:
exarkun thinks pb is bad and that I should implement what I need in AMP.
You can see how I might have interpreted this to mean that you just said you think PB is bad :-). Nevertheless, Daniel didn't lead with the details and refrain from a value judgement, so the advice applies equally well to him. Which is why I filled out all those details, so other readers of the thread will know what "reasonable" and "bad" mean in this context. -glyph
Twisted dev people dudes,
Nevertheless, *Daniel* didn't lead with the details and refrain from a value judgement, so the advice applies equally well to him.
Lesson learned. Thanks. I agree that this is important. Now to bring the thread back on-topic, I'd like to ask what pb should do *in principle*. In other words, what is the specification for the flavors? I think a discussion of each pb flavor would be helpful and would provide me material from which I can generate missing docstrings [1]. I'd rather do it this way instead of backing out what the standing implementation currently does so that I don't waste time working on something which is a fundamentally bad idea. This discussion should be a small investment at the present time. Copyable: The functionality provided by Copyable is simple. The sender of the Copyable just sends it and *forgets*. Therefore, sending a Copyable is basically just sending atomic data in a particular format. I don't think we need to discuss this any further. Referenceable: When I send you a Referenceable, I send a GUID so that you can later refer to that object. For example, I send you a message with argument (psuedo code) "referenceable-'Joe'" This is a declaration that I am keeping hold of an object called "Joe" upon which you may call methods remotely. Specifically, you can send me "'Joe-foo-4" which tells me to call Joe.foo(4) and send you the result. 1. How long should the GUID for Joe survive? If Joe is deleted can I reuse the name "Joe" for an object created later? 2. Do I notify you if Joe disappears on my side? Let's stop here for now. I owe glyph some reviews [2]. Yours sincerely, Daniel [1] I already submitted a patch to the pb documentation and improved the submission based on review. I hope this provides some indication of my commitment to make material contributions. I mention this because glyph made a comment suggesting that showing real work would be valuable. [2] Is this the beginning of a process which will lead me in the end to complete servitude and loss of ownership of my own soul?
On Aug 8, 2014, at 6:31 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Now to bring the thread back on-topic,
Yes, let's get back to it, shall we?
I'd like to ask what pb should do in principle. In other words, what is the specification for the flavors? I think a discussion of each pb flavor would be helpful and would provide me material from which I can generate missing docstrings [1]. I'd rather do it this way instead of backing out what the standing implementation currently does so that I don't waste time working on something which is a fundamentally bad idea. This discussion should be a small investment at the present time.
That sounds like a good idea.
Copyable: The functionality provided by Copyable is simple. The sender of the Copyable just sends it and *forgets*. Therefore, sending a Copyable is basically just sending atomic data in a particular format. I don't think we need to discuss this any further.
Yes. There's an important corollary to this: a Copyable ought to be immutable. A Copyable really represents a "value" in the functional programming sense, and not an "object" in the OO sense.
Referenceable: When I send you a Referenceable, I send a GUID so that you can later refer to that object.
It's not really a GUID. The "G" in GUID stands for "global", and the IDs in Referenceable specifically draw a distinction: <https://github.com/twisted/twisted/blob/a8227e5562a4f9074bb0d5faf6a10e910697...>. They're named LUIDs throughout. The ID is connection-local. When the PB connection goes away, so does the reference to that object.
For example, I send you a message with argument (psuedo code)
"referenceable-'Joe'"
This is a declaration that I am keeping hold of an object called "Joe" upon which you may call methods remotely. Specifically, you can send me
"'Joe-foo-4"
which tells me to call Joe.foo(4) and send you the result.
1. How long should the GUID for Joe survive? If Joe is deleted can I reuse the name "Joe" for an object created later?
Right now these IDs survive until the end of the connection. We might want to have other ways to address objects, but that should be something higher-level; a naming service that lets you request an object by some identifier. The ID is a counter, and since it's a Python integer, it'll never even wrap around, so it won't be re-used within the scope of the same connection.
2. Do I notify you if Joe disappears on my side?
Yes. <https://github.com/twisted/twisted/blob/a8227e5562a4f9074bb0d5faf6a10e910697...>.
Let's stop here for now. I owe glyph some reviews [2].
Yours sincerely, Daniel
[1] I already submitted a patch to the pb documentation and improved the submission based on review. I hope this provides some indication of my commitment to make material contributions. I mention this because glyph made a comment suggesting that showing real work would be valuable.
Thanks for pointing that out. Stuff is happening, everybody ;-).
[2] Is this the beginning of a process which will lead me in the end to complete servitude and loss of ownership of my own soul?
Oh, don't worry. That's not the end. That is merely the beginning. -glyph
glyph,
2. Do I notify you if Joe disappears on my side?
Yes. < https://github.com/twisted/twisted/blob/a8227e5562a4f9074bb0d5faf6a10e910697... .
That's the recipient announcing deletion, not the sender. And anyway, my questions is how _should_ this work, not how does it work right now. Daniel
On Aug 17, 2014, at 8:05 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
2. Do I notify you if Joe disappears on my side?
Yes. <https://github.com/twisted/twisted/blob/a8227e5562a4f9074bb0d5faf6a10e910697...>.
That's the recipient announcing deletion, not the sender.
On the sender's side, the object can't disappear unless the recipient sends the deletion. The recipient is holding a reference to it. This is by design - the sender can synthesize a restricted capability which has no use except for mediating the recipient's access to a particular resource (in fact, this is practically the only recommended way to use PB) and the reference to that object is held only by the server.
And anyway, my questions is how _should_ this work, not how does it work right now.
It works this way now, and that part of the design is basically correct, I think, unless you're asking about some other aspect of it that I don't get ;). -glyph
glyph,
On the sender's side, the object *can't* disappear unless the recipient sends the deletion.
Surely a resource can disappear on the server. When that happens, any Referenceables being used to mediate access to that resource should go away... or something, right? I must not be thinking about this correctly. Daniel
On Aug 18, 2014, at 11:49 AM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
On the sender's side, the object can't disappear unless the recipient sends the deletion.
Surely a resource can disappear on the server. When that happens, any Referenceables being used to mediate access to that resource should go away... or something, right? I must not be thinking about this correctly.
What do you mean by "disappear"? A "resource" - i.e. a Referenceable - is just a Python object in memory. One could of course write an intentionally malicious PB server that made it appear that an object had "disappeared" by responding with errors to all method calls sent over the wire, but in normal operation, Python objects don't spontaneously ascend to a different plane of existence - as long as there are pointers to them in memory (in the case of Referenceables that are currently in use, a reference from a dictionary on the Broker instance for the client which is using them) they will remain alive indefinitely. -glyph
glyph,
A "resource" - i.e. a Referenceable - is just a Python object in memory.
Indeed.
but in normal operation, Python objects don't spontaneously ascend to a different plane of existence - as long as there are pointers to them in memory
Of course.
in the case of Referenceables that are currently in use, a reference from a dictionary on the Broker instance for the client which is using them
Suppose I have a Thingy: myThingy = Thingy() I want to give you some amount of access to manipulate myThingy, so I make a Referenceable which has some connection to it: myReferenceable.thingy = weakref.proxy(myThingy) and I send you the Referenceable. Now suppose I do del myThingy Now myThingy will be garbage collected. Then, if you invoke methods on myReferenceable, they'll fail. Is this what we want, or should I tell you that your RemoteReference should be considered stale? If I'm not thinking about this correctly please advise. I realize that I could have done myReferenceable.thingy = myThingy so that myThingy lives as long as myReferenceable, but this doesn't actually seem like what I would normally want. Daniel
On Aug 18, 2014, at 12:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
A "resource" - i.e. a Referenceable - is just a Python object in memory.
Indeed.
but in normal operation, Python objects don't spontaneously ascend to a different plane of existence - as long as there are pointers to them in memory
Of course.
in the case of Referenceables that are currently in use, a reference from a dictionary on the Broker instance for the client which is using them
Suppose I have a Thingy:
myThingy = Thingy()
I want to give you some amount of access to manipulate myThingy, so I make a Referenceable which has some connection to it:
myReferenceable.thingy = weakref.proxy(myThingy)
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected. Then, if you invoke methods on myReferenceable, they'll fail. Is this what we want, or should I tell you that your RemoteReference should be considered stale?
If I'm not thinking about this correctly please advise. I realize that I could have done
myReferenceable.thingy = myThingy
so that myThingy lives as long as myReferenceable, but this doesn't actually seem like what I would normally want.
Why would you not normally want that? What you're saying here is that MyReferenceable requires a thingy in the 'thingy' attribute to do its job. MyReferenceable is a Python class in your application - its clients will call its methods, and it should take care that its methods do something sensible. The fact that its clients are remote via the Broker class is almost irrelevant. If you pass a type MyReferenceable doesn't expect - a weakref.proxy that suddenly becomes invalid when the inner object goes away - you'll get nonsense behavior. But this isn't specific to Referenceable or remote access - if you just had an A and a B, and A expects a 'b' attribute that's a B, and you set 'b' to something that isn't a B, you get the same kind of nonsense behavior. (Also, if you require a 'thingy' attribute it should probably be a constructor argument rather than an externally-set attribute, so that the instance is initially in a valid state.) -glyph
On Mon, Aug 18, 2014 at 3:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected.
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers. If and only if that's the last reference (as determined by Python's reference counting), it is deleted. As glyph said, as long as there is an outstanding remote reference, the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted. Dustin
Dustin,
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
Argh. I'm assuming, as in the example, that the only strong reference to myThingy is the one I own.
the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
I understand that. I'm trying to ask if that's how it _should_ work. If I have a resource and make a Referenceable to give you access to it, it doesn't really make sense to me that my resource should be kept alive just because you have that access. It seems more reasonable to me that your access object should reference my resource _weakly_ and that you should receive some kind of notification if and when the resource expires. It's just like the case of a GUI and a business logic object. The GUI probably gets a reference to the business logic object so that eg. button pushes can invoke methods on the object. However, that reference should probably be weak so that the business logic object can be garbage collected when it's finished with its business. There's no sense (to me) in keeping an object alive because a GUI, logger, or other observer is observing it. Am I just wrong? Daniel On Mon, Aug 18, 2014 at 2:18 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
On Mon, Aug 18, 2014 at 3:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected.
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
If and only if that's the last reference (as determined by Python's reference counting), it is deleted. As glyph said, as long as there is an outstanding remote reference, the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
Dustin
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
Indeed, what you're expecting is very much against the design of PB. Dustin On Mon, Aug 18, 2014 at 5:25 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Dustin,
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
Argh. I'm assuming, as in the example, that the only strong reference to myThingy is the one I own.
the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
I understand that. I'm trying to ask if that's how it _should_ work. If I have a resource and make a Referenceable to give you access to it, it doesn't really make sense to me that my resource should be kept alive just because you have that access. It seems more reasonable to me that your access object should reference my resource _weakly_ and that you should receive some kind of notification if and when the resource expires.
It's just like the case of a GUI and a business logic object. The GUI probably gets a reference to the business logic object so that eg. button pushes can invoke methods on the object. However, that reference should probably be weak so that the business logic object can be garbage collected when it's finished with its business. There's no sense (to me) in keeping an object alive because a GUI, logger, or other observer is observing it. Am I just wrong?
Daniel
On Mon, Aug 18, 2014 at 2:18 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
On Mon, Aug 18, 2014 at 3:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected.
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
If and only if that's the last reference (as determined by Python's reference counting), it is deleted. As glyph said, as long as there is an outstanding remote reference, the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
Dustin
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Dustin and glyph,
Indeed, what you're expecting is very much against the design of PB.
I see. The existing Referenceable code now makes sense to me, since I now understand the intent. Thank you. Would having something like Referenceable but which is not reference counted across the network, and which receives a notification when the server-side object dies, be useful [1]? I'm thinking of the case where I use pb to play chess over the network. If a piece is captured, the server's reference to that object will be deleted. There is no sense, in this case, for the object representing the piece to persist, and certainly not because the client's happen to have knowledge that the piece existed once upon a time. Daniel [1] I haven't delved into the Viewable code yet, so that might be what I'm describing. On Mon, Aug 18, 2014 at 4:30 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
Indeed, what you're expecting is very much against the design of PB.
Dustin
On Mon, Aug 18, 2014 at 5:25 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Dustin,
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
Argh. I'm assuming, as in the example, that the only strong reference to myThingy is the one I own.
the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
I understand that. I'm trying to ask if that's how it _should_ work. If I have a resource and make a Referenceable to give you access to it, it doesn't really make sense to me that my resource should be kept alive just because you have that access. It seems more reasonable to me that your access object should reference my resource _weakly_ and that you should receive some kind of notification if and when the resource expires.
It's just like the case of a GUI and a business logic object. The GUI probably gets a reference to the business logic object so that eg. button pushes can invoke methods on the object. However, that reference should probably be weak so that the business logic object can be garbage collected when it's finished with its business. There's no sense (to me) in keeping an object alive because a GUI, logger, or other observer is observing it. Am I just wrong?
Daniel
On Mon, Aug 18, 2014 at 2:18 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
On Mon, Aug 18, 2014 at 3:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected.
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
If and only if that's the last reference (as determined by Python's reference counting), it is deleted. As glyph said, as long as there is an outstanding remote reference, the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
Dustin
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me. Perhaps you see a more compelling use-case than the chess example. In just about any case I can think of, I'm not at all concerned about when garbage collection takes place. Certainly it opens you to a denial of service from a malicious client, which might hold references to objects you'd rather it didn't, but PB's not made for use in hostile situations, and anyway it has an upper limit (hard-coded to 4096) on the number of references a client can hold. Trust me, Buildbot users run up against that limit all the time. If your server-side app is sensitive to when objects are destroyed for any reason aside from management of its memory consumption, I'd argue your app is broken. It's worth noting that Buildbot also ran into a nasty bug in an older version of Twisted that caused the broker to not unreference objects correctly, which led to quite a bit of memory usage. Dustin On Mon, Aug 18, 2014 at 8:28 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Dustin and glyph,
Indeed, what you're expecting is very much against the design of PB.
I see. The existing Referenceable code now makes sense to me, since I now understand the intent. Thank you.
Would having something like Referenceable but which is not reference counted across the network, and which receives a notification when the server-side object dies, be useful [1]?
I'm thinking of the case where I use pb to play chess over the network. If a piece is captured, the server's reference to that object will be deleted. There is no sense, in this case, for the object representing the piece to persist, and certainly not because the client's happen to have knowledge that the piece existed once upon a time.
Daniel
[1] I haven't delved into the Viewable code yet, so that might be what I'm describing.
On Mon, Aug 18, 2014 at 4:30 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
Indeed, what you're expecting is very much against the design of PB.
Dustin
On Mon, Aug 18, 2014 at 5:25 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Dustin,
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
Argh. I'm assuming, as in the example, that the only strong reference to myThingy is the one I own.
the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
I understand that. I'm trying to ask if that's how it _should_ work. If I have a resource and make a Referenceable to give you access to it, it doesn't really make sense to me that my resource should be kept alive just because you have that access. It seems more reasonable to me that your access object should reference my resource _weakly_ and that you should receive some kind of notification if and when the resource expires.
It's just like the case of a GUI and a business logic object. The GUI probably gets a reference to the business logic object so that eg. button pushes can invoke methods on the object. However, that reference should probably be weak so that the business logic object can be garbage collected when it's finished with its business. There's no sense (to me) in keeping an object alive because a GUI, logger, or other observer is observing it. Am I just wrong?
Daniel
On Mon, Aug 18, 2014 at 2:18 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
On Mon, Aug 18, 2014 at 3:37 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
and I send you the Referenceable. Now suppose I do
del myThingy
Now myThingy will be garbage collected.
No, this is simply incorrect. 'del myThingy' simply removes a reference to the object to which myThingy refers.
If and only if that's the last reference (as determined by Python's reference counting), it is deleted. As glyph said, as long as there is an outstanding remote reference, the Broker keeps a Python reference to the object internally, preventing the reference count from reaching zero, preventing the object from being deleted.
Dustin
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
-- Daniel Sank Department of Physics Broida Hall University of California Santa Barbara, CA 93117 (805)893-3899
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Dustin,
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.
Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :) Since what I describe is basically WeakReferenceable, it's not obvious to me that there's a bug. I tell you when the thing to which your Remote(Weak)Reference points is destroyed, just like weak references invoke finalization callbacks. If you try to invoke remote methods after than happens I just return you an error. What's the problem? This is exactly how weak references work.
If your server-side app is sensitive to when objects are destroyed for any reason aside from management of its memory consumption, I'd argue your app is broken.
That's a really good point. Only CPython destroys objects deterministically when the ref count hits zero.
Certainly it opens you to a denial of service from a malicious client, which might hold references to objects you'd rather it didn't
So my instinct to keep strong references only where they're needed is just bad? Should a GUI or a logger keep a strong reference to the things they observe? Daniel
On Aug 18, 2014, at 8:31 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
Dustin,
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.
Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :)
Oddly this is _exactly_ how a co-worker characterized the problem as well, with (as far as I can see) no communication of this idea ;). <http://cwe.mitre.org/data/definitions/416.html> describes the disastrous consequences of this in C; the Python equivalent is mild, but it's still annoying to get 'None has no attribute "frob"' when you do self.frobber.frob().
Since what I describe is basically WeakReferenceable, it's not obvious to me that there's a bug. I tell you when the thing to which your Remote(Weak)Reference points is destroyed, just like weak references invoke finalization callbacks. If you try to invoke remote methods after than happens I just return you an error. What's the problem? This is exactly how weak references work.
The thing about weak references is that they are almost always to objects that are shared, that have a well-defined lifecycle. Your PB objects generally should _not_ be shared; a well-structured PB application will almost always create thin wrapper objects, factoring the remote-access-control logic out of the core model abstractions. Exposing weak references across a network is also very, very tricky: distributed reference counting is hard enough without trying to introduce real distributed GC that involves reference tracing.
If your server-side app is sensitive to when objects are destroyed for any reason aside from management of its memory consumption, I'd argue your app is broken.
That's a really good point. Only CPython destroys objects deterministically when the ref count hits zero.
Yes. And even the CPython developers have said it's effectively a bug to depend on this behavior, because you can't really know when it's going to happen. A debugger might be holding on to your stack frames for a little while. The profiler might be keeping your locals around for a moment longer than you expect. This is why we have idioms like 'with f as open(...)' now.
Certainly it opens you to a denial of service from a malicious client, which might hold references to objects you'd rather it didn't
So my instinct to keep strong references only where they're needed is just bad?
Here are a couple of ways to solve this problem without getting weak references involved: Set limits on things. PB isn't great about this, but this is an area where it could get better, and where all the fixes are really straightforward (find the place where PB does a thing, set a limit, raise an exception if the thing is over that limit). In this case, don't allow clients to hold unlimited numbers of simultaneous references. Start throwing errors when too many live references exist on one connection. A reasonable application should not need that many at once - if you set a limit at around 1024 and allow servers to tune it for particular workloads, it should be fine. (Set it per-type maybe?) Don't give out references to objects you can't revoke logically, at an application level. If you have a chess piece that has been taken, that is not a NULL pointer or None. There is a small, fixed number of chess pieces per game, so you don't have to worry about denials of service. Therefore your ChessPiece class should have a 'taken' state associated with it; in fact, you could remember which move of the game the piece was taken on, and produce an error message which specifically reminds the player when it was taken. Much like 'with f as open(...)' explicitly invokes 'f.close()' at the end of the block but does not force 'f' to get garbage collected, you should still be able to have a PB protocol-level reference to an application-level revoked object. Debugging distributed systems is hard enough without translating every revoked-permission error into some common "the distributed GC happened, I don't know what happened to your object, life is hard". Is this making sense?
Should a GUI or a logger keep a strong reference to the things they observe?
As my other message indicated - yes :). -glyph
On 03:31 am, sank.daniel@gmail.com wrote:
Dustin,
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.
Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :)
Sarcastic? What? Dustin *was* explaining the problem. Without sarcasm, so far as I can tell. Jean-Paul
On Aug 19, 2014, at 5:05 AM, exarkun@twistedmatrix.com wrote:
On 03:31 am, sank.daniel@gmail.com wrote:
Dustin,
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.
Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :)
Sarcastic? What? Dustin *was* explaining the problem. Without sarcasm, so far as I can tell.
My response may have come across as a little ambiguous, so let me also say - I think use-after-free is a pretty accurate description of the issue as well, I don't believe this was intended sarcastically. -glyph
And since it's come up, no, I didn't mean that either. It was a (somewhat colorful) description of my perspective on the question. Dustin On Wed, Aug 20, 2014 at 1:21 AM, Glyph <glyph@twistedmatrix.com> wrote:
On Aug 19, 2014, at 5:05 AM, exarkun@twistedmatrix.com wrote:
On 03:31 am, sank.daniel@gmail.com wrote:
Dustin,
Adding what amounts to a use-after-free bug to the protocol seems like a really bad idea to me.
Oh goody, a sarcastic comment which doesn't actually bother to explain the bug :)
Sarcastic? What? Dustin *was* explaining the problem. Without sarcasm, so far as I can tell.
My response may have come across as a little ambiguous, so let me also say - I think use-after-free is a pretty accurate description of the issue as well, I don't believe this was intended sarcastically.
-glyph
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
On Aug 18, 2014, at 2:25 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
It's just like the case of a GUI and a business logic object. The GUI probably gets a reference to the business logic object so that eg. button pushes can invoke methods on the object. However, that reference should probably be weak so that the business logic object can be garbage collected when it's finished with its business. There's no sense (to me) in keeping an object alive because a GUI, logger, or other observer is observing it. Am I just wrong?
When it comes to GUI toolkits, there are two philosophies on this. One, embodied in toolkits like OS X's Cocoa and (if you squint at it just right) Qt, is that this reference should always be weak (or, you know, __unsafe __unretained which is like "weak" with a bit of a speech impediment) because something else (a window management layer, for example, or a data-access layer updating some data) will probably be holding the reference. This is popular in C-style toolkits with an object model and reference counting because there's often an implicit circular reference between a view and its controller, and cleaning that up in C or C++ can be messy. Another, embodied in toolkits like GTK+ and the JavaScript DOM, is that this reference should always be strong, because the GUI can logically manipulate the model object it refers to, and so it should have a strong reference - otherwise GUI actions might spontaneously start causing crashes when something unrelated forgets about that object. I am a big fan of the latter style. Although there is often something to hold that strong reference, sometimes there is actually nothing else to hold it, and so you have to create bizarre lifecycle shenanigans to replicate the fairly straightforward behavior of "the user's eyeballs are looking at the screen, there's a window on the screen, the window refers to my model object, therefore the user's eyeballs have a strong reference to my model object". Some things that present GUIs are observers, some things are manipulators; the former model works for observers, the latter model works well for both. So I'm inclined to say you're wrong. However, according to the efficient-market hypothesis, Cocoa must be better than any of those other things, so I may be in a minority there. Nevertheless in PB the distinction is even more stark: if your example is that you have a model object with a GUI observer, it is the GUI that would expose the Referenceable, because the model would need to call methods on the view to update it. So this isn't about whether your model stays memory-resident while the GUI is up, but rather, whether the GUI itself stays memory-resident while the model is alive! Obviously you wouldn't want your GUI or your logger to disappear while the model is still active. -glyph
On 8 Aug 2014, at 23:59, Glyph Lefkowitz wrote:
I've participated in this discussion several times:
Hypothetical Amalgam of Median Interlocutors Speaking Here: "I'm using Tulip because I really like its style of coroutines." Glyph: "That's interesting. Did you know that Twisted has an equivalent style of coroutines, called inlineCallbacks, that's been around for years?" HAMISH: "I saw that, and I asked about that a while ago and I heard it was bad. It haven't heard that Tulip has the same problems, though." Glyph: "Really? What problems does inlineCallbacks have that Tulip's coroutines don't?" HAMISH: "When I asked about it everybody told me I have to use Deferreds instead, but Deferreds are really confusing and they make your code look all gross, so I didn't want to do that. With Tulip I don't have to!" Glyph: <facepalm>
That btw is something I’m trying to fight on IRC whenever I can for months now. @inlineCallbacks may be worse than pure Deferreds in some ways, but they are amazing to get people to give Twisted a chance and start appreciating it (most people still have no clue what Twisted actually can do for them; hence the “who needs Twisted when we have tulip!?” questions). And FWIW I have a mid-sized Twisted application running on top of @inlineCallbacks for years now and it works just fine. People finally stopped knee-jerking at async/event-based programming and we’re keeping them out by being perfectionist smart-asses. Next time someone asks about them, keep your “ugh inlineCallbacks” to yourself; a future contributor may come out of it.
On Aug 9, 2014, at 3:50 AM, Hynek Schlawack <hs@ox.cx> wrote:
On 8 Aug 2014, at 23:59, Glyph Lefkowitz wrote:
I've participated in this discussion several times:
Hypothetical Amalgam of Median Interlocutors Speaking Here: "I'm using Tulip because I really like its style of coroutines." Glyph: "That's interesting. Did you know that Twisted has an equivalent style of coroutines, called inlineCallbacks, that's been around for years?" HAMISH: "I saw that, and I asked about that a while ago and I heard it was bad. It haven't heard that Tulip has the same problems, though." Glyph: "Really? What problems does inlineCallbacks have that Tulip's coroutines don't?" HAMISH: "When I asked about it everybody told me I have to use Deferreds instead, but Deferreds are really confusing and they make your code look all gross, so I didn't want to do that. With Tulip I don't have to!" Glyph: <facepalm>
That btw is something I’m trying to fight on IRC whenever I can for months now. @inlineCallbacks may be worse than pure Deferreds in some ways, but they are amazing to get people to give Twisted a chance and start appreciating it (most people still have no clue what Twisted actually can do for them; hence the “who needs Twisted when we have tulip!?” questions). And FWIW I have a mid-sized Twisted application running on top of @inlineCallbacks for years now and it works just fine.
People finally stopped knee-jerking at async/event-based programming and we’re keeping them out by being perfectionist smart-asses. Next time someone asks about them, keep your “ugh inlineCallbacks” to yourself; a future contributor may come out of it.
+1 I stopped trying to use @inlineCallbacks because I was told it was bad (without much explanation why) anytime I posted a code snippet and asked for help with something that was using it. End result was that I didn’t really want to write much Twisted code because I don’t like the style of code where you’re working with pure Deferreds. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
This discussion about inlineCallbacks has nothing to do with the title of this thread. Someone already created a spin-off thread talking about inlineCallbacks. Let's use that.
On Aug 4, 2014, at 9:47 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
2. Is there a specification for the pb dialect of banana?
Beyond the code, no.
Ok.
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need.
For two personal projects, I would like to have a reasonable remote objects library in python. I need something which can announce state changes to clients, and receive state change requests from clients. My solution:
1. Make server side class which can spawn Cacheables when it wants to tell clients of its existence. 2. Give RemoteCaches to clients and use observe_* methods as appropriate. 3. Stuff a Viewable into the RemoteCaches so that clients can request state changes.
Question #1: Is this a reasonable use of pb?
Yes.
This all worked great until I ran into a bug. In trying to fix the bug, I found that
1. pb code is really hard to understand
Sorry about that.
2. exarkun's thinks pb is bad and that I should implement what I need in AMP.
I really wish we would stop calling things "bad" and "good". This isn't a helpful classification. PB is adequate for a particular set of requirements. Those requirements are somewhat unusual, and AMP is better for a lot of use-cases. It sounds to me like you are a lot more interested in
3. exarkun thinks banana and jelly are reasonable.
Again, what does "reasonable" mean in this context? Let me explain my own opinion about this. Banana is a perfectly serviceable low-level marshaling format. It's pretty efficient when compared to something like JSON, and has compression mechanisms which can make it even more efficient (the "dialect" support you referred to). The only thing about it that isn't very general is that its implementation (although not the protocol specification) hard-codes the PB abbreviated-string dialect. Jelly is higher level, but more language-specific. Its specification implicitly encodes numerous Python implementation details, like the distinction between "tuple" and "list". It also couples very tightly to your program's structure. This can be a real benefit to getting a protocol up and running quickly, but it still allows you to create protocols where you don't really know what the wire format is, where you develop hidden dependencies. In more complex protocols (where the "ease of getting up and running quickly" thing really starts to shine) this attribute of Jelly can cause real difficulty in any kind of cross-system communication: communicating with a peer from a different language, or even in Python without access to all the protocol class definitions from the original system, is hard because it requires reverse-engineering. This is where it becomes "bad". Still, it isn't as big of a disaster security- and maintenance-wise as Pickle. The information you need is recorded in the code, it's just spread out, you don't need to work backwards from protocol dumps. If I were going to spend some time maintaining PB, this is where I'd focus: if the schemas were a bit more explicit, could be collected into one place more easily, and were all validated in advance (before passing deserialized objects to the application code, or serializing them across the wire), then these problems could be addressed without changing the API too much. PB basically just inherits all of the benefits and caveats of Jelly. It's a trivial serialization of remote references to objects.
Question #2: Would you recommend implementing a simplified replacement for pb on top of banana/jelly, or starting over from AMP? I favor the banana/jelly route because the protocol seems intrinsically flexible, but I read your blog explaining why protocols like banana are bad, so I'm confused about what I "should" do.
First of all, don't take my development advice as gospel. When I write an article and publish it, I'm just trying to make people aware of issues they may not have considered; make your own decisions about how to write your own code. (Unless your decision is to write it yourself in PHP, of course, in which case you are a danger to yourself and others and should be remanded to compulsory treatment.) It seems like PB fits your style, and the problems with it are all tractable and fixable. I am sad that you're not getting the development support you need to maintain it (most of all I'm sad you're not getting it from me!) but let's see if we can fix that. I'll start by replying to your other email. One thing that might speed things along is if you can help out with some code reviews. We've got a _really_ long queue right now and that's making it hard for me to spend any focused effort in one particular area. I'm happy to trade 2-for-1 - if you do two code reviews, I will regard it as an immediate obligation for me to review a ticket you direct me to ;). It might also help to write more small, simple patches for PB. Especially adding docstrings to make the nature of your other, more complex changes easier for reviewers to understand. These would also be easier to land, and a couple of decades in open source has taught me that nothing motivates development activity like successful development activity ;). Good luck, -glyph
This discussion seems relevant to a design pattern "Yearbook Lifecycle" that I'm using for athleets.com and junkeet.com. Design Goal: 1) Maintain yearbooks across state, allowing students to delegate access control to classmates as a series of transformations (recursive web renders) - here sign my yearbook...here read what so-and-so wrote..can I take your yearbook to PE, we might skip class. 2) Persist inside a context indexer that answers yearbook queries, with generic access to the t0 and viewable access to t1+ transforms with request controls for represented as e0, e1 being another recursive web template kept within the master at t-1 So a complex use case would be a yearbook message like "OMG - did you read {s1.p3.t4}? cause {e4.s3} said f_td(after|before) {e2.s6} was kissing behind {p3.t0.a4}. See you this summer good luck at {owner.ext['college']}! BFF, {s3}" Fake Model YearbookPage(Element): SignedYearbookPage(YearbookPage): Yearbook(pb.Copyable) -pages -accessControlLog StudentYearbook(pb.Viewable) --transformation YearbookServiceTransformer(??) Publish(pb.Root) remote_issue_yearbook(studentid=None): // none can read but not signed Athleets.com is ready(ing) for yearbook-ish events, ie when a players get traded or hurt. Other events act across yearbook motif regenerating the t0 when athlete data changes which republishes all copies non t0 that exist. The Sender Receiver Model from the example is difficult to follow cause I tend to think of Originator only. The originator being a service that waits for clients in need of a working copy. A sender, doesn't exist...it would be more of a challenger, that if recognized as more authoritative, replaces the rb.Root service. Outside of the copyable I'm trying to upstream via remotes pieces within the copyable that need updating. I'm delving into viewable to bring caller identity data to the web templating process not necessary restrict access. But also to build a system that maintains audit history within the yearbook model. Getting to areas of non-working code, so let me report back any success. Cheers Kevin On Tue, Aug 5, 2014 at 11:55 AM, Glyph Lefkowitz <glyph@twistedmatrix.com> wrote:
On Aug 4, 2014, at 9:47 PM, Daniel Sank <sank.daniel@gmail.com> wrote:
glyph,
2. Is there a specification for the pb dialect of banana?
Beyond the code, no.
Ok.
I would be happy to answer questions, but obviously I'm not super responsive :). Let me know what you need.
For two personal projects, I would like to have a reasonable remote objects library in python. I need something which can announce state changes to clients, and receive state change requests from clients. My solution:
1. Make server side class which can spawn Cacheables when it wants to tell clients of its existence. 2. Give RemoteCaches to clients and use observe_* methods as appropriate. 3. Stuff a Viewable into the RemoteCaches so that clients can request state changes.
Question #1: Is this a reasonable use of pb?
Yes.
This all worked great until I ran into a bug. In trying to fix the bug, I found that
1. pb code is really hard to understand
Sorry about that.
2. exarkun's thinks pb is bad and that I should implement what I need in AMP.
I really wish we would stop calling things "bad" and "good". This isn't a helpful classification. PB is adequate for a particular set of requirements. Those requirements are somewhat unusual, and AMP is better for a lot of use-cases.
It sounds to me like you are a lot more interested in
3. exarkun thinks banana and jelly are reasonable.
Again, what does "reasonable" mean in this context?
Let me explain my own opinion about this.
Banana is a perfectly serviceable low-level marshaling format. It's pretty efficient when compared to something like JSON, and has compression mechanisms which can make it even more efficient (the "dialect" support you referred to). The only thing about it that isn't very general is that its implementation (although not the protocol specification) hard-codes the PB abbreviated-string dialect.
Jelly is higher level, but more language-specific. Its specification implicitly encodes numerous Python implementation details, like the distinction between "tuple" and "list". It also couples very tightly to your program's structure. This can be a real benefit to getting a protocol up and running quickly, but it still allows you to create protocols where you don't really know what the wire format is, where you develop hidden dependencies. In more complex protocols (where the "ease of getting up and running quickly" thing really starts to shine) this attribute of Jelly can cause real difficulty in any kind of cross-system communication: communicating with a peer from a different language, or even in Python without access to all the protocol class definitions from the original system, is hard because it requires reverse-engineering. This is where it becomes "bad". Still, it isn't as big of a disaster security- and maintenance-wise as Pickle. The information you need *is* recorded in the code, it's just spread out, you don't need to work backwards from protocol dumps. If I were going to spend some time maintaining PB, this is where I'd focus: if the schemas were a bit more explicit, could be collected into one place more easily, and were all validated in advance (before passing deserialized objects to the application code, or serializing them across the wire), then these problems could be addressed without changing the API too much.
PB basically just inherits all of the benefits and caveats of Jelly. It's a trivial serialization of remote references to objects.
Question #2: Would you recommend implementing a simplified replacement for pb on top of banana/jelly, or starting over from AMP? I favor the banana/jelly route because the protocol seems intrinsically flexible, but I read your blog explaining why protocols like banana are bad, so I'm confused about what I "should" do.
First of all, don't take my development advice as gospel. When I write an article and publish it, I'm just trying to make people aware of issues they may not have considered; make your own decisions about how to write your own code.
(Unless your decision is to write it yourself in PHP, of course, in which case you are a danger to yourself and others and should be remanded to compulsory treatment.)
It seems like PB fits your style, and the problems with it are all tractable and fixable. I am sad that you're not getting the development support you need to maintain it (most of all I'm sad you're not getting it from me!) but let's see if we can fix that. I'll start by replying to your other email.
One thing that might speed things along is if you can help out with some code reviews. We've got a _really_ long queue right now and that's making it hard for me to spend any focused effort in one particular area. I'm happy to trade 2-for-1 - if you do two code reviews, I will regard it as an immediate obligation for me to review a ticket you direct me to ;).
It might also help to write more small, simple patches for PB. Especially adding docstrings to make the nature of your other, more complex changes easier for reviewers to understand. These would also be easier to land, and a couple of decades in open source has taught me that nothing motivates development activity like successful development activity ;).
Good luck,
-glyph
_______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
participants (8)
-
Daniel Sank
-
Donald Stufft
-
Dustin J. Mitchell
-
exarkun@twistedmatrix.com
-
Glyph
-
Glyph Lefkowitz
-
Hynek Schlawack
-
Kevin Mcintyre