[Twisted-Python] Deferred documentation rewrite
![](https://secure.gravatar.com/avatar/434aee9ad675384a9e745c7217ac4abe.jpg?s=120&d=mm&r=g)
Hello all, I have been prodded by the members of #twisted into rewriting the Deferred documentation. You can check out the plan at this ticket: http://twistedmatrix.com/trac/ticket/3943 Comments would be appreciated. Cheers, Edward P.S. If you reply on the mailing list, please CC me.
![](https://secure.gravatar.com/avatar/9ba6ae09ad47f1dd0dce031fa052185a.jpg?s=120&d=mm&r=g)
Hi Edward, On Thu, Jul 30, 2009 at 11:24 AM, Edward Z. Yang<ezyang@mit.edu> wrote:
Your outline looks nice. Something that *really* helped me a lot with Deferreds was seeing how they are modelled after standard Python flow control behaviour. I guess that's what the first section your proposing is about. Jono Lange gave a presentation recently (can't remember what it was called... maybe something about being an evil hacker or about how your code sucks and he hates you) where he presented step-by-step slides that shows some normal Python code and then the asynchronous Deferred-using equivalent. Even though I understood the principles reasonably well before attending his talk, the way he presented them in his slides was very effective and helped me clarify that understanding. If he's willing, which I suspect he will be, I recommend you look at the slides and steal his good ideas. :) Thanks, J.
![](https://secure.gravatar.com/avatar/95658668bebdc20db10bb5ccc1603017.jpg?s=120&d=mm&r=g)
2009/7/31 ezyang <ezyang@mit.edu>:
The accompanying paper is here: http://mumak.net/stuff/twisted-intro.html Cheers, mwh
![](https://secure.gravatar.com/avatar/53cf3ca941974336efe1921cdef8e83b.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Michael Hudson" <micahel@gmail.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> The accompanying paper is here: http://mumak.net/stuff/twisted-intro.html Thanks for this incredibly helpful link! -Dave
![](https://secure.gravatar.com/avatar/434aee9ad675384a9e745c7217ac4abe.jpg?s=120&d=mm&r=g)
You can view an initial draft of the rewrite here: http://ezyang.com/twisted/defer2.html For reference, this is the planned outline (X means done, ? means almost done): X Synchronous to Asynchronous: The Method to the Madness X Convert synchronous code to asynchronous code X Why asynchronous? - Deferred X Basic operation - Convenience primitives (succeed, fail, execute, maybeDeferred) ? Callback/Errback chaining - Timeouts - Composing deferreds - DeferredList/gatherResults - chainDeferred - Advanced topics - Deferred asynchronous primitives - Sugar syntax Cheers, Edward P.S. Please CC me in your replies! Thanks.
![](https://secure.gravatar.com/avatar/d7875f8cfd8ba9262bfff2bf6f6f9b35.jpg?s=120&d=mm&r=g)
On Fri, 2009-07-31 at 18:40 -0400, Edward Z. Yang wrote:
It's great you're working on this! The docs on deferreds certainly need help. The problem with this is that it perpetuates the misunderstanding the Deferreds *make* things asynchronous, even with the intro that says otherwise. I think it's better to assume already asynchronous code, handling the transition from synchronous to async in an intro event loop howto. A better comparative exposition might be with normal callbacks, e.g.: "def foo(x, gotResultCallback): pass" vs. "def foo(x): # return Deferred". At the very least having that async but callbacky version in the middle helps understanding. It also omits half the story: how you *create* Deferreds. There should be a section on that as well. An example involving a parser, where you just wave your hands about who pushes data in to the parser exactly (so no need to go into event loop details), may work well. In particular, the object that wants the result of the parsing wants to get parse errors, *not* whoever pushes data in. Often it's same object, but not always. Deferreds help with that.
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Fri, Jul 31, 2009 at 6:40 PM, Edward Z. Yang <ezyang@mit.edu> wrote:
You can view an initial draft of the rewrite here:
This is a great first draft! Very substantial. I really appreciate you working on this. Now I will proceed to rip it to shreds by way of giving you some feedback, but please try to take this as a constructive review. I'm happy with what you've got but given the large amount of dissatisfaction in the community with the existing Deferred docs, and the widespread confusion that they cause, I think that these docs have to be totally awesome.
X Synchronous to Asynchronous: The Method to the Madness
I really strongly object to this section title. Reading through the section itself, I don't find that it's that objectionable, but one of the misconceptions that we frequently need to dispel is that Deferreds are "crazy" or "complex" or "magic". I think it's very important to reinforce this for the reader, that this is just an idiom we use for some python functions to call some other functions in a particular order. So please get rid of the "madness". Beyond that, you spend a lot of time talking about *synchronous* and * asynchronous** *code in this section. You go so far as to *boldface the words for emphasis*. Okay, great, these are important terms, but you're clearly explaining them as if the user doesn't really know what they mean. I think that starting with a definition of "synchronous" and "asynchronous" would be helpful. Better yet, have an explanation that invokes some code. The tone also suggests that the user may not quite understand what callbacks are or how they work. A brief explanation of higher-order functions in Python may be in order. (Or the tone could change to assume that the user * does* know about this sort of thing, but a little redundancy might not be amiss here.) If someone comes to this document with a set of ideas about how network programming works - for example, that "read()" reads some bytes off of a socket and blocks until they're ready, it won't be clear how select() and friends get involved to make this asynchronous programming deal worthwhile. So it would be useful to explain, at least briefly, how this kind of work gets done behind the scenes. You don't want to actually spin up the real reactor early on in the examples, though. I think Jonno Lange's document did a reasonable job explaining how Deferreds interact with the reactor. It's important to get across that there's no magical interaction, since that's a considerable source of confusion. I also wrote an answer on stackoverflow which addressed this, which might be helpful to you as a resource: http://stackoverflow.com/questions/80617/asychronous-programming-in-python-t... More minor things: "tutorial-ish"? Is this a tutorial or not? I don't mind some informality and humor in the documentation, but this is just sloppy. (Not necessarily the wording: reading through it, I really can't tell if it's intended to be a tutorial or not.) "set of code": this should be "function", or possibly "callback" or "callable". It's important to be precise with terminology because later in the documentation we're going to expect users to know what those terms mean, and if we've been inconsistent they may be confused. Throughout Twisted, "Deferred" is used as a noun. In this document they are universally referred to as a "Deferred object". Please drop the "object". The bulleted lists seem to be a distraction. Most of them aren't really enumerating anything, they're just jumping from topic to topic without finishing a sentence. You use the word "simple" a lot. Don't tell me it is or isn't simple: demonstrate its simplicity. In one case — "Simple and well defined." — there isn't even a sentence. "Asynchronous programming is centered around this notion that:" This whole section is very confused. If it's centered around something, shouldn't it be one thing? "this notion"? Which notion, you've got a list of 3 bullet points that talk about maybe 5 notions, none of which is an antecedent which could satisfy "this". You are throwing lots and lots of examples at the reader, but I find that users understand better with one thoroughly-explained toy example that they can pick apart and play with than a whole bunch of abstract stuff. For example, "Sometimes I want code to happen during an event, but the event firing is distinct from my program flow". A user reading that (if they understand it) is likely to say "why not start a thread?". If it is instead presented in terms of a matter-of-fact "here is what happens" not "here is why you want this" then the user is more likely to focus on what is happening (and thus, on understanding Deferreds, which is really the whole point here) than on whether they *really* actually want it or not. Hopefully by the time they thoroughly understand it they will know that they do want it ;-). - Deferred
X Basic operation - Convenience primitives (succeed, fail, execute, maybeDeferred)
This should be covered later on. "fail" doesn't make any sense unless you already know about errbacks and chaining. ? Callback/Errback chaining These examples are really weak on explanation. I won't belabor that point though, because it seems like you're not really done writing them yet.
- Timeouts
Your bare-bones Deferred implementation should really be called something else. In Jonathan Lange's example, it was called a "Placeholder". I can see readers getting confused about whether Deferreds are something you're supposed to (or allowed to) implement yourself, or whether they're something that's a part of Twisted, because you move from talking about your toy implementation to the real thing without skipping a beat.
- Advanced topics
- Deferred asynchronous primitives - Sugar syntax
I feel like this is throwing too much at the user at once. It's absolutely fantastic if you want to address this stuff as well (its docs are even weaker than Deferred itself), but let's put in a break so that they know they should go off and try to understand the more basic aspects of Deferreds before trying to understand gatherResults, inlineCallbacks and DeferredSemaphore. More importantly I think you should really focus on getting an extremely lucid and readable explanation of the core concepts of event-driven programming and Deferreds before you start adding in these extra bits of documentation. Just keep coming back to that. Pretend you don't understand why asynchronous programming is useful at all, how select() or non-blocking I/O works, and read through the document. Consider whether you understand what's going on, why these ideas are useful. Deliberately try to misunderstand them in a way which is wrong, but consistent with the wording, and see if you can get to the bottom of your document without being corrected :). I have more feedback, but I assume this is more than enough to get you started :). P.S. Please CC me in your replies! Thanks. I'll try to remember, but I'm sure somebody's going to forget - you should subscribe to the mailing list so you get their messages even if you don't :).
![](https://secure.gravatar.com/avatar/434aee9ad675384a9e745c7217ac4abe.jpg?s=120&d=mm&r=g)
I have updated my draft here: http://ezyang.com/twisted/defer2.html The most notable change is I've removed the section "From Synchronous to Asynchronous". I believe (and I think other people agree with me) that this is an important topic to cover, but it's really *hard* to teach asynchronous programming and I'd like to think a bit more about how I'd like to frame the subject. There are at least two issues that we have to deal with: * Why asynchronous? - Define synchronous and asynchronous - Multiplexing IO - Introduce a simple reactor based on select() * Why callbacks? - Asynchronous interaction to synchronous interaction - Delocalized execution (the parser example) - High level functions in Python review Quite frankly, I'm stumped on "defining synchronous and asynchronous." Asynchronous had always made sense to me, coming from JavaScript, since it was "you click this button and something should happen!" But that is a very different use-case of asynchronous programming than Twisted is. And Glyph raised some very salient concerns about what we were trying to teach people. I just don't know what direction people are coming from. As such, the document now is targeted to "people who know the basics of asynchronous programming and grok callbacks", and I've incorporated Itamar's excellent suggesting of comparing explicit callback parameters and the Deferred object, which I hope dispells the notion of Deferred being magical fairly well (my assertion is Deferred is merely an abstraction over said callback parameters.) I've also fully fleshed out the Deferreds reference; any omissions are my fault. The plan next is to discuss composing deferreds (which will also touch on when you should and how to create your own deferreds, as well as deferredlist) and the convenience primitives. Cheers, Edward
![](https://secure.gravatar.com/avatar/53cf3ca941974336efe1921cdef8e83b.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Edward Z. Yang" <ezyang@MIT.EDU> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Monday, August 03, 2009 6:00 PM Subject: Re: [Twisted-Python] Deferred documentation rewrite ================ I like the side-by-side regular and twisted versions, that's helpful. You are approaching the complicated stuff toward the end - please don't stop there. Show some more examples of more intricate cases of side-by-side logic, eg how to use deferreds for this if all foo's are asynchronous: x=foo1(y) if x > 0: z=foo2(x) y = foo3(z) else: z=foo4(x,y) alldone=foo5(z) # now add error handling for all the deferreds # how would I debug things if foo3 had an unanticipated error? ================ My latest real question for the docs to answer is, Are these different? == example 1 ==== d=asynchronousprocess() d.addCallback(b) d.addCallback(c) d.addErrback(errbc) === example 2 ==== d=asynchronousprocess().addCallback(b).d.addCallback(c).d.addErrback(errd) ============= Why or why not? When might one use one form over the other? -Dave
![](https://secure.gravatar.com/avatar/307ebad5fc7824b9f223fbad5698d278.jpg?s=120&d=mm&r=g)
In the Callbacks and errbacks section: "Notice that in the synchronous version, process is inside the try..except block. This translates over to the asynchronous code: if process throws an exception, handle_twisted will get a Failure object..." : I think you may mean "handle_twisted_error", not "handle_twisted" On Mon, Aug 3, 2009 at 6:00 PM, Edward Z. Yang<ezyang@mit.edu> wrote:
I have updated my draft here:
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Mon, Aug 3, 2009 at 6:00 PM, Edward Z. Yang <ezyang@mit.edu> wrote:
I have updated my draft here:
Thanks. Looks like it's improving. I've got more points to critique now, but that's only because there's more meat to the tutorial now :). 1. The coding standard in this document is PEP8, not the Twisted coding standard. Have a look here: http://twistedmatrix.com/trac/browser/trunk/doc/core/development/policy/codi... 2. "Callbacks are the lingua franca of asynchronous programming" strikes me as an odd turn of phrase, especially one to open the document with. 3. "This document addresses ... Deferred. It..." - "It" has an ambiguous antecedent. Are you talking about the document or the Deferred class? Of course it becomes obvious, but it should be phrased so you don't need to. 4. It's far from obvious what "nonblocking_call" is supposed to be, given that its definition is "pass". On my first skim through I thought it was a callback, then had to stop, go back and read again when I realized that didn't make sense. Brevity is good in examples, but this is too brief. 1. "input" is a builtin function. You might want to avoid using it for a parameter name. 5. "You might be tempted to define it like this": you're switching back and forth from second to third person; at first referring to the reader, then an anonymous different programmer. It might be useful to give these roles different names; "Alice and Bob" are popular. 6. If you must use a third-person pronoun (as you do the one time you refer to the API's anonymous user); you should stick to a gender-neutral one wherever possible, unless of course you're referring to a specific character. 7. "The Deferred doesn't do anything that you couldn't have done with the two callback parameters." This isn't strictly true; chaining callbacks, and dealing with errors that arise in different layers of an asynchronous callback chain, aren't strictly possible without some additional mechanism. 8. Deferred is mentioned as an API link, Failure isn't. 9. Your explanations of the examples seem backwards. "At its very simplest, Deferred has a single callback attached to it". I think you should be explaining the problem being solved by a single callback, since the synchronous example isn't addressed. The synchronous example obviously doesn't have a single callback attached to it :). In other words, document "here's what you might want to do, here's how you can do it" rather than "here's a thing you can do! by the way, you might want to do it because...". You've addressed the general why-you-want-to-do-it in the section above, but it would be helpful to do it in the small for each specific example. 10. The DeferredList docs seem wonky in several ways. 1. The opening is hard to follow.
We are now ready to consider our original problem
what original problem?
a Deferred that would only fire
"fire"? what does "fire" mean? The term hasn't yet been introduced.
after some other number of Deferreds fired
Yeah, I'm still not sure what you're referring to. Why would I want to do this, again? 2. Users really shouldn't be subclassing Deferred themselves, so it's bad to have an example that does that. Especially one which . The fact that this is what DeferredList is is an implementation detail, and an ugly one at that. Try talking about gatherResults instead, and implementing a function which does the same thing without a subclass. Or, perhaps, a class of your own which just delegates to Deferred for Deferred behavior, rather than inheriting it. 3. Users *definitely* shouldn't be subclassing Deferred without upcalling to its __init__. I haven't tested them, but I'm pretty sure these examples will just blow up with tracebacks. 4. The examples are never invoked. It's semi-obvious how to use them, but semi-obvious things are often invoked semi-correctly. Better to have examples with can be run, or at least ones with a 0-argument entry point named something like 'start'. 5. "Consider the following interaction of two Deferreds:". You're setting this up as if it's going to be very formal, but then your language is sloppy; you don't name the different deferreds. One of them is "one deferred", the other is "a Deferred". You don't describe them independently, the relationship is implicit in the description. Given that you're describing a fairly complex constellation of objects with which the user isn't necessarily familiar yet, you should be clearly labeling the Deferreds in question in the code sample with variable names (something as simple as "a" and "b" would probably do fine) and then consistently using those names to refer to them in the prose as well, so it's easy for the reader to follow exactly which thing you're talking about. A big problem with technical documentation, *especially* documentation of Deferreds, is that it's very easy for a reader to start confusing which thing is which. Once again, it would be good to set up some kind of concrete problem first: *why* are we waiting on multiple Deferreds? 11. "Fluent Interface"? This is more new terminology — terminology that I am not familiar with, I might add — that isn't defined anywhere in the document. I think it's more of an appendix than something important to the main narrative; composing Deferreds, returning a Deferred from another Deferred, firing a Deferred from another callback, etc, should be covered first. 1. "Batons" looks like it's going to be more fancy ad-hoc terminology - I would recommend keeping the language simpler and consistent with other Twisted documentation :). 12. Still a lot of enumerated lists. Obviously a bad habit to which I am prone ;-), but when one uses an enumerated list, there should either be in an expectation that the numbers will be useful. Either, as in this document review, or code reviews, where the numbers can be used to refer to points in subsequent discussion, or there's a clear separation of steps. It's not really clear what the "two possible scenarios" lists are enumerations *of *. Are they different things that can happen? 13. You should try eliminating the word "consider" from the document. You seem to have the rhetorical habit, which I've seen from other people (myself included), of having a sentence which is missing a clear subject/verb/object relationship, and working around it by saying "consider" or "let's say". For example, you want to communicate that there's a Deferred somewhere with some callbacks. You can't just say "A Deferred with some callbacks.", so you say "Consider: a Deferred with some callbacks", and now the sentence *seems* complete, but it doesn't really communicate a full thought. Okay, I think that's enough feedback for now. I'll have to do more with your next round of edits, or my feedback is going to be longer than the document itself :). * Why asynchronous?
You might want to start with this one, since callbacks are even more generally useful than asynchronous programming. Your suggestion of a parser example makes this clear: even if you're parsing synchronously, you'll still probably have callbacks for different parse rule matches.
I'd start with the words themselves. synchronous means "at the same time". This refers to the timing of the function call and its effects. In a synchronous program, if I say "read()", then at that same time that "read()" is called, the reading happens and the data is returned. But, in an *a*synchronous ("not at the same time") program, "read()" is called, but its effects happen later. This can obviously be fleshed out quite a bit, but I think that core concept is what's important to communicate.
Your experience with JavaScript — or at least, with GUI programming, since JavaScript itself is terrible — might actually be a good way to explain the problem here. One example I like to use to explain why sometimes you just can't block is this: button1 = Button() button2 = Button() # I need to wait for the user to click on this button button1.waitForClick() # okay now they've clicked it. message("Hooray you clicked button 1") button2.waitForClick() # oh dang, but what if they want to click button 2 first!?! although you can probably devise a more lucid variant of that :). One of these days I really want to write a combined Twisted / GTK tutorial that shows how to ask questions in dialog boxes without blocking and sub-main-loops and other nasty tricks that GTK programs often get up to in order to have a question-and-answer UI. Unfortunately, although these examples do serve as easy-to-identify for learning Twisted programmers, it's not always immediately clear how this corresponds to networking data, and the extra complexity of GUI libraries makes it more difficult to run the examples. And Glyph raised some very salient concerns about what we were trying to
teach people. I just don't know what direction people are coming from.
I think the best assumption of background for such an introductory tutorial is to assume that the user doesn't really understand what problem Deferreds solve, and has thus never done any substantial work in an asynchronous environment. More experienced users will skim some parts, but that's fine: more experienced users are easily able to figure out what Deferreds are even with just the current documentation :). We shouldn't treat this as a Python tutorial, but it should at least touch briefly on callable objects and nested variables. As such, the document now is targeted to "people who know the basics
Again, I think this might be assuming a bit too much. At the very least, you should find a very, very good tutorial on callbacks and higher-order functions in Python to point people to as a dependency, so that users who * don't* have that experience can go read about it somewhere else. (Actually, every dependency of every document *really* ought to have hyperlinks to other resources that teach that dependency, so that a user who doesn't know python but needs to dive into a Twisted codebase will be put on their way quickly.) Even people who have some Python experience, but use callbacks rarely, will often discover there are things they don't know when they start programming with Twisted and nesting 5 or 6 callbacks in a function. For example, many people don't know all the fiddly rules of scope nesting. Take a poll of some potential targets for this intro documentation and ask them if they can explain why this produces an error: def f(x=1): def t(): if x > 3: x = 2 else: return x return t ... but adding 'x=x' to the parameter list of 't' makes it work (although not like they would expect if they manipulated 'x' in f). The plan next is to discuss composing deferreds (which will also
touch on when you should and how to create your own deferreds, as well as deferredlist) and the convenience primitives.
I think you need to start talking about creating your own Deferreds, at least implicitly, very early on in the document. For example, rather than having "nonblocking_call" be a dummy function, have it maintain a list of yet-to-complete calls, like this: pending = [] def process(data): return "Processed: <" + data + ">" def nonblockingCall(data, whenSucceeded, whenFailed): pending.append((data, whenSucceeded, whenFailed)) def completeOneCall(succeeded=True): data, whenSucceeded, whenFailed = pending.pop(0) if succeeded: whenSucceeded(process(data)) else: whenFailed(RuntimeError("It failed, for some reason.")) then (A) you can demonstrate how the callbacks actually get called in a tiny little system that the reader can play around with and get comfortable in before understanding Deferred, and (B) you can illustrate the same example again with some Deferred logic involved.
![](https://secure.gravatar.com/avatar/9ba6ae09ad47f1dd0dce031fa052185a.jpg?s=120&d=mm&r=g)
Hi Edward, On Thu, Jul 30, 2009 at 11:24 AM, Edward Z. Yang<ezyang@mit.edu> wrote:
Your outline looks nice. Something that *really* helped me a lot with Deferreds was seeing how they are modelled after standard Python flow control behaviour. I guess that's what the first section your proposing is about. Jono Lange gave a presentation recently (can't remember what it was called... maybe something about being an evil hacker or about how your code sucks and he hates you) where he presented step-by-step slides that shows some normal Python code and then the asynchronous Deferred-using equivalent. Even though I understood the principles reasonably well before attending his talk, the way he presented them in his slides was very effective and helped me clarify that understanding. If he's willing, which I suspect he will be, I recommend you look at the slides and steal his good ideas. :) Thanks, J.
![](https://secure.gravatar.com/avatar/95658668bebdc20db10bb5ccc1603017.jpg?s=120&d=mm&r=g)
2009/7/31 ezyang <ezyang@mit.edu>:
The accompanying paper is here: http://mumak.net/stuff/twisted-intro.html Cheers, mwh
![](https://secure.gravatar.com/avatar/53cf3ca941974336efe1921cdef8e83b.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Michael Hudson" <micahel@gmail.com> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> The accompanying paper is here: http://mumak.net/stuff/twisted-intro.html Thanks for this incredibly helpful link! -Dave
![](https://secure.gravatar.com/avatar/434aee9ad675384a9e745c7217ac4abe.jpg?s=120&d=mm&r=g)
You can view an initial draft of the rewrite here: http://ezyang.com/twisted/defer2.html For reference, this is the planned outline (X means done, ? means almost done): X Synchronous to Asynchronous: The Method to the Madness X Convert synchronous code to asynchronous code X Why asynchronous? - Deferred X Basic operation - Convenience primitives (succeed, fail, execute, maybeDeferred) ? Callback/Errback chaining - Timeouts - Composing deferreds - DeferredList/gatherResults - chainDeferred - Advanced topics - Deferred asynchronous primitives - Sugar syntax Cheers, Edward P.S. Please CC me in your replies! Thanks.
![](https://secure.gravatar.com/avatar/d7875f8cfd8ba9262bfff2bf6f6f9b35.jpg?s=120&d=mm&r=g)
On Fri, 2009-07-31 at 18:40 -0400, Edward Z. Yang wrote:
It's great you're working on this! The docs on deferreds certainly need help. The problem with this is that it perpetuates the misunderstanding the Deferreds *make* things asynchronous, even with the intro that says otherwise. I think it's better to assume already asynchronous code, handling the transition from synchronous to async in an intro event loop howto. A better comparative exposition might be with normal callbacks, e.g.: "def foo(x, gotResultCallback): pass" vs. "def foo(x): # return Deferred". At the very least having that async but callbacky version in the middle helps understanding. It also omits half the story: how you *create* Deferreds. There should be a section on that as well. An example involving a parser, where you just wave your hands about who pushes data in to the parser exactly (so no need to go into event loop details), may work well. In particular, the object that wants the result of the parsing wants to get parse errors, *not* whoever pushes data in. Often it's same object, but not always. Deferreds help with that.
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Fri, Jul 31, 2009 at 6:40 PM, Edward Z. Yang <ezyang@mit.edu> wrote:
You can view an initial draft of the rewrite here:
This is a great first draft! Very substantial. I really appreciate you working on this. Now I will proceed to rip it to shreds by way of giving you some feedback, but please try to take this as a constructive review. I'm happy with what you've got but given the large amount of dissatisfaction in the community with the existing Deferred docs, and the widespread confusion that they cause, I think that these docs have to be totally awesome.
X Synchronous to Asynchronous: The Method to the Madness
I really strongly object to this section title. Reading through the section itself, I don't find that it's that objectionable, but one of the misconceptions that we frequently need to dispel is that Deferreds are "crazy" or "complex" or "magic". I think it's very important to reinforce this for the reader, that this is just an idiom we use for some python functions to call some other functions in a particular order. So please get rid of the "madness". Beyond that, you spend a lot of time talking about *synchronous* and * asynchronous** *code in this section. You go so far as to *boldface the words for emphasis*. Okay, great, these are important terms, but you're clearly explaining them as if the user doesn't really know what they mean. I think that starting with a definition of "synchronous" and "asynchronous" would be helpful. Better yet, have an explanation that invokes some code. The tone also suggests that the user may not quite understand what callbacks are or how they work. A brief explanation of higher-order functions in Python may be in order. (Or the tone could change to assume that the user * does* know about this sort of thing, but a little redundancy might not be amiss here.) If someone comes to this document with a set of ideas about how network programming works - for example, that "read()" reads some bytes off of a socket and blocks until they're ready, it won't be clear how select() and friends get involved to make this asynchronous programming deal worthwhile. So it would be useful to explain, at least briefly, how this kind of work gets done behind the scenes. You don't want to actually spin up the real reactor early on in the examples, though. I think Jonno Lange's document did a reasonable job explaining how Deferreds interact with the reactor. It's important to get across that there's no magical interaction, since that's a considerable source of confusion. I also wrote an answer on stackoverflow which addressed this, which might be helpful to you as a resource: http://stackoverflow.com/questions/80617/asychronous-programming-in-python-t... More minor things: "tutorial-ish"? Is this a tutorial or not? I don't mind some informality and humor in the documentation, but this is just sloppy. (Not necessarily the wording: reading through it, I really can't tell if it's intended to be a tutorial or not.) "set of code": this should be "function", or possibly "callback" or "callable". It's important to be precise with terminology because later in the documentation we're going to expect users to know what those terms mean, and if we've been inconsistent they may be confused. Throughout Twisted, "Deferred" is used as a noun. In this document they are universally referred to as a "Deferred object". Please drop the "object". The bulleted lists seem to be a distraction. Most of them aren't really enumerating anything, they're just jumping from topic to topic without finishing a sentence. You use the word "simple" a lot. Don't tell me it is or isn't simple: demonstrate its simplicity. In one case — "Simple and well defined." — there isn't even a sentence. "Asynchronous programming is centered around this notion that:" This whole section is very confused. If it's centered around something, shouldn't it be one thing? "this notion"? Which notion, you've got a list of 3 bullet points that talk about maybe 5 notions, none of which is an antecedent which could satisfy "this". You are throwing lots and lots of examples at the reader, but I find that users understand better with one thoroughly-explained toy example that they can pick apart and play with than a whole bunch of abstract stuff. For example, "Sometimes I want code to happen during an event, but the event firing is distinct from my program flow". A user reading that (if they understand it) is likely to say "why not start a thread?". If it is instead presented in terms of a matter-of-fact "here is what happens" not "here is why you want this" then the user is more likely to focus on what is happening (and thus, on understanding Deferreds, which is really the whole point here) than on whether they *really* actually want it or not. Hopefully by the time they thoroughly understand it they will know that they do want it ;-). - Deferred
X Basic operation - Convenience primitives (succeed, fail, execute, maybeDeferred)
This should be covered later on. "fail" doesn't make any sense unless you already know about errbacks and chaining. ? Callback/Errback chaining These examples are really weak on explanation. I won't belabor that point though, because it seems like you're not really done writing them yet.
- Timeouts
Your bare-bones Deferred implementation should really be called something else. In Jonathan Lange's example, it was called a "Placeholder". I can see readers getting confused about whether Deferreds are something you're supposed to (or allowed to) implement yourself, or whether they're something that's a part of Twisted, because you move from talking about your toy implementation to the real thing without skipping a beat.
- Advanced topics
- Deferred asynchronous primitives - Sugar syntax
I feel like this is throwing too much at the user at once. It's absolutely fantastic if you want to address this stuff as well (its docs are even weaker than Deferred itself), but let's put in a break so that they know they should go off and try to understand the more basic aspects of Deferreds before trying to understand gatherResults, inlineCallbacks and DeferredSemaphore. More importantly I think you should really focus on getting an extremely lucid and readable explanation of the core concepts of event-driven programming and Deferreds before you start adding in these extra bits of documentation. Just keep coming back to that. Pretend you don't understand why asynchronous programming is useful at all, how select() or non-blocking I/O works, and read through the document. Consider whether you understand what's going on, why these ideas are useful. Deliberately try to misunderstand them in a way which is wrong, but consistent with the wording, and see if you can get to the bottom of your document without being corrected :). I have more feedback, but I assume this is more than enough to get you started :). P.S. Please CC me in your replies! Thanks. I'll try to remember, but I'm sure somebody's going to forget - you should subscribe to the mailing list so you get their messages even if you don't :).
![](https://secure.gravatar.com/avatar/434aee9ad675384a9e745c7217ac4abe.jpg?s=120&d=mm&r=g)
I have updated my draft here: http://ezyang.com/twisted/defer2.html The most notable change is I've removed the section "From Synchronous to Asynchronous". I believe (and I think other people agree with me) that this is an important topic to cover, but it's really *hard* to teach asynchronous programming and I'd like to think a bit more about how I'd like to frame the subject. There are at least two issues that we have to deal with: * Why asynchronous? - Define synchronous and asynchronous - Multiplexing IO - Introduce a simple reactor based on select() * Why callbacks? - Asynchronous interaction to synchronous interaction - Delocalized execution (the parser example) - High level functions in Python review Quite frankly, I'm stumped on "defining synchronous and asynchronous." Asynchronous had always made sense to me, coming from JavaScript, since it was "you click this button and something should happen!" But that is a very different use-case of asynchronous programming than Twisted is. And Glyph raised some very salient concerns about what we were trying to teach people. I just don't know what direction people are coming from. As such, the document now is targeted to "people who know the basics of asynchronous programming and grok callbacks", and I've incorporated Itamar's excellent suggesting of comparing explicit callback parameters and the Deferred object, which I hope dispells the notion of Deferred being magical fairly well (my assertion is Deferred is merely an abstraction over said callback parameters.) I've also fully fleshed out the Deferreds reference; any omissions are my fault. The plan next is to discuss composing deferreds (which will also touch on when you should and how to create your own deferreds, as well as deferredlist) and the convenience primitives. Cheers, Edward
![](https://secure.gravatar.com/avatar/53cf3ca941974336efe1921cdef8e83b.jpg?s=120&d=mm&r=g)
----- Original Message ----- From: "Edward Z. Yang" <ezyang@MIT.EDU> To: "Twisted general discussion" <twisted-python@twistedmatrix.com> Sent: Monday, August 03, 2009 6:00 PM Subject: Re: [Twisted-Python] Deferred documentation rewrite ================ I like the side-by-side regular and twisted versions, that's helpful. You are approaching the complicated stuff toward the end - please don't stop there. Show some more examples of more intricate cases of side-by-side logic, eg how to use deferreds for this if all foo's are asynchronous: x=foo1(y) if x > 0: z=foo2(x) y = foo3(z) else: z=foo4(x,y) alldone=foo5(z) # now add error handling for all the deferreds # how would I debug things if foo3 had an unanticipated error? ================ My latest real question for the docs to answer is, Are these different? == example 1 ==== d=asynchronousprocess() d.addCallback(b) d.addCallback(c) d.addErrback(errbc) === example 2 ==== d=asynchronousprocess().addCallback(b).d.addCallback(c).d.addErrback(errd) ============= Why or why not? When might one use one form over the other? -Dave
![](https://secure.gravatar.com/avatar/307ebad5fc7824b9f223fbad5698d278.jpg?s=120&d=mm&r=g)
In the Callbacks and errbacks section: "Notice that in the synchronous version, process is inside the try..except block. This translates over to the asynchronous code: if process throws an exception, handle_twisted will get a Failure object..." : I think you may mean "handle_twisted_error", not "handle_twisted" On Mon, Aug 3, 2009 at 6:00 PM, Edward Z. Yang<ezyang@mit.edu> wrote:
I have updated my draft here:
![](https://secure.gravatar.com/avatar/e1554622707bedd9202884900430b838.jpg?s=120&d=mm&r=g)
On Mon, Aug 3, 2009 at 6:00 PM, Edward Z. Yang <ezyang@mit.edu> wrote:
I have updated my draft here:
Thanks. Looks like it's improving. I've got more points to critique now, but that's only because there's more meat to the tutorial now :). 1. The coding standard in this document is PEP8, not the Twisted coding standard. Have a look here: http://twistedmatrix.com/trac/browser/trunk/doc/core/development/policy/codi... 2. "Callbacks are the lingua franca of asynchronous programming" strikes me as an odd turn of phrase, especially one to open the document with. 3. "This document addresses ... Deferred. It..." - "It" has an ambiguous antecedent. Are you talking about the document or the Deferred class? Of course it becomes obvious, but it should be phrased so you don't need to. 4. It's far from obvious what "nonblocking_call" is supposed to be, given that its definition is "pass". On my first skim through I thought it was a callback, then had to stop, go back and read again when I realized that didn't make sense. Brevity is good in examples, but this is too brief. 1. "input" is a builtin function. You might want to avoid using it for a parameter name. 5. "You might be tempted to define it like this": you're switching back and forth from second to third person; at first referring to the reader, then an anonymous different programmer. It might be useful to give these roles different names; "Alice and Bob" are popular. 6. If you must use a third-person pronoun (as you do the one time you refer to the API's anonymous user); you should stick to a gender-neutral one wherever possible, unless of course you're referring to a specific character. 7. "The Deferred doesn't do anything that you couldn't have done with the two callback parameters." This isn't strictly true; chaining callbacks, and dealing with errors that arise in different layers of an asynchronous callback chain, aren't strictly possible without some additional mechanism. 8. Deferred is mentioned as an API link, Failure isn't. 9. Your explanations of the examples seem backwards. "At its very simplest, Deferred has a single callback attached to it". I think you should be explaining the problem being solved by a single callback, since the synchronous example isn't addressed. The synchronous example obviously doesn't have a single callback attached to it :). In other words, document "here's what you might want to do, here's how you can do it" rather than "here's a thing you can do! by the way, you might want to do it because...". You've addressed the general why-you-want-to-do-it in the section above, but it would be helpful to do it in the small for each specific example. 10. The DeferredList docs seem wonky in several ways. 1. The opening is hard to follow.
We are now ready to consider our original problem
what original problem?
a Deferred that would only fire
"fire"? what does "fire" mean? The term hasn't yet been introduced.
after some other number of Deferreds fired
Yeah, I'm still not sure what you're referring to. Why would I want to do this, again? 2. Users really shouldn't be subclassing Deferred themselves, so it's bad to have an example that does that. Especially one which . The fact that this is what DeferredList is is an implementation detail, and an ugly one at that. Try talking about gatherResults instead, and implementing a function which does the same thing without a subclass. Or, perhaps, a class of your own which just delegates to Deferred for Deferred behavior, rather than inheriting it. 3. Users *definitely* shouldn't be subclassing Deferred without upcalling to its __init__. I haven't tested them, but I'm pretty sure these examples will just blow up with tracebacks. 4. The examples are never invoked. It's semi-obvious how to use them, but semi-obvious things are often invoked semi-correctly. Better to have examples with can be run, or at least ones with a 0-argument entry point named something like 'start'. 5. "Consider the following interaction of two Deferreds:". You're setting this up as if it's going to be very formal, but then your language is sloppy; you don't name the different deferreds. One of them is "one deferred", the other is "a Deferred". You don't describe them independently, the relationship is implicit in the description. Given that you're describing a fairly complex constellation of objects with which the user isn't necessarily familiar yet, you should be clearly labeling the Deferreds in question in the code sample with variable names (something as simple as "a" and "b" would probably do fine) and then consistently using those names to refer to them in the prose as well, so it's easy for the reader to follow exactly which thing you're talking about. A big problem with technical documentation, *especially* documentation of Deferreds, is that it's very easy for a reader to start confusing which thing is which. Once again, it would be good to set up some kind of concrete problem first: *why* are we waiting on multiple Deferreds? 11. "Fluent Interface"? This is more new terminology — terminology that I am not familiar with, I might add — that isn't defined anywhere in the document. I think it's more of an appendix than something important to the main narrative; composing Deferreds, returning a Deferred from another Deferred, firing a Deferred from another callback, etc, should be covered first. 1. "Batons" looks like it's going to be more fancy ad-hoc terminology - I would recommend keeping the language simpler and consistent with other Twisted documentation :). 12. Still a lot of enumerated lists. Obviously a bad habit to which I am prone ;-), but when one uses an enumerated list, there should either be in an expectation that the numbers will be useful. Either, as in this document review, or code reviews, where the numbers can be used to refer to points in subsequent discussion, or there's a clear separation of steps. It's not really clear what the "two possible scenarios" lists are enumerations *of *. Are they different things that can happen? 13. You should try eliminating the word "consider" from the document. You seem to have the rhetorical habit, which I've seen from other people (myself included), of having a sentence which is missing a clear subject/verb/object relationship, and working around it by saying "consider" or "let's say". For example, you want to communicate that there's a Deferred somewhere with some callbacks. You can't just say "A Deferred with some callbacks.", so you say "Consider: a Deferred with some callbacks", and now the sentence *seems* complete, but it doesn't really communicate a full thought. Okay, I think that's enough feedback for now. I'll have to do more with your next round of edits, or my feedback is going to be longer than the document itself :). * Why asynchronous?
You might want to start with this one, since callbacks are even more generally useful than asynchronous programming. Your suggestion of a parser example makes this clear: even if you're parsing synchronously, you'll still probably have callbacks for different parse rule matches.
I'd start with the words themselves. synchronous means "at the same time". This refers to the timing of the function call and its effects. In a synchronous program, if I say "read()", then at that same time that "read()" is called, the reading happens and the data is returned. But, in an *a*synchronous ("not at the same time") program, "read()" is called, but its effects happen later. This can obviously be fleshed out quite a bit, but I think that core concept is what's important to communicate.
Your experience with JavaScript — or at least, with GUI programming, since JavaScript itself is terrible — might actually be a good way to explain the problem here. One example I like to use to explain why sometimes you just can't block is this: button1 = Button() button2 = Button() # I need to wait for the user to click on this button button1.waitForClick() # okay now they've clicked it. message("Hooray you clicked button 1") button2.waitForClick() # oh dang, but what if they want to click button 2 first!?! although you can probably devise a more lucid variant of that :). One of these days I really want to write a combined Twisted / GTK tutorial that shows how to ask questions in dialog boxes without blocking and sub-main-loops and other nasty tricks that GTK programs often get up to in order to have a question-and-answer UI. Unfortunately, although these examples do serve as easy-to-identify for learning Twisted programmers, it's not always immediately clear how this corresponds to networking data, and the extra complexity of GUI libraries makes it more difficult to run the examples. And Glyph raised some very salient concerns about what we were trying to
teach people. I just don't know what direction people are coming from.
I think the best assumption of background for such an introductory tutorial is to assume that the user doesn't really understand what problem Deferreds solve, and has thus never done any substantial work in an asynchronous environment. More experienced users will skim some parts, but that's fine: more experienced users are easily able to figure out what Deferreds are even with just the current documentation :). We shouldn't treat this as a Python tutorial, but it should at least touch briefly on callable objects and nested variables. As such, the document now is targeted to "people who know the basics
Again, I think this might be assuming a bit too much. At the very least, you should find a very, very good tutorial on callbacks and higher-order functions in Python to point people to as a dependency, so that users who * don't* have that experience can go read about it somewhere else. (Actually, every dependency of every document *really* ought to have hyperlinks to other resources that teach that dependency, so that a user who doesn't know python but needs to dive into a Twisted codebase will be put on their way quickly.) Even people who have some Python experience, but use callbacks rarely, will often discover there are things they don't know when they start programming with Twisted and nesting 5 or 6 callbacks in a function. For example, many people don't know all the fiddly rules of scope nesting. Take a poll of some potential targets for this intro documentation and ask them if they can explain why this produces an error: def f(x=1): def t(): if x > 3: x = 2 else: return x return t ... but adding 'x=x' to the parameter list of 't' makes it work (although not like they would expect if they manipulated 'x' in f). The plan next is to discuss composing deferreds (which will also
touch on when you should and how to create your own deferreds, as well as deferredlist) and the convenience primitives.
I think you need to start talking about creating your own Deferreds, at least implicitly, very early on in the document. For example, rather than having "nonblocking_call" be a dummy function, have it maintain a list of yet-to-complete calls, like this: pending = [] def process(data): return "Processed: <" + data + ">" def nonblockingCall(data, whenSucceeded, whenFailed): pending.append((data, whenSucceeded, whenFailed)) def completeOneCall(succeeded=True): data, whenSucceeded, whenFailed = pending.pop(0) if succeeded: whenSucceeded(process(data)) else: whenFailed(RuntimeError("It failed, for some reason.")) then (A) you can demonstrate how the callbacks actually get called in a tiny little system that the reader can play around with and get comfortable in before understanding Deferred, and (B) you can illustrate the same example again with some Deferred logic involved.
participants (9)
-
Dave Britton
-
Edward Z. Yang
-
ezyang
-
Glyph Lefkowitz
-
Itamar Shtull-Trauring
-
Jamu Kakar
-
Kevin Horn
-
Michael Hudson
-
Ying Li