
Prompted by exarkun, I have put together some simple documentation for beginners starting with Twisted. The points are made up of things I wish I had known from the start as a complete beginner. It still needs lots of work. The layout needs to change, there are duplications and the points need to be made more succinctly. But before I spend more time on honing the document, I thought it would be a good idea to get some feedback. The information in the document may already exist and I have just overlooked it. People may feel it is not appropriate for the documentation in Twisted and should go elsewhere. Some of the information needs to be checked for accuracy and to avoid misleading readers. Anyway I would appreciate any feedback positive or negative. John Aherne Here ie the text: Basic Information for anyone starting with Twisted If you don't know mauch about TCP, then bear this in mind. BASIC TCP TCP is a stream of data. Once the connection is open, it stays open until closed. It does not have a beginning or an end. It does not know about your messages. You cannot wait until it sends your message since it will not tell you the message has been sent. If you want to know if your message reached the other end, you need to have in place a protocol for each end to respond that it received the data. You will then need to implement a timeout for when there are problems otherwise you may wait a long time for a response. TCP will wait forever. But intermediate routers may time you out after 2 - 30 minutes if there is no traffic on the port. TCP is just a stream of data. You have to process the stream of data looking for a marker your application has placed that signifies the end of message to your end of the application program. Twisted provide the linereceiver and sendline functions to help in the common case of using CR/LF as a terminator of messages, expecially for chat type protocols and http. The reactor and the select command will process the outgoing and incoming buffers without blocking. The reactor uses the select command. Each time the reactor cycles around, it will use select to check the read and write buffers to see if any buffer is ready to read or write. It will process those that are ready. Ignoring any not yet ready. Anyone familiar with networking and select will probably already understand this. Anyone not familiar will not realise it and needs to become familiar with how select works. If you want to know how Twisted processes network traffic, you should read up on the select command. TWISTED - DIRECT SEND DATA CALLS For simple network activity you do not need to use deferreds. They are not necessary. And you can get a lot done without deferreds just by using the transport.write or sendline functions. This is shown in the simple Chat Server example following. Provided you are dealing in small amounts of data you will not block the reactor. If you are sending megabytes of data in a file, that is a different matter. Using sendline directly is faster than using a deferred. John Goerzen in his Apress book Python Network Fundamentals has a very simple chat server example. WHAT IS BLOCKING CODE Blocking code is code that will block or may potentially block the continued execution of the main reactor thread. Think for the most part of long running processes or operations that may be long running, doing file or network i/o, calculating cpu intensive work, operations that may timeout like doing a remote call to another process or host machine, database operations are usually a culprit, that may be flooded with work or crashed, the examples go on but are mainly about i/o and cpu intensive operations. When these things happen on the reactor / main thread they block the server from doing anything else, it can't accept new connections, it can't do anything else until this blocking activity has completed and returned control to the reactor thread. WHAT ARE DEFERREDS By and large they seem very similar to callbacks. They aren't, but seem to perform the same sort of function. Please refer to other documentation on defers for more detailed explanation. As everyone hears interminably on the twisted list, deferreds do not make blocking code non-blocking. We all try it - but you shouldn't. If you have blocking code, then first think about putting it into deferToThread which will run the code in its own thread. It's not the only thing you can do but it is a good start. Return a deferred when setting up this threaded function and add appropriate callbacks and errbacks. This will run the blocking code in its own thread. You should not call transport.write or sendline functions directly from the thread since this is not thread-safe. In the thread you must call the callback or errback to return processing to the reactor thread and then send any data from the reactor thread. You can handle this without deferToThread by breaking the blocking code up into smaller pieces. Sometimes you need to transfer a large file to a socket, instead of trying to send it all at once send 10KB at a time and yield back to the reactor and reschedule the next 10KB until finished. This will work, it might not be the fastest way and still may block for an unacceptable amount of time on just 10KB, depending on how heavily taxed the i/o system is at the moment. Usually deferToThread is just easier to implement. DATABASE PROCESSING TENDS TO BE BLOCKING The adbapi module seems to be a good example of using deferreds and threads. The adbapi module returns a deferred it has created, you add your callbacks to it. The thread then calls your callback when ready. It does seem like the examplar for doing deferreds. The db stuff will normally block so put it in a thread and use deferreds to wait the result or failure. THREADS twisted is meant to avoid the problems of using threads for network processing. So why are we using threads. It's a way of moving potentially blocking code out of the way so it avoids hanging the reactor. THREADS WON'T NECESSARILY PREVENT BLOCKING A point about the db calls is that they can be very intensive. If you need to run some db function every 30 secs or 60 secs and the db takes 50% or more of the time to generate the results, you won't have much time to service any incoming requests that want to get results. The remote connections will be failing bigtime. So then I suppose you should break the code into 2 programs. One that does the db stuff, the other to handle the remote connections. The db code when it has a result will then connect to the other program and pass across its results. There may be better ways of doing this of course depending on circumstances. WHEN TO USE DEFERREDS If you have a cpu intensive process, then in all probability it will block the reactor since it will take 100% cpu time while running - whether in the main thread or in a separate thread. These are not good for running in twisted. If you have I/O activity, such as reading lines of text from a disk file, this seems a good candidate for deferreds. This is what the dbapi module does. So it seems like a good example to follow. As a general rule, it is simplest to use deferreds with threads. This is not always true so circumstances may indicate a better way of running a deferred. You still need to make sure that the bulk of the time is available for handling connections. Otherwise you will start to have failing connections Using sendline directly is faster than putting a deferred in between. BEWARE WHEN USING DEFERREDS IN THREADS Since deferToThread runs the function you pass to it in a non-reactor thread, you may not use any non-thread-safe Twisted APIs in the function you pass to it. Beware of using shared data when running in the thread such as lists and dictionaries.

On Mon, Aug 10, 2009 at 2:56 AM, John Aherne <johnaherne@rocs.co.uk> wrote:
Prompted by exarkun, I have put together some simple documentation for beginners starting with Twisted.
Thanks for doing this. All documentation help is useful :).
But before I spend more time on honing the document, I thought it would be a good idea to get some feedback.
I've added a review of this document to my personal to-do list, but that might take another couple of days. In the meanwhile, I think the stuff you're trying to communicate is valuable, but some of it seems pretty vague, and the ordering is a little confusing. For example, For simple network activity you do not need to use deferreds
what constitutes "simple" network activity? Does this means that there are some types of network activity do require deferreds? For that matter, is "network activity" everything Twisted does, or just sending/receiving bytes? etc, etc. I think it would be better to clearly and simply lay out how to do "simple" network operations like sending and receiving data before talking about Deferreds at all. It may still be useful to say "you don't need Deferreds" at some point, to make sure this is clear to the new user, but that should come later, when you can illustrate more clearly *why* they don't need Deferreds. You also use the word "seem" a lot. You should be more assertive, and just say what things are or aren't, not what they seem like. Don't worry about being wrong. If you write something wrong, we will correct you before it goes into the docs :).

On Tue, Aug 11, 2009 at 10:03 PM, Glyph Lefkowitz <glyph@twistedmatrix.com>wrote:
I appreciate it it was a bit confused, but I wanted to get it out sooner rather than later and just get the basic facts confirmed. So I will start putting my thoughts into better shape and more coherently. This always takes a couple of rewrites, so I hope you can add some more thoughts before I get too far. John Aherne

On Wed, Aug 12, 2009 at 6:41 AM, John Aherne <johnaherne@rocs.co.uk> wrote:
I appreciate it it was a bit confused, but I wanted to get it out sooner rather than later and just get the basic facts confirmed.
Yes, it's always good to share these early drafts.
Actually I think I don't have too much more to say about your original draft. If you have a new one, would you like to share it?

On Mon, Aug 10, 2009 at 2:56 AM, John Aherne <johnaherne@rocs.co.uk> wrote:
Prompted by exarkun, I have put together some simple documentation for beginners starting with Twisted.
Thanks for doing this. All documentation help is useful :).
But before I spend more time on honing the document, I thought it would be a good idea to get some feedback.
I've added a review of this document to my personal to-do list, but that might take another couple of days. In the meanwhile, I think the stuff you're trying to communicate is valuable, but some of it seems pretty vague, and the ordering is a little confusing. For example, For simple network activity you do not need to use deferreds
what constitutes "simple" network activity? Does this means that there are some types of network activity do require deferreds? For that matter, is "network activity" everything Twisted does, or just sending/receiving bytes? etc, etc. I think it would be better to clearly and simply lay out how to do "simple" network operations like sending and receiving data before talking about Deferreds at all. It may still be useful to say "you don't need Deferreds" at some point, to make sure this is clear to the new user, but that should come later, when you can illustrate more clearly *why* they don't need Deferreds. You also use the word "seem" a lot. You should be more assertive, and just say what things are or aren't, not what they seem like. Don't worry about being wrong. If you write something wrong, we will correct you before it goes into the docs :).

On Tue, Aug 11, 2009 at 10:03 PM, Glyph Lefkowitz <glyph@twistedmatrix.com>wrote:
I appreciate it it was a bit confused, but I wanted to get it out sooner rather than later and just get the basic facts confirmed. So I will start putting my thoughts into better shape and more coherently. This always takes a couple of rewrites, so I hope you can add some more thoughts before I get too far. John Aherne

On Wed, Aug 12, 2009 at 6:41 AM, John Aherne <johnaherne@rocs.co.uk> wrote:
I appreciate it it was a bit confused, but I wanted to get it out sooner rather than later and just get the basic facts confirmed.
Yes, it's always good to share these early drafts.
Actually I think I don't have too much more to say about your original draft. If you have a new one, would you like to share it?
participants (3)
-
Glyph Lefkowitz
-
Jarrod Roberson
-
John Aherne