[Twisted-Python] Twisted scalability question

Hi, I've got a task to write a DHCP server for a local cable operator and I'm considering different frameworks to write it in. I've looked at Twisted a couple of times and thought I'd do some little experimenting to see how well it behaves with the current task at hand. The DHCP server needs to do some database interaction to support the business logic. So to model the problem I wrote two simple QOTD servers for UDP protocol, using a postresql backend for quotes. One server is written using a Twisted DatagramProtocol and uses Deferreds for database query using adbapi. The other is a dead-simple single-threaded block-and-wait UDP socket server with a single database connection that just reads from the udp socket, queries the database and writes the result back. So far so good. I wanted to see how well these servers scale with a lot of requests. First I tried making just a lot of sequential requests from a single-threaded client, but that showed that the Twisted version was slower by a factor of ~4. After a little consideration I thought that Twisted must be better for a lot of concurrent requests, because of the threaded database access. Twisted should be able to accept more than one client at a time and keep all the database connections (5 by default, i guess) busy. So I forked 20 clients off using a simple bash script with a for loop. The result was that the Twisted is still slower, by the same factor of 4. The "top" shows that while the Twisted server is running, the 5 postgresql connections are mostly in "idle" state (they change the name of the process to show it). However, while the single threaded server is running, the only active pgsql connection is mostly in its "select" state. So, after the rather lengthy introduction, I'd like to know what am I doing wrong? How can I show the better scalability of Twisted platform? -- Matti Jagula

At 11:58 PM 9/22/03 +0300, Matti Jagula wrote:
So, after the rather lengthy introduction, I'd like to know what am I doing wrong? How can I show the better scalability of Twisted platform?
This is just a guess, but you probably aren't doing anything complicated enough to benefit. :) Seriously, if all you're doing is sending a single datagram in response to a simple DB query, framework overhead will continue to dominate your processing time. Think about the two programs: both perform the query and send the datagram. But the non-Twisted program *does nothing else*. So, what you're seeing is that the per-invocation framework overhead for the Twisted version is about 3x what's required for just the db query and datagram send. For something so simple, that actually sounds pretty reasonable! What Twisted (and the "reactor" pattern in general) excels at is multiplexing *concurrent and ongoing* I/O operations. However, your application has *no* ongoing I/O! In general, Twisted scales well because it lets you do other things while you're waiting for I/O. With TCP streams, you may have to wait before you write data, because the receiver may be slow. Likewise, if you're receiving data, the sender may be slow. While it's waiting, Twisted can let you do something else, like send or receive data from someone else that's ready at that moment, or do some actual computation. However, in your setup, the *only* I/O that you ever wait for is the database! You're receiving a single UDP packet (that you don't "wait" for), and sending a UDP packet (that you don't need to wait to send!). Even if you made it take 10 times longer for the DB query to run, you'd *still* be able to send that data out as soon as you receive it. In short, the problem you're solving is so darn simple that there's no way using Twisted can really improve on it, except in terms of issues like keeping your DB connection open. There's simply nothing here for Twisted to multiplex!
participants (2)
-
Matti Jagula
-
Phillip J. Eby