I've recently written a threaded client server application using a
custom message protocol. Each message is 49 bytes long and needs to
arrive at the client as soon as possible.
In the current app I send() each message as soon as all 49 bytes are
available, I also recv(49) bytes at the client side (buffering as
necessary). The threaded application thus always reads 49 bytes from the
network buffers as soon as it is possible to do so. This application
consumes about 9% cpu time at full throttle.
Because I already use the twisted.enterprise adbapi on the client side I
decided to get rid of all the client threads and use twisted for the TCP
stuff as well. I've written a small test script to determine basic loads
#! /usr/bin/env python
from twisted.internet.protocol import Protocol,ReconnectingClientFactory
from twisted.internet import reactor
f = open("wakka", "w")
def dataReceived(self, data):
""" * connection were made, send signon"""
print "Signing on...",
signon = array.array('B', [ord('T'), 1 ,255, 1, 255])
def startedConnecting(self, connector):
print 'Started to connect.'
def buildProtocol(self, addr):
print 'Resetting reconnection delay'
def clientConnectionLost(self, connector, reason):
print 'Lost connection. Reason:', reason
def clientConnectionFailed(self, connector, reason):
print 'Connection failed. Reason:', reason
reactor.connectTCP('localhost', 55555, TRAUClientFactory())
Connecting the above to the server yielded somewhat surprising results:
The length of the dataReceived() data between runs varies between 49 and
multiples of 49 bytes. I understand this (I think) as the ethernet
packet length sweet spot is about 1.3kB and Protocol is propably
optimized arround that. This does imply that I need a more elaborate
frame caching scheme on the client side than the one I currently have,
as another socket connection signals if messages from the current
connection must be stored or discarded.
Surprisingly, the script data length is reported as continuous 49 bytes
at leas 4/10 times it's run, on other runs the print len(data) line
looks something like this:
Each second dataReceived() callback is 49 bytes, the rest multiples of
When the script reports all frames as 49; cpu consumption is ~33%, when
staggered it's ~4%. The dataReceived() call seems overly expensive when
compared with the 'raw' synchronous recv().
Paradoxically, padding the 49 byte message with 1000 pad bytes improves
the script's performance 8 times; due to the decrease in dataReceived()
1) is there a way to force dataReceived() to return when a certain data
length has been received ?
2) Why is dataReceived() so expensive (if it is) ?
3) Is Protocol the correct tree or are there other ways to handle small
time sensitive messages in twisted.
Apologies for the longish post.