Problem with writing fast UDP server

Krzysztof Retel Krzysztof.Retel at
Tue Nov 25 22:12:14 CET 2008

On Nov 21, 6:55 pm, Greg Copeland <gtcopel... at> wrote:
> On Nov 21, 11:05 am, Krzysztof Retel <Krzysztof.Re... at>
> wrote:
> > On Nov 21, 4:48 pm, Peter Pearson <ppear... at nowhere.invalid> wrote:
> > > On Fri, 21 Nov 2008 08:14:19 -0800 (PST), Krzysztof Retel wrote:
> > > > I am not sure what do you mean by CPU-bound? How can I find out if I
> > > > run it on CPU-bound?
> > > CPU-bound is the state in which performance is limited by the
> > > availability of processor cycles.  On a Unix box, you might
> > > run the "top" utility and look to see whether the "%CPU" figure
> > > indicates 100% CPU use.  Alternatively, you might have a
> > > tool for plotting use of system resources.
> > > --
> > > To email me, substitute nowhere->spamcop, invalid->net.
> > Thanks. I run it without CPU-bound
> With clearer eyes, I did confirm my math above is correct. I don't
> have a networking reference to provide. You'll likely have some good
> results via Google. :)
> If you are not CPU bound, you are likely IO-bound. That means you
> computer is waiting for IO to complete - likely on the sending side.
> In this case, it likely means you have reached your ethernet bandwidth
> limits available to your computer. Since you didn't correct me when I
> assumed you're running 10Mb ethernet, I'll continue to assume that's a
> safe assumption. So, assuming you are running on 10Mb ethernet, try
> converting your application to use TCP. I'd bet, unless you have
> requirements which prevent its use, you'll suddenly have enough
> bandwidth (in this case, frames) to achieve your desired results.
> This is untested and off the top of my head but it should get you
> pointed in the right direction pretty quickly. Make the following
> changes to the server:
> sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
>  to
> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> Make this:
> print "Waiting for first packet to arrive...",
> sock.recvfrom(BUFSIZE)
> look like:
> print "Waiting for first packet to arrive...",
> cliSock = sock.accept()
> Change your calls to sock.recvfrom(BUFSIZE) to cliSock.recv(BUFSIZE).
> Notice the change to "cliSock".
> Keep in mind TCP is stream based, not datagram based so you may need
> to add additional logic to determine data boundaries for re-assemble
> of your data on the receiving end. There are several strategies to
> address that, but for now I'll gloss it over.
> As someone else pointed out above, change your calls to time.clock()
> to time.time().
> On your client, make the following changes.
> sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
>  to
> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> sock.connect( (remotehost,port) )
> nbytes = sock.sendto(data, (remotehost,port))
>  to
> nbytes = sock.send(data)
> Now, rerun your tests on your network. I expect you'll be faster now
> because TCP can be pretty smart about buffering. Let's say you write
> 16, 90B blocks to the socket. If they are timely enough, it is
> possible all of those will be shipped across ethernet as a single
> frame. So what took 16 frames via UDP can now *potentially* be done in
> a single ethernet frame (assuming 1500MTU). I say potentially because
> the exact behaviour is OS/stack and NIC-driver specific and is often
> tunable to boot. Likewise, on the client end, what previously required
> 15 calls to recvfrom, each returning 90B, can *potentially* be
> completed in a single call to recv, returning 1440B. Remember, fewer
> frames means less protocol overhead which makes more bandwidth
> available to your applications. When sending 90B datagrams, you're
> waisting over 48% of your available bandwidth because of protocol
> overhead (actually a lot more because I'm not accounting for UDP
> headers).
> Because of the differences between UDP and TCP, unlike your original
> UDP implementation which can receive from multiple clients, the TCP
> implementation can only receive from a single client. If you need to
> receive from multiple clients concurrently, look at python's select
> module to take up the slack.
> Hopefully you'll be up and running. Please report back your findings.
> I'm curious as to your results.

I've been out of online for a while. Anyway, clarifing few things:
- I am running on IO-bound
- we have 10/100/1000MB ethernet, and 10/100MB switches, routers,
- the MTU is default 1500

I know what you are saying regarding TCP. I was using it in another
project. However this project needs to be done using UDP and can't be
changed :(
Was testing today multiple approaches to client. Kept one similar to
the above one, rewrote one using threads and found another issue. The
speed is pretty acceptable, but there is an issue with sending
1.000.000 packets per client (it does it within around 1.5min). It
runs from a client machine to the server machine, both on the same
network. So, when sending milion packets only around 50%-70% are send
over. On the client machine it looks like all the packets where
transmitted however tcpdump running on the server shows that only
50-70% went through. I doubt that such amount of packets were lost. My
impression is that I fill up the socket buffer stack either on the
client or the server machine. Though, don't know yet how to test it.
The 'netstat -s -u' shows normal entries, similarly 'ifconfig' for
Any ideas what could go wrong here? I started to read about the buffer
size and system variable settings for socket. Can't still find a way
to explain that.
We might try to crossover approach connecting directly one machine to
another and play with that.


More information about the Python-list mailing list