[Tutor] multipart socket streaming problem: the socket

richard kappler richkappler at gmail.com
Tue Jan 5 10:31:28 EST 2016


This is a continuation of the thread 'reading an input stream' I had to
walk away from for a few days due to the holidays and then other work
considerations, and I figured it best to break my confusion into separate
chunks, I hope that's appropriate. In short, my script needs to read a
stream of xml data from a socket (port 2008), the data coming in from as
many as 30 different machines, but usually 4 or less, as many as 3 messages
per second from each machine at times, messages block format delimited by
stx(\x02) and etx (\x03), send the data in those blocks to a parser
(already built using lxml and an xslt file) and send it out to splunk using
a native 'event writer'.

Here's what I have thus far for the socket, it works though has not been
tested for multiple connections yet, how should it be improved to be more
Pythonic, robust and efficient?:

#!/usr/bin/env python

import socket

# receive socket
sockrx = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sockrx_address = ('', 2008)
print 'opening sockrx on %s port %s' % sockrx_address
try:
    sockrx.bind(sockrx_address)
except socket.error as msg:
    print 'Bind failed. Error code: ' + str(msg)
    sys.exit()

sockrx.listen(5)
print 'listening'

# wait for a connection
connection, client_address = sockrx.accept()
data = connection.recv(8192)

I have questions including:
- Should I increase the 5 connection limit to 30+ as it may be listening to
that many machines?
- Buffer size is set at 8192, yet the messages may be much larger, can I
safely increase that? Should I?
- I'm presuming that the OS handles assembly of a message that comes in
more than one packet using the TCP/IP protocols, but is that true?

regards, Richard


More information about the Tutor mailing list