[Tutor] reading an input stream
Cameron Simpson
cs at zip.com.au
Thu Jan 7 17:07:58 EST 2016
On 08Jan2016 08:52, Cameron Simpson <cs at zip.com.au> wrote:
[...]
>Instead, gather the data progressively and emit XML chunks. You've got a TCP
>stream - the TCPServer class will do an accept and handle you an _unbuffered_
>binary stream file from which you can just .read(), ignoring any arbitrary
>"packet" sizes. For example (totally untested) using a generator:
[...]
Just a few followup remarks:
This is all Python 3, where bytes and strings are cleanly separated. You've got
a binary stream with binary delimiters, so we're reading binary data and
returning the binary XML in between. We separately decode this into a string
for handing to your XML parser. Just avoid Python 2 altogether; this can all be
done in Python 2 but it is not as clean, and more confusing.
The socketserver module is... annoyingly vague about what the .rfile property
gets you. It says a "a file-like object". That should be a nice io.BytesIO
subclass with a .read1() method, but conceivably it is not. I'm mentioning this
because I've noticed that the code I lifted the TCPServer setup from seems to
make a BytesIO from whole cloth by doing:
fp = os.fdopen(os.dup(request.fileno()),"rb")
You'd hope that isn't necessary here, and that request.rfile is a nice BytesIO
already.
In xml_extractor, the "# locate start of XML chunk" loop could be better by
using .find exactly as in the "# gather XML chunk"; I started with .read(1)
instead of .read1(8192), which is why it does things byte by byte.
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Tutor
mailing list