Need a specific sort of string modification. Can someone help?

Ian Kelly ian.g.kelly at gmail.com
Sat Jan 5 21:04:56 CET 2013


On Sat, Jan 5, 2013 at 8:57 AM, Chris Angelico <rosuav at gmail.com> wrote:
> You miss my point, though. I went for simple Pythonic code, and never
> measured its performance, on the expectation that it's "good enough".
> Written in C, the state machine is probably WAY faster than splitting
> and then iterating. My C++ MUD client uses code similar to that to
> parse TELNET and ANSI codes from a stream of bytes in a socket (and
> one of its "states" is that there's no more data available, so wait on
> the socket); the rewrite in a high level language divides the string
> on "\xFF" for TELNET and "\x1B" for ANSI, working them separately, and
> then afterward splits on "\n" to divide into lines. The code's much
> less convoluted, it's easier to test different parts (because I can
> simply call the ANSI parser with a block of text), and on a modern
> computer, you can't see the performance difference (since you spend
> most of your time waiting for socket data anyway).

Anecdotally and somewhat off-topic, when I wrote my own MUD client in
Python, I implemented both TELNET and ANSI parsing in Python using a
state machine processing one byte at a time (actually two state
machines - one at the protocol layer and one at the client layer; the
telnet module is a modified version of the twisted.conch.telnet
module).  I was worried that the processing would be too slow in
Python.  When I got it running, it turned out that there was a
noticeable lag between input being received and displayed.  But when I
profiled the issue it actually turned out that the rich text control I
was using, which was written in C++, wasn't able to handle a large
buffer gracefully.  The parsing code in Python did just fine and
didn't contribute to the lag issue at all.



More information about the Python-list mailing list