asyncore/asynchat and terminator string

Hendrik van Rooyen mail at microcorp.co.za
Wed Jan 17 02:57:33 EST 2007


 "David Hirschfield" <davidh at ilm.com> wrote:

8< --------------- problems on syncing up in serial comms -----------------

I have seen people address this with success by using stuff like:

"XXHEADERXX" as a "here starts the lesson" identifier, with no 
trouble, on a high volume newsfeed.

If you assume that any binary value is possible and equally likely, 
then the problem looks hopeless, but in practice if you do this kind 
of thing, followed by a length, which automatically points to the 
next "instance" of the "here starts the lesson" in a stream, then 
it becomes extremely unlikely to occur in the wild.  If you also
repeat the length at the end, then you can "scan Backwards" 
through the stream. And if its not like that, its a "false sync"...

The other way is to take a leaf out of the bit oriented protocols'
book, and to "frame" the "packets" between (possibly repetitious)
occurrences of a character that is guaranteed not to occur in the
data, known as a "flag" character.

You do this by a process that is called "escaping" the occurrences 
of the flag character in the data with yet another "escape char", 
that also needs special treatment if it happens in the data...

Escaping can be accomplished by replacing each instance of the
"Poison Characters" by an instance of the escape char, followed 
by the bitwise inversion of the poison char.  Unescaping has to
do the reverse.

Using a tilde "~" as a flag char, and the ordinary slash as an 
escape char works well.

So a packet with both poison chars in it will look like this:

"~~ordinary chars followed by tilde" + "/\x81"+
"ordinary followed by slash" +"/\xd0" + "somestuff~~"

So you sync up by looking for a tilde followed by a 
non tilde, and the end is when you hit a tilde again.

To unescape, you look for slashes and invert the chars
following them.

Also Yahoo for "netstrings"

If you want to be Paranoid, then you also implement message
numbers, to make sure you don't lose packets, and for hyper
paranoia, put in a BCC or LRC to make sure what you send
off is received...

If you are running over Ethernet, the last lot is not warranted,
as its done anyway - but on a serial port, where you are on 
your own, it makes sense.

HTH - Hendrik




More information about the Python-list mailing list