[Tutor] [Fwd: Re: Consistant Overhead Byte Stuffing (COBS)algorithm help]

Sat Oct 8 13:55:15 CEST 2005

Michael Cotherman wrote:
> The c code seems to be walking through the list moving
> bytes from src to dst, but the python code below seems
> to take one byte from src, start counitng up to the
> value from 1 and appending each and every value along
> the way to dst, no?

Ah, right you are. You are commenting on Alan's code and I am replying thinking you are commenting on my code. Doh! Sorry about that!

Alan was just trying to sketch out the Python equivalent to the C code and his translation has a bug - the line
      dst.append(i)
should be
      dst.append(src[i])

But...I posted complete, working code that duplicates your results. Did you see it? Does it work for you?

In my code instead of the loop to copy the fragment from src to dst I just copy the complete chunk in one go:
       dst.append(src[current+1:current+count])

I'll repeat my original reply below in case you lost it.

Kent

Michael Cotherman wrote:

> I am a noob to converting pointers in C++ to arrays in
> python, although the first time I see it done, I will
> have no problem. Can you help converting the below
> (what I think is the 'decoder' section) to python?

OK I'll bite. Your code is copying from a src buffer to a dst buffer. The src buffer has a count byte followed by count-1 bytes of data. If the count is less that 0xFF, a zero byte has to be inserted into the dst. So the decode loop gets a count, copies that many bytes to the dst buffer, then optionally appends a 0.

Python doesn't have a direct equivalent to this sort of manipulation of memory pointers and raw memory buffers. They are commonly replaced by strings or lists. My function takes a string as an input argument, builds the output as string fragments in a list, then builds a string to return.

Note that Python strings include an implicit length so there is no need to pass and return a length argument.

I included a test case with the data from your previous email. Your data is evidently hex-encoded as well as COBS encoded - in other words your strings are hex values where each two chars represent a single byte. I have converted to and from byte strings to feed this to the decoder.

Kent

def unstuff(src):
   # src is a COBS compressed string
   current = 0     # index into src
   dst = []        # a list that will build the result

   while current < len(src):
       # Get the count and convert it to an integer
       count = ord(src[current])

       # Append count-1 chars from src to dst
       dst.append(src[current+1:current+count])

       # Do we need to add a zero byte?
       if count < 0xFF:
           dst.append('\x00')

       # Bump the counter and continue
       if count>0:
           current += count
       else:
           current += 1

   # dst is a list of string fragments; this converts it to a single string
   return ''.join(dst)

def hexToString(hexStr):
   ''' Convert a string of two-digit hex values to a string of bytes with those values '''
   return ''.join([chr(int(hexStr[i:i+2], 16)) for i in range(0, len(hexStr), 2)])

def stringToHex(src):
   ''' Convert a byte string to a string of two-digit hex values '''
   return ''.join([ '%02x' % ord(s) for s in src ])
      if __name__ == '__main__':
   data = '0002860104DB203F0100'
   print data
   data = hexToString(data)
   print
      newData = unstuff(data)
   print stringToHex(newData)

> 
> -mike
> 
> 
> --- Alan Gauld <alan.gauld at freenet.co.uk> wrote:
> 
> 
>>>I am a noob to converting pointers in C++ to
>>
>>arrays in
>>
>>>python, although the first time I see it done, I
>>
>>will
>>
>>>have no problem. Can you help converting the below
>>>(what I think is the 'decoder' section) to python?
>>
>>It won't be working code but I think this is whats
>>happening...
>>
>>
>>>UINT CCobsPackets::UnStuffData(unsigned char *src,
>>>unsigned char *dst, UINT length)
>>
>>def UnStuffData(src,dst,len):
>>
>>
>>>{
>>>unsigned char *dstStart = dst;
>>>unsigned char *end = src + length;
>>
>># I don't think these are needed for Pyhon.
>>
>>
>>>while (src < end)
>>
>>for code in src:
>>
>>
>>>{
>>>int code = *src++;
>>>for (int i=1; i<code; i++) 
>>>{
>>>*dst++ = *src++;
>>>}
>>
>>    for i in range(1,code):
>>       dst.append(i)
>>
>>
>>>if (code < 0xFF) 
>>>{
>>>*dst++ = 0;
>>>}
>>
>>   if code < 0xff
>>       dst.append('\0')   # may not be needed in
>>python...
>>
>>
>>>}
>>>return (UINT)(dst - dstStart);
>>>}
> 
> 
> 
> 
> 	
> 		
> __________________________________ 
> Yahoo! Mail - PC Magazine Editors' Choice 2005 
> http://mail.yahoo.com
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 
>