Extension of struct to handle non byte aligned values?
I'm attempting to write a Packet class, and a few other classes for use in writing protocol conformance tests. For the most part this is going well except that I'd like to be able to pack and unpack byte strings with values that are not 8 bit based quantities. As an example, I'd like to be able to grab just a single bit from a byte string, and I'd also like to modify, for example, 13 bits. These are all reasonable quantities in an IPv4 packet. I have looked at doing this all in Python within my own classes but I believe this is a general extension that would be good for the struct module. I could also write a new module, bitstruct, to do this but that seems silly. I did not find anything out there that handles this case, so if I missed that then please let me know.
My proposal would be for a new format character, 'z', which is followed by a position in bits from 0 to 31 so that we get either a byte, halfword, or longword based byte string back and then an optional 'r' (for run length, and because 'l' and 's' are already used) followed by a number of bits. The default length is 1 bit. I believe this is sufficient for most packet protocols I know of because, for the most part, protocols try to be 32 or 64bit aligned. This would ALWAYS unpack into an int type. So, you would see this:
bytestring = pack("z0r3z3r13", flags, fragment)
this would pack the flags and fragment offset in a packet at bits 0-3 and 3-13 respectively and return a 2 byte byte-string.
header_length = unpack("z4r4", packet.bytes)
would retrieve the header length from the packet, which is from bits 4 through 8.
"George V. Neville-Neil" firstname.lastname@example.org writes:
I'm attempting to write a Packet class, and a few other classes for use in writing protocol conformance tests. For the most part this is going well except that I'd like to be able to pack and unpack byte strings with values that are not 8 bit based quantities.
Well, the main thing that comes to mind is that I wouldn't regard the struct interface as being something totally wonderful and perfect.
I am aware of a few attempts to make up a better interface, such as ctypes and Bob's rather similar looking ptypes from macholib:
and various silly unreleased things I've done. They all work on the basic idea of a class schema that describes the binary structure, eg:
class Sound(Message): code = 0x06 layout = [('mask', BYTE()), ('vol', CDI(1, SDI(BYTE(), 1/255.0), 1.0)), ('attenuation', CDI(2, SDI(BYTE(), 1/64.0), 1.0)), ('entitychan', SHORT()), ('soundnum', BYTE()), ('origin', COORD()*3)]
You may want to do something similar (presumably the struct module or some other c stuff would be under the hood somewhere).
I don't really see a need to change CPython here, unless some general binary parsing scheme becomes best-of-breed and a candidate for stdlib inclusion.
PS: This is probably more comp.lang.python material.
George V. Neville-Neil