Parsing Binary Structures; Is there a better way / What is your way?

Hendrik van Rooyen hendrik at microcorp.co.za
Thu Aug 6 03:39:33 EDT 2009


On Wednesday 05 August 2009 16:46:13 Martin P. Hellwig wrote:
> Hi List,
>
> On several occasions I have needed (and build) a parser that reads a
> binary piece of data with custom structure. For example (bogus one):
>
> BE
> +---------+---------+-------------+-------------+------+--------+
>
> | Version | Command | Instruction | Data Length | Data | Filler |
>
> +---------+---------+-------------+-------------+------+--------+
> Version: 6 bits
> Command: 4 bits
> Instruction: 5 bits
> Data Length: 5 bits
> Data: 0-31 bits
> Filler: filling 0 bits to make the packet dividable by 8
>
> what I usually do is read the packet in binary mode, convert the output
> to a concatenated 'binary string'(i.e. '0101011000110') and then use
> slice indeces to get the right data portions.
> Depending on what I need to do with these portions I convert them to
> whatever is handy (usually an integer).

This bit banging stuff is a PITA, no matter what you do.
Python does not have bit fields like C.
And C bit fields are implementation dependent.
Write an extension?

Some time ago I asked a similar question, and
Castironpi came up with what was essentially an
indexed integer, with named bits.

It stores the bits natively, but I suspect that the
price paid is access time.

I enclose a module that you can adapt.
It talks about bytes but they are integers really.
It is different from what you are doing, as it
was aimed at reading and writing bits in 
a hardware context.

If you get your head around the concept,
then it may give you some ideas.  It should
be possible to extend the concept to
pass name,length tuples at construction time
instead of just a name with an implied length 
of one bit, and it may make sense to change
the underlying type from an integer to an 
array of one byte integers.

It is nice to be able to say:
val = bitname() 
to read, and 
bitname(1)
or
bitname(0)
to write.

I can also write:
outputbits[3] = 1
or
val = inputbits[5]

If you can successfully generalise it
to field names It should be very useful.

I cannot think of a way though, to not
have the "in" and "out" split, but you can
program your way around that - you do not
have to update "in place".

- Hendrik

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bits.py
Type: application/x-python
Size: 4379 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20090806/c056e243/attachment.bin>


More information about the Python-list mailing list