Library for parsing binary structures
dieter
dieter at handshake.de
Thu Mar 28 04:12:46 EDT 2019
Paul Moore <p.f.moore at gmail.com> writes:
> I'm looking for a library that lets me parse binary data structures.
> The stdlib struct module is fine for simple structures, but when it
> gets to more complicated cases, you end up doing a lot of the work by
> hand (which isn't that hard, and is generally perfectly viable, but
> I'm feeling lazy ;-))
>
> I know of Construct, which is a nice declarative language, but it's
> either weak, or very badly documented, when it comes to recursive
> structures. (I really like Construct, and if I could only understand
> the docs better I may well not need to look any further, but as it is,
> I can't see anything showing how to do recursive structures...) I am
> specifically trying to parse a structure that looks something like the
> following:
>
> Multiple instances of:
> - a type byte
> - a chunk of data structured based on the type
> types include primitives like byte, integer, etc, as well as
> (type byte, count, data) - data is "count" occurrences of data of
> the given type.
What you have is a generalized deserialization problem.
It can be solved with a set of deserializers.
def deserialize(file):
"""read the beginning of file and return the corresponding object."""
In the above case, you have a mapping "type byte --> deserializer",
called "TYPE" and (obviously) "(" is one such "type byte".
The deserializer corresponding to "(" is:
def sequence_deserialize(file):
type_byte = file.read(1)
if not type_byte: raise EOFError()
type = TYPE[type_byte]
count = TYPE[INT].deserialize(file)
seq = [type.deserialize(file) for i in range(count)]
assert file.read(1) == ")"
return seq
The top level "deserialize" could look like:
def top_deserialize(file):
"""generates all values found in *file*."""
while True:
type_byte = file.read(1)
if not type_byte: return
yield TYPE[type_byte].deserialize(file)
More information about the Python-list
mailing list