[Python-Dev] an idea for improving struct.unpack api

Thomas Heller theller at python.net
Thu Jan 6 13:22:52 CET 2005


Paul Moore <p.f.moore at gmail.com> writes:

> On Thu, 6 Jan 2005 21:28:26 +1100, Anthony Baxter
> <anthony at interlink.com.au> wrote:
>> My take on this:
>> 
>>     struct.pack/struct.unpack is already one of my least-favourite parts
>>     of the stdlib. Of the modules I use regularly, I pretty much only ever
>>     have to go back and re-read the struct (and re) documentation because
>>     they just won't fit in my brain. Adding additional complexity to them
>>     seems like a net loss to me.
>
> Have you looked at Thomas Heller's ctypes? Ignoring the FFI stuff, it
> has a fairly comprehensive interface for defining and using C
> structure types. A simple example:
>
>>>> class POINT(Structure):
> ...    _fields_ = [('x', c_int), ('y', c_int)]
> ...
>>>> p = POINT(1,2)
>>>> p.x, p.y
> (1, 2)
>>>> str(buffer(p))
> '\x01\x00\x00\x00\x02\x00\x00\x00'
>
> To convert *from* a byte string is messier, but not too bad:
[...]

For reading structures from files, the undocumented (*) readinto
method is very nice. An example:

class IMAGE_DOS_HEADER(Structure):
    ....
class IMAGE_NT_HEADERS(Structure):
    ....

class PEReader(object):
    def read_image(self, pathname):
        ################
        # the MSDOS header
        image = open(pathname, "rb")
        self.dos_header = IMAGE_DOS_HEADER()
        image.readinto(self.dos_header)

        ################
        # The PE header
        image.seek(self.dos_header.e_lfanew)
        self.nt_headers = IMAGE_NT_HEADERS()
        image.readinto(self.nt_headers)


> It might even be possible to get Thomas to add a small helper
> classmethod to ctypes types, something like
>
>     POINT.unpack(str, offset=0, length=None)

Maybe, but I would prefer the unbeloved buffer object (*) as argument,
because it has builtin offset and length.

> which does the equivalent of
>
>     def unpack(cls, str, offset=0, length=None):
>         if length is None:
>             length=sizeof(cls)
>         b = buffer(str, offset, length)
>         new = cls()
>         ctypes.memmove(new, b, length)
>         return new
>
>>     I'd _love_ to find the time to write a sane replacement for struct - as
>>     well as the current use case, I'd also like it to handle things like
>>     attribute-length-value 3-tuples nicely (where you get a fixed field
>>     which identifies the attribute, a fixed field which specifies the value
>>     length, and a value of 'length' bytes). Almost all sane network protocols
>>     (i.e. those written before the plague of pointy brackets) use this in
>>     some way.
>
> I'm not sure ctypes handles that, mainly because I don't think C does
> (without the usual trick of defining the last field as fixed length)

Correct.

(*) Which brings me to the questions I have in my mind for quite some
time: Why is readinto undocumented, and what about the status of the
buffer object: do the recent fixes to the buffer object change it's
status?

Thomas



More information about the Python-Dev mailing list