[Python-ideas] Ideas for improving the struct module

Cameron Simpson cs at zip.com.au
Thu Jan 19 21:54:31 EST 2017


On 19Jan2017 16:04, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>This is a neat idea, but this will only work for parsing framed
>binary protocols.  For example, if you protocol prefixes all packets
>with a length field, you can write an efficient read buffer and
>use your proposal to decode all of message's fields in one shot.
>Which is good.
>
>Not all protocols use framing though.  For instance, your proposal
>won't help to write Thrift or Postgres protocols parsers.

Sure, but a lot of things fit the proposal. Seems a win: both simple and 
useful.

>Overall, I'm not sure that this is worth the hassle.  With proposal:
>
>   data, = struct.unpack('!H$', buf)
>   buf = buf[2+len(data):]
>
>with the current struct module:
>
>   len, = struct.unpack('!H', buf)
>   data = buf[2:2+len]
>   buf = buf[2+len:]
>
>Another thing: struct.calcsize won't work with structs that use
>variable length fields.

True, but it would be enough for it to raise an exception of some kind. It 
won't break any in play code, and it will prevent accidents for users of new 
variable sizes formats.

We've all got things we wish struct might cover (I have a few, but strangely 
the top of the list is nonsemantic: I wish it let me put meaningless whitespace 
inside the format for readability).

+1 on the proposal from me.

Oh: subject to one proviso: reading a struct will need to return how many bytes 
of input data were scanned, not merely returning the decoded values.

Cheers,
Cameron Simpson <cs at zip.com.au>


More information about the Python-ideas mailing list