
On 20/01/17 10:47, Elizabeth Myers wrote:
On 19/01/17 20:54, Cameron Simpson wrote:
On 19Jan2017 16:04, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
This is a neat idea, but this will only work for parsing framed binary protocols. For example, if you protocol prefixes all packets with a length field, you can write an efficient read buffer and use your proposal to decode all of message's fields in one shot. Which is good.
Not all protocols use framing though. For instance, your proposal won't help to write Thrift or Postgres protocols parsers.
Sure, but a lot of things fit the proposal. Seems a win: both simple and useful.
Overall, I'm not sure that this is worth the hassle. With proposal:
data, = struct.unpack('!H$', buf) buf = buf[2+len(data):]
with the current struct module:
len, = struct.unpack('!H', buf) data = buf[2:2+len] buf = buf[2+len:]
Another thing: struct.calcsize won't work with structs that use variable length fields.
True, but it would be enough for it to raise an exception of some kind. It won't break any in play code, and it will prevent accidents for users of new variable sizes formats.
We've all got things we wish struct might cover (I have a few, but strangely the top of the list is nonsemantic: I wish it let me put meaningless whitespace inside the format for readability).
+1 on the proposal from me.
Oh: subject to one proviso: reading a struct will need to return how many bytes of input data were scanned, not merely returning the decoded values.
This is a little difficult without breaking backwards compatibility, but, it is not difficult to compute the lengths yourself. That said, calcsize could require an extra parameter if given a format string with variable-length specifiers in it, e.g.:
struct.calcsize("z", (b'test'))
Would return 5 (zero-length terminator), so you don't have to compute it yourself.
Also, I filed a bug, and proposed use of Z and z.
Should I write up a PEP about this? I am not sure if it's justified or not. It's 3 changes (calcsize and two format specifiers), but it might be useful to codify it.