[Python-ideas] Ideas for improving the struct module
Steven D'Aprano
steve at pearwood.info
Wed Jan 18 20:27:20 EST 2017
On Wed, Jan 18, 2017 at 04:24:39AM -0600, Elizabeth Myers wrote:
> Hello,
>
> I've noticed a lot of binary protocols require variable length
> bytestrings (with or without a null terminator), but it is not easy to
> unpack these in Python without first reading the desired length, or
> reading bytes until a null terminator is reached.
This sounds like a fairly straight-forward feature request for the
struct module, which probably could go straight to the bug tracker.
Unfortunately I can't *quite* work out what the feature request is :-)
If you're asking for struct to support Pascal strings, with a single
byte (0...255) for the length, it already does with format code "p".
I was going to suggest P for "large" Pascal string, with the length
given by *two* bytes rather than one (0...65535), but P is already in
use. Are you proposing the "$" format code from netstruct? That would be
interesting, as it would allow format codes:
B$ standard Pascal string, like p
I$ Pascal string with a two-byte length
L$ Pascal string with a four-byte length
4294967295 bytes should be enough for anyone :-)
Another common format is "ASCIIZ", or a one-byte Pascal string including
a null terminator. People actually use this:
http://stackoverflow.com/questions/11850950/unpacking-a-struct-ending-with-an-asciiz-string
Which just leaves C-style null terminated strings. c/n/N are all already
in use; I guess that C (for C-string) or S (for c-String) are
possibilities.
All of these seem like perfectly reasonable formats for the struct
module to support. They're all in use. struct already supports
variable-width formats. I think its just a matter of raising one or more
feature requests, and then doing the work.
I guess this is just my long-winded way of saying +1.
--
Steve
More information about the Python-ideas
mailing list