[Python-ideas] Ideas for improving the struct module
Cameron Simpson
cs at zip.com.au
Fri Jan 20 17:46:38 EST 2017
On 20Jan2017 14:47, Elizabeth Myers <elizabeth at interlinked.me> wrote:
>1) struct.unpack and struct.unpack_from should remain
>backwards-compatible. I don't want to return extra values from it like
>(length unpacked, (data...)) for that reason.
Fully agree with this.
>If the calcsize solution
>feels a bit weird (it isn't much less efficient, because strings store
>their length with them, so it's constant-time), there could also be new
>functions that *do* return the length if you need it. To me though, this
>feels like a use case for struct.iter_unpack.
Often, maybe, but there are still going to be protocols that the new format
doesn't support, where the performant thing to do (in pure Python) is to scan
what you can with struct and "hand scan" the special bits with special code.
Consider, for example, a format like MP4/ISO14496, where there's a regular
block structure (which is somewhat struct parsable) that can contain embedded
arbitraily weird information. Or the flipside where struct parsable data are
embedded in a format not supported by struct.
The mixed situation is where you need to know where the parse got up to.
Calling calcsize or its variable size equivalent after a parse seems needlessly
repetetive of the parse work.
For myself, I would want there to be some kind of call that returned the parse
and the length scanned, with the historic interface preserved for the fixed
size formats or for users not needing the length.
>2) I want to avoid making a weird incongruity, where only
>variable-length strings return the length actually parsed.
Fully agree. Arguing for two API calls: the current one and one that also
returns the scan length.
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Python-ideas
mailing list