[Python-ideas] Ideas for improving the struct module

Cameron Simpson cs at zip.com.au
Fri Jan 20 17:46:38 EST 2017


On 20Jan2017 14:47, Elizabeth Myers <elizabeth at interlinked.me> wrote:
>1) struct.unpack and struct.unpack_from should remain
>backwards-compatible. I don't want to return extra values from it like
>(length unpacked, (data...)) for that reason.

Fully agree with this.

>If the calcsize solution
>feels a bit weird (it isn't much less efficient, because strings store
>their length with them, so it's constant-time), there could also be new
>functions that *do* return the length if you need it. To me though, this
>feels like a use case for struct.iter_unpack.

Often, maybe, but there are still going to be protocols that the new format 
doesn't support, where the performant thing to do (in pure Python) is to scan 
what you can with struct and "hand scan" the special bits with special code.  

Consider, for example, a format like MP4/ISO14496, where there's a regular 
block structure (which is somewhat struct parsable) that can contain embedded 
arbitraily weird information. Or the flipside where struct parsable data are 
embedded in a format not supported by struct.

The mixed situation is where you need to know where the parse got up to.  
Calling calcsize or its variable size equivalent after a parse seems needlessly 
repetetive of the parse work.

For myself, I would want there to be some kind of call that returned the parse 
and the length scanned, with the historic interface preserved for the fixed 
size formats or for users not needing the length.

>2) I want to avoid making a weird incongruity, where only
>variable-length strings return the length actually parsed.

Fully agree. Arguing for two API calls: the current one and one that also 
returns the scan length.

Cheers,
Cameron Simpson <cs at zip.com.au>


More information about the Python-ideas mailing list