[Python-ideas] struct.unpack should support open files

Steven D'Aprano steve at pearwood.info
Wed Dec 26 06:42:30 EST 2018

On Wed, Dec 26, 2018 at 12:18:23PM +0200, Andrew Svetlov wrote:

> > json is correct: if `read()` is called without argument it reads the whole
> content until EOF.
> But with size argument the is different for interactive and non-interactive
> streams.
> RawIOBase and BufferedIOBase also have slightly different behavior for
> `.read()`.

This is complexity that isn't the unpack() function's responsibility to 
care about. All it wants is to call read(N) and get back N bytes. If it 
gets back anything else, that's an error.

> Restriction fp to BufferedIOBase looks viable though, but it is not a
> file-like object.

There is no need to restrict it to BufferedIOBase. In hindsight, I am 
not even sure we should do an isinstance check at all. Surely all we 
care about is that the object has a read() method which takes a single 
argument, and returns that number of bytes?

Here's another proof-of-concept implementation which doesn't require any 
isinstance checks on the argument. The only type checking it does is to 
verify that the read returns bytes, and even that is only a convenience 
so it can provide a friendly error message.

def unpackStruct(fmt, frm):
        read = frm.read
    except AttributeError:
        return struct.unpack(fmt, frm)
    n = struct.calcsize(fmt)
    value = read(n)
    if not isinstance(value, bytes):
        raise TypeError('read method must return bytes')
    if len(value) != n:
        raise ValueError('expected %d bytes but only got %d' % (n, len(value)))
    return struct.unpack(fmt, value)

> What is behavior of unpack_from(fp, offset=120)?

I don't know. What does the "offset" parameter do, and who requested it? 
I didn't, and neither did the OP Drew Warwick.

James Edwards wrote that he too uses a similar function in production, 
one which originally did support file seeking, but they took it out.

If you are suggesting an offset parameter to the unpack() function, it 
is up to you to propose what meaning it will have and justify why it 
should be part of unpack's API. Until then, YAGNI.

> Should iter_unpack() read the whole buffer from file into a memory before
> emitting a first value?

Nobody has requested any changes to iter_unpack().


More information about the Python-ideas mailing list