[Python-ideas] struct.unpack should support open files

Andrew Svetlov andrew.svetlov at gmail.com
Wed Dec 26 05:18:23 EST 2018

On Wed, Dec 26, 2018 at 11:26 AM Steven D'Aprano <steve at pearwood.info>

> On Wed, Dec 26, 2018 at 09:48:15AM +0200, Andrew Svetlov wrote:
> > The perfect demonstration of io objects complexity.
> > `stream.read(N)` can return None by spec if the file is non-blocking
> > and have no ready data.
> >
> > Confusing but still possible and documented behavior.
> https://docs.python.org/3/library/io.html#io.RawIOBase.read
> Regardless, my point doesn't change. That has nothing to do with the
> behaviour of unpack. If you pass a non-blocking file-like object which
> returns None, you get exactly the same exception as if you wrote
>     unpack(fmt, f.read(size))
> and the call to f.read returned None. Why is it unpack's responsibility
> to educate the caller that f.read can return None?
> Let's see what other functions with similar APIs do.
> py> class FakeFile:
> ...     def read(self, n=-1):
> ...             return None
> ...     def readline(self):
> ...             return None
> ...
> py> pickle.load(FakeFile())
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: a bytes-like object is required, not 'NoneType'
> py> json.load(FakeFile())
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/lib/python3.5/json/__init__.py", line 268, in load
>     parse_constant=parse_constant, object_pairs_hook=object_pairs_hook,
> **kw)
>   File "/usr/local/lib/python3.5/json/__init__.py", line 312, in loads
>     s.__class__.__name__))
> TypeError: the JSON object must be str, not 'NoneType'
> If it is good enough for pickle and json load() functions to report a
> TypeError like this, it is good enough for unpack().
> Not every exception needs a custom error message.
> > You need to repeat reads until collecting the value of enough size.
> That's not what the OP has asked for, it isn't what the OP's code does,
> and its not what I've suggested.
> Do pickle and json block and repeat the read until they have a complete
> object? I'm pretty sure they don't -- the source for json.load() that I
> have says:
>     return loads(fp.read(), ... )
> so it definitely doesn't repeat the read. I think it is so unlikely that
> pickle blocks waiting for extra input that I haven't even bothered to
> look. Looping and repeating the read is a clear case of YAGNI.
> json is correct: if `read()` is called without argument it reads the whole
content until EOF.
But with size argument the is different for interactive and non-interactive
RawIOBase and BufferedIOBase also have slightly different behavior for

Restriction fp to BufferedIOBase looks viable though, but it is not a
file-like object.

Also I'm thinking about type annotations in typeshed.
Now the type is Union[array[int], bytes, bytearray, memoryview]
Should it be Union[io.BinaryIO, array[int], bytes, bytearray, memoryview] ?

What is behavior of unpack_from(fp, offset=120)?
Should iter_unpack() read the whole buffer from file into a memory before
emitting a first value?

> Don't over-engineer the function, and then complain that the over-
> engineered function is too complex. There is no need for unpack() to
> handle streaming input which can output anything less than a complete
> struct per read.
> > `.read(N)` can return less bytes by definition,
> Yes, we know that. And if it returns fewer bytes, then you get a nice,
> clear exception.
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/

Andrew Svetlov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20181226/26313f9c/attachment-0001.html>

More information about the Python-ideas mailing list