[Python-ideas] struct.unpack should support open files

Steven D'Aprano steve at pearwood.info
Wed Dec 26 04:25:19 EST 2018


On Wed, Dec 26, 2018 at 09:48:15AM +0200, Andrew Svetlov wrote:

> The perfect demonstration of io objects complexity.
> `stream.read(N)` can return None by spec if the file is non-blocking
> and have no ready data.
> 
> Confusing but still possible and documented behavior.

https://docs.python.org/3/library/io.html#io.RawIOBase.read

Regardless, my point doesn't change. That has nothing to do with the 
behaviour of unpack. If you pass a non-blocking file-like object which 
returns None, you get exactly the same exception as if you wrote

    unpack(fmt, f.read(size))

and the call to f.read returned None. Why is it unpack's responsibility 
to educate the caller that f.read can return None?

Let's see what other functions with similar APIs do.


py> class FakeFile:
...     def read(self, n=-1):
...             return None
...     def readline(self):
...             return None
...
py> pickle.load(FakeFile())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: a bytes-like object is required, not 'NoneType'
py> json.load(FakeFile())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/json/__init__.py", line 268, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/local/lib/python3.5/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'NoneType'


If it is good enough for pickle and json load() functions to report a 
TypeError like this, it is good enough for unpack().

Not every exception needs a custom error message.



> You need to repeat reads until collecting the value of enough size.

That's not what the OP has asked for, it isn't what the OP's code does, 
and its not what I've suggested.

Do pickle and json block and repeat the read until they have a complete 
object? I'm pretty sure they don't -- the source for json.load() that I 
have says:

    return loads(fp.read(), ... )

so it definitely doesn't repeat the read. I think it is so unlikely that 
pickle blocks waiting for extra input that I haven't even bothered to 
look. Looping and repeating the read is a clear case of YAGNI.

Don't over-engineer the function, and then complain that the over- 
engineered function is too complex. There is no need for unpack() to 
handle streaming input which can output anything less than a complete 
struct per read.



> `.read(N)` can return less bytes by definition,

Yes, we know that. And if it returns fewer bytes, then you get a nice, 
clear exception.



-- 
Steve


More information about the Python-ideas mailing list