<div dir="ltr"><div dir="ltr">The proposal can generate cryptic messages like<br>`a bytes-like object is required, not 'NoneType'`<br><br>To produce more informative exception text all mentioned cases should be handled:<br><br>> - read partial structs from non-blocking files without failing<br>> - deal with file system errors without failing<br>> - support reading from text files when bytes are required without failing<br>> - if an exception occurs, the state of the file shouldn't change<br>I can add a couple of cases but the list is long enough for demonstration purposes.<br><br>When a user calls <br>    unpack(fmt, f.read(calcsize(fmt))<br>the user is responsible for handling all edge cases (or ignore them most likely).<br><br>If it is a part of a library -- robustness is the library responsibility.</div></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Dec 24, 2018 at 11:23 PM Steven D'Aprano <<a href="mailto:steve@pearwood.info">steve@pearwood.info</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Dec 24, 2018 at 03:36:07PM +0000, Paul Moore wrote:<br>

<br>

> > There should be no difference whether the text comes from a literal, a<br>

> > variable, or is read from a file.<br>

> <br>

> One difference is that with a file, it's (as far as I can see)<br>

> impossible to determine whether or not you're going to get bytes or<br>

> text without reading some data (and so potentially affecting the state<br>

> of the file object).<br>

<br>

Here are two ways: look at the type of the file object, or look at the <br>

mode of the file object:<br>

<br>

py> f = open('/tmp/spam.binary', 'wb')<br>

py> g = open('/tmp/spam.text', 'w')<br>

py> type(f), type(g)<br>

(<class '_io.BufferedWriter'>, <class '_io.TextIOWrapper'>)<br>

<br>

py> f.mode, g.mode<br>

('wb', 'w')<br>

<br>

<br>

> This might be considered irrelevant <br>

<br>

Indeed :-)<br>

<br>

<br>

> (personally,<br>

> I don't see a problem with a function definition that says "parameter<br>

> fd must be an object that has a read(length) method that returns<br>

> bytes" - that's basically what duck typing is all about) but it *is* a<br>

> distinguishing feature of files over in-memory data.<br>

<br>

But it's not a distinguishing feature between the proposal, and writing:<br>

<br>

unpack(fmt, f.read(size))<br>

<br>

which will also read from the file and affect the file state before <br>

failing. So its a difference that makes no difference.<br>

<br>

<br>

> There is also the fact that read() is only defined to return *at most*<br>

> the requested number of bytes. Non-blocking reads and objects like<br>

> pipes that can return additional data over time add extra complexity.<br>

<br>

How do they add extra complexity?<br>

<br>

According to the proposal, unpack() attempts the read. If it returns the <br>

correct number of bytes, the unpacking succeeds. If it doesn't, you get <br>

an exception, precisely the same way you would get an exception if you <br>

manually did the read and passed it to unpack().<br>

<br>

Its the caller's responsibility to provide a valid file object. If your <br>

struct needs 10 bytes, and you provide a file that returns 6 bytes, you <br>

get an exception. There's no promise made that unpack() should repeat <br>

the read over and over again, hoping that its a pipe and more data <br>

becomes available. It either works with a single read, or it fails.<br>

<br>

Just like similar APIs as those provided by pickle, json etc which <br>

provide load() and loads() functions.<br>

<br>

In hindsight, the precedent set by pickle, json, etc suggests that we <br>

ought to have an unpack() function that reads from files and an <br>

unpacks() function that takes a string, but that ship has sailed.<br>

<br>

<br>

> Again, not insoluble, and potentially simple enough to handle with<br>

> "read N bytes, if you got something other than bytes or fewer than N<br>

> of them, raise an error", but still enough that the special cases<br>

> start to accumulate.<br>

<br>

I can understand the argument that the benefit of this is trivial over <br>

<br>

    unpack(fmt, f.read(calcsize(fmt))<br>

<br>

Unlike reading from a pickle or json record, its pretty easy to know how <br>

much to read, so there is an argument that this convenience method <br>

doesn't gain us much convenience.<br>

<br>

But I'm just not seeing where all the extra complexity and special case <br>

handing is supposed to be, except by having unpack make promises that <br>

the OP didn't request:<br>

<br>

- read partial structs from non-blocking files without failing<br>

- deal with file system errors without failing<br>

- support reading from text files when bytes are required without failing<br>

- if an exception occurs, the state of the file shouldn't change<br>

<br>

Those promises *would* add enormous amounts of complexity, but I don't <br>

think we need to make those promises. I don't think the OP wants them, <br>

I don't want them, and I don't think they are reasonable promises to <br>

make.<br>

<br>

<br>

> The suggestion is a nice convenience method, and probably a useful<br>

> addition for the majority of cases where it would do exactly what was<br>

> needed, but still not completely trivial to actually implement and<br>

> document (if I were doing it, I'd go with the naive approach, and just<br>

> raise a ValueError when read(N) returns anything other than N bytes,<br>

> for what it's worth).<br>

<br>

Indeed. Except that we should raise precisely the same exception type <br>

that struct.unpack() currently raises in the same circumstances:<br>

<br>

py> struct.unpack("ddd", b"a")<br>

Traceback (most recent call last):<br>

  File "<stdin>", line 1, in <module><br>

struct.error: unpack requires a bytes object of length 24<br>

<br>

rather than ValueError.<br>

<br>

<br>

<br>

-- <br>

Steve<br>

_______________________________________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org" target="_blank">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/codeofconduct/</a><br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">Thanks,<br>Andrew Svetlov</div>