
On 20 January 2017 at 15:13, Nathaniel Smith <njs@pobox.com> wrote:
On Jan 20, 2017 09:00, "Paul Moore" <p.f.moore@gmail.com> wrote:
On 20 January 2017 at 16:51, Elizabeth Myers <elizabeth@interlinked.me> wrote:
Should I write up a PEP about this? I am not sure if it's justified or not. It's 3 changes (calcsize and two format specifiers), but it might be useful to codify it.
It feels a bit minor to need a PEP, but having said that did you pick up on the comment about needing to return the number of bytes consumed?
str = struct.unpack('z', b'test\0xxx')
How do we know where the unpack got to, so that we can continue parsing from there? It seems a bit wasteful to have to scan the string twice to use calcsize for this...
unpack() is OK, because it already has the rule that it raises an error if it doesn't exactly consume the buffer. But I agree that if we do this then we'd really want versions of unpack_from and pack_into that return the new offset. (Further arguments that calcsize is insufficient: it doesn't work for potential other variable length items, e.g. if we added uleb128 support; it quickly becomes awkward if you have multiple strings; in practice I think everyone who needs this would just end up writing a wrapper that calls calcsize and returns the new offset anyway, so should just provide that up front.)
For pack_into this is also easy, since currently it always returns None, so if it started returning an integer no one would notice (and it'd be kinda handy in its own right, honestly).
unpack_from is the tricky one, because it already has a return value and this isn't it. Ideally it would have worked this way from the beginning, but too late for that now... I guess the obvious solution would be to come up with a new function that's otherwise identical to unpack_from but returns a (values, offset) tuple. What to call this, though, I don't know :-). unpack_at? unpack_next? (Hinting that this is the natural primitive you'd use to implement unpack_iter.)
Yes - maybe a PEP. Then we could also, for example, add the suggestion of whitespace on the struct description string - which is nice. And we could things of: unpack methods returns a specialized object- not a tuple, which has attributes with the extra information. So, instead of a, str = struct.unpack("IB$", data) people who want the length can do: tmp = struct.unpack("IB$", data) do_things_with_len(tmp.tell) a, str = tmp The struct "object" could allow other things as well. Since we are at it, maybe a 0 copy version, that would return items from their implace buffer positions. But, ok, maybe most of this should just go in a third party package - anyway, a PEP could be open for more improvements than the variable-lenght fields proposed. (The idea of having attributes with extra information about size, for example - I think that is better than having: size, (a, str) = struct.unpack2(... ) ) js -><-
-n
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/