[issue16518] add "buffer protocol" to glossary
New submission from Chris Jerdonek: This issue is to add "buffer protocol" (or perhaps "buffer object") to the glossary. The concept is currently described here: http://docs.python.org/dev/c-api/buffer.html#buffer-protocol Éric initially suggested doing this in the comments to issue 13538. Such a glossary entry would be useful because the buffer protocol (or buffer object) should likely be cited, for example, wherever a function accepts a bytes object, bytearray object, or any object that supports the buffer protocol. The str() constructor is one example where this is done: http://hg.python.org/cpython/file/59acd5cac8b5/Doc/library/functions.rst#l12... "Buffer object" might be the more useful term to add to the glossary because it would help to have a briefer way of saying "any object that supports the buffer protocol." (I'm assuming this is what "buffer object" actually means.) The patch for this issue should also do a comprehensive review of occurrences of buffer object/protocol throughout the docs and add or update links and index entries where appropriate. ---------- assignee: docs@python components: Documentation messages: 176042 nosy: chris.jerdonek, docs@python, eric.araujo, ezio.melotti, pitrou priority: normal severity: normal status: open title: add "buffer protocol" to glossary type: enhancement versions: Python 3.2, Python 3.3, Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Changes by Ezio Melotti <ezio.melotti@gmail.com>: ---------- stage: -> needs patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Terry J. Reedy added the comment: I would use the term that is currently used in some error messages. ---------- nosy: +terry.reedy _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment: "Buffer protocol" is the right term. "Buffer object" doesn't mean anything in Python 3 and, furthermore, it might be mixed up with the Python 2 `buffer` type. As for the error messages, they are generally very bad on this topic, so I would vote to change them :-) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Chris Jerdonek added the comment: Do we have a recommended (and preferably briefer) way of saying, "any object that supports the buffer protocol"? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
Do we have a recommended (and preferably briefer) way of saying, "any object that supports the buffer protocol"?
It depends where. There's no recommended way yet, but I would vote for "bytes-like object" in error messages that are targetted at the average developer. The docs (glossary?) could explain that "bytes-like object" is the same as "buffer-providing object" or "object implementing the buffer protocol". ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Chris Jerdonek added the comment: s/any// ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment:
"Buffer object" doesn't mean anything in Python 3 and, furthermore, it might be mixed up with the Python 2 `buffer` type.
Agreed.
As for the error messages, they are generally very bad on this topic, so I would vote to change them :-)
I would say that they are verbose maybe, but not necessary bad. Using "any object that supports the buffer protocol" without explicitly mentioning bytes (and bytearray) might end up being even more confusing (if that's what it's being proposed). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment:
I would vote for "bytes-like object"
Sounds like a good compromise between brevity and clarity to me. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Stefan Krah added the comment: I wouldn't use "bytes-like object". One can certainly argue that *memoryview* should be bytes-like as a matter of preference, but the buffer protocol specifies strongly (or even statically) typed multi-dimensional arrays. PEP-3118 Py_buffer structs are essentially how NumPy works internally. ---------- nosy: +skrah _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
I wouldn't use "bytes-like object". One can certainly argue that *memoryview* should be bytes-like as a matter of preference, but the buffer protocol specifies strongly (or even statically) typed multi-dimensional arrays.
Ach :-(
PEP-3118 Py_buffer structs are essentially how NumPy works internally.
Well, we should still write a Python documentation, not a NumPy documentation (on this tracker anyway). Outside of NumPy, there's little use for multi-dimensional objects. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Stefan Krah added the comment: Antoine Pitrou <report@bugs.python.org> wrote:
PEP-3118 Py_buffer structs are essentially how NumPy works internally.
Well, we should still write a Python documentation, not a NumPy documentation (on this tracker anyway). Outside of NumPy, there's little use for multi-dimensional objects.
Ok, but people should not be surprised if their (Python) array.array() of double or their array of ctypes structs is silently accepted by some byte consuming function. How about "object does not provide a byte buffer" for error messages and "(byte) buffer provider" as a shorthand for "any buffer provider that exposes its memory as a sequence of unsigned bytes in response to a PyBUF_SIMPLE request"? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
Well, we should still write a Python documentation, not a NumPy documentation (on this tracker anyway). Outside of NumPy, there's little use for multi-dimensional objects.
Ok, but people should not be surprised if their (Python) array.array() of double or their array of ctypes structs is silently accepted by some byte consuming function.
Probably. My own (humble :-)) opinion is that array.array() is a historical artifact, and its use doesn't seem to be warranted in modern Python code. ctypes is obviously a very special library, and not for the faint of heart.
How about "object does not provide a byte buffer" for error messages and "(byte) buffer provider" as a shorthand for "any buffer provider that exposes its memory as a sequence of unsigned bytes in response to a PyBUF_SIMPLE request"?
It's not too bad, I think. However, what I think is important is that the average (non-expert) Python developer understand that the function really accepts a bytes object, and other similar types (because, really, bytes is the only bytes-like type most developers will ever face). That's why I'm proposing "bytes-like object". ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Stefan Krah added the comment: Antoine Pitrou <report@bugs.python.org> wrote:
How about "object does not provide a byte buffer" for error messages and "(byte) buffer provider" as a shorthand for "any buffer provider that exposes its memory as a sequence of unsigned bytes in response to a PyBUF_SIMPLE request"?
It's not too bad, I think. However, what I think is important is that the average (non-expert) Python developer understand that the function really accepts a bytes object, and other similar types (because, really, bytes is the only bytes-like type most developers will ever face). That's why I'm proposing "bytes-like object".
If it is somehow possible to establish the term as a shorthand for the real meaning, then I guess it's the most economical option for documenting Python methods (I don't think it should be used in the C-API docs though). help (b''.join) for example would sound better with "bytes-like object" than with "(byte) buffer provider". ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Chris Jerdonek added the comment:
I wouldn't use "bytes-like object".
What about "buffer-like object"? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
I wouldn't use "bytes-like object".
What about "buffer-like object"?
"buffer-like" means "like a buffer" which is wrong on two points: - "buffer" is not defined at this point, so the user doesn't understand what it means - we are not talking about an object which is "like a buffer", but which "provides a buffer" ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Chris Jerdonek added the comment:
That's why I'm proposing "bytes-like object".
If it is somehow possible to establish the term as a shorthand for the real meaning,
This can be established via the glossary. We can still use "buffer provider" for the general case, if we find that it is useful in certain circumstances. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Chris Jerdonek added the comment: After this issue is resolved, the binascii docs can be updated as suggested in issue 16724. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Changes by Florent Xicluna <florent.xicluna@gmail.com>: ---------- nosy: +flox _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: Here's a patch that adds "bytes-like object" to the glossary, links to the buffer protocol docs[0] and provides bytes and bytearray as examples. [0]: http://docs.python.org/dev/c-api/buffer.html#buffer-protocol ---------- keywords: +patch stage: needs patch -> patch review versions: +Python 2.7 -Python 3.2 Added file: http://bugs.python.org/file30065/issue16518.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Roundup Robot added the comment: New changeset 474f28bf67b3 by Ezio Melotti in branch '3.3': #16518: add "bytes-like object" to the glossary. http://hg.python.org/cpython/rev/474f28bf67b3 New changeset 747cede24367 by Ezio Melotti in branch 'default': #16518: merge with 3.3. http://hg.python.org/cpython/rev/747cede24367 New changeset 1b92a0112f5d by Ezio Melotti in branch '2.7': #16518: add "bytes-like object" to the glossary. http://hg.python.org/cpython/rev/1b92a0112f5d ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Roundup Robot added the comment: New changeset d1aa8a9eba44 by Ezio Melotti in branch '2.7': #16518: fix links in glossary entry. http://hg.python.org/cpython/rev/d1aa8a9eba44 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: The attached patch replaces things like "object that support the buffer protocol/interface/API" with "bytes-like objects" throughout the docs. The patch doesn't change error messages/docstrings. I also noticed that on 2.7[0], the section about the buffer protocol in Doc/c-api/buffer.rst is called "Buffers and Memoryview Objects" and it's not as clear as the one on 3.x[1]. Should this section be backported? [0]: http://docs.python.org/2.7/c-api/buffer.html#bufferobjects [1]: http://docs.python.org/dev/c-api/buffer.html#bufferobjects ---------- Added file: http://bugs.python.org/file30089/issue16518-2.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
I also noticed that on 2.7[0], the section about the buffer protocol in Doc/c-api/buffer.rst is called "Buffers and Memoryview Objects" and it's not as clear as the one on 3.x[1]. Should this section be backported?
The "buffer protocol" situation is different on 2.x, please let's concentrate on 3.x :-) ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Roundup Robot added the comment: New changeset 003e4eb92683 by Ezio Melotti in branch '3.3': #16518: use "bytes-like object" throughout the docs. http://hg.python.org/cpython/rev/003e4eb92683 New changeset d4912244cce6 by Ezio Melotti in branch 'default': #16518: merge with 3.3. http://hg.python.org/cpython/rev/d4912244cce6 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: The attached patch uses "bytes-like objects" in the error messages. ---------- Added file: http://bugs.python.org/file30124/issue16518-3.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Antoine Pitrou added the comment:
The attached patch uses "bytes-like objects" in the error messages.
I'm surprised your patch doesn't touch Python/getargs.c. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: FWIW I was grepping for buffer protocol/interface/api, and then double-checking for "buffer" in the resulting files. Python/getargs.c doesn't seem to mention the buffer protocol/interface/api at all. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: Updated patch to include getargs.c too. ---------- stage: patch review -> commit review Added file: http://bugs.python.org/file30138/issue16518-4.diff _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Raymond Hettinger added the comment: At first-reading, it looks like matters were made more confusing with "bytes-like object" as a defined term. ---------- nosy: +rhettinger _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: Can you elaborate? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Roundup Robot added the comment: New changeset e7e8a218737a by R David Murray in branch 'default': #16518: Bring error messages in harmony with docs ("bytes-like object") https://hg.python.org/cpython/rev/e7e8a218737a ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
R. David Murray added the comment: Committed the message changes to 3.5 only, since it will probably cause tests to fail in various projects, despite messages not being a formal part of the python API. Per IRC conversation with Ezio and Antoine, I posted a note to python-dev to let people know we now have a consistent terminology in the docs and error messages, and to provide a last opportunity for objections (it is easy enough to back the patch out if there is an outcry, but I don't expect one). ---------- nosy: +r.david.murray resolution: -> fixed stage: commit review -> resolved status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Serhiy Storchaka added the comment: There are other unfixed messages (may be introduced after 3.3):
b''.join(['']) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sequence item 0: expected bytes, bytearray, or an object with the buffer interface, str found str(42, 'utf8') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: coercing to str: need bytes, bytearray or buffer-like object, int found import array; array.array('B').frombytes(array.array('I')) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: string/buffer of bytes required. import socket; print(socket.socket.sendmsg.__doc__) sendmsg(buffers[, ancdata[, flags[, address]]]) -> count
Send normal and ancillary data to the socket, gathering the non-ancillary data from a series of buffers and concatenating it into a single message. The buffers argument specifies the non-ancillary data as an iterable of buffer-compatible objects (e.g. bytes objects). The ancdata argument specifies the ancillary data (control messages) as an iterable of zero or more tuples (cmsg_level, cmsg_type, cmsg_data), where cmsg_level and cmsg_type are integers specifying the protocol level and protocol-specific type respectively, and cmsg_data is a buffer-compatible object holding the associated data. The flags argument defaults to 0 and has the same meaning as for send(). If address is supplied and not None, it sets a destination address for the message. The return value is the number of bytes of non-ancillary data sent. And there are several mentions of "buffer-like" or "buffer-compatible" in the documentation. ---------- nosy: +serhiy.storchaka _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Georg Brandl added the comment: Please open a new issue for those. ---------- nosy: +georg.brandl _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
Ezio Melotti added the comment: See #22581. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue16518> _______________________________________
participants (11)
-
Antoine Pitrou
-
Chris Jerdonek
-
Ezio Melotti
-
Florent Xicluna
-
Georg Brandl
-
R. David Murray
-
Raymond Hettinger
-
Roundup Robot
-
Serhiy Storchaka
-
Stefan Krah
-
Terry J. Reedy