[issue27000] improve document of filter
New submission from Xiang Zhang: I think filter's doc can be improved[1]: 1. It doesn't mention ``bool``. ``bool`` is treated the same way as ``None``. It is not called. But this is not mentioned. 2. 'the identity function is assumed' is confusing, at least for me. It looks like when ``None`` is passed, *function* is set to a default func, lambda x: x. Then *function* is called and we identify the return value True or False. But this is not the truth. There is no default value and no function is applied. I think this should be deleted. [1] https://docs.python.org/3/library/functions.html#filter ---------- assignee: docs@python components: Documentation files: filter_doc.patch keywords: patch messages: 265325 nosy: docs@python, xiang.zhang priority: normal severity: normal status: open title: improve document of filter versions: Python 3.5, Python 3.6 Added file: http://bugs.python.org/file42818/filter_doc.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Franklin? Lee added the comment: Aren't these both implementation details? As in, they only affect efficiency, not effect, right? ---------- nosy: +leewz _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Josh Rosenberg added the comment: bool is not enough of a special case to call it out without confusing the issue. No, the bool constructor is not actually called. But it will still behave as if it was called for all intents and purpose, it just skips the reference counting shenanigans for the actual True/False singleton objects. Drawing a distinction might make people worry that it wouldn't invoke __len__ or __bool__ as normal. Similarly, for all intents and purposes, your mental model of the identity function is mostly correct (I suspect the wording meant to use "function" in the mathematical sense, but it works either way). Yes, it never actually calls a function, but that's irrelevant to observed behavior. Your only mistake is in assuming the function actually returns the specific values True or False; no filter function needs to return True or False, they simply evaluate for truth or falsehood (that's why filter's docs use "true" and "false" to describe it, not "True" and "False"). filter(str.strip, list_of_strings) is perfectly legal, and yields those strings that contain non-whitespace characters. ---------- nosy: +josh.r _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Josh Rosenberg added the comment: Franklin said it better: The only difference between documentation and behavior is invisible implementation details, which have no business being documented in any event (since they needlessly tie the hands of maintainers of CPython and other Python interpreters, while providing no useful benefit). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Xiang Zhang added the comment: First I have to clarify that my mistake is not in understanding but in writing. What I mean by 'identify the return value True or False' is actually what you say, 'evaluate for truth or falsehood'. I also notice the lowercase false and true in the doc. I know they are deliberate. Sorry about this. For ``bool``, I almost agree with you now. Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth. And for identity function, I insist. I don't see any advantage with this sentence other than confusion. I don't think this will affect other implementation either. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Raymond Hettinger added the comment:
bool is not enough of a special case to call it out without confusing the issue.
I concur. It would be easy to make the docs less usable by elaborating on this special case. For the most part, a user should use None if they just want to test the truth value of the input. The docs for filter() have been through a number of revisions and much discussion. Let's not undo previous efforts. For the most part, these docs have been successful in communicating what filter() does. ---------- nosy: +rhettinger _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Franklin? Lee added the comment:
Although I still think it's telling readers incorrect info in the second part. For ``bool``, it is not equivalent to ``(item for item in iterable if function(item))`` but ``(item for item in iterable if item)``. For CPython, you are not telling the truth.
What do you mean by, "it is not equivalent"? Are you saying that the first one will give a different result from the second? In general, when interpreting an object in a boolean context, Python will do the "equivalent" of calling ``bool`` on it, where "equivalent" in the docs means "has the same result as". See, for example, the ``itertools`` docs: https://docs.python.org/3/library/itertools.html#itertools.accumulate -------- In this case: If ``filter`` is passed ``None`` or ``bool``, it will call "PyObject_IsTrue" on the object. (https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e97...) "PyObject_IsTrue" is defined here: https://github.com/python/cpython/blob/6aea3c26a22c5d7e3ffa3d725d8d75dac0e1b... On the other hand, ``bool`` is defined here, as "PyBool_Type": https://github.com/python/cpython/blob/c750281ef5d8fa89d13990792163605302e97... "PyBool_Type" is defined here, with the ``bool.__new__`` function defined as "bool_new": https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7... "bool_new" is defined here, using "PyObject_IsTrue": https://github.com/python/cpython/blob/2d264235f6e066611b412f7c2e1603866e0f7... Both "filter_next" and "bool_new" call "PyObject_IsTrue" and take 0 as False, positive as True, and negative as an error. So it's equivalent to calling ``bool``, but the "bool_new" call is sort of inlined. Does that clear things up? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
R. David Murray added the comment: Thanks for wanting to improve the docs, but the docs are a specification of syntax and behavior, not of implementation. So, the existing docs are correct, and changing them would over-specify the function. Since Raymond has also voted for rejection, I'm closing this. ---------- nosy: +r.david.murray resolution: -> not a bug stage: -> resolved status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Xiang Zhang added the comment: It's OK. Thanks for all your info and do learn. BTW, Franklin, I knew what will happen when ``bool`` is passed. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Franklin? Lee added the comment: In that case, I'm still wondering what you mean by "not equivalent". Are you saying there is code which will work only if the ``bool`` function is really called? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
Xiang Zhang added the comment: Not about code, just the doc. In my opinion, if ``bool`` is not called it is definitely not equivalent to ``(item for item in iterable if function(item))``, which actually calls the function, even there is nothing different in the result. But, this is a rather subjective and not important now. I am OK with all your opinions. And considering other interpreters, leaving it untouched is a good idea. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue27000> _______________________________________
participants (5)
-
Franklin? Lee -
Josh Rosenberg -
R. David Murray -
Raymond Hettinger -
Xiang Zhang