[issue23787] sum() function docstring lists arguments incorrectly
New submission from Valentine Sinitsyn: sum() function doctstring describes expected arguments as follows (Python 2.7.6): sum(...) sum(sequence[, start]) -> value ... This implies sum() should accept str, unicode, list, tuple, bytearray, buffer, and xrange. However, you clearly can't use this function to sum strings (which is also mentioned in the docstring):
sum('abc') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'int' and 'str'
I'd suggest to describe first argument as iterable, which is actually what sum() expects there.
----------
assignee: docs@python
components: Documentation
messages: 239388
nosy: docs@python, vsinitsyn
priority: normal
severity: normal
status: open
title: sum() function docstring lists arguments incorrectly
type: enhancement
versions: Python 2.7
_______________________________________
Python tracker
R. David Murray added the comment:
In python3 the docstring does say iterable. It wouldn't be a bad thing to change it in 2.7, but it is not much of a priority. iterable vs sequence makes no difference to the str question: a string is an iterable. The docstring explicitly says strings are excepted, as you mentioned, so there's nothing to do about that.
I note that python3 also does not support iterables of byte-like objects. I'm not sure if this would actually be helpful to add to the docstring, though, since sum(b'abc') works and a docstring is probably not an appropriate place to go into detail as to why.
----------
nosy: +r.david.murray
_______________________________________
Python tracker
Valentine Sinitsyn added the comment:
Yes, strings aren't an issue. I only used them as an example.
I came across this issue during code review, discussing if it is okay to pass generator expression to sum() (like sum(x*2 for x in xrange(5)) or is it better to convert it to the list first (sum([x*2 for x in xrange(5)])). Both variants work so docstring is sort of specification here.
Surely, it's not a high priority task anyways.
----------
_______________________________________
Python tracker
Changes by Raymond Hettinger
Wolfgang Maier added the comment:
This implies sum() should accept str, unicode, list, tuple, bytearray, buffer, and xrange.
and in fact it *does* accept all these as input. It just refuses to add the elements of the sequence if these elements are of certain types. Of course, the elements of a string are strings themselves so this does not work:
sum('abc', '') Traceback (most recent call last): File "
", line 1, in <module> sum('abc', '') TypeError: sum() can't sum strings [use ''.join(seq) instead]
compare with a bytes sequence in Python3, where the elements are ints:
sum(b'abc', 0) 294
but strings are also perfectly accepatble as input if you do not try to add their str elements, but something else:
class X (int): def __add__(self, other): return X(ord(other) + self)
sum('abc', X(0)) 294
=> the docs are right and there is no issue here.
----------
nosy: +wolma
_______________________________________
Python tracker
Valentine Sinitsyn added the comment:
Seems like mentioning string was really a bad idea. They were only used as (poor) example, forget them if they are confusing in any way.
In my understanding, any sequence in Python is iterable, bit not all iterables are sequences (correct me if I'm wrong). Then, the purpose of my suggestion is to explicitly say that sum() accepts iterables. In its current form, it seems like it doesn't, that's why I considered the docstring [subtly] wrong.
----------
_______________________________________
Python tracker
Serhiy Storchaka added the comment:
Raymond, could you open a pull request?
----------
keywords: +easy
nosy: +serhiy.storchaka
priority: normal -> low
status: open -> pending
_______________________________________
Python tracker
Raymond Hettinger added the comment: [Serhiy]
Raymond, could you open a pull request?
Perhaps you could do it for me. I still haven't had time to wrestle with the github switchover, so I'm effectively crippled for a while. [Valentine]
Seems like mentioning string was really a bad idea .... that's why I considered the docstring [subtly] wrong.
Not really wrong in a way that confuses typical users. That docstring has been successfully communicating the basic API for over a decade.
Over time, the docs have slowly converted the old "sequence" references to "iterable". The docs were never really wrong; instead, we just got more precise by what we meant by sequence versus iterable (i.e. before the ABCs were introduced, the term "sequence" was used in a somewhat generic way to mean "a succession of data values").
Also note, it is an interesting paradox that docstrings that are the most helpful to most people most of the time are brief and little loose with terminology. In general, they reward those who are doing quick lookups for API reminders, but do not reward pedantic close readings.
We'll go ahead and change "sequence" to "iterable" for sum(), but I think that is only a minor win. The change makes it more technically correct but less friendly to some users (i.e. people need to be taught what "iterable" means while they tend to get the notion of "sequence of values" without any training).
As far as the exclusion of string goes, there were plenty of debate about whether to allow them or to more broadly disallow many data types where summing works quadratically. The final decision was made by the BDFL and it seems to have been the right decision for just about everyone. You can take issue with his decision, but that would be pointless.
----------
nosy: +rhettinger
status: pending -> open
_______________________________________
Python tracker
Changes by Mariatta Wijaya
Raymond Hettinger added the comment:
I believe this is just a 2.7 issue.
----------
versions: -Python 3.5, Python 3.6, Python 3.7
_______________________________________
Python tracker
Changes by Mariatta Wijaya
Changes by Mariatta Wijaya
Mariatta Wijaya added the comment:
New changeset 536209ef92f16ea8823209a3c4b8763c0ec5d4bc by Mariatta in branch '2.7':
bpo-23787: Change sum() docstring from sequence to iterable (GH-1859)
https://github.com/python/cpython/commit/536209ef92f16ea8823209a3c4b8763c0ec...
----------
_______________________________________
Python tracker
Mariatta Wijaya added the comment:
Raymond's patch has been applied to 2.7 branch.
Thanks :)
----------
resolution: -> fixed
stage: patch review -> resolved
status: open -> closed
_______________________________________
Python tracker
participants (6)
-
Mariatta Wijaya
-
R. David Murray
-
Raymond Hettinger
-
Serhiy Storchaka
-
Valentine Sinitsyn
-
Wolfgang Maier