Social problems of Python doc [was Re: Python docs disappointing]
python at rcn.com
Wed Aug 12 20:27:42 CEST 2009
On Aug 12, 3:32 am, Paul Boddie <p... at boddie.org.uk> wrote:
> Maybe the problem is that although everyone welcomes contributions and
> changes (or says that they do), the mechanisms remain largely beyond
FWIW, I support the idea the regular docs incorporating links to
freely editable wiki pages. That will at least make it easier for
people to make changes or add notes.
That being said, I would like to add a few thoughts about the
current process. ISTM that important corrections (when the
docs are clearly in error) tend to get made right away. What
is more interesting are the doc requests that get rejected
* Many doc requests come from people just learning the language
(that makes sense because the learning process involves reading
the docs). Unfortunately, a fair number of those requests are
flat-out wrong or represent a profound misunderstanding of the
feature in question. That may be an indicator that the docs
need to be improved, but the specific suggestion can be inane.
* Some doc requests come from people who simply do not like the
feature in question. It is natural for tastes, styles, and
preferences to vary; however, we do have a firm rule that Guido's
tastes, styles, and preferences are the ones that go into the
language. So the doc writers need to try to channel Guido instead
of fighting him. So, if you think eval() is evil (I don't but many
do), we're not going to document that eval() should *never* be used.
If you hate super(), that's too bad -- the docs need to describe
what it does and how it was intended to be used -- the docs are no
place for diatribes on why inheritance is over-used and why you
think the world would be a better place without mixins or
* Then, there is a matter of where to put a particular piece of
documentation (how many times do you repeat a concept that pervades
the language). Hashing is a good example. The docs can discuss how
some objects hash to their object id and that object ids can change
from run-to-run, but if someone gets tripped-up by the idea (hey,
my set reprs changed between runs, wtf!), they want the docs updated
in the specific place that tripped them up (i.e. you must put big
red warnings in the set documentation, and the dict documentation,
and everywhere else a hash gets used ...). The core problem is that
the docs are interrelated -- the one place you're looking for
documentation of a specific builtin or function can't contain
every concept in the language.
* Some behaviors are intentionally left unspecified. For the longest
time, Tim did not want to guarantee sort stability. This left him
free to continue to search for algorithmic improvements that did not
have stability. Later, the property was deemed so important that it
did become a guaranteed behavior. Also, some things are unspecified
to make things easier for other implementations (IronPython, PyPy,
Jython, etc.) We need to make sure that some one doesn't casually
go through the docs making doc promises that are hard to keep.
* Some things are just plain difficult to fully explain in English
and not misrepresent that actual behavior. For example, the str.split
docs have been continuously tweaked over the years. Even now, I think
there are corner cases that are not fully captured by the docs.
edits to str.split() docs are more likely than not to take them
away from the truth.
* Then, there is the problem of talking too much. A book could be
written about super(), but that shouldn't all go into the docs for
the super builtin. Beginners often want to master all the builtins
and they try to read the doc page on builtin functions. It used to be
that you could read through the builtin descriptions in a few minutes.
Now, it takes a concerted effort to get through. It is hard to take
a sip of water from a firehose. Too much information has make a
function harder to understand.
* My biggest pet peeve are requests to fill the docs with big red
warnings. I do not want the docs to look like a mine field. The
should be reserved for a handful of security or data corruption risks.
For the most part, the docs should be matter-of-fact, explaining what
a function or method does and how it was intended to be used.
Preferred: "The value str.letters is locale dependent"
Not preferred: "Caution, the str.letters value can be adversely
affected by the locale setting (it could even change length!); use
only when you are certain the locale setting will not violate any of
your program invariants; consider using a string literal instead; I
string.letters and think Guido was smoking crack when it was
* Another category of rejected doc requests come from people looking
absolution from one of their programming bugs. It typically takes the
form of, "I made an assumption that the language did X, but it did Y
and my program didn't do what I wanted; therefore, the docs must be
to blame and they must change ...". The suggestion is "I was
implies "the docs are hosed". The fact is that people with diffferent
backgrounds are going to have different expectations and someone is
going to get "surprised". The docs need to say what functions do, but
they don't need to be changed everytime someone writes a buggy
In short, most doc requests that get rejected are requests that didn't
actually improve the documentation.
I do support links from the regular docs to an external wiki but the
main docs should continue to go through the regular process using the
More information about the Python-list