[Numpy-discussion] Documentation roles in the numpy/scipy documentation editor

Wed May 9 17:36:17 EDT 2012

We considered lowering the review standard near the end of my direct
involvement in the doc project but decided not to.  You didn't mention
any benefit to the proposed changes, so while I'm not active in the doc
project anymore, let me relate our decision.

It's often the case that docstrings get written fast, and it's usually
the case that they're written by a single person, who has a single
perspective.  We wanted to make docs that were professional, that could
be placed next to the manuals for IDL, Matlab, etc. without
embarrassment.  So, we set up a system similar to academic publishing.
Every docstring would be seen by two sets of critical eyes, and for
major X.0 releases we'd pay a proofreader to spend a few days to polish
off the English and get the style totally consistent.

At the same time, we needed to get something decent in every docstring
fast, so we made that the priority.  About the time we achieved that,
money ran out.  So, lots of docstrings are in "needs review" or even
"being edited" status.  But that doesn't mean money will never come
again.  Indeed, there are now several companies basing their services
around this software.  If someone does want to make the docs
professional, say for numpy 2.0 or 3.0 or whatever, or as part of a
larger system for sale, then they have a system in place that can do it.

The purpose of the review statuses is to identify how close a docstring
is to publishable.  However, there is no consequence to the statuses: a
docstring gets included in the release no matter its status.  But, you
do know which docstrings need what kind of work.  So, what's the benefit
of changing what the statuses mean, or eliminating them?  I think it may
only be that the writers feel better.  The users don't even see the
statuses as they're not listed in the release.

Tim felt that docs should be continually edited, not "finished".  I
agree, especially if the underlying routine or surrounding docs get
changed.  But the system is designed to encourage this!  Here's how:

Say most/all routines get genuine "proofed" status.  That's great, but
it's not the end of the line by any means.  If someone comes along and
edits a "proofed" docstring, that docstring then automatically "needs
review" once again, to ensure that a mistake was not inserted.  Now you
know what to look at when checking things over before a release (since
there can't be unit tests for docs).  From the history, you also know it
was once proofed, so reviewing and proofing it is very easy just by
looking at the diffs.

So, the system encourages and accounts for continual edits while
allowing a professional product to be produced for a particular release.

The way to move forward is to declare that the goal is to get all docs
to some status, say "needs review" (that was our initial goal, and the
only one we achieved, more or less).  Then, go after the docs that don't
have that, like the new polynomial docs.  If someone wants to publish a
manual, the goal becomes "proofed", and there's more work to do.

It DOES make sense to give the reviewer role to more people.  Just make
sure they take care in their reviews, so the statuses continue to have
meaning.  Otherwise what's the point?

--jh--

On Mon, 7 May 2012 22:14:56, Ralf Gommers <ralf.gommers at googlemail.com> wrote:
On Mon, May 7, 2012 at 7:37 PM, Tim Cera <tim at cerazone.net> wrote:

>> I think we should change the roles established for the Numpy/Scipy
>> documentation editors because they do not work as intended.
>>
>> For reference they are described here:
>> http://docs.scipy.org/numpy/Front%20Page/
>>
>> Basically there aren't that many active people to support being split into
>> the roles as described which has led to a backlog of 'Needs review'
>> docstrings and only one  'Proofed' docstring.  I think that many of these
>> docstrings are good enough, just that not enough people have put themselves
>> out front as so knowledgeable about a certain topic to label docstrings as
>> 'Reviewed' or 'Proofed'.
>>
>> You're right. I think at some point the goal shifted from getting
>everything to "proofed" to getting everything to "needs review".
>
>
>> Here are the current statistics for numpy docstrings:
>> Current %Count Needs editing17 279 Being written / Changed4 62 Needs
>> review76 1235 Needs review (revised)2 35 Needs work (reviewed)0 3Reviewed (needs proof)
>> 0 0 Proofed0 1 Unimportant? 1793
>>
>> The "needs editing" category actually contains mostly docstrings that are
>quite good, but were recently created and never edited in the doc wiki. The
>% keeps on growing. Bumping all polynomial docstrings up to "needs review"
>would be a good start here to make the % reflect the actual status.
>
>>
>> I have thought about some solutions in no particular order:
>>
>> * Get rid of the 'Reviewer' and 'Proofer' roles.
>> * Assign all 'Editors', the 'Reviewer', and 'Proofer' privileges.
>> * People start out as 'Editors', and then become 'Reviewers', and
>> 'Proofers' based on some editing metric.
>>
>> For full disclosure, I would be generous with a 'Reviewed' label if given
>> the authority because philosophically I think there should be a point where
>> the docstring is 'Good enough' and it should be expected to have a life of
>> continually small improvements rather that a point when it is 'Done'.
>>
>
>This makes sense to me.
>
>
>> Regardless of what decision is made, the single 'Proofed' docstring should
>> be available for editing.  I can't even find what it is.  I imagine that it
>> should be on the docstring page at http://docs.scipy.org/numpy/docs/
>>
>> It used to be there - maybe the stats got confused.