[Matplotlib-devel] Proposal to amend the PR merge rules

Antony Lee antony.lee at institutoptique.fr
Wed Jan 16 06:10:46 EST 2019


Replying to Nelle's points first:


> It seems to me that this "uncontroversial but uninteresting" rule is
> currently not based on objective criteria. I'm personally not in favor of
> removing two core contributors approval for anything that modifies the
> API.  We have had too many close calls on PR almost merged that had
> backward incompatible changes: I realize that some core developers believe
> that we are not moving fast enough, on a project with hundreds of thousands
> of users, changes to the API should not be considered lightly.
>

To be clear (and even though it is probably common knowledge that I would
like to evolve the API faster), there is no "secret agenda" of using this
to sneak API breaks through the system.


> Here's an attempt at better defining an "uncontroversial but
> uninteresting" rule based on objective criteria:
> - Has not been commented on by more than one core developer;
>

Sounds reasonable (with a reasonably flexible interpretation -- similarly
to the "no activity since two weeks, excluding trivial chat"), basically
"no other core dev expressed interest in reviewing it".


> - Doesn't change the public API;
> - Doesn't add or remove a functionality;
> - Doesn't add or remove a dependency;
>

I expect all changes of API to go through the normal route of deprecation
and API change note, which even the first reviewer should make sure are
present before sponsoring the PR for single-review-merge (as an aside, I
guess the first reviewer could even say, "approving, but I don't want to
take the responsibility for being the single reviewer"...).  So now you
have a two week period where any other core dev passing by can just see
whether there is an API change note and make their own judgment, including
deciding to block the PR if necessary, based on that change note.
I guess it is reasonable to say "API deprecations or changes and new
functionality should be clearly documented" to help that "skim-through"
review.
As for *adding* functionality, I guess you would have rejected the ttc-font
PR on that premise?  (Actually I now see that the additional functionality
was actually not documented :))  Note that this is not some obscure edge
case that no one cares about; it's a feature that has been requested for
years on the tracker.

- Is fully tested;
>

Also proposed by Jody and now by David, sounds good to me.


> - All parameters are fully documented;
>

(sounds contradictory with "doesn't add a functionality"... if there is no
new functionality it's unlikely there's a new parameter)


> - Doesn't fix a release-critical bug;
>

These tend to all be merged just before the next release is cut anyways
(review dynamics are a bit different in these periods), so I wouldn't worry
about it, but sure.


> - Is less than XX lines of code. (Long PR are harder to review, and thus
> the likelihood of one core dev not noticing a problem is higher). Threshold
> needs to be defined.
>

I don't actually think this is a good rule; the pdf/ps merge PR is +138
lines, -182 lines but as explained above the change is fairly trivial;
obviously I could have split it into smaller chunks but if anything that
would have made it harder to review.
Conversely (to use an example I have already raised elsewhere)
https://github.com/matplotlib/matplotlib/pull/11530 (making sure that
LocationEvent.x and LocationEvent.y are integers (as they refer to pixel
locations) rather than floats) is 2 lines of changes in the codebase and 20
lines of test, was reviewed by 2.5 core devs (I left a comment in passing)
and we all missed that there was "problem" with it (it broke mplcursors and
some other pieces of code I have that assume that it is possible to
synthetize LocationEvents pointing to arbitrary places in the canvas).
(Not to derail this discussion into how to evaluate API breaks once
again... but I think these kinds of changes are much more insidious than,
say, removing a function, as the latter would just trigger an
AttributeError that is easily noticed and at worst you just go back to the
previous Matplotlib release, whereas the LocationEvent change was only
caught because it was covered by the mplcursors test suite; otherwise I'd
have been silently generating wrong results due to it...)


> To be honest, it also doesn't help that some people have 50+ pull requests
> open (Antony? :p).


I guess if the way to fix Matplotlib is to not contribute to Matplotlib
anymore, we have a bigger problem.  In fact, I have some pretty big patches
fixing long-standing issues (e.g. for #8550 I have a pure-Python
reimplementation of the fontconfig font weight deduction algorithm,
translated from fontconfig's C implementation) that I don't even bother
turning into PRs, as I know they have no chance of being reviewed and
merged under the current system at least, because no one would care (see
also the bus-factor point below).


> It also doesn't help that some PR contain the same code (13117, 13128) and
> are being hold off until the other one is merged (also, why is 13117 being
> hold off for a PR that has been opened afterwards?). IMO, if a PR can be
> merged, it should be merged.


That's discussed in
https://github.com/matplotlib/matplotlib/pull/13117#discussion_r245503693
(FTR I actually agreed on merging 13117 directly, but apparently that's an
API break...)


> But we shouldn't make too many exceptions to the two core contributors
> rules. It's easier to have two core contributor review a pull request, than
> to try to fix API issues on code that has been merged and released. Having
> two pairs of eyes on a PR is also a good way to avoid having a bus factor
> of 1 on some areas of the library.


There are already huge and central chunks of the library (Agg internals,
PDF/PS) that have a bus factor close to zero.  If anything, I hope that
this "drive-by review" system will encourage reviewers to start looking at
places they are less familiar with ("oh yes, I guess this change makes
sense" -- see Eric's reply above) and decrease that.

And then to David's (the code-coverage point is already covered above):

I think it is good remember that (as far as I know) everyone submitting PRs
> and everyone reviewing them isn't paid and does so in their spare time;
> therefore is no obligation for anyone to review anything on any particular
> timescale. One thing that hasn't been discussed is how to increase the
> reviewing capacity of the community, which I guess would involve working
> out how to attract and retain more regular contributors.


Given the premises "no one is paid to review PRs" and "the reviewing
capacity is too small" (which I agree with), I think there are two
reasonable solutions:
- Make PRs easier to merge (this proposal),
- Increasing to number of core devs (your proposal, but it's not clear to
me whether this number can increase significantly in a short period of
time...).

Antony

On Wed, Jan 16, 2019 at 10:34 AM David Stansby <dstansby at gmail.com> wrote:

> I think it is good remember that (as far as I know) everyone submitting
> PRs and everyone reviewing them isn't paid and does so in their spare time;
> therefore is no obligation for anyone to review anything on any particular
> timescale. One thing that hasn't been discussed is how to increase the
> reviewing capacity of the community, which I guess would involve working
> out how to attract and retain more regular contributors.
>
> re. the proposed changes, I'm +1 modulo objective criteria being agreed
> upon. Personally I agree with Nelle's list, with the addition "has 100%
> code coverage".
>
> David
>
>
>
> On Tue, 15 Jan 2019, 23:46 Nelle Varoquaux <nelle.varoquaux at gmail.com
> wrote:
>
>> For the way GitHub sorts pull requests, it is indeed a problem, but
>> there's a way to sort by "most recently updated." I sometimes use it to
>> make sure that I review recently updated pull request that don't appear on
>> the first page.
>>
>> It seems to me that this "uncontroversial but uninteresting" rule is
>> currently not based on objective criteria. I'm personally not in favor of
>> removing two core contributors approval for anything that modifies the
>> API.  We have had too many close calls on PR almost merged that had
>> backward incompatible changes: I realize that some core developers believe
>> that we are not moving fast enough, on a project with hundreds of thousands
>> of users, changes to the API should not be considered lightly.
>>
>> Here's an attempt at better defining an "uncontroversial but
>> uninteresting" rule based on objective criteria:
>> - Has not been commented on by more than one core developer;
>> - Doesn't change the *public* API;
>> - Doesn't add or remove a functionality;
>> - Doesn't add or remove a dependency;
>> - Is fully tested;
>> - All parameters are fully documented;
>> - Doesn't fix a release-critical bug;
>> - Is less than XX lines of code. (Long PR are harder to review, and thus
>> the likelihood of one core dev not noticing a problem is higher). Threshold
>> needs to be defined.
>>
>> To be honest, it also doesn't help that some people have 50+ pull
>> requests open (Antony? :p). It also doesn't help that some PR contain the
>> same code (13117, 13128) and are being hold off until the other one is
>> merged (also, why is 13117 being hold off for a PR that has been opened
>> afterwards?). IMO, if a PR can be merged, it should be merged. But we
>> shouldn't make too many exceptions to the two core contributors rules. It's
>> easier to have two core contributor review a pull request, than to try to
>> fix API issues on code that has been merged and released. Having two pairs
>> of eyes on a PR is also a good way to avoid having a bus factor of 1 on
>> some areas of the library.
>>
>>
>> On Tue, 15 Jan 2019 at 15:19, Thomas Caswell <tcaswell at gmail.com> wrote:
>>
>>> I think a distinction between "no one cares" and "no one has bandwidth"
>>> needs to be made.
>>>
>>> If you look at the 'pulse' page, we merging 100+ PRs a month (
>>> https://github.com/matplotlib/matplotlib/pulse/monthly) so reviews are
>>> happening, the issue is just that some are falling through the cracks.
>>>
>>> Tom
>>>
>>> On Tue, Jan 15, 2019 at 4:45 PM Antony Lee <
>>> antony.lee at institutoptique.fr> wrote:
>>>
>>>> Eric,
>>>> Neither the ttc font, nor the pdf/ps common code PRs actually touch any
>>>> "complex" point regarding fonts or pdf/ps.  I have described the ttc font
>>>> in my previous message; the pdf/ps PR is really just, look, these two
>>>> classes share 80 lines worth of code that's literally copy-pasted, let's
>>>> just put that in a common parent class and inherit from it.
>>>> Conversely, one font-related PR for which I actually have low hopes of
>>>> getting properly reviewed (except on an "sure, I trust you" basis) is
>>>> https://github.com/matplotlib/matplotlib/pull/12928, which actually
>>>> touches a rather complex point about font encodings (some of which I had to
>>>> clarify on the FreeType mailing list).  (If anyone wants to have a look at
>>>> it, though... :p)
>>>> Antony
>>>>
>>>> On Tue, Jan 15, 2019 at 10:36 PM Eric Firing <efiring at hawaii.edu>
>>>> wrote:
>>>>
>>>>> Antony,
>>>>>
>>>>> Your examples illustrate that the problem is often that we don't have
>>>>> enough people who understand some areas, like fonts and pdf/ps file
>>>>> formats, to get 2 thorough and knowledgeable reviews in a reasonable
>>>>> time.
>>>>>
>>>>> I'm fine with your proposed rule.  It should help at least a little
>>>>> bit.
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>> On 2019/01/15 9:11 AM, Antony Lee wrote:
>>>>> > I'm obviously more aware of my own PRs, so here are a few I'd have
>>>>> put
>>>>> > up for single-review-merge (note that it's quite possible that
>>>>> they'd
>>>>> > have attracted more attention and gotten merged faster under that
>>>>> > system; that's also the goal...).  Obviously other devs may think
>>>>> about
>>>>> > other PRs that could have benefitted from the same process.
>>>>> >
>>>>> > - https://github.com/matplotlib/matplotlib/pull/9787 adds support
>>>>> for
>>>>> > the ttc font format (requested since 2014) for png/svg output
>>>>> (that's
>>>>> > provided basically for free by FreeType) but not for pdf/ps (that
>>>>> would
>>>>> > require more difficult changes).  The entire PR comes down to:
>>>>> >    * adding "ttc" as an extension that should be treated like
>>>>> "ttf"/"otf",
>>>>> >    * adding error handling to pdf/ps to signal these formats are not
>>>>> > supported there,
>>>>> >    * tests.
>>>>> >    The PR sat open for 10 months before a first positive review
>>>>> (because
>>>>> > no one cares about fonts, I guess) and took another 3 months to
>>>>> attract
>>>>> > a second positive review.
>>>>> >
>>>>> > - https://github.com/matplotlib/matplotlib/pull/12472 does a fairly
>>>>> > trivial fix to fontList.json to make it reusable for Matplotlibs
>>>>> > installed in different venvs, got a positive review in one day and
>>>>> > waited another 2.5 months for a second positive review.
>>>>> >
>>>>> > - https://github.com/matplotlib/matplotlib/pull/9867 deduplicates
>>>>> > completely duplicated code between the pdf and ps backends, mostly
>>>>> for
>>>>> > maintainability's sake; it took 7.5 months (and 5 rebases due to
>>>>> > conflicts) to get a first positive review and another 6 (and around
>>>>> as
>>>>> > many rebases) to get a second one.
>>>>> >
>>>>> > -----
>>>>> >
>>>>> > As for the specifics (as I see it):
>>>>> > - Someone opens a PR, it gets a positive review, no negative review,
>>>>> and
>>>>> > no activity occurs for two weeks (significant activity -- excluding
>>>>> > trivial chat on the thread).
>>>>> > - Sponsor (first reviewer, or author if a committer) *pings* all
>>>>> devs by
>>>>> > mentioning @matplotlib/developers on the PR thread with the intent
>>>>> to
>>>>> > single-review-merge, and labels the PR accordingly.
>>>>> > - The PR can still be reviewed/merged/rejected by the normal review
>>>>> > process.  However, if no one explicitly opposes the merge within two
>>>>> > weeks, anyone can merge it at that point (including by self-merge).
>>>>> >
>>>>> > I would not exclude all API changes from the process.  For example,
>>>>> >
>>>>> https://github.com/matplotlib/matplotlib/pull/13173/commits/e90d264f3e5a263ef57243a2af86b46ab74ccc16
>>>>> > deprecates (and prepares for deletion) two parameters to imshow()
>>>>> that
>>>>> > have had no effect since 2006.  Let's pretend for a second this
>>>>> commit
>>>>> > was a single independent PR (and not an example to showcase the
>>>>> > parameter-deprecating decorator...); if it started being forgotten I
>>>>> > would have proposed it for single-review-merge (again, note that if
>>>>> > you're worried about the change, you don't need to reject the PR;
>>>>> you
>>>>> > can just say, I think this API change needs to be approved by a
>>>>> second
>>>>> > reviewer and that'll block the single-review-merge just as well).
>>>>> >
>>>>> > Antony
>>>>> >
>>>>> > On Tue, Jan 15, 2019 at 7:46 PM Nelle Varoquaux
>>>>> > <nelle.varoquaux at gmail.com <mailto:nelle.varoquaux at gmail.com>>
>>>>> wrote:
>>>>> >
>>>>> >     The proposed scheme by Paul doesn't seem reasonable to me. Core
>>>>> >     contributor A or committers needs to actively reach out to other
>>>>> >     core contributors: labeling is not enough IMO.
>>>>> >
>>>>> >
>>>>> >     On Tue, 15 Jan 2019 at 10:37, Jody Klymak <jklymak at uvic.ca
>>>>> >     <mailto:jklymak at uvic.ca>> wrote:
>>>>> >
>>>>> >
>>>>> >
>>>>> >>         On 15 Jan 2019, at 10:20, Paul Hobson <pmhobson at gmail.com
>>>>> >>         <mailto:pmhobson at gmail.com>> wrote:
>>>>> >>
>>>>> >>         I think this would be a good policy as well. Can I get some
>>>>> >>         clarification on the flow? The way I understand it:
>>>>> >>
>>>>> >>         1) Someone (committer, contributor, or first-timer) opens a
>>>>> >>         simple PR
>>>>> >>         2) Committer A reviews it, adds a "single-review-merge"
>>>>> label,
>>>>> >>         then digitally walks away
>>>>> >>         3) No less than two weeks later, Committer B sees the PR,
>>>>> >>         notices the label, opens it up, and merges without
>>>>> reviewing.
>>>>> >
>>>>> >         If there is a Committer B, I think they should at least look
>>>>> at
>>>>> >         the PR to make sure it doesn’t have any hidden API changes,
>>>>> or
>>>>> >         introduce anything hostile.
>>>>> >
>>>>> >         The point here is that getting the third person to look at a
>>>>> PR
>>>>> >         is proving quite difficult, and if the rest of the folks with
>>>>> >         the commit bit haven’t looked at a PR for a month, even after
>>>>> >         being warned, then silence indicates consent.
>>>>> >
>>>>> >         Note that only folks with the commit bit can label PRs, so
>>>>> >         anyone flagging a PR like this is implicitly trusted to not
>>>>> be
>>>>> >         wreaking havoc, and not doing this for major changes.  Since
>>>>> we
>>>>> >         all curate our own PRs, I rather expect that the PR
>>>>> contributor
>>>>> >         will often be the person who adds the flag.  Whether we want
>>>>> to
>>>>> >         allow a self-merge at that point is up for debate, but if
>>>>> not,
>>>>> >         then folks need to get in the habit of looking at old flagged
>>>>> >         PRs and merging them.
>>>>> >
>>>>> >         I think a huge problem we have is the LIFO default sort of
>>>>> the
>>>>> >         github PR queue, and that any PR not on the first page might
>>>>> as
>>>>> >         well be closed for all the attention it will get without
>>>>> >         constant nagging.
>>>>> >
>>>>> >         Cheers,   Jody
>>>>> >
>>>>> >>         Does that capture a successful "single-review-merge"
>>>>> lifecycle?
>>>>> >>         -paul
>>>>> >>
>>>>> >>         On Tue, Jan 15, 2019 at 9:57 AM Nelle Varoquaux
>>>>> >>         <nelle.varoquaux at gmail.com <mailto:
>>>>> nelle.varoquaux at gmail.com>>
>>>>> >>         wrote:
>>>>> >>
>>>>> >>             Hello,
>>>>> >>
>>>>> >>             Overall, I think this is a good idea, but I would like
>>>>> >>             specifics on what "uncontroversial but uninteresting" PR
>>>>> >>             mean. IMO, that should exclude any PR with API changes.
>>>>> >>
>>>>> >>             Cheers,
>>>>> >>             N
>>>>> >>
>>>>> >>             On Tue, 15 Jan 2019 at 06:00, Antony Lee
>>>>> >>             <anntzer.lee at gmail.com <mailto:anntzer.lee at gmail.com>>
>>>>> wrote:
>>>>> >>
>>>>> >>                 Hi all,
>>>>> >>
>>>>> >>                 During the weekly dev call, I proposed to amend the
>>>>> PR
>>>>> >>                 merge rules, following an initial comment on Github
>>>>> >>                 [
>>>>> https://github.com/matplotlib/matplotlib/pull/13173#issuecomment-453921220].
>>>>>
>>>>> >>                 There was general agreement among the devs present
>>>>> >>                 (Tom, Hannah, Jody, and myself), so I'm putting it
>>>>> >>                 here for discussion.  The objective of this change
>>>>> is
>>>>> >>                 to prevent "uncontroversial but uninteresting" PRs
>>>>> >>                 from falling into oblivion, and try to decrease the
>>>>> >>                 size of the open PR stack.
>>>>> >>
>>>>> >>                 The current rule is that a PR needs positive reviews
>>>>> >>                 (from two committers, and excluding the author if a
>>>>> >>                 committer) to be merged, except that doc-only PRs
>>>>> >>                 (docstrings, rst) only needs a single positive
>>>>> review.
>>>>> >>
>>>>> >>                 I am proposing that, if a (non-doc) PR already has
>>>>> one
>>>>> >>                 positive review, but no activity on that PR has
>>>>> >>                 occurred for two weeks (exact time interval up to
>>>>> >>                 bikeshedding) [Jody suggests: and the PR has 100%
>>>>> code
>>>>> >>                 coverage], then a committer (either the first
>>>>> >>                 reviewer, or the author if a committer) can suggest
>>>>> >>                 that it be merged on the basis of that single
>>>>> review.
>>>>> >>                 To do so, the "sponsor" should ping all developers
>>>>> >>                 (@matplotlib/developers) on that issue indicating
>>>>> that
>>>>> >>                 intent, and add a "single-review-merge" label on the
>>>>> >>                 PR (so that these PRs can easily be found).
>>>>> >>                 Committers are encouraged to review the PR to accept
>>>>> >>                 and merge it or reject it or request changes on it;
>>>>> >>                 but they can also just indicate that they consider
>>>>> the
>>>>> >>                 PR sufficiently complex that a proper second review
>>>>> is
>>>>> >>                 needed before merging, or request an extension,
>>>>> etc.
>>>>> >>                 To do so they should still leave a "reject" review,
>>>>> >>                 even if just saying "objecting to single-review
>>>>> >>                 merged; anyone can dismiss after a second review".
>>>>> >>                 However, if within another two weeks, no committer
>>>>> >>                 voiced any objection (explicitly, i.e. by
>>>>> rejecting),
>>>>> >>                 then the PR can indeed be merged on the basis of
>>>>> that
>>>>> >>                 single review.
>>>>> >>
>>>>> >>                 To avoid overwhelming the system, any committer can
>>>>> >>                 only "sponsor" a single PR at a time.
>>>>> >>
>>>>> >>                 Thoughts?
>>>>> >>
>>>>> >>                 Antony
>>>>> >>                 _______________________________________________
>>>>> >>                 Matplotlib-devel mailing list
>>>>> >>                 Matplotlib-devel at python.org
>>>>> >>                 <mailto:Matplotlib-devel at python.org>
>>>>> >>
>>>>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>> >>
>>>>> >>             _______________________________________________
>>>>> >>             Matplotlib-devel mailing list
>>>>> >>             Matplotlib-devel at python.org
>>>>> >>             <mailto:Matplotlib-devel at python.org>
>>>>> >>
>>>>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>> >>
>>>>> >>         _______________________________________________
>>>>> >>         Matplotlib-devel mailing list
>>>>> >>         Matplotlib-devel at python.org <mailto:
>>>>> Matplotlib-devel at python.org>
>>>>> >>         https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>> >
>>>>> >         --
>>>>> >         Jody Klymak
>>>>> >         http://web.uvic.ca/~jklymak/
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >     _______________________________________________
>>>>> >     Matplotlib-devel mailing list
>>>>> >     Matplotlib-devel at python.org <mailto:Matplotlib-devel at python.org>
>>>>> >     https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Matplotlib-devel mailing list
>>>>> > Matplotlib-devel at python.org
>>>>> > https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>> >
>>>>>
>>>>> _______________________________________________
>>>>> Matplotlib-devel mailing list
>>>>> Matplotlib-devel at python.org
>>>>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>>
>>>> _______________________________________________
>>>> Matplotlib-devel mailing list
>>>> Matplotlib-devel at python.org
>>>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>>
>>>
>>>
>>> --
>>> Thomas Caswell
>>> tcaswell at gmail.com
>>> _______________________________________________
>>> Matplotlib-devel mailing list
>>> Matplotlib-devel at python.org
>>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>>
>> _______________________________________________
>> Matplotlib-devel mailing list
>> Matplotlib-devel at python.org
>> https://mail.python.org/mailman/listinfo/matplotlib-devel
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/matplotlib-devel/attachments/20190116/410273fe/attachment-0001.html>


More information about the Matplotlib-devel mailing list