Re: [python-committers] My cavalier and aggressive manner, API change and bugs introduced for basically zero benefit

22 Jan 2017 · _wants_

      I've been on the sidelines for a while myself for a number of reasons,
but the shift to GitHub will pull me back in for sure, at least in
terms of code review. I look forward to actually contributing code
again soon, but easier tooling on reviews—rather, a shiny new one, as
I'm aware of Reitveld—is enough of a carrot to bring me back in.
On Sun, Jan 22, 2017 at 12:08 PM, Tal Einat <taleinat@gmail.com> wrote:
...
Dormant core dev here. Not contributing at all due to severe lack of
time in the past year and a half, not likely to have more time in the
near future. Also no longer working with Python at all except as a
hobby :(
I could pull off a review once a month if it would actually help!
On Sat, Jan 21, 2017 at 9:51 PM, Brett Cannon <brett@python.org> wrote:
...
What I'm picking up from this is (as a gross oversimplification):

Victor _wants_ code reviews
Raymond thinks we _need_ code reviews

So the common theme here regardless of whether you agree with Raymond or
Victor's approach to development is that we are not getting enough code
reviews to go around. To me that's what the systemic issue is that this
email is bringing up.
Now I think most of us don't think the solution to the lack of reviews is to
lower our standard of what it takes to become a core developer (this doesn't
mean we shouldn't do a better job of identifying potential candidates, just
that we shouldn't give people commit privileges after a single patch like
some projects do). To me that means we need to address why out of 79 core
developers only 39 have a single commit over the past year, 30/79 have more
than 12 commits over that same time scale, 15/79 people have more than 52
commits, and 2/79 people have over 365 commits
(https://github.com/python/cpython/graphs/contributors?from=2016-01-22&to=2017-01-21&type=c
for the stats).
Some of you have said you're waiting for the GitHub migration before you
start contributing again, which I can understand (I'm going to be sending an
email with an update on that after this email to python-dev &
core-workflow). But for those that have not told me that I don't know what
it will take to get you involved again. For instance, not to pick on Andrew
but he hasn't committed anything but he obviously still cares about the
project. So what would it take to get Andrew to help review patches again so
that the next time something involving random comes through he feels like
taking a quick look?
As I have said before, the reason I took on the GitHub migration is for us
core developers. I want our workflow to be as easy as possible so that we
can be as productive as possible. But the unspoken goal I have long-term is
to get to the point that even dormant core devs want to contribute again,
and to the point that everyone reviews a patch/month and more people
reviewing a patch/week (although I'll take a patch/year to start). I want to
get to the point that every person with commit privileges takes 30 minutes a
month to help with reviews and that the majority of folks take 30 minutes a
week to review (and please don't think this as a hard rule and if you don't
the privileges go away, view this as an aspirational goal). Even if people
who don't have time to review the kind of patches Victor is producing which
triggered this thread, reviewing documentation patches can be done without
deep knowledge of things and without taking much time. That way people who
have time to review the bigger, more difficult patches can actually spend
their time on those reviews and not worrying about patches fixing a spelling
mistake or adding a new test to raise test coverage.
All of this is so that I hope one day we get to the point where all patches
require a review no matter who proposed the code change. Now I think we're
quite a ways of from being there, but that's my moonshot goal for our
workflow: that we have enough quality reviews coming in that we feel that
even patches from fellow core developers is worth requiring the extra code
check and disbursement of knowledge without feeling like a terrible drag on
productivity.
Once the GitHub migration has occurred I'm planning to tackle our Misc/NEWS
problem and then automate Misc/ACKS. After that, though, I hope we can take
the time to have a hard look at what in our workflow prevents people from
making even occasional code reviews so that everyone wants to help out again
(and if any of this interests you then please subscribe to core-workflow).
On Fri, 20 Jan 2017 at 02:46 Victor Stinner <victor.stinner@gmail.com>
wrote:
...
Hi,
Raymond Hettinger used a regression that I introduced in the builtin
sorted() function (in Python 3.6.0) to give me his feedback on my
FASTCALL work, but also on Argument Clinic.
Context: http://bugs.python.org/issue29327#msg285848
Since the reported issues is wider than just FASTCALL, including how I
contribute to CPython, I decided to discuss the topic with a wider
audience. I continue the discussion on python-committers to get the
opinion of the other core developers.
Sorry for my very long answer! I tried to answer to each issues
reported by Raymond.
Inaccurate summary: I'm a strong supporter of "it's better to ask
forgiveness than permission", whereas Raymond considers that I
introduced too many regressions with my workflow.
Raymond Hettinger added the comment:
...
A few random thoughts that may or may not be helpful:

We now have two seasoned developers and one new core developer that

collectively are creating many non-trivial patches to core parts of Python
at an unprecedented rate of change.  The patches are coming in much faster
than they can reasonably be reviewed and carefully considered, especially by
devs such as myself who have very limited time available.  IMO, taken as
whole, these changes are destabilizing the language.  Python is so
successful and widely adopted that we can't afford a "shit happens"
attitude.  Perhaps that works in corners of the language, infrequently used
modules, but it makes less sense when touching the critical paths that have
had slow and careful evolution over 26 years.

Besides the volume of patches, one other reason that reviews are hard
to come by is that they apply new APIs that I don't fully understand yet.
There are perhaps two people on the planet who could currently give
thoughtful, correct, and critical evaluation of all those patches.  Everyone
else is just watching them all fly by and hoping that something good is
happening.

Since one or maybe even two years, I noticed that many of my issues
were blocked by the lack of reviews. As you wrote, only few developer
have the knowledge and background to be able to provide a good review
(not only "tests pass, so LGTM") on my changes modifying the Python
core.
I also wanted to discuss this topic, but I didn't really know what to
propose. Let's take this opportunity to explain how I contribute to
CPython, especially how I decide to wait for a review or not.
For each patch that I write, I estimate the risk of regression. You
may know that any regression is something unexpected, so such
estimation is tricky. Here is my heuristic:
(*) if the patch is trivial (short, non controversal), I push it
immediatly.
(*) If I'm less confident, I open an issue and attach the patch. I
wait at least one day before pushing.
It's strange, but the process of opening an issue and attaching the
patch usually helps to review the code myself (find bugs, or more
generally enhance the patch). Maybe because it forces me to review the
change one more time?
If the change is not part of a larger patch serie, so doesn't block me
to move further, I try to keep the issue open around one week.
The truth is that too few of my patches get a review :-/ Maybe I
should wait longer, but then it becomes harder for me to handle many
patches.
Maybe it's a tooling issues. Recently, I started to use local branches
in a Git repository. It helps a lot of work on parallel on large
changes. Before, I only worked in a single directory (default/, the
default Mercurial branch) and applied/reverted patches everytime. It's
painful, especially when I have to use multiple computers, download
again publshed patches, etc. Maybe it will run smoother once CPython
will move to Git and GitHub.
By the way, it's painful to squash a long patch serie into a giant
patch, much harder to review, where changes don't make sense at all at
the first look. Again, a better reviewing tool supporting patch series
(GitHub) will help here too.
Not supporting patch series in our reviewing tool also explains why I
prefer to push than having to wait for a review. Rebasing manually
long patch series stored as giant .patch files is complicated.
(*) If the change changes an API or changes a core component, I wait
for at least one review from a core reviewer. Sometimes, I even send
an email to python-dev. Again, sometimes I don't get any feedback on
the patch nor the email after two weeks :-/ At least, I tried :-)
Usually, I get feedback in less than one week, or no feedback at all.
I understand that nobody understands my change or nobody cares :-)
I totally understand that most core developers have a little amount of
time available to contribute to Python. I'm trying to find a
compromise between the risk of introducing regressions and being stuck
in my work. This email might help me to adjust my workflow.
By the way, I'm trying to always run the full test suite (./python -m
test -rW -j0) before pushing any change. If I suspect that I may have
introduced reference leaks, I also run "./python -m test -R 3:3 ..."
on the tests related to the modified code to check for
memory/reference leaks.
...

One other reason for the lack of review comments in the enthusiasm and
fervor surrounding the patches.  I feel like there is a cost of questioning
whether the patches should be done or how they are done, like I am burning
little karma every time.  Sometimes it feels safest and most cordial to just
say nothing and let you make hundreds of semi-reviewed changes to just about
every critical part of the language.

"semi-reviewed". Let me be more accurate: yeah, I do push a lot of
changes which were not reviewed by anyone (see above).
...

Historically, if there was creator or maintainer of the code who was
still active, that person would always be consulted and have a final say on
whether a change should be applied.  Now, we have code constantly being
changed without consulting the original author (for example, the recent and
catastrophic random initialization bug was due to application of a patch
without consulting the author of _randommodule.c and the maintainer of
random.py, or this change to sorted(), or the changes to decimal, etc).

What do you mean by "author"? As you wrote, Python is now 26 years
old, so it had a very long history, and each file has a very long list
of "authors". I guess that you mean more a "maintainer".
My problem is that I'm not aware of any explicit list of maintainers.
I didn't know that you were the maintainer of the random module before
you told me that at the Facebook sprint last september. I didn't
expect that the random module had a maintainer, I thought that any
core developer would be allowed to modify the code.
Moreover, since I open an issue for most of my changes, it gives an
opportunity to maintainers to review changes. Maybe we need more
components in the bug tracker to notify maintainers of pending
changes?
You mentionned 3 different changes, let me reply.
(1) The random change: http://bugs.python.org/issue29085
I introduced a regression in random.Random.seed(): a typo in the C
code has the consequence that the current time and process identifier
is used, instead of os.urandom(16), to initialize the Mersenne Twister
RNG.
IMHO the regression is not "catastrophic". Only few developers
instanciate random.Random themself, random.Random must not be used for
security, etc. I let others decide if this bug was catastrophic or
not.
Since we are talking about the development process, let me see how the
change was made.
Context: The PEP 524 has a long and painful history... Something like
more than 500 messages were sent on the bug tracker and python-dev,
and nobody was listening to each others, two security experts
"rage-quitted" Python because of this mess... I decided to try to fix
this issue in a constructive way, so I wrote a PEP. Nick wrote a
different PEP, since it was clear that it was possible to handle
security in two different incompatible ways. A mailing list was even
created just to discuss this bug! A mailing list just for a bug gives
an idea of the size of the mess :-)
Well, about the change itself, it was done in
http://bugs.python.org/issue27776
The patch was available for review during 19 days
(2016-08-18-2016-09-06) and was reviewed by Nick Coghlan. Since Nick
wrote a similar PEP, I trusted him to be able to review my change.
(Well, anyway I already trust all core developers, but I mean that I
was trusting him even more than usual :-))
Since the change has a big impact on security, I had prefer to get a
review of more developers, especially our security experts... but as I
wrote, two security experts "rage- quitted". Again, this PEP has a
long and sad story :-/
Note: you say that you are the maintainer of the random module, but I
don't recall having see you in any recent discussions and issues
related to os.urandom(), whereas a lot of enhancements and changes
were done last 2 years. I made many changes to support new OS
functions like getentropy() an getrandom().
Oooookay, let's see the second change, "this change to sorted()",
http://bugs.python.org/issue29327
(2) I introduced a bug in sorted(), last August:
https://hg.python.org/cpython/rev/15eab21bf934/
Calling sorted(iterable=[]) does crash. To be honest, I didn't imagine
that anyone would pass the iterable by keyword, but Serhiy is very
good to spot bugs in corner cases :-)
IMHO the regression is subtle.
When I optimized the code to use FASTCALL, I replaced
PyTuple_GetSlice(args, 1, argc) with &PyTuple_GET_ITEM(args, 1). I
checked that all tests passed, so it looks ok to me.
I didn't imagine that anyone would call sorted(iterable=[]), so I
didn't notice that PyTuple_GetSlice() can create an empty tuple.
The previous code was wrong since sorted() accepted iterable as a
keyword, whereas sort.list() doesn't.
So well, I let you guess if a review would have spot this bug in the
large change.
(3) Recently, I ran sed to replace code patterns to use faster ways to
call functions:
https://hg.python.org/cpython/rev/54a89144ee1d
"Replace PyObject_CallObject(callable, NULL) with
_PyObject_CallNoArg(callable)"
I recalled that I modified the _decimal module and that Stefan Krah
complained, because he wants to have the same code base on Python 3.5,
3.6 and 3.7. He also mentionned an external test suite which was
broken by recent _decimal changes (not sure if my specific change was
in cause or not), but I wasn't aware of it.
To be honest, I didn't even notice that I modified _decimal when I ran
sed on all .c files. Since the change was straightforward and (IMHO)
made the code more readable, I didn't even wait for a review if I
recall correctly.
Stefan and me handled this issue privately (he reverted my change),
I'm not sure that it's worth it to say more about this "issue" (or
even "non-issue").
To be clear, I don't consider that my change introduced a regression.
...

In general, Guido has been opposed to sweeping changes across the code
base for only tiny benefits.  Of late, that rule seems to have been lost.

The benefits of FASTCALL mainly apply to fine grained functions which
only do a little work and tend to be called frequently in loops.  For
functions such as sorted(), the calling overhead is dominated by the cost of
actually doing the sort.  For sorted(), FASTCALL is truly irrelevant and
likely wasn't worth the complexity, or the actual bug, or any of the time
we've now put in it.  There was no actual problem being solved, just a
desire to broadly apply new optimizations.

Ok, first, you qualify my FASTCALL changes as code churn. So let me
show an example with sorted():
https://hg.python.org/cpython/rev/b34d2ef5c412
Can you elaborate how such change increases the complexity?
Second, "no actual problem being solved"
Since the goal of FASTCALL is to optimize Python, I guess that you
consider that the speedup doesn't justify the change. I gave numbers
in the issue #29327:
Microbenchmark on sorted() on Python 3.7 compared to 3.5 (before
FASTCALL):
haypo@smithers$ ./python -m perf timeit 'seq=list(range(10))'
'sorted(seq)' --compare-to=../3.5/python -v
Median +- std dev: [3.5] 1.07 us +- 0.06 us -> [3.7] 958 ns +- 15 ns:
1.12x faster (-11%)
haypo@smithers$ ./python -m perf timeit 'seq=list(range(10)); k=lambda
x:x' 'sorted(seq, key=k)' --compare-to=../3.5/python -v
Median +- std dev: [3.5] 3.34 us +- 0.07 us -> [3.7] 2.66 us +- 0.05
us: 1.26x faster (-21%)
IMHO such speedup is significant even on a microbenchmark. Can you
elaborate what are your criteria to decide if an optimization is worth
it?
...

Historically, we've relied on core developers showing restraint.  Not
every idea that pops into their head is immediately turned into a patch
accompanied by pressure to apply it.  Devs tended to restrict themselves to
parts of the code they knew best through long and careful study rather
sweeping through modules and altering other people's carefully crafted code.

Should I understand that I should restrict myself to some files? Or
not touch some specific parts of Python, like... "your" code like
random, itertools and collections modules?
I replied to the 3 issues you mentioned previously and explained how I
contribute to Python.
...

FWIW, I applaud your efforts to reduce call overhead -- that has long
been a sore spot for the language.

Guido has long opposed optimizations that increase risk of bugs,
introduce complexity, or that affect long-term maintainability.   In some
places, it looks like FASTCALL is increasing the complexity (replacing
something simple and well-understood with a wordier, more intricate API that
I don't yet fully understand and will affect my ability to maintain the
surrounding code).

I'm sorry, I didn't spent much time on explaing the FASTCALL design
nor documenting my changes. It's partially deliberate to make
everything related to FASTCALL private. Since it's a huge project
modifying a lot of code, I wanted to wait until the APIs and the code
stop moving too fast to take time to explain my work and document it.
If you have specific questions, please go ahead.
Shortest summary:

FASTCALL replaces (args: tuple, kwargs: optional dict) with (args: C
array, nargs: int, kwnames: tuple of keyword keys). It's a new calling
convention which allows to avoid a temporary tuple to pass positional
arguments and avoids temporary dictionary to pass keyworkd arguments.

To use FASTCALL, C functions should be converted to the new
METH_FASTCALL calling convention

PyObject_Call() can replaced with _PyObject_FastCallKeywords() or
_PyObject_FastCallDict() (when we still get kwargs as a dict) in such
conversion

Many existing C functions were optimized internally to use FASCALL,
so even if you don't modify your code, you will benefit of it
(speedup). Typical example: PyFunction_CallFunctionObjArgs().

The most massive change were purely internal and don't affect the most
famous C APIs at all. In some cases, to fully benefit of FASTCALL,
code should be modified. I'm trying to restrict such changes to Python
internals, especially the most used functions.
I expected that the required changes were straightforward enough, it
looks like I was wrong, but I don't recall anyone, before you
recently, asking for an explanation.
...

It was no long ago that you fought tooth-and-nail against a single
line patch optimization I submitted.  The code was clearly correct and had a
simple disassembly to prove its benefit.  Your opposition was based on "it
increases the complexity of the code, introduces a maintenance cost, and
increases the risk of bugs".  In the end, your opposition killed the patch.
But now, the AC and FASTCALL patches don't seem to mind any of these
considerations.

Context: http://bugs.python.org/issue26201
It seems like we need more _explicit_ rules to decide if an
optimization is worth it or not. For me, the de facto standard request
for an optimization is to prove it with a benchmark. I requested a
benchmark, but you refused to provide it.
So I ran my own benchmark and saw that your change made the modified
code (PyList_Append()) 6% slower. I'm not sure that my bencmark was
correct, but it was a first step to take a decision.
To come back to FASTCALL, your point is that it doesn't provide any
speedup.
In most FASTCALL issues that I opened, I provide a script to reproduce
my benchmark and the benchmark results. The speedup is usually betwen
10% and 20% faster.
Should I understand that 6% slower is ok, whereas 10-20% faster is not
good? Can you please elaborate?
...

AC is supposed to be a CPython-only concept.  But along the way APIs
are being changed without discussion.  I don't mind that sorted() now
exposes *iterable* as a keyword argument, but it was originally left out on
purpose (Tim opined that code would look worse with iterable as a keyword
argument).  That decision was reversed unilaterally without consulting the
author and without a test.  Also as AC is being applied, the variable names
are being changed.  I never really liked the "mp" that used in dicts and
prefer the use of "self" better, but it is a gratuitous change that
unilaterally reverses the decisions of the authors and makes the code not
match any of the surrounding code that uses the prior conventions.

Ah, at least I concur with you on one point :-) Changes to convert
functions to AC must not change the API (type of arguments: positional
only/keyword/..., default values, etc.) nor provide a worse docstring.
There is an active on-going work to enhance AC to fix issues that you
reported, like the default value of positional-only parameters which
should not be rendered in the function signature (I created the issue
#29299 with a patch). Serhiy is also working on implementing the last
major missing feature of AC: support *args and **kwargs parameters
(issue #20291).
FYI I wasn't involved in AC changes, I only started to look at AC
recently (1 or 2 months ago). Again, I agree that these changes should
be carefully reviewed, which is an hard task since required changes
are usually large and move a lot of code. We need more eyes to look at
these changes!
For the specific case of sorted(), the name of first parameter is
already documented in the docstring and documentation in Python 2.7:
"iterable". So I guess that you mean that it is now possible to use it
as a keyword argument. Well, see the issue #29327 for the long story.
This issue is a regression, it was already fixed, and I didn't
introduce the API change.
Oh by the way, when I read your comment, I understand that I'm
responsible of all regressions. It's true that I introduced
regressions, that's where I said "shit happens" (or more politically
correct: "it's better to ask forgiveness than permission" ;-)). Since
I'm one of the most active contributor in CPython, I'm not surprised
of being the one who introduce many (most?) regressions :-) I'm trying
to review my changes multiple times, test corner cases, etc. But I'm
not perfect.
Sadly, to show its full power, FASTCALL requires changes at many
levels of the code. It requires to change at lot of code, but I
understood that core developers approved the whole project. Maybe I
was wrong? At least, I asked for permissions multiple changes,
especially at the start.
...

FWIW, the claim that the help is much better is specious.  AFAICT,
there has never been the slightest problem with "sorted(iterable, key=None,
reverse=False) --> new sorted list" which has been clear since the day it
was released.   It is some of the new strings the are causing problems with
users (my students frequently are tripped-up by the / notation for example;
no one seems to be able to intuit what it means without it being explained
first).

Good news, it seems like you have a good experience in API design,
documentation, etc. Join the "Argument Clinic" project to help us to
enhance docstrings, function signatures and documentation ;-)
See the good part of the AC on-going work: it's a nice opportunity to
also enhance documentation, not only provide a signature.
By the way, to be honest, the main advantage of converting functions
to AC is to get a signature. The signature is visible in docstrings
which is nice, but it is also very useful to a wide range of tools
like (IDE, static checks, etc.).
Conversion to FASTCALL is more a nice effect. At least, it is a good
motivation for me to convert mor and more code to AC :-)
AC moves docstring closer to the list of parameters. IHMO it makes the
C code simpler to read and understand. It also removes the boring code
responsible to "parse" arguments, so it makes the code shorter. But
well, this is just my opinion.
...

FWIW, I'm trying to be constructive and contribute where I can, but
frankly I can't keep up with the volume of churn.   Having seen bugs being
introduced, it is not inappropriate to ask another dev to please be careful,
especially when that dev has been prolific to an unprecedented degree and
altering core parts of the language for function calls, to new opcodes, the
memory allocators, etc.  Very few people on the planet are competent to
review these changes, make reasonable assessments about whether the
complexity and churn are worth it.  An fewer still have the time to keep up
with the volume of changes.

Hum, I wasn't involved in bytecode changes.
Well, I reviewed the very good work of Demur Rumed. I recall that you
worked on a similar area, trying to fetch bytecode by 16-bit instead
of 8-bit. Demur proposed a good design and I recall that the design
was approved.
I helped a little bit on the implementation and I pushed the final
change, but all credits go to Demur and Serhiy Storshaka! By the way,
Serhiy made further efficient enhancements in the bytecode of
CALL_FUNCTION instructions.
About memory allocations, I guess that you are referring to my change
on PyMem_Malloc() allocator. I discussed the issue on python-dev and
waited for approval of my peers before pushing anything, since I know
well that it's a critical part of Python:
https://mail.python.org/pipermail/python-dev/2016-March/143467.html
I provide all data requested by Marc Andre Lemburg (test the change
with common projects, Django, Pillow, numpy) and made further changes
(PYTHONMALLOC=debug tool) to help to handle this backward incompatible
change (GIL is now required to call PyMem_Malloc).
Hopefully, it seems like nobody noticed this subtle change (GIL now
requied): I didn't see any bug report. By the way, I fixed a misused
PyMem_Mem() in numpy.
...

Please do continue your efforts to improve the language, but also
please moderate the rate of change, mitigate the addition complexity, value
stability over micro-optimizations, consult the authors and maintainers of
code, take special care without code that hasn't been reviewed because that
lacks a safety net, and remember that newer devs may be taking cues from you
(do you want them making extensive changes to long existing stable code
without consulting the authors and with weak LGTM reviews?)

Ok, I will do it.
Thank you for you feedback Raymond. I hope that my email helps you to
understand how I work and how I take my decisions.
Victor

python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/

python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/

python-committers mailing list
python-committers@python.org
https://mail.python.org/mailman/listinfo/python-committers
Code of Conduct: https://www.python.org/psf/codeofconduct/

Re: [python-committers] My cavalier and aggressive manner, API change and bugs introduced for basically zero benefit

Brian Curtin

Microbenchmark on sorted() on Python 3.7 compared to 3.5 (before FASTCALL):

haypo@smithers$ ./python -m perf timeit 'seq=list(range(10)); k=lambda x:x' 'sorted(seq, key=k)' --compare-to=../3.5/python -v Median +- std dev: [3.5] 3.34 us +- 0.07 us -> [3.7] 2.66 us +- 0.05 us: 1.26x faster (-21%)