Statistics: growth of core dev number vs growth of the code size/complexity
Hi,
I wrote a quick & dirty parser to compute statistics on *new* CPython core developer per year using the following page as data: https://devguide.python.org/developers/
2007: 15 2008: 19 2009: 11 2010: 20 2011: 12 2012: 9 2013: 4 2014: 10 2015: 2 2016: 5 2017: 2
Compare these numbers to Stéphane Wirtel's statistics on pull requests: https://speakerdeck.com/matrixise/cpython-loves-your-pull-requests
=> Number of active core developerson on GitHub pull requests: 27 (stats from February 2017 to October 2017) (I'm not sure of the meaning of this number, it's the number of core developer who authored pull requests, I don't think that it counts core developers who only made reviews.)
If you look at the size of the source code, it's still growing constanly since 1990: https://www.openhub.net/p/python/
2007: around 783k lines 2010: around 683k lines 2013: around 800k lines 2015: around 875k lines 2017: around 973k lines
The number of bugs is also constanly growing. Statistics on bugs since 2011: https://bugs.python.org/issue?@template=stats
2011: around 2500 open issues 2013: around 4000 open issues 2015: around 5000 open issues 2017: around 6200 open issues
The size of the CPython project is constantly growing as its complexity (technical debt? what is this? :-)), but the growth of core developers is slowing down.
I do consider that we need more people to handle the growing number of issues and pull requests, so the question is now how to find and "hire" (sorry, promote) them ;-)
Maybe we have a problem with mentoring. Maybe the CPython code base became too hard to train newcomers? Maybe we are too conservative? I don't know.
Victor
Le 07/12/2017 à 00:17, Victor Stinner a écrit :
Maybe we have a problem with mentoring. Maybe the CPython code base became too hard to train newcomers? Maybe we are too conservative? I don't know.
The language moved at a faster pace back then (especially with Python 3), which made it easier to find ways to contribute without getting into boring or overly tedious topics.
Also, I think its image is simply changing. 10 years ago, Python still stood out as something cool and a bit special. Now it's regarded as established and mature, and it certainly changes its attractivity among the kind of people who are keen to do volunteer contributions to free / open source software.
Can this be reversed? I don't know.
Regards
Antoine.
On Thu, Dec 07, 2017 at 12:17:04AM +0100, Victor Stinner wrote:
If you look at the size of the source code, it's still growing constanly since 1990: https://www.openhub.net/p/python/
2007: around 783k lines 2010: around 683k lines
What happened between 2007 and 2010 that the source shrank by nearly 13%?
-- Steve
On 7 December 2017 at 10:19, Steven D'Aprano steve@pearwood.info wrote:
On Thu, Dec 07, 2017 at 12:17:04AM +0100, Victor Stinner wrote:
If you look at the size of the source code, it's still growing constanly since 1990: https://www.openhub.net/p/python/
2007: around 783k lines 2010: around 683k lines
What happened between 2007 and 2010 that the source shrank by nearly 13%?
My guess would be that it's a consequence of Python 3 becoming the default branch.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Dec 6, 2017, at 18:17, Victor Stinner victor.stinner@gmail.com wrote:
I wrote a quick & dirty parser to compute statistics on *new* CPython core developer per year using the following page as data: https://devguide.python.org/developers/
2007: 15 2008: 19 2009: 11 2010: 20 2011: 12 2012: 9 2013: 4 2014: 10 2015: 2 2016: 5 2017: 2
Thanks for the analysis. Let’s do everything we can to get some of these newer core devs to the Language Summit in 2018!
-Barry
On Thu, Dec 7, 2017 at 2:17 AM, Victor Stinner victor.stinner@gmail.com wrote:
Hi,
I wrote a quick & dirty parser to compute statistics on *new* CPython core developer per year using the following page as data: https://devguide.python.org/developers/
2007: 15 2008: 19 2009: 11 2010: 20 2011: 12 2012: 9 2013: 4 2014: 10 2015: 2 2016: 5 2017: 2
Compare these numbers to Stéphane Wirtel's statistics on pull requests: https://speakerdeck.com/matrixise/cpython-loves-your-pull-requests
=> Number of active core developerson on GitHub pull requests: 27 (stats from February 2017 to October 2017) (I'm not sure of the meaning of this number, it's the number of core developer who authored pull requests, I don't think that it counts core developers who only made reviews.)
Is there an easy way to merge the first two stats? It would be interesting to see a table like this:
Number of active core developers | Year of their first commit
4 | 2012 3 | 2014 ...
--Berker
On Wed, 6 Dec 2017 at 15:17 Victor Stinner victor.stinner@gmail.com wrote:
Hi,
I wrote a quick & dirty parser to compute statistics on *new* CPython core developer per year using the following page as data: https://devguide.python.org/developers/
2007: 15 2008: 19 2009: 11 2010: 20 2011: 12 2012: 9 2013: 4 2014: 10 2015: 2 2016: 5 2017: 2
Compare these numbers to Stéphane Wirtel's statistics on pull requests: https://speakerdeck.com/matrixise/cpython-loves-your-pull-requests
=> Number of active core developerson on GitHub pull requests: 27 (stats from February 2017 to October 2017) (I'm not sure of the meaning of this number, it's the number of core developer who authored pull requests, I don't think that it counts core developers who only made reviews.)
If you look at the size of the source code, it's still growing constanly since 1990: https://www.openhub.net/p/python/
2007: around 783k lines 2010: around 683k lines 2013: around 800k lines 2015: around 875k lines 2017: around 973k lines
The number of bugs is also constanly growing. Statistics on bugs since 2011: https://bugs.python.org/issue?@template=stats
2011: around 2500 open issues 2013: around 4000 open issues 2015: around 5000 open issues 2017: around 6200 open issues
Do realize that open issues is a really misleading statistic as they include enhancement requests which we historically never close unless there's zero chance we will accept such a change.
The size of the CPython project is constantly growing as its complexity (technical debt? what is this? :-)), but the growth of core developers is slowing down.
Well, you added code to speed up Unicode encoding/decoding, right? So it's just adding stuff to keep things performant as well as new things. It's just what happens when you're willing to improve things.
I do consider that we need more people to handle the growing number of issues and pull requests, so the question is now how to find and "hire" (sorry, promote) them ;-)
Maybe we have a problem with mentoring. Maybe the CPython code base became too hard to train newcomers? Maybe we are too conservative? I don't know.
I think it's partially a fact that Python's popularity has increased the pool size of contributors, so lots of people grabbing individual things. This leads to less of a chance to make sustained contributions. E.g. when I became a core dev it was because I was able to grab a new issue to work on that was easy at a very regular cadence, but I don't know if I could rectify that at this point.
On Thu, Dec 7, 2017 at 8:43 PM, Brett Cannon brett@python.org wrote:
On Wed, 6 Dec 2017 at 15:17 Victor Stinner victor.stinner@gmail.com wrote:
Hi,
I wrote a quick & dirty parser to compute statistics on *new* CPython core developer per year using the following page as data: https://devguide.python.org/developers/
2007: 15 2008: 19 2009: 11 2010: 20 2011: 12 2012: 9 2013: 4 2014: 10 2015: 2 2016: 5 2017: 2
Compare these numbers to Stéphane Wirtel's statistics on pull requests: https://speakerdeck.com/matrixise/cpython-loves-your-pull-requests
=> Number of active core developerson on GitHub pull requests: 27 (stats from February 2017 to October 2017) (I'm not sure of the meaning of this number, it's the number of core developer who authored pull requests, I don't think that it counts core developers who only made reviews.)
If you look at the size of the source code, it's still growing constanly since 1990: https://www.openhub.net/p/python/
2007: around 783k lines 2010: around 683k lines 2013: around 800k lines 2015: around 875k lines 2017: around 973k lines
The number of bugs is also constanly growing. Statistics on bugs since 2011: https://bugs.python.org/issue?@template=stats
2011: around 2500 open issues 2013: around 4000 open issues 2015: around 5000 open issues 2017: around 6200 open issues
Do realize that open issues is a really misleading statistic as they include enhancement requests which we historically never close unless there's zero chance we will accept such a change.
Here is a breakdown: 2443 behavior 2124 enhancements 224 crash 170 compile error 125 resource usage 104 performance 44 security
The size of the CPython project is constantly growing as its complexity (technical debt? what is this? :-)), but the growth of core developers is slowing down.
Well, you added code to speed up Unicode encoding/decoding, right? So it's just adding stuff to keep things performant as well as new things. It's just what happens when you're willing to improve things.
I do consider that we need more people to handle the growing number of issues and pull requests, so the question is now how to find and "hire" (sorry, promote) them ;-)
Maybe we have a problem with mentoring. Maybe the CPython code base became too hard to train newcomers? Maybe we are too conservative? I don't know.
I think it's partially a fact that Python's popularity has increased the pool size of contributors, so lots of people grabbing individual things. This leads to less of a chance to make sustained contributions. E.g. when I became a core dev it was because I was able to grab a new issue to work on that was easy at a very regular cadence, but I don't know if I could rectify that at this point.
I think it also has to do with the maturity of the project.
When I started I remember I wanted to fix some issue and when I ran the whole test suite I had at least a dozen failing on my machine, so I went and fix those as well. But to fix those I needed to add more stuff to test.support and to regrtest, and so I did. Then those changes needed to be documented, but the documentation was missing. So I started improving the documentation and even after all that was done there were still plenty of low hanging fruits on the bug tracker or other issues to work on. Then Python 3 came, and there was more work to be done.
Nowadays the situation is much better, Python is more stable and mature, and what's left is more difficult, obscure, or controversial. There are still new modules and features being added and ISTM that most of the new core devs are working on those (e.g. asyncio/typing/etc), but otherwise finding new easy issues is becoming increasingly more difficult.
Best Regards, Ezio Melotti
Ezio:
Nowadays the situation is much better, Python is more stable and mature, and what's left is more difficult, obscure, or controversial. There are still new modules and features being added and ISTM that most of the new core devs are working on those (e.g. asyncio/typing/etc), but otherwise finding new easy issues is becoming increasingly more difficult.
Hey, I really like your very positive view on CPython ;-) You made my day!
Python is moving closer to perfection!
"Perfection is Achieved Not When There Is Nothing More to Add, But When There Is Nothing Left to Take Away"
Victor
participants (8)
-
Antoine Pitrou
-
Barry Warsaw
-
Berker Peksağ
-
Brett Cannon
-
Ezio Melotti
-
Nick Coghlan
-
Steven D'Aprano
-
Victor Stinner