Hi Thomas, I have to assume that this isn't a rejection of my proposal, since I haven't actually made a proposal to the SC yet :) Thanks for the feedback though, it's very valuable to know the SC's thinking on this matter. I have a few comments inline below. On 04/11/2020 12:27 pm, Thomas Wouters wrote:
(For the record, I’m not replying as a PSF Director in this; I haven’t discussed this with the rest of the Board yet. This just comes from the Steering Council.)
The Steering Council discussed this proposal in our weekly meeting, last week. It's a complicated subject with a lot of different facets to consider. First of all, though, we want to thank you, Mark, for bringing this to the table. The Steering Council and the PSF have been looking for these kinds of proposals for spending money on CPython development. We need ideas like this to have something to spend money on that we might collect (e.g. via the new GitHub sponsors page), and also to have a good story to potential (corporate) sponsors.
That said, we do have a number of things to consider here.
For background, funding comes in a variety of flavours. Most donations to the PSF are general fund donations; the foundation is free to use it for whatever purpose it deems necessary (within its non-profit mission). The PSF Board and staff decide where this money has the biggest impact, as there are a lotof things the PSF could spend it on.
Funds can also be earmarked for a specific purpose. Donations to PyPI (donate.pypi.org <http://donate.pypi.org>) work this way, for example. The donations go to the PSF, but are set aside specifically for PyPI expenses and development. Fiscal sponsorship (https://www.python.org/psf/fiscal-sponsorees/) is similar, but even more firmly restricted (and the fiscal sponsorees, not the PSF, decides +++ what to spend the money on).
A third way of handling funding is more targeted donations: sponsors donate for a specific program. For example, GitHub donated money specifically for the PSF to hire a project manager to handle the migration from bugs.python.org <http://bugs.python.org> to GitHub Issues. Ezio Melotti was contracted by the PSF for this job, not GitHub, even though the funds are entirely donated by GitHub. Similar to such targeted donations are grant requests, like the several grants PyPI received and the CZI grant request for CPython that was recently rejected (https://github.com/python/steering-council/issues/26). The mechanics are a little different, but the end result is the same: the PSF receives funds to achieve very specific goals.
I really don't want to take money away from the PSF. Ideally I would like the PSF to have more money.
Regarding donations to CPython development (as earmarked donations, or from the PSF's general fund), the SC drew up a plan for investment that is centered around maintenance: reducing the maintenance burden, easing the load on volunteers where desired, working through our bug and PR backlog. (The COVID-19 impact on PyCon and PSF funds put a damper on our plans, but we used much of the original plan for the CZI grant request, for example. Since that, too, fell through, we're hoping to collect funds for a reduced version of the plan through the PSF, which is looking to add it as a separate track in the sponsorship program.) Speeding up pure-Python programs is not something we consider a priority at this point, at least not until we can address the larger maintenance issues.
I too think we should improve the maintenance story. But maintenance doesn't get anyone excited. Performance does. By allocating part of the budget to maintenance we get performance *and* a better maintenance story. That's my goal anyway. I think it is a lot easier to say to corporations, give us X dollars to speed up Python and you save Y dollars, than give us X dollars to improve maintenance with no quantifiable benefit to them.
And it may not be immediately obvious from Mark's plans, but as far as we can tell, the proposal is for speeding up pure-Python code. It will do little for code that is hampered, speed-wise, by CPython's object model, or threading model, or the C API. We have no idea how much this will actually matter to users. Making pure-Python code execution faster is always welcome, but it depends on the price. It may not be a good place to spend $500k or more, and it may even not be considered worth the implementation complexity.
I'll elaborate: 1. There will be a large total diff, but not that large an increase in code size; less than 1% of the current size of the C code base. There would be an increase in the conceptual complexity of the interpreter, but I'm hoping to largely offset that with better code organization. It is perfectly possible to *improve* code quality, if not necessarily size, while increasing performance. Simpler code is often faster and better algorithms do not make worse code. 2. The object model and C-API are an inherent part of CPython. It's not really meaningful to say that some piece of code's performance is hampered by the C-API or object model. What matters is how much faster it goes. 3. Regarding threading, all CPU bound code will be speed up. Whether code is limited by being single threaded or not, it will still be sped up. The speed up of a single interpreter is (largely) independent of the number of threads running. Eric, Petr and Victor's work will still be relevant for concurrent performance. Please, just ask me if you need more details on any of these points.
Thinking specifically of corporate sponsorship, it's very much the question if pure-Python code speedup is something companies would be willing to invest serious funds in. Google's Unladen Swallow was such an investment, and though it did deliver speedups (which were included in Python 2.7) and even though Google has a lotof Python code, there was not enough interest to keep it going. This may be different now, but finding out what "customers" (in the broadest sense) actually want is an important first step in asking for funding for a project like this. It's the kind of thing normally done by a product manager, at least in the corporate world, and we need that same effort and care put into it.
It makes sense that a single corporate sponsor would be unwilling to fund this. But why not several corporations? It keeps their costs down and they get the same benefit. I have no idea how to go about organizing that, however.
If we canpotentially find the funds for this project, via the PSF's general fund, earmarked funds or a direct corporate sponsor, we also have to consider what we are actually delivering. Which performance metrics are we improving? How are we measuring them, what benchmarks? What if the sponsor has their own benchmarks they want to use? What about effects on other performance metrics, ones the project isn't seeking to improve, are they allowed to worsen? To what extent? How will that be measured? How will we measure progress as the project continues? What milestones will we set? What happens when there's disagreement about the result between the sponsor and the people doing the work? What if the Steering Council or the core developers -- as a body -- declines to merge the work even if it does produce the desired result for the sponsor and the people doing the work?
We already have a standard benchmark suite. I would propose using that as a start. If corporate sponsors want to add their own benchmarks that's a double win. They get more confidence that they will see performance improvements and we get a more comprehensive benchmark suite. I wouldn't worry about anything getting slower. But, if a sponsor only sees a 20% speedup on their code, despite a general speed up of 50%, then what happens? I guess that's up to the sponsor, although they probably should state their conditions up front.
And this is about more than just agreements between the sponsor and the people doing the work. What is the position of the Steering Council in this? Are they managing the people doing the work or not? Are they evaluating the end result or not? What about the rest of the core developers? And how will development take place? Will the design or implementation of the performance improvements go through the PEP process? Will the SC or other core developers have input in the design or implementation? Who will do code review of the changes? Will the work be merged in small increments, or will it happen in a separate branch until the project is complete? All of these questions, and more, will need to be answered in some way, and it really requires a project manager to take this on. We've seen how much impact good management can have on a project with the PyPI work overseen by Sumana. A project of this scale really can't do without it.
I don't think that the SC or PSF should be managing the work. How do you price and allocate research work? Which is why I am offering to subcontract. I am willing to take on the risk and, having done the research, know that I can deliver. As for reviewing and merging, I would expect to pay someone for reviewing and some other maintenance tasks. Note that the payment would be for the review, not for a favorable review. Obviously reviews from other code devs would be most welcome, but I don't want to rely on using up other people's spare time. I can merge the code myself. Merges would be in small units and as often as is practical. There is no need for long lived branches, at least not for stage 1.
I don't doubt all of these questions can be answered, but it's going to take time and effort -- and probably concessions -- to get to a good proposal to put before interested corporations, and then more adjustments to accommodate them. The PSF and the SC can't fund the work at this time. If we can find a sponsor willing to just shell out the $2M (or just $500k) for the current plan, the SC is not against it -- but without the product management and project management work mentioned above, I doubt this will happen. If we want the SC or the PSF to go shopping for sponsors, soliciting donations for this project, we need more of the product/project management work done as well.
Just the $500k, or thereabouts. The first stage should not rely on later stages ever happening. As for project management, that's why I was suggested a cash-on-delivery contract. Obviously whoever gets hired by the PSF for maintenance will need managing, but that needs to happen anyway.
If people want to work on the product and project management part of the proposal, that’d be great. We'd be happy to provide guidance. We also can -- and will! -- certainly mention this proposal as the kind of work we would want to fund when talking to potential sponsors. We can gauge interest, to see how worthwhile it would be to flesh out the proposal. Who knows, maybe someone will be willing to outright fund this as-is. But as it is, the SC doesn't think we can fund this directly, even if we had the money available.
Again, I really don't want to take money away from the PSF. Cheers, Mark.