Using CLA-assistant for Python
![](https://secure.gravatar.com/avatar/6b6e72d297aa0270654a0d4575f1287e.jpg?s=120&d=mm&r=g)
Following up from past thread about Improving CLA signing process ( https://mail.python.org/mm3/archives/list/core-workflow@python.org/thread/2O... ) Over in Zulip <https://python.zulipchat.com/#narrow/stream/130206-pep581/subject/CLA/near/1...>, Yury suggested using cla-assistant <https://github.com/cla-assistant/cla-assistant>, which he used in EdgeDB project. I've checked with Van, and from legal point of view, he has no problem with using cla-assistant, and he said that he can do what is needed to support it. Brett and Zach both sounds supportive too. *question*: Is there any objection if we start using cla-assistant instead of the-knights-who-say-ni? (I'm going to take silence as :"no objection" :) ) What we will need to do order to start using cla-assistant: - Export the currently signed cla info from bpo to cla-assistant. We need to come up with a .csv containing a list of GitHub usernames of people who has signed Python's CLA. - Create a GitHub gist that represents Python's CLA and any additional info we need (some instruction here <https://github.com/cla-assistant/cla-assistant#request-more-information-from...> ) - Install cla-assistant in Python organization on GitHub, and enable it for the various repositories: cpython, devguide, peps, core-workflow, bedevere, miss-islington. Perhaps we start with just devguide at first, and see how it goes. - Update documentation surrounding CLA signing process. - Make the cla status check required. - We can remove the CLA signing page <https://www.python.org/psf/contrib/contrib-form/>, or keep it up only for those who submits patch to bpo - check-python-cla.herokuapp.com can be shut down, there will be no more use of it. - I think we can stop maintaining the-knights-who-say-ni? Things that we will lose if we go with this new workflow: - Since cla-assistant works with GitHub usernames, people who has signed the CLA but did not associate their GitHub username will need to either associate their GitHub username before the move, or sign the CLA again once the move. - since the cla status check will be required, we may not be able to "ignore" - Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python). - It can't check the CLA for people who contribute by submitting a patch in bpo. I guess in this case, we will fallback to the cla record in bpo. This situation should be quite rare. Most of us prefer GitHub PRs over bpo patch. - Any other issue I didn't think of? If nobody has strong objections or raised any issues, I plan to get this started and set up during Core sprint in a couple weeks. I'm hoping we don't have to wait until we have the next BDFL/COP/TOP*/voting committee to decide on this. *acronyms: BDFL: Benevolent Dictator For Life COP: Council of Pythonistas TOP: Trio of Pythonistas ᐧ
![](https://secure.gravatar.com/avatar/6a7c690adf3a7cff20e3c7c52f0160ad.jpg?s=120&d=mm&r=g)
On 08/29, Mariatta Wijaya wrote:
Following up from past thread about Improving CLA signing process ( https://mail.python.org/mm3/archives/list/core-workflow@python.org/thread/2O... )
Over in Zulip <https://python.zulipchat.com/#narrow/stream/130206-pep581/subject/CLA/near/1...>, Yury suggested using cla-assistant <https://github.com/cla-assistant/cla-assistant>, which he used in EdgeDB project.
I've checked with Van, and from legal point of view, he has no problem with using cla-assistant, and he said that he can do what is needed to support it. Brett and Zach both sounds supportive too.
*question*: Is there any objection if we start using cla-assistant instead of the-knights-who-say-ni? (I'm going to take silence as :"no objection" :) ) Just one question, this tool is developed by the Github team at SAP. For me, SAP is a big company with a big closed source ERP for the enterprises. I am just surprised of this initiative from them.
Good for us. 1. Do we need to use their platform? 2. If they close the platform for any reason, there is the source code, we could host it as well.
What we will need to do order to start using cla-assistant: - Export the currently signed cla info from bpo to cla-assistant. We need to come up with a .csv containing a list of GitHub usernames of people who has signed Python's CLA.
- Create a GitHub gist that represents Python's CLA and any additional info we need (some instruction here <https://github.com/cla-assistant/cla-assistant#request-more-information-from...> )
- Install cla-assistant in Python organization on GitHub, and enable it for the various repositories: cpython, devguide, peps, core-workflow, bedevere, miss-islington. Perhaps we start with just devguide at first, and see how it goes.
- Update documentation surrounding CLA signing process.
- Make the cla status check required.
- We can remove the CLA signing page <https://www.python.org/psf/contrib/contrib-form/>, or keep it up only for those who submits patch to bpo
- check-python-cla.herokuapp.com can be shut down, there will be no more use of it.
- I think we can stop maintaining the-knights-who-say-ni?
:/ sad, I love this bot ;-)
Things that we will lose if we go with this new workflow:
- Since cla-assistant works with GitHub usernames, people who has signed the CLA but did not associate their GitHub username will need to either associate their GitHub username before the move, or sign the CLA again once the move.
from that, you will get the list of active core developer.
- since the cla status check will be required, we may not be able to "ignore"
- Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python).
- It can't check the CLA for people who contribute by submitting a patch in bpo. I guess in this case, we will fallback to the cla record in bpo. This situation should be quite rare. Most of us prefer GitHub PRs over bpo patch.
- Any other issue I didn't think of?
If nobody has strong objections or raised any issues, I plan to get this started and set up during Core sprint in a couple weeks. I'm hoping we don't have to wait until we have the next BDFL/COP/TOP*/voting committee to decide on this.
*acronyms: BDFL: Benevolent Dictator For Life COP: Council of Pythonistas TOP: Trio of Pythonistas
I am fine for that, because when we organize the sprints, we have to ask to the participants to sign the CLA and with the manual validation, it's boring for the newcomer. +1 -- Stéphane Wirtel - https://wirtel.be - @matrixise
![](https://secure.gravatar.com/avatar/df8e51d7618d5ed7ccbbc8dea9a9afee.jpg?s=120&d=mm&r=g)
On Thu, Aug 30, 2018 at 1:15 AM Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
- Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python).
I don't think this is a good idea. We are getting a good amount of PRs that fix typos or markup errors and requiring CLA would make people refrain from contributing to Python. Personally, I wouldn't bother contributing if I was asked to sign a CLA just to get a simple documentation fix merged. Also, the amount of work required just to make it usable on python/cpython seems like a good indication that it's not worth the trouble :) --Berker
![](https://secure.gravatar.com/avatar/f3ba3ecffd20251d73749afbfa636786.jpg?s=120&d=mm&r=g)
On Thu, 30 Aug 2018 at 08:15, Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
- Any other issue I didn't think of?
If nobody has strong objections or raised any issues, I plan to get this started and set up during Core sprint in a couple weeks.
While I wouldn't expect them to object (since the proposal will save them a currently manual step), it would be worth checking directly with Ewa and Betsy on the PSF staff (as I believe they're the ones that handle the eSign -> bugs.python.org step in the current process). Beyond that, my main concern would be the one Berker raised: the fact that we allow reviewers to waive the CLA requirement for contributions that don't meet the standard of being copyrightable (most notably, typo fixes), is a feature, not a bug. That said, a usability regression for more casual fixes may be worth the trade-off when the pay-off is a major usability improvement for absolutely every one involved in bringing new contributors up to the level where we can accept more substantial contributions from them. You'll also want to talk to Ernest (PSF Infrastructure director) about either automating the periodic export of the csv file with all the CLA signatories, or else running the PSF's own instance of the service (which may also provide some more freedom in making the check advisory rather than strictly enforced). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
![](https://secure.gravatar.com/avatar/6b6e72d297aa0270654a0d4575f1287e.jpg?s=120&d=mm&r=g)
1. Do we need to use their platform? 2. If they close the platform for any reason, there is the source code, we could host it as well.
You'll also want to talk to Ernest (PSF Infrastructure director) about
either automating the periodic export of the csv file with all the CLA signatories, or else running the PSF's own instance of the service (which may also provide some more freedom in making the check advisory rather than strictly enforced).
I was thinking to just go with their platform, so we don't have to maintain/pay for it. I will check with Ernest if we can get this set up on our own infrastructure. While I wouldn't expect them to object (since the proposal will save them a
currently manual step), it would be worth checking directly with Ewa and Betsy on the PSF staff (as I believe they're the ones that handle the eSign -> bugs.python.org step in the current process).
I've asked Ewa and Betsy just now, they're both ok with a new process (using cla-assistant), and they also think it will be an improvement. - Since the status check will be made required, it means every contribution
no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python).
I don't think this is a good idea. We are getting a good amount of PRs that fix typos or markup errors and requiring CLA would make people refrain from contributing to Python. Personally, I wouldn't bother contributing if I was asked to sign a CLA just to get a simple documentation fix merged.
Beyond that, my main concern would be the one Berker raised: the fact that
we allow reviewers to waive the CLA requirement for contributions that don't meet the standard of being copyrightable (most notably, typo fixes), is a feature, not a bug.
Fair point. We actually have several options here: 1. Always "require" the cla check to pass. But this could "block" trivial contributions like typo fixes. (FWIW, these days I will still wait for them to sign CLA anyway, even for typo fixes.) 2. "not require" the cla check to pass. This will allow any core devs to merge the PR even if the cla was not signed. 3. "require" the cla check to pass, but allow only admins to merge the PR. I've considered (3), but I kinda don't want to bother the admins to merge trivial PRs. I've thought of (2) as well, but in Zulip, Brett said, "we're talking about legal stuff here, so paranoia typically wins out". And I think it makes sense. His suggestion was to start with (1), and re-evaluate the policy as we go. Changing this from "required" to "not required" is a matter of several clicks in GitHub. (by an admin) Also, the amount of work required just to make it usable on python/cpython
seems like a good indication that it's not worth the trouble :)
There are indeed some work involved to get this set up, probably mostly my time (and a little bit of Brett's time), and if we host it ourselves, Ernest's time. If we use their platform, and not host it ourselves, most of the things are done by clicking things on GitHub and cla-assistant, (trivial effort). The biggest task will be to write up the gist, and writing the code to export current status to a csv. I think these shouldn't take too long, perhaps one afternoon during the sprint. If we host it ourselves, it will take time from Ernest/ PSF Infrastructure team to do the initial setup and then day to day maintenance. I doubt the day to day maintenance will be any more than what we've been doing for ni/bpo. But perhaps Ernest will know more about this. Once it all set up, this will simplify the CLA signing process: - new contributors don't have to wait at least one US business day for this process to complete. It will be done in a matter of several clicks. - new contributors don't have to create bpo account if they're contributing to devguide or peps, or if they're fixing trivial typos - saves time from Ewa and Betsy, so they don't need to manually check the CLA anymore. I think overall it is worth spending the time and effort to get the initial setup going. At least I'm willing to do it for myself. Mariatta ᐧ
![](https://secure.gravatar.com/avatar/4dc045274504f02e7a0b6264e96da643.jpg?s=120&d=mm&r=g)
Le jeu. 30 août 2018 à 00:15, Mariatta Wijaya <mariatta.wijaya@gmail.com> a écrit :
Over in Zulip, Yury suggested using cla-assistant, which he used in EdgeDB project.
If you chose to use that, I would prefer to use our own instance to manage the database ourself, especially make backup. It would be bad for a legal point of view if suddently https://cla-assistant.io/ instance goes away and loose all its data. Victor
![](https://secure.gravatar.com/avatar/8dc73ffbbd0e3882d1f67bf8f1c60dcc.jpg?s=120&d=mm&r=g)
On Wednesday, August 29, 2018 6:14:52 PM EDT Mariatta Wijaya wrote:
- Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python)
It is possible to set the minimum number of lines changed required to trigger a CLA check. There's also a minimum number of files, although that seems less useful. Elvis
![](https://secure.gravatar.com/avatar/6b6e72d297aa0270654a0d4575f1287e.jpg?s=120&d=mm&r=g)
It is possible to set the minimum number of lines changed required to trigger a CLA check. There's also a minimum number of files, although that seems less useful.
I find it is tricky with CPython. one line change in any .rst file, perhaps trivial. One line change in *.py or *.c file, might not be trivial and require CLA, issue number, news entry and so on. I guess if we do it this way, we need better guidelines of what requires CLA and what's not. To see this in different perspective, if someone wants to propose a trivial change and then refused to sign the CLA, I think it won't be hard for us to find another contributor who has signed the CLA to make that change instead. So I'm inclined to make this a requirement. If you chose to use that, I would prefer to use our own instance to
manage the database ourself, especially make backup. It would be bad for a legal point of view if suddently https://cla-assistant.io/ instance goes away and loose all its data.
I've pinged Ernest about this. He said it is reasonable for The PSF to host an instance of cla-assistant. And he will look into it. I also wonder though, instead of hosting it ourselves, can't we just keep daily backups of the signed CLA? It's basically a list of GitHub usernames? Perhaps that would be an easier task than hosting and maintaining it. Mariatta ᐧ On Fri, Aug 31, 2018 at 8:26 AM Elvis Pranskevichus <elvis@magic.io> wrote:
On Wednesday, August 29, 2018 6:14:52 PM EDT Mariatta Wijaya wrote:
- Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python)
It is possible to set the minimum number of lines changed required to trigger a CLA check. There's also a minimum number of files, although that seems less useful.
Elvis
![](https://secure.gravatar.com/avatar/e8600d16ba667cc8d7f00ddc9f254340.jpg?s=120&d=mm&r=g)
On Fri, 31 Aug 2018 at 09:36 Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
It is possible to set the minimum number of lines changed required to
trigger a CLA check. There's also a minimum number of files, although that seems less useful.
I find it is tricky with CPython. one line change in any .rst file, perhaps trivial. One line change in *.py or *.c file, might not be trivial and require CLA, issue number, news entry and so on. I guess if we do it this way, we need better guidelines of what requires CLA and what's not.
To see this in different perspective, if someone wants to propose a trivial change and then refused to sign the CLA, I think it won't be hard for us to find another contributor who has signed the CLA to make that change instead. So I'm inclined to make this a requirement.
If you chose to use that, I would prefer to use our own instance to
manage the database ourself, especially make backup. It would be bad for a legal point of view if suddently https://cla-assistant.io/ instance goes away and loose all its data.
I've pinged Ernest about this. He said it is reasonable for The PSF to host an instance of cla-assistant. And he will look into it.
I also wonder though, instead of hosting it ourselves, can't we just keep daily backups of the signed CLA? It's basically a list of GitHub usernames?
Perhaps that would be an easier task than hosting and maintaining it.
I think the key question is what sort of resiliency would we have in making this potential change? If we can backup the data regularly so that if we have to quickly turn around and either stand up our own instance of cla-assistant or tweak our CLA bot then I would assume this would take care of the biggest concerns people have. Basically we need to have a plan in place if the hosted cla-assistant disappeared today without notice. -Brett
Mariatta
ᐧ
On Fri, Aug 31, 2018 at 8:26 AM Elvis Pranskevichus <elvis@magic.io> wrote:
On Wednesday, August 29, 2018 6:14:52 PM EDT Mariatta Wijaya wrote:
- Since the status check will be made required, it means every contribution no matter how trivial, requires CLA. Without it, we can't merge the pull request. (Maybe only the admins can still merge). Sounds like this is a good thing anyway (for Python)
It is possible to set the minimum number of lines changed required to trigger a CLA check. There's also a minimum number of files, although that seems less useful.
Elvis
_______________________________________________ core-workflow mailing list -- core-workflow@python.org To unsubscribe send an email to core-workflow-leave@python.org https://mail.python.org/mm3/mailman3/lists/core-workflow.python.org/ This list is governed by the PSF Code of Conduct: https://www.python.org/psf/codeofconduct
![](https://secure.gravatar.com/avatar/f3ba3ecffd20251d73749afbfa636786.jpg?s=120&d=mm&r=g)
On Sat, 1 Sep 2018 at 02:53, Brett Cannon <brett@python.org> wrote:
On Fri, 31 Aug 2018 at 09:36 Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
I also wonder though, instead of hosting it ourselves, can't we just keep daily backups of the signed CLA? It's basically a list of GitHub usernames?
Perhaps that would be an easier task than hosting and maintaining it.
I think the key question is what sort of resiliency would we have in making this potential change? If we can backup the data regularly so that if we have to quickly turn around and either stand up our own instance of cla-assistant or tweak our CLA bot then I would assume this would take care of the biggest concerns people have. Basically we need to have a plan in place if the hosted cla-assistant disappeared today without notice.
Given the infrastructure that the PSF has already set up to handle the modern incarnation of PyPI, I suspect adding our own instance of cla-assistant to that (and hence being able to easily integrate it with the PSF's existing monitoring and management tools, like DataDog) will actually be easier than devising and managing a custom cla-assistant-specific disaster recovery plan. 'tis the beauty and wonder of combining traditional three-tier open source web applications with an automated app management platform like Kubernetes :) Cheers, Nick. P.S. See https://p.datadoghq.com/sb/7dc8b3250-85dcf667bd for the current PyPI/PSF metrics dashboard. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
![](https://secure.gravatar.com/avatar/6b6e72d297aa0270654a0d4575f1287e.jpg?s=120&d=mm&r=g)
Thanks for all the inputs. I've chatted with Brett at the sprint about this, and this is our current thinking: - We will go with cla-assistant, hosted by them. - We will keep daily backups of signed CLA ( a list of GitHub usernames of people who have signed the CLA ) - We will keep daily backups of cla-assistant's source code I understand that people are worried about the possibility that the service suddenly go away without notice. I'm personally not too worried about it, but having a daily backups of both data and source code should keep us covered. With the backups, if the service go away suddenly one day, we can spin up our own version of it. In addition, the code for the-knights-who-say-ni will be made read-only and not deleted, and it can be re-activated. The reason I don't want to host our own cla-assistant is to reduce our burden of having to maintain it, keeping the server alive, updating the running service and the code every once in a while. And by "us", I actually mean the PSF infrastructure team, Brett, and myself who are volunteering to keep the CLA bot service up and running. For me, the sooner we can get cla-assistant set up, the easier all our life is going to be. So these are the things I plan to do today: - contact Ernest about getting daily backups in place - create the gist of the CLA, and have Van L review it - obtain an export of signed CLA from bpo as of today Mariatta ᐧ On Sat, Sep 1, 2018 at 9:25 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, 1 Sep 2018 at 02:53, Brett Cannon <brett@python.org> wrote:
On Fri, 31 Aug 2018 at 09:36 Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
I also wonder though, instead of hosting it ourselves, can't we just keep daily backups of the signed CLA? It's basically a list of GitHub usernames?
Perhaps that would be an easier task than hosting and maintaining it.
I think the key question is what sort of resiliency would we have in making this potential change? If we can backup the data regularly so that if we have to quickly turn around and either stand up our own instance of cla-assistant or tweak our CLA bot then I would assume this would take care of the biggest concerns people have. Basically we need to have a plan in place if the hosted cla-assistant disappeared today without notice.
Given the infrastructure that the PSF has already set up to handle the modern incarnation of PyPI, I suspect adding our own instance of cla-assistant to that (and hence being able to easily integrate it with the PSF's existing monitoring and management tools, like DataDog) will actually be easier than devising and managing a custom cla-assistant-specific disaster recovery plan.
'tis the beauty and wonder of combining traditional three-tier open source web applications with an automated app management platform like Kubernetes :)
Cheers, Nick.
P.S. See https://p.datadoghq.com/sb/7dc8b3250-85dcf667bd for the current PyPI/PSF metrics dashboard.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
![](https://secure.gravatar.com/avatar/6b6e72d297aa0270654a0d4575f1287e.jpg?s=120&d=mm&r=g)
Update, please see my post on Discourse: https://discuss.python.org/t/using-cla-assistant-for-python/990 Thanks. ᐧ
participants (8)
-
Berker Peksağ
-
Brett Cannon
-
Elvis Pranskevichus
-
Mariatta
-
Mariatta Wijaya
-
Nick Coghlan
-
Stephane Wirtel
-
Victor Stinner