Thanks for all the inputs.

I've chatted with Brett at the sprint about this, and this is our current thinking:
- We will go with cla-assistant, hosted by them.
- We will keep daily backups of signed CLA ( a list of GitHub usernames of people who have signed the CLA )
- We will keep daily backups of cla-assistant's source code

I understand that people are worried about the possibility that the service suddenly go away without notice.
I'm personally not too worried about it, but having a daily backups of both data and source code should keep us covered.
With the backups, if the service go away suddenly one day, we can spin up our own version of it.
In addition, the code for the-knights-who-say-ni will be made read-only and not deleted, and it can be re-activated.

The reason I don't want to host our own cla-assistant is to reduce our burden of having to maintain it, keeping the server alive, updating the running service and the code every once in a while.
And by "us", I actually mean the PSF infrastructure team, Brett, and myself who are volunteering to keep the CLA bot service up and running.

For me, the sooner we can get cla-assistant set up, the easier all our life is going to be. 

So these are the things I plan to do today:
- contact Ernest about getting daily backups in place
- create the gist of the CLA, and have Van L review it
- obtain an export of signed CLA from bpo as of today

Mariatta


On Sat, Sep 1, 2018 at 9:25 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, 1 Sep 2018 at 02:53, Brett Cannon <brett@python.org> wrote:
On Fri, 31 Aug 2018 at 09:36 Mariatta Wijaya <mariatta.wijaya@gmail.com> wrote:
I also wonder though, instead of hosting it ourselves, can't we just keep daily backups of the signed CLA? It's basically a list of GitHub usernames?
 
Perhaps that would be an easier task than hosting and maintaining it.

I think the key question is what sort of resiliency would we have in making this potential change? If we can backup the data regularly so that if we have to quickly turn around and either stand up our own instance of cla-assistant or tweak our CLA bot then I would assume this would take care of the biggest concerns people have. Basically we need to have a plan in place if the hosted cla-assistant disappeared today without notice.

Given the infrastructure that the PSF has already set up to handle the modern incarnation of PyPI, I suspect adding our own instance of cla-assistant to that (and hence being able to easily integrate it with the PSF's existing monitoring and management tools, like DataDog) will actually be easier than devising and managing a custom cla-assistant-specific disaster recovery plan.

'tis the beauty and wonder of combining traditional three-tier open source web applications with an automated app management platform like Kubernetes :)

Cheers,
Nick.

P.S. See https://p.datadoghq.com/sb/7dc8b3250-85dcf667bd for the current PyPI/PSF metrics dashboard.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia