On Sun Nov 30 2014 at 12:00:20 PM Donald Stufft <donald@stufft.io> wrote:

On Nov 30, 2014, at 11:44 AM, Brett Cannon <brett@python.org> wrote:



On Sun Nov 30 2014 at 10:55:26 AM Ian Cordasco <graffatcolmingov@gmail.com> wrote:
On Sun, Nov 30, 2014 at 7:01 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
> On Sun, 30 Nov 2014 16:23:08 +1100
> Chris Angelico <rosuav@gmail.com> wrote:
>>
>> Yes, GitHub is proprietary. But all of your actual code is stored in
>> git, which is free, and it's easy to push that to a new host somewhere
>> else, or create your own host. This proposal is for repositories that
>> don't need much in the way of issue trackers etc, so shifting away
>> from GitHub shouldn't demand anything beyond moving the repos
>> themselves.
>
> I hope we're not proposing to move the issue trackers to github,
> otherwise I'm -1 on this PEP.
>
> Regards
>
> Antoine.

So I usually choose not to weigh in on discussions like these but
there seems to be a lot of misdirection in some of these arguments.

To start, I'm generally neutral about this proposal or Nick's proposal
that spurred this one. I've found the most frustrating part of
contributing to anything involving CPython to be the lack of reviewer
time. I have had very small (2-line) patches take months (close to a
year in reality) to get through in spite of periodically pinging the
appropriate people. Moving to git/GitHub will not alleviate this at
all.

To be clear, the main reasoning behind Nick's was being able to submit
changes without ever having to have a local copy of the repository in
question on your machine. Having a complete web workflow for editing
and contributing makes the barrier to entry far lower than switching
VCS or anything else. BitBucket (apparently, although I've never used
this) and GitHub both have this capability and *both* are
free-as-in-beer systems.

No one as I understand it is proposing that we use the per-distro
proprietary interface to these websites.

All data can be removed from GitHub using it's API and can generally
be converted to another platform. The same goes for BitBucket although
it's arguably easier to retrieve issue data from BitBucket than
GitHub. That said, *the issue tracker is not covered by these
proposals* so this is a moot point. Drop it already.

If we're seriously considering moving to git as a DVCS, we should
consider the real free-as-in-freedom alternatives that come very close
to GitHub and can be improved by us (even though they're not written
in Python). One of those is GitLab. We can self-host a GitLab instance
easily or we can rely on gitlab.com. GitLab aims to provide a very
similar user experience to GitHub and it's slowly approaching feature
parity and experience parity. GitLab is also what a lot of people
chose to switch to after the incident Steven mentioned, which I don't
think is something we should dismiss or ignore.

We should refocus the discussion with the following in mind:

- Migrating "data" from GitHub is easy. There are free-as-in-freedom
tools to do it and the only cost is the time it would take to monitor
the process

- GitHub has a toxic company culture that we should be aware of before
moving to it. They have a couple blog posts about attempting to change
it but employees became eerily silent after the incident and have
remained so from what I've personally seen.

- GitHub may be popular but there are popular FOSS solutions that
exist that we can also self-host at something like forge.python.org

- bugs.python.org is not covered by any of these proposals

- The main benefit this proposal (and the proposal to move to
BitBucket) are seeking to achieve is an online editing experience
allowing for *anyone with a browser and an account* to contribute.
This to me is the only reason I would be +1 for either of these
proposals (if we can reach consensus).

But that's not just it. As you pointed out, Ian, getting patch submissions committed faster would be a huge improvement over what we have today. GitHub/Bitbucket/whatever could help with this by giving core devs basic CI to know that I patch is sound to some extent as well as push button commits of patches.

For me personally, if I knew a simple patch integrated cleanly and passed on at least one buildbot -- when it wasn't a platform-specific fix -- then I could easily push a "Commit" button and be done with it (although this assumes single branch committing; doing this across branches makes all of this difficult unless we finally resolve our Misc/NEWS conflict issues so that in some instances it can be automated). Instead I have to wait until I have a clone I can push from, download a patch, apply it, run the unit tests myself, do the commit, and then repeat a subset of that to whatever branches make sense. It's a lot of work for which some things could be automated.

Well there’s two sides to the contribution process.

There’s making things better/easier for people who *aren’t* committers and
there is making things better/easier for people who *are* committers. Tacking
extra things on to what we already have to improve the life of committers is
easier in many ways. As committers they’ve likely already taken the time to
learn the bespoke workflow that the Python project uses and have already gotten
through that particular hurdle. Looking to standardize around popular tools is
mostly about making it easier for *new* people and making it so that if they
learn this set of tools they can go an immediately apply that to most of the
other Python projects out there, or that if they are already contributing to
those other Python projects they are probably aware of this particular
toolchain and workflow and can apply that knowledge directly to the Python
project.

Moving to some of these tools happens to come with it features like really nice
CI integration and a nice "Merge" button that also make it a lot nicer for the
committer side of things.

All very true, but if we can't improve both sides then we are simply going to end up with even more patches that we take a while to get around to. I want to end up with a solution that advances the situation for *both* committers and non-committers and I feel like that is being lost in the discussion somewhat. As the person who pushed for a migration to DVCS for non-committers I totally support improving the workflow for non-committers, but not at the cost of ignoring the latter half of the contribution workflow of committers which is a chronic problem.

As the PEP points out, the devguide, devinabox, and the PEPs have such a shallow development process that hosting them on Bitbucket wouldn't be a big thing. But if we don't view this as a long-term step towards moving cpython development somehow we are bifurcating our code contributors between git and hg which will be annoying. Now it could be argued that it doesn't matter for the peps and devguide since they are purely text and can be done easily through a web UI and a simple CI in Travis can be set up to make sure that the docs compile cleanly. But moving devinabox where there is going to be a code checkout in order to execute code for testing, etc. will be an issue.

So I guess my view is +0 for doc-only repos on GitHub as long as we make it clear we are doing it with the expectation that people will do everything through the web UI and never have to know git. But I can't advocate moving code over without moving ALL repos over to git -- hosting location doesn't matter to me -- to prevent having to know both DVCSs in order to do coding work related to Python; the cpython repo shouldn't become this vaunted repo that is special and so it's in hg long-term but everything else is on git.

-Brett
 

I think it's also hard to get a representation of the people for whom the
bespoke workflow and less popular tooling are a problem for in a discussion
on python-dev. My guess is most of those people would not have signed up for
python-dev since, unless they were willing to take the time to learn that,
so there is an amount of selection bias at play here as well.