[just an FYI to everyone: replying without trimming the PEP will lead to moderation, so please cut out stuff that doesn't matter when replying] On Sun, 17 Jan 2016 at 21:37 Ezio Melotti <ezio.melotti@gmail.com> wrote:
On Mon, Jan 18, 2016 at 4:33 AM, Brett Cannon <brett@python.org> wrote:
On Sun, 17 Jan 2016 at 16:30 Ezio Melotti <ezio.melotti@gmail.com>
wrote: [SNIP]
Adding GitHub username support to bugs.python.org +++++++++++++++++++++++++++++++++++++++++++++++++ To keep tracking of CLA signing under the direct control of the PSF, tracking who has signed the PSF CLA will be continued by marking that fact as part of someone's bugs.python.org user profile. What this means is that an association will be needed between a person's bugs.python.org [#b.p.o]_ account and their GitHub account, which will be done through a new field in a user's profile.
We have to decide how to deal with users that don't have a b.p.o account. The two options that we discussed in the previous mails are: 1) require them to create a b.p.o account; 2) allow them to log in to b.p.o using their github account; (see also next section)
Both require creating an account, it just varies whether they log in using GitHub or not. I don't see how we can avoid that if we are going to continue to own the CLA dataset. Honestly I don't see this as a big issue as we have not seemed to have any issues with people creating accounts up to this point.
It's not a big issue, but there will be people that only have a github account, and we will need to explain them that in order to accept their PRs we need them to sign the CLA, and in order to sign the CLA we need them to create a b.p.o account and link it to their github username.
I expect this to be the common case. I wish there was a way to avoid it, but if we want people to participate on the issue tracker they will need the account anyway. Plus it's no worse than it is today as I expect it's already the case most people have a GItHub account but not a b.p.o one.
Linking a pull request to an issue ++++++++++++++++++++++++++++++++++ An association between a pull request and an issue is needed to track when a fix has been proposed. The association needs to be many-to-one as there can take multiple pull requests to solve a single issue (technically it should be a many-to-many association for when a single fix solves multiple issues, but this is fairly rare and issues can be merged into one using the ``Superceder`` field on the issue tracker).
Association between a pull request and an issue will be done based on detecting the regular expression``[Ii]ssue #(?P<bpo_id>\d+)``. If this is specified in either the title or in the body of a message on a pull request then connection will be made on bugs.python.org [#b.p.o]_. A label will also be added to the pull request to signify that the connection was made successfully. This could lead to incorrect associations if the wrong issue or referencing another issue was done, but these are rare occasions.
Is there a way to associate the PR to an issue (in case the user forgot) or change association (in case it got the wrong number) after the creation of the PR?
You tell me. :) I assume any bot we write to handle this will monitor PR-level comments since GitHub doesn't notify on title changes. But we can choose any workflow we want, so if we want it to be an explicit command like `/bot issue 12345` then we can do that instead.
I was asking about the github side. On b.p.o it's not a problem adding PRs either automatically or manually, but I don't know if the same can be done from github (it already happens when a commit message includes the wrong issue number -- the commit can not be changed and the notification is sent to the wrong issue on b.p.o). Even if it's not possible, I guess we could still add a comment to github with the correct issue number, and add the PR to the issue and the people to the nosy list manually.
So are you asking about a b.p.o -> GH association, so that you can make the association on b.p.o and have it show up somehow? If that's your question then we can add a comment like we do on b.p.o.
[SNIP]
Backup of pull request data ''''''''''''''''''''''''''' Since GitHub [#github]_ is going to be used for code hosting and code review, those two things need to be backed up. In the case of code hosting, the backup is implicit as all non-shallow Git [#git]_ clones contain the full history of the repository, hence there will be many backups of the repository.
If possible I would still prefer an "official" backup. I don't think we want to go around asking who has the latest copy of the repo in case something happens to GitHub.
I would be shocked if a core developer doesn't have an up-to-date repository (then again I don't expect to have to flee GitHub overnight, giving us time to update). But if you want to make it this an optional feature then I'm fine with that (I don't think the PSF infrastructure team will have issue with a regularly updated clone of the repository).
Yes, this is not a particularly realistic issue -- even without an official backup we won't lose any code. It's mostly to have an official backup and procedure to restore the repo instead of having to figure out what to do when it happens (even if it probably never will). The infra team can decide if this is a reasonable request or if it's not worth the extra effort.
+1
[SNIP]
Test coverage report '''''''''''''''''''' Getting an up-to-date test coverage report for Python's standard library would be extremely beneficial as generating such a report can take quite a while to produce.
There are a couple pre-existing services that provide free test coverage for open source projects. Which option is best is an open issue: `Choosing a test coverage service`_.
Do we want to eventually request that all new code introduced is fully covered by tests?
I have always told sprinters that 80% is a good guideline. I wouldn't ever want a rule, though. Obviously lowering coverage would not be great and should be avoided, but if
Maybe we could use red/yellow/green labels to indicate the coverage level. That alone might lead contributors to aim for the green label without having to create and enforce any rule.
To give you an idea of at least how Coveralls integration looks, https://github.com/python-modernize/python-modernize/pull/117
I think having an indication of how much code in a PR is covered by tests would be useful regardless of the answer to the previous question.
I know Coveralls already does this; don't know about Codecov.
Link web content back to files that it is generated from '''''''''''''''''''''''''''''''''''''''''''''''''''''''' It would be helpful for people who find issues with any of the documentation that is generated from a file to have a link on each page which points back to the file on GitHub [#github]_ that stores the content of the page. That would allow for quick pull requests to fix simple things such as spelling mistakes.
Here you are talking about PEPs/devguide/docs.p.o, right?
Yes.
FWIW the docs.python.org pages already have a "report a bug" link in the sidebar and also in the footer, but they both just redirect to https://docs.python.org/3/bugs.html .
Yep, which is what made me think that it would be nice if we could direct people directly to the actual page instead of having to figure it out. But then again, if we think a lot of drive-by PRs will be doc-based and from people who have never contributed before, then we will have to wait for them to sign the CLA anyway, so maybe it isn't worth it? I guess the direct link to the underlying content is only useful if people who have signed the CLA will use it the most.
FWIW this was problem was supposed to be fixed with pootle, but that project seems dead (not sure if due to technical reasons, or simply because no one had time to integrate it).
Splitting out parts of the documentation into their own repositories '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' While certain parts of the documentation at https://docs.python.org change with the code, other parts are fairly static and are not tightly bound to the CPython code itself. The following sections of the documentation fit this category of slow-changing, loosely-coupled:
* `Tutorial <https://docs.python.org/3/tutorial/index.html>`__ * `Python Setup and Usage <https://docs.python.org/3/using/index.html>`__ * `HOWTOs <https://docs.python.org/3/howto/index.html>`__ * `Installing Python Modules <https://docs.python.org/3/installing/index.html>`__ * `Distributing Python Modules <https://docs.python.org/3/distributing/index.html>`__ * `Extending and Embedding <https://docs.python.org/3/extending/index.html>`__ * `FAQs <https://docs.python.org/3/faq/index.html>`__
These parts of the documentation could be broken out into their own repositories to simplify their maintenance and to expand who has commit rights to them to ease in their maintenance.
I would still consider these somewhat dynamic (especially between Py 2 and Py 3). There are other documents that are more version-independent, such as the whatsnew pages,
The What's New docs get updated with changes so that can't be pulled out from the cpython repo (especially if my dream ever comes true of making people be better about making sure that document gets updated with changes that warrant being mentioned there).
The reason to keep them in a separate repo is that they are identical copies of the same document. If you fix a typo in the whatsnew/2.7, you have to fix the same typo in all branches, and in theory the page should always be identical in each branch.
Sure, but I try to make it a habit to update What's New at the same time as committing something that warrants a mention. Pulling that out will become a bigger chore. Having said that, if we want core developers to be the ones that author those kinds of changes then having it be a separate repo won't necessarily be as critical. We would just probably need to add a "needs What's New mention" label or something to keep track of what needs to be added so people didn't forget.
Howtos, FAQs, etc. might differ between major and even minor versions as new features are added and old features deprecated, so it makes sense to have (slightly) different versions in each branch (unless you want to move them outside the cpython repo and still keep separate branches). For example https://docs.python.org/3/faq/programming.html#is-there-an-equivalent-of-c-s... had a different answer before 2.5 :)
I suspect that when that kind of situation occurred that I version-specific branch would be created and then merged in once a release went out.
Also moving them might make things more complicated (from simply having to find/clone another repo to building docs for docs.python.org from different sources).
It's possible, but I doubt it. We could have a docs.python.org repo that contains everything but the docs carried in the cpython repo. Or we can have these other doc repos be Git submodules of cpython that you can check out if you want, but it isn't necessary to. What I do know is this idea has come up many times in the past from the perspective of being able to give out commit rights to the docs much more readily than with the cpython repo and I think that's a good idea.
[SNIP]
Git CLI commands for committing a pull request to cpython --------------------------------------------------------- Because Git [#git]_ may be a new version control system for core developers, the commands people are expected to run will need to be written down. These commands also need to keep a linear history while giving proper attribution to the pull request author.
Another set of commands will also be necessary for when working with a patch file uploaded to bugs.python.org [#b.p.o]_. Here the linear history will be kept implicitly, but it will need to make sure to keep/add attribution.
Nick Coghlan, Pierre-Yves David (a Mercurial dev), and Shiyao Ma (one of our GSoC student) have been working on an HG extension that simplifies interaction with the bug tracker (see the list of patches, download/apply them, upload new patches): https://bitbucket.org/introom/hg-cpydev In a previous email, someone mentioned an alias that allows an easier interaction with PRs. Would it make sense to write and distribute an official git extension that provides extra commands/aliases for these set of commands? (I guess the answer depends on how many tasks we have and how straightforward it is to do with plain git commands.)
It quite possibly might be. Otherwise shell commands could be written and kept in Tools/. I major perk IMO with Git over Mercurial is Git Bash comes with GIt and that gives you Bash on Windows. That makes writing cross-platform shell scripts to help with this sort of thing easy without leaving Windows users stranded.
Isn't that also possible with HG, since extensions are written in Python?
I mean .sh files, not extensions, e.g., we have a shell script in Tools that calls Git directly.
(BTW, is it possible/reasonable to write git extensions/hooks in Python?)
I believe so. From my understanding, Git doesn't go with an API solution like Mercurial and instead prefers a way to simply specify a naming scheme that will call commands on your $PATH. See https://www.atlassian.com/git/articles/extending-git/ as an example.
How to handle the Misc/NEWS file -------------------------------- There are two competing approaches to handling ``Misc/NEWS`` [#news-file]_. One is to add a news entry for issues on bugs.python.org [#b.p.o]_. This would mean an issue that is marked as "resolved" could not be closed until a news entry is added in the "news" field in the issue tracker. The benefit of tying the news entry to the issue is it makes sure that all changes worthy of a news entry have an accompanying issue. It also makes classifying a news entry automatic thanks to the Component field of the issue. The Versions field of the issue also ties the news entry to which Python releases were affected. A script would be written to query bugs.python.org for relevant new entries for a release and to produce the output needed to be checked into the code repository. This approach is agnostic to whether a commit was done by CLI or bot.
The competing approach is to use an individual file per news entry, containg the text for the entry. In this scenario each feature
Typo: containing
release would have its own directory for news entries and a separate file would be created in that directory that was either named after the issue it closed or a timestamp value (which prevents collisions). Merges across branches would have no issue as the news entry file would still be uniqeuely named and in the directory of the latest
Typo: uniquely
version that contained the fix. A script would collect all news entry files no matter what directory they reside in and create an appropriate news file (the release directory can be ignored as the mere fact that the file exists is enough to represent that the entry belongs to the release). Classification can either be done by keyword in the new entry file itself or by using subdirectories representing each news entry classification in each release directory (or classification of news entries could be dropped since critical information is captured by the "What's New" documents which are organized). The benefit of this approach is that it keeps the changes with the code that was actually changed. It also ties the message to being part of the commit which introduced the change. For a commit made through the CLI, a script will be provided to help generate the file. In a bot-driven scenario, the merge bot will have a way to specify a specific news entry and create the file as part of its flattened commit (while most likely also supporting using the first line of the commit message if no specific news entry was specified). Code for this approach has been written previously for the Mercurial workflow at http://bugs.python.org/issue18967. There is also a tool from the community at https://pypi.python.org/pypi/towncrier.
Does using git (and fast-forward merges, rebases, etc.) still create conflicts? (I guess the answer is yes, but perhaps we should double-check.)
I believe so, but I guess I could be wrong since I have not explicitly tested it.
If it does, there's also a third option: writing a merge script. I wrote a basic one for hg and it seemed to work decently, perhaps with git it's even easier.
I'll mention it, but i suspect the file-based solution will win out based on the feeling I have gotten from people when this topic has come up before.
I was re-reading the issue, and found due interesting links: 1) https://mail.python.org/pipermail/python-dev/2014-December/137393.html (Pierre-Yves David actually convinced me to write the merge script and helped me) 2) https://github.com/twisted/newsbuilder (this can be another option that can be added to the PEP)
Thanks, I'll add them.
A few additional points that you might want to add to the PEP: * A new workflow for releasing Python should also be defined, and PEP 101 updated accordingly.
I view this as implicit.
* The devguide also needs to be updated.
Implicit, but I can call this out. I assume it will end up with a github-migration branch for a while to store updates until the migration occurs.
* We should decide what to do with all the other repos at hg.python.org
They are all personal repos, so people can do what they want with them. -Brett