Proposal for creating github spellcheck bot.

Hi, Recently, tirkarthi (xtreak) works on finding and fixing typos. And I (18z) came up with the idea of creating a spellcheck bot to improve the quality of cpython. We had discussion in https://github.com/python/cpython/pull/13749 And it is summarized below: Consensus: - Reducing typo in a constructive way. To be Solved: - Even after reducing the code checked, there will still be mostly false positives. - High frequency of updating dictionary.txt (false positive typo). Next Step: - Interfacing the filter with aspell to get some numbers. MISC: - The report could be made optional with a spellcheck label. - Code for reducing .py files to strings and comments. - Command line for spellcheck. There are problems to be solved. But I think we can always find a better solution. So, I’ll be working on interfacing the filter with aspell to get some numbers. Looking forward to hearing more opinions. :D Thanks KunYuChen (18z)

Hi Brett, YES! That's exactly what we want to do. :D Chen KunYu (18z) http://kunyu-chens-notes.rtfd.io/ On Fri, Jun 7, 2019 at 1:58 AM Brett Cannon <brett@python.org> wrote:

I also agree with that,and I would like to work on this with kunyu. Jun-wei song (krnick)

On Thu, 6 Jun 2019 at 13:38, Chen KunYu <xspiritualx@gmail.com> wrote:
Even after reducing the code checked, there will still be mostly false positives. High frequency of updating dictionary.txt (false positive typo).
If you can work out a way to integrate the Sphinx object inventory and glossary that the Docs build process emits, you may be able to dramatically cut down on the false positives without having to interactively update an additional data file. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi, Sorry if I did not reply earlier but I was offline for the last 2 months, but about the capability for the spell checking, in October 2018, I have used the spellcheck-contrib (https://github.com/sphinx-contrib/spelling). You can use it but for the new version of Sphinx 2.x, I think there is an update to do for the code. Have a nice day, Stéphane On 06/06, Chen KunYu wrote:
-- Stéphane Wirtel - https://wirtel.be - @matrixise

Hi guys, We would like to express our appreciation to all of you for the suggestions you give for the bot. Based on all of those, we created the Github-Bot for checking spelling typos on the pull request based on ( https://pypi.org/project/pyspellchecker). https://github.com/krnick/Gwalstat Example like this -> https://github.com/krnick/Gwalstat-test/pull/4 This is an idea suddenly came to us when we find that the CPython project seems to have lots of pull requests for TYPOS, so we think that maybe we could build a bot that checks the typos before merged to the project to reduce the amount of the pull request for typos. But I found that this robot has a high false-positive rate for CPython Doc files due to the syntax of reStructuredText or some function name as tirkarthi mentioned. This robot only works well for normal documents. It is still useful as a confirmation before merge. We would consistently make progress on this. Any comments suggestions would be greatly appreciated Thank you! JunWei Song, (krnick) Stéphane Wirtel <stephane@wirtel.be> 於 2019年7月19日 週五 上午7:20寫道:

Hi Brett, YES! That's exactly what we want to do. :D Chen KunYu (18z) http://kunyu-chens-notes.rtfd.io/ On Fri, Jun 7, 2019 at 1:58 AM Brett Cannon <brett@python.org> wrote:

I also agree with that,and I would like to work on this with kunyu. Jun-wei song (krnick)

On Thu, 6 Jun 2019 at 13:38, Chen KunYu <xspiritualx@gmail.com> wrote:
Even after reducing the code checked, there will still be mostly false positives. High frequency of updating dictionary.txt (false positive typo).
If you can work out a way to integrate the Sphinx object inventory and glossary that the Docs build process emits, you may be able to dramatically cut down on the false positives without having to interactively update an additional data file. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Hi, Sorry if I did not reply earlier but I was offline for the last 2 months, but about the capability for the spell checking, in October 2018, I have used the spellcheck-contrib (https://github.com/sphinx-contrib/spelling). You can use it but for the new version of Sphinx 2.x, I think there is an update to do for the code. Have a nice day, Stéphane On 06/06, Chen KunYu wrote:
-- Stéphane Wirtel - https://wirtel.be - @matrixise

Hi guys, We would like to express our appreciation to all of you for the suggestions you give for the bot. Based on all of those, we created the Github-Bot for checking spelling typos on the pull request based on ( https://pypi.org/project/pyspellchecker). https://github.com/krnick/Gwalstat Example like this -> https://github.com/krnick/Gwalstat-test/pull/4 This is an idea suddenly came to us when we find that the CPython project seems to have lots of pull requests for TYPOS, so we think that maybe we could build a bot that checks the typos before merged to the project to reduce the amount of the pull request for typos. But I found that this robot has a high false-positive rate for CPython Doc files due to the syntax of reStructuredText or some function name as tirkarthi mentioned. This robot only works well for normal documents. It is still useful as a confirmation before merge. We would consistently make progress on this. Any comments suggestions would be greatly appreciated Thank you! JunWei Song, (krnick) Stéphane Wirtel <stephane@wirtel.be> 於 2019年7月19日 週五 上午7:20寫道:
participants (6)
-
Brett Cannon
-
Chen KunYu
-
JunWei Song
-
nick
-
Nick Coghlan
-
Stéphane Wirtel