[issue30004] in regex-howto, improve example on grouping
New submission from Cristian Barbarosie: In the Regular Expression HOWTO https://docs.python.org/3.6/howto/regex.html#regex-howto the last example in the "Grouping" section has a bug. The code is supposed to find repeated words, but it catches false repetitions.
p = re.compile(r'(\b\w+)\s+\1') p.search('Paris in the the spring').group() 'the the' p.search('k is the thermal coefficient').group() 'the the'
I propose adding a \b after \1, this solves the problem :
p = re.compile(r'(\b\w+)\s+\1\b') p.search('Paris in the the spring').group() 'the the' print p.search('k is the thermal coefficient') None
---------- assignee: docs@python components: Documentation messages: 291209 nosy: Cristian Barbarosie, docs@python priority: normal severity: normal status: open title: in regex-howto, improve example on grouping type: enhancement versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6, Python 3.7 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30004> _______________________________________
Changes by Serhiy Storchaka <storchaka+cpython@gmail.com>: ---------- nosy: +akuchling _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30004> _______________________________________
Cristian Barbarosie added the comment: Just discovered that a nearly identical example is presented in the end of section "Non-capturing and Named Groups". My proposal applies to this other example, too. And, by the way, reading this HOWTO has been very useful to me. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30004> _______________________________________
Mandeep Bhutani added the comment: Looks like both examples need a closing \b. Is this being worked on or should I submit a PR? ---------- nosy: +mandeepb _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30004> _______________________________________
Cristian Barbarosie added the comment: This topic seems stuck. Is there anything else I should do ? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue30004> _______________________________________
Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment: Do you mind to create a pull request on GitHub Cristian? ---------- components: +Regular Expressions nosy: +ezio.melotti, mrabarnett, serhiy.storchaka stage: -> needs patch versions: -Python 3.3, Python 3.4, Python 3.5 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Cristian Barbarosie <cristian.barbarosie@gmail.com> added the comment: I'm sorry, I have no experience at all with Git. Could you please do it for me ? The bug appears in two places, see my first two messages. Thank you ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mandeep Bhutani <mandeep@keemail.me> added the comment: Serhiy, Christian: I'll submit a PR for this later today. ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Change by Mandeep Bhutani <mandeep@keemail.me>: ---------- keywords: +patch pull_requests: +4384 stage: needs patch -> patch review _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mandeep Bhutani <mandeep@keemail.me> added the comment: Cristian, Serhiy: I've submitted a PR for this bug. Cristian: I apologize for misspelling your name in a prior post. ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mariatta Wijaya <mariatta.wijaya@gmail.com> added the comment: New changeset 610e5afdcbe3eca906ef32f4e0364e20e1b1ad23 by Mariatta (Mandeep Bhutani) in branch 'master': bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443) https://github.com/python/cpython/commit/610e5afdcbe3eca906ef32f4e0364e20e1b... ---------- nosy: +Mariatta _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Change by Roundup Robot <devnull@psf.upfronthosting.co.za>: ---------- pull_requests: +4485 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Change by Roundup Robot <devnull@psf.upfronthosting.co.za>: ---------- pull_requests: +4486 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mariatta Wijaya <mariatta.wijaya@gmail.com> added the comment: New changeset c02037d62284f4d4ca6b22f2ed05165ce2014951 by Mariatta (Miss Islington (bot)) in branch '2.7': bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443) (GH-4555) https://github.com/python/cpython/commit/c02037d62284f4d4ca6b22f2ed05165ce20... ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mariatta Wijaya <mariatta.wijaya@gmail.com> added the comment: New changeset 3e60747025edc34b503397ab8211be59cfdd05cd by Mariatta (Miss Islington (bot)) in branch '3.6': bpo-30004: Fix the code example of using group in Regex Howto Docs (GH-4443) (GH-4554) https://github.com/python/cpython/commit/3e60747025edc34b503397ab8211be59cfd... ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Mariatta Wijaya <mariatta.wijaya@gmail.com> added the comment: Thanks everyone. I merged the PR, and it's been backported to 3.6 and 2.7 ---------- resolution: -> fixed stage: patch review -> resolved status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment: Thank you Cristian for reporting this issue. Thank you Mandeep for your patch. Thank you Mariatta for merging. ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue30004> _______________________________________
participants (5)
-
Cristian Barbarosie
-
Mandeep Bhutani
-
Mariatta Wijaya
-
Roundup Robot
-
Serhiy Storchaka