[Python-ideas] Improving Clarity of re Module

Ned Batchelder ned at nedbatchelder.com
Wed Nov 27 11:06:47 CET 2013


On 11/26/13 5:31 PM, Alex Seewald wrote:
> For a match object, m, m.group(0) is the semantics for accessing the 
> entire span of the match. For newcomers to regular expressions who are 
> not familiar with the concept of a 'group', the name group(0) is 
> counter-intuitive. A more natural-language-esque alias to group(0), 
> perhaps 'matchSpan', could reduce the time novices spend from idea to 
> working code. Of course, this convenience would introduce a bit of 
> complexity to the codebase, so it may or may not be worth it to add an 
> alias to group(0). What do people think?
>
I like the idea of a better attribute for accessing the matched text.  I 
would go for either "m.matched" or "m.text".   There are convenience 
methods on match objects that I've almost never used: why do we need 
both .span() and .start()+.end(), for example?  And yet, I use .group() 
all the time, and have to just accept that my pattern had no groups in 
it, and I say "group" when I mean "matched text".  Yes, I understand 
about groups, and group 0, etc, but for such a common need, why not have 
a common name?

While we're at it, how can it be that we haven't improved the __repr__ 
after all these years?

    >>> m = re.search("[ab]", "xay")
    >>> m
    <_sre.SRE_Match object at 0x10a2ce9f0>

_sre? SRE_Match? huh? :)

--Ned.
> -- 
> Alex Seewald
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20131127/4afe94d8/attachment.html>


More information about the Python-ideas mailing list