[Tutor] Query regarding Regular Expression
Alan Gauld
alan.gauld at yahoo.co.uk
Wed Jun 28 19:03:39 EDT 2017
On 28/06/17 21:27, cookiestar227 - Cookie Productions wrote:
> So far have understood everything except for the following example:
>
>>>> t = "A fat cat doesn't eat oat but a rat eats bats."
>>>> mo = re.findall("[force]at", t)
> What I don't understand is the [force] part of the Regular Expression.
A sequence of characters inside square brackets means match any one of
the characters. So [force]at matches:
fat, oat, rat, cat, eat
It does not ,atch bat because there is no b inside the brackets.
The fact that force spells a real word is misleading, it could just as
well be written
[ocfre]at
and it would do the same.
> I would prefer to use the following RE as it achieves my desired result:
>
>>>> mo = re.findall("[A-Za-z]at", t)
>>>> print(mo)
> ['fat', 'cat', 'eat', 'oat', 'rat', 'eat', 'bat']
Fine, but it does a different job, as you discovered.
The problem with regex is that very minor changes in
pattern can have big differences in output. Or, as
you've shown a big difference in pattern can make
a very subtle difference in output.
That's what makes regex so powerful and so very difficult
to get right.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list