[Tutor] Query regarding Regular Expression

Alan Gauld alan.gauld at yahoo.co.uk
Wed Jun 28 19:03:39 EDT 2017


On 28/06/17 21:27, cookiestar227 - Cookie Productions wrote:

>  So far have understood everything except for the following example:
> 
>>>>  t = "A fat cat doesn't eat oat but a rat eats bats."
>>>>  mo = re.findall("[force]at", t)

> What I don't understand is the [force] part of the Regular Expression.  

A sequence of characters inside square brackets means match any one of
the characters. So [force]at matches:

fat, oat, rat, cat, eat

It does not ,atch bat because there is no b inside the brackets.

The fact that force spells a real word is misleading, it could just as
well be written

[ocfre]at

and it would do the same.

> I would prefer to use the following RE as it achieves my desired result:
> 
>>>> mo = re.findall("[A-Za-z]at", t)
>>>> print(mo)
> ['fat', 'cat', 'eat', 'oat', 'rat', 'eat',  'bat']
Fine, but it does a different job, as you discovered.

The problem with regex is that very minor changes in
pattern can have big differences in output. Or, as
you've shown a big difference in pattern can make
a very subtle difference in output.

That's what makes regex so powerful and so very difficult
to get right.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list