Regular expression issue

Chris Rebert clp2 at rebertia.com
Sun Aug 8 18:55:03 EDT 2010


On Sun, Aug 8, 2010 at 3:32 PM, Thomas Jollans <thomas at jollybox.de> wrote:
> On Monday 09 August 2010, it occurred to genxtech to exclaim:
>> I am trying to learn regular expressions in python3 and have an issue
>> with one of the examples I'm working with.
>> The code is:
>>
>> #! /usr/bin/env python3
>>
>> import re
>>
>> search_string = "[^aeiou]y$"
>
> To translate this expression to English:
>
> a character that is not a, e, i, o, or u, followed by the character 'y', at
> the end of the line.
>
> "vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"
>
> "pita" does not match: it does not end with "y".

Or in other words, the regex will not match when:
- the string ends in "ay", "ey", "iy", "oy", or "uy"
- the string doesn't end in "y"
- the string is less than 2 characters long

So, the program has a logic error in its assumptions. A non-match
*doesn't* imply that a string ends in one of the aforementioned pairs;
the other possibilities have been overlooked.

May I suggest instead using the much more straightforward
`search_string = "[aeiou]y$"` and then swapping your conditions
around? The double-negative sort of style the program is currently
using is (as you've just experienced) harder to reason about and thus
more error-prone.

Cheers,
Chris
--
http://blog.rebertia.com



More information about the Python-list mailing list