Python 2.7 re.IGNORECASE broken in re.sub?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Mon Aug 16 19:10:41 EDT 2010


On Mon, 16 Aug 2010 05:46:17 -0700, Alex Willmer wrote:

> On Aug 16, 12:23 pm, Steven D'Aprano <st... at REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Sun, 15 Aug 2010 17:36:07 -0700, Alex Willmer wrote:
>> > On Aug 16, 1:07 am, Steven D'Aprano <st... at REMOVE-THIS-
>> > cybersource.com.au> wrote:
>> >> You're passing re.IGNORECASE (which happens to equal 2) as a count
>> >> argument, not as a flag. Try this instead:
>>
>> >> >>> re.sub(r"python\d\d" + '(?i)', "Python27", t)
>> >> 'Python27'
>>
>> > Basically right, but in-line flags must be placed at the start of a
>> > pattern, or the result is undefined.
>>
>> Pardon me, but that's clearly not correct, as proven by the fact that
>> the above example works.
> 
> Undefined includes 'might work sometimes'. I refer you to the Python
> documentation:
> 
> "Note that the (?x) flag changes how the expression is parsed. It should
> be used first in the expression string, or after one or more whitespace
> characters. If there are non-whitespace characters before the flag, the
> results are undefined."
> http://docs.python.org/library/re.html#regular-expression-syntax


Well so it does. I stand corrected.

I note though that even the docs say "should" rather than "must". I 
wonder whether the documentation author is just being cautious, because 
I've seen comments on the python-dev list that imply that the current 
behaviour of flags (that their effect is global to the regex) is 
supported. E.g.:

http://code.activestate.com/lists/python-dev/98681/

At the point that people are seriously considering changing the behaviour 
of a replacement re engine in order to support the current "undefined" 
behaviour, perhaps that behaviour isn't quite so undefined and the docs 
need to be re-written?



-- 
Steven





More information about the Python-list mailing list