Python 2.7 re.IGNORECASE broken in re.sub?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Mon Aug 16 19:10:41 EDT 2010
On Mon, 16 Aug 2010 05:46:17 -0700, Alex Willmer wrote:
> On Aug 16, 12:23 pm, Steven D'Aprano <st... at REMOVE-THIS-
> cybersource.com.au> wrote:
>> On Sun, 15 Aug 2010 17:36:07 -0700, Alex Willmer wrote:
>> > On Aug 16, 1:07 am, Steven D'Aprano <st... at REMOVE-THIS-
>> > cybersource.com.au> wrote:
>> >> You're passing re.IGNORECASE (which happens to equal 2) as a count
>> >> argument, not as a flag. Try this instead:
>>
>> >> >>> re.sub(r"python\d\d" + '(?i)', "Python27", t)
>> >> 'Python27'
>>
>> > Basically right, but in-line flags must be placed at the start of a
>> > pattern, or the result is undefined.
>>
>> Pardon me, but that's clearly not correct, as proven by the fact that
>> the above example works.
>
> Undefined includes 'might work sometimes'. I refer you to the Python
> documentation:
>
> "Note that the (?x) flag changes how the expression is parsed. It should
> be used first in the expression string, or after one or more whitespace
> characters. If there are non-whitespace characters before the flag, the
> results are undefined."
> http://docs.python.org/library/re.html#regular-expression-syntax
Well so it does. I stand corrected.
I note though that even the docs say "should" rather than "must". I
wonder whether the documentation author is just being cautious, because
I've seen comments on the python-dev list that imply that the current
behaviour of flags (that their effect is global to the regex) is
supported. E.g.:
http://code.activestate.com/lists/python-dev/98681/
At the point that people are seriously considering changing the behaviour
of a replacement re engine in order to support the current "undefined"
behaviour, perhaps that behaviour isn't quite so undefined and the docs
need to be re-written?
--
Steven
More information about the Python-list
mailing list