Am 16.07.2010 17:08, schrieb Vlastimil Brom:
2010/7/9 Georg Brandl <g.brandl@gmx.net>:
Am 09.07.2010 02:35, schrieb MRAB:
1. Some of the inline flags are scoped; for example, putting "(?i)" at the end of a regex will now have no effect because it's no longer a global, all-or-nothing, flag.
That is problematic. I've often seen people put these flags at the end of a regex, probably for readability purposes. IMHO it would be better to limit flag scoping to the explicit (?flags-flags: ) groups.
I just noticed the formulation on the reference page regular-expressions.info on this kind of flags: "(?i) Turn on case insensitivity for the remainder of the regular expression. (Older regex flavors may turn it on for the entire regex.)" and likewise for other flags.
http://www.regular-expressions.info/refadv.html
I am not sure, how "authoritative" this page by Jan Goyvaerts is for various implementations, but it looks like a very comprehensive reference.
Nevertheless, the authoritative reference for our regex engine is its docs, i.e. http://docs.python.org/library/re.html -- and that states clearly that inline flags apply to the whole regex.
I think with a new regex implementation, not all of this "historical" semantics must be copied, unless there are major real usecases, which would be affected by this.
As I already said, I *have* seen this in real code. As MRAB indicated, this was the only silent change in semantics as compared to the old regex engine. If we replace re by regex, which I think is the only way to get the new features in the stdlib, changing this one aspect is a) not backwards compatible and b) in a subtle way that forces everyone to review his/her regular expressions. That's definitely not acceptable. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.