[Python-Dev] Behavior of matching backreferences
Gustavo Niemeyer
niemeyer@conectiva.com
Mon, 24 Jun 2002 00:04:58 -0300
> > I still think it should, because otherwise the "^(a)?b\1$" can never be
> > used, and this expression will become "^((a)?)b\1$" if more than one
> > character is needed.
>
> Is that a real concern? I mean that in the sense of whether you have an
> actual application requiring that some multi-character bracketing string
> either does or doesn't appear on both ends of a thing, and typing another
> set of parens is a burden. Both parts of that seem strained.
No, it isn't. Even because there is some way to implement this,
as Barry and you have shown, and because *I* know it doesn't work as
I'd expect. :-))
Indeed, I've found it while implementing another feature which in my
opinion is really useful, and can't be easily achieved. But that's
something for another thread, another day.
[...]
> ? Your example is hiding in there, on the "third iteration of the outer
> loop". The official POSIX interpretation was that it should match just the
> first 6 characters, and not the trailing #,
>
> because in a third iteration of the outer subexpression, . would match
> nothing (as distinct from matching a null string) and hence \2 would
> match nothing.
[...]
Thanks for giving me a strong and detailed reason. I understand that
small issues can end up in endless discussions and different
implementations. I'm happy that the POSIX people thought about that
before me <2.0 wink>.
> > Could you please reject the patch at SF?
>
> I'm not sure which one you mean, so on your authority I'm going to reject
> all patches at SF. Whew! This makes our job much easier <wink>.
That's good! You'll take back the time you wasted with me. ;-))
--
Gustavo Niemeyer
[ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]