Match First Sequence in Regular Expression?

Tim Chase python.list at
Thu Jan 26 20:23:47 CET 2006

>> "xyz123aaabbaaabab"
>> where you have "aaab" in there twice.
> Good suggestion.

I assumed that this would be a valid case.  If not, the
expression would need tweaking.

>> ^([^b]|((?<!a)b))*aaab+[ab]*$
> Looks good, although I've been unable to find a good
> explanation of the "negative lookbehind" construct "(?<".  How
> does it work?

The beginning part of the expression


breaks down as

	[^b]        anything that isn't a "b"
	|           or
	(...)       this other thing

where "this other thing" is

	(?<!a)b     a "b" as long as it isn't immediately
	            preceeded by an "a"

The "(?<!...)" construct means that the "..." portion can't come 
before the following token in the this case, before a 

There's also a "negative lookahead" (rather than "lookbehind") 
which prevents items from following.  This should be usable in 
this scenario as wall and works with the aforementioned tests, using


which would be "anything that's not an 'a'; or an 'a' as long as 
it's not followed by a 'b'"

The gospel is at:

but is a bit terse.  O'reily has a fairly good book on regexps if 
you want to dig a bit deeper.


More information about the Python-list mailing list