Using Groups inside Braces with Regular Expressions

MRAB google at mrabarnett.plus.com
Mon Jul 14 02:14:23 CEST 2008


On Jul 14, 12:05 am, Chris <chriss... at gmail.com> wrote:
> I'm trying to delimit  sentences in a block of text by defining the
> end-of-sentence marker as a period followed by a space followed by an
> uppercase letter or end-of-string.
>
> I'd imagine the regex for that would look something like:
> [^(?:[A-Z]|$)]\.\s+(?=[A-Z]|$)
>
> However, Python keeps giving me an "unbalanced parenthesis" error for
> the [^] part. If this isn't valid regex syntax, how else would I match
> a block of text that doesn't the delimiter pattern?
>
What is the [^(?:[A-Z]|$)] part meant to be doing? Is it meant to be
matching everything up to the end of the sentence?

[...] is a character class, so Python is parsing the character class
as:

[^(?:[A-Z]|$)]
^^^^^^^^^^



More information about the Python-list mailing list