Using Groups inside Braces with Regular Expressions
MRAB
google at mrabarnett.plus.com
Sun Jul 13 20:14:23 EDT 2008
On Jul 14, 12:05 am, Chris <chriss... at gmail.com> wrote:
> I'm trying to delimit sentences in a block of text by defining the
> end-of-sentence marker as a period followed by a space followed by an
> uppercase letter or end-of-string.
>
> I'd imagine the regex for that would look something like:
> [^(?:[A-Z]|$)]\.\s+(?=[A-Z]|$)
>
> However, Python keeps giving me an "unbalanced parenthesis" error for
> the [^] part. If this isn't valid regex syntax, how else would I match
> a block of text that doesn't the delimiter pattern?
>
What is the [^(?:[A-Z]|$)] part meant to be doing? Is it meant to be
matching everything up to the end of the sentence?
[...] is a character class, so Python is parsing the character class
as:
[^(?:[A-Z]|$)]
^^^^^^^^^^
More information about the Python-list
mailing list