python regex character group matches

Steven D'Aprano steve at
Wed Sep 17 16:55:35 CEST 2008

On Wed, 17 Sep 2008 15:56:31 +0200, Fredrik Lundh wrote:

> Assuming that you want to find runs of \uXXXX escapes, simply use
> non-capturing parentheses:
>     pat = re.compile(u"(?:\\\u[0-9A-F]{4})")

Doesn't work for me:

>>> pat = re.compile(u"(?:\\\u[0-9A-F]{4})")
UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 
5-7: truncated \uXXXX escape

Assuming that the OP is searching byte strings, I came up with this:

>>> pat = re.compile('(\\\u[0-9A-F]{4})+')


More information about the Python-list mailing list