re.compile.match() results in unicode strings - why?

Kent Johnson kent3737 at yahoo.com
Thu Nov 11 10:42:50 EST 2004


Axel Bock wrote:
> Hi,
> 
> I am doing matches with the re module, and I am experiencing a strange 
> problem. I match a string with
>     exp = re.compile(blah)
>     m = exp.match(string)
>     a,b,c,d = m.groups()
> now a,b,c,d are all string variables, and they all come out as unicode 
> strings (u"xxx").

Apparently if the input strings are unicode then the groups will be as well:
 >>> import re
 >>> r=re.compile('(ab)')
 >>> r.match('abc').groups()
('ab',)
 >>> r.match(u'abc').groups()
(u'ab',)

Are you sure that exp is not a unicode string?

Kent



More information about the Python-list mailing list