[Python-bugs-list] [ python-Bugs-610299 ] unicode alphanumeric regexp bug
SourceForge.net
noreply@sourceforge.net
Sun, 23 Feb 2003 17:29:20 -0800
Bugs item #610299, was opened at 2002-09-16 21:18
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=610299&group_id=5470
Category: Regular Expressions
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Florent Guillaume (efge)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: unicode alphanumeric regexp bug
Initial Comment:
I've got the following problem, in python 2.1, 2.2 and
2.3a0 (Debian):
>>> import re
>>> re.compile(r'\w+', re.U).sub('X', u'hello caf\xe9')
u'X X'
>>> re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9')
u'XXXXX XXXX'
>>> re.compile(r'\w', re.U).sub('X', u'hello caf\xe9')
u'XXXXX XXX\xe9'
The first two results are ok, but the third is not.
----------------------------------------------------------------------
>Comment By: Guido van Rossum (gvanrossum)
Date: 2003-02-23 20:29
Message:
Logged In: YES
user_id=6380
Fixed in 2.3 CVS using Greg's patch. Will backport to 2.2 as
well.
----------------------------------------------------------------------
Comment By: Greg Chapman (glchapman)
Date: 2002-11-04 11:51
Message:
Logged In: YES
user_id=86307
I just posted a small patch to sre_compile.py which should fix this:
http://sourceforge.net/tracker/?
func=detail&aid=633359&group_id=5470&atid=305470
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=610299&group_id=5470