[Python-Dev] Unicode regexp problem

Aahz aahz@pythoncraft.com
Mon, 16 Sep 2002 21:05:19 -0400


On Tue, Sep 17, 2002, Florent Guillaume wrote:
>
> I've got the following problem, in python 2.1, 2.2 and 2.3a0 (Debian):
> 
> >>> import re
> >>> re.compile(r'\w+',   re.U).sub('X', u'hello caf\xe9')
> u'X X'
> >>> re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9')
> u'XXXXX XXXX'
> >>> re.compile(r'\w',    re.U).sub('X', u'hello caf\xe9')
> u'XXXXX XXX\xe9'
> 
> The first two results are ok, but the third is not.

python-dev is the wrong forum for bug reports, unless

a) it's *only* in the CVS tree

and

b) you know you need advice for fixing it (and are planning to help fix)

In any case, you should write a bug report on SourceForge first (unless
you're posting to c.l.python to check whether it is a bug).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/