Unicode regexp problem
I've got the following problem, in python 2.1, 2.2 and 2.3a0 (Debian):
import re re.compile(r'\w+', re.U).sub('X', u'hello caf\xe9') u'X X' re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9') u'XXXXX XXXX' re.compile(r'\w', re.U).sub('X', u'hello caf\xe9') u'XXXXX XXX\xe9'
The first two results are ok, but the third is not. Thanks, Florent PS: I'd appreciate a Cc on answers. -- Florent Guillaume, Nuxeo (Paris, France) +33 1 40 33 79 87 http://nuxeo.com mailto:fg@nuxeo.com
On Tue, Sep 17, 2002, Florent Guillaume wrote:
I've got the following problem, in python 2.1, 2.2 and 2.3a0 (Debian):
import re re.compile(r'\w+', re.U).sub('X', u'hello caf\xe9') u'X X' re.compile(r'\w{1}', re.U).sub('X', u'hello caf\xe9') u'XXXXX XXXX' re.compile(r'\w', re.U).sub('X', u'hello caf\xe9') u'XXXXX XXX\xe9'
The first two results are ok, but the third is not.
python-dev is the wrong forum for bug reports, unless a) it's *only* in the CVS tree and b) you know you need advice for fixing it (and are planning to help fix) In any case, you should write a bug report on SourceForge first (unless you're posting to c.l.python to check whether it is a bug). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Project Vote Smart: http://www.vote-smart.org/
participants (2)
-
Aahz
-
Florent Guillaume