[Python-bugs-list] [ python-Bugs-599377 ] re searches don't work with 4-byte unico
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 27 Aug 2002 09:49:27 -0700
Bugs item #599377, was opened at 2002-08-23 19:16
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=599377&group_id=5470
Category: Python Library
Group: Python 2.2.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Jim Fulton (dcjim)
Assigned to: Fredrik Lundh (effbot)
Summary: re searches don't work with 4-byte unico
Initial Comment:
For Python 2.2.1 or the CVS head, as of this posting,
with Python configured for 4-byte unicode
(--enable-unicode=ucs4)
searches against unicode regular expressions that use
characters above \xff don't seem to work.
Here's an example:
invalid_xml_char = re.compile(u'[\ud800-\udfff]')
invalid_xml_char.search(u'\ud800')
returns None, rather than a match.
----------------------------------------------------------------------
>Comment By: Peter Schneider-Kamp (nowonder)
Date: 2002-08-27 16:49
Message:
Logged In: YES
user_id=14463
I could reproduce this behaviour exactly. No idea what is
causing it, though.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=599377&group_id=5470