[Python-Dev] re with Unicode broken?
Fredrik Lundh
fredrik@pythonware.com
Fri, 13 Jul 2001 16:44:22 +0200
sjoerd wrote:
> This is not for the faint of heart.
>
> My validating XML parser doesn't work anymore, even though I didn't
> change a thing (except update Python from CVS).
when did you last update without problems?
the likely cause for this is MvL's "big char set" patch, which
I checked in on July 6.
here's a workaround: tweak sre_compile.py so it doesn't generate
BIGCHARSET op codes. in _optimize_charset, change this:
except IndexError:
# character set contains unicode characters
return _optimize_unicode(charset, fixup)
# compress character map
to
except IndexError:
# character set contains unicode characters
return charset # WORKAROUND: no compression
# compress character map
I'll look into this over the weekend.
Cheers /F