[XML-SIG] New DOM code checked in
Dieter Maurer
dieter@handshake.de
Mon, 22 Mar 1999 18:01:10 +0000 (/etc/localtime)
Lars Marius Garshol writes:
> | [Unicode] Now I have no idea what to do; switch to using Fredrik's
> | code and adapt PyExpat to it, stick with Martin's module, or what?
>
> I'm in a similar quandary. I'd very much like to add Unicode support
> to xmlproc, but to do that I need support in RE. Any feedback on what
> Guido thinks and what the others here think would be welcome.
I have started to extend "pcre" for wide character string handling
(strings may consist (uniformly) of either 1, 2 or 4 byte
units; thus this is a fixed width (UCS-2)
rather than multibyte (UTF-8) approach).
Currently, I require, that all RE metacharacters are still
ASCII; e.g. in "{n,m}" I would only recognize ASCII
digits but not ARABIC-INDIC digits. There is, of cause, no
restriction with respect to the characters that match themselves.
Things like canonical mapping and canonical writing direction
are not handled (it is more a wide character than a unicode support PCRE).
Of cause, I will announce the module, as soon as it becomes
alpha. I make, however, only slow progress.
- Dieter