[XML-SIG] Re: Issues with Unicode type

Eric van der Vlist vdv@dyomedea.com
26 Sep 2002 09:08:53 +0200


On Wed, 2002-09-25 at 22:39, M.-A. Lemburg wrote:
>=20
> I book all this under FUD. It'll take a bit of time, but we'll
> eventually move there. For now, I think the issues around
> surrogates and the need for non-BMP code points in real life
> applications are a bit overhyped.

I think that it depends what we call real life and more precisely if you
consider that the full conformance to standards and W3C recommendations
is part of the real life or not.

Having never met the need before, I can't consider non BMP code points
as an absolute requirement by themselves.

OTH, working on implementations of standards (or recs) without aiming
for complete conformance is something which I consider as dangerous and
I am reaching a point where Python doesn't look as a adequate plateform
to implement W3C XML Schema datatypes (and hardly an adequate platform
to implement Relax NG) because of the lack of support of non BMP code
points.

1) For Relax NG:

The issue can be solved by using other mechanisms to test "NCName"s but
the Regular Expression which I am using right now doesn't work when the
Python interpreter has been compiled with support of ucs4.

2) For W3C XML Schema Datatypes:

The two issues which I am currently aware of are the length of the
strings which can be solved by implementing an application level length
algorithm and, more serious, the support of the regular expressions
required for the "pattern" facet for which I don't see how we could rely
on the Python regexp features which are buggy when compiled as ucs4 and
will not produce the expected result when compiled as ucs2.=20

Unless we rely on external C extensions such as the ones developed by
Daniel for libxml, I just see no way to be "natively conform"!

Again, we can say that it won't matter for "real life applications" and
that we don't care about conformance but that's a dangerous path.

Thanks,

Eric
--=20
Rendez-vous =E0 Paris.
                          http://www.technoforum.fr/integ2002/index.html
------------------------------------------------------------------------
Eric van der Vlist       http://xmlfr.org            http://dyomedea.com
(W3C) XML Schema ISBN:0-596-00252-1 http://oreilly.com/catalog/xmlschema
------------------------------------------------------------------------