[XML-SIG] DC DOM tests (Was: Roadmap document - finally!)

Uche Ogbuji uche.ogbuji@fourthought.com
Tue, 20 Feb 2001 11:54:34 -0700


> > > | - DOMString and text manipulating interface methods are not tested beyond
> > > |   ASCII text due to an implementation limitation of ParsedXML.DOM. So,
> > > |   implementations will not be tested if text is correctly treated when
> > > |   multi-byte UTF-16 characters are involved.
> > > 
> > > By "multi-byte UTF-16 characters" I assume you mean Unicode characters
> > > outside the BMP that are represented using two surrogates?
> > 
> > I wonder if that's what Martijn means.  I've read that most Java 
> > implementations have trouble with characters outside the BMP.  I wonder if 
> > Python handles these properly.
> 
> Depends on what you call properly.  Can you elaborate on what you
> would call proper treatment here?

Sure.  I admit it's hearsay, but I thought I'd read that because Java Unicode 
is or was underspecified, that there was the possibility of transposition of 
the high-surrogate with the low-surrogate character between Java 
implementations or platforms.

Now I don't exactly write XML dissertations on "Hello Kitty" <g>, so I'm not 
likely to run into this myself, but I was wondering whether Python handles 
surrogate blocks appropriately across platforms and implementations (I guess 
including cpyhton -> Jpython).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python