From: email@example.com Jeff Hobbs writes:
Can someone explain to me why moving to UCS-4 is a good thing?
Because it simplifies processing of non-BMP characters, as it restores the property that you get one Unicode character per string index.
While Tcl is agnostic about non-BMP chars (all 2 of them ... ha ha), it does have correct UCS-4 support (not completely though with how RedHat patched it). This has been discussed before briefly here:
Which of the follow-up messages do you consider reliable information in this report? davygrvy comments appear to be irrelevant, as they talk about Unicode 3.0, keithp likewise. Your own comment appears to talk about possible future changes, instead of the current code.
BTW, I mentioned this because I'm not sure that the reasoning behind moving to a 32-bit integral type was due to RHs desire to support the extra chars in Unicode 4 (after all, without shipping fonts to display them ... what's the point?). Keith Packard, who submitted the bug report (RFE really) is one of the major XFree maintainers (err ... I guess that's xwin now). In any case, he wanted to allow 32-bit in X in part for ease of processing, advantages of word alignment, and other things.
IOW, I'm not really sure that this was all done to support UCS-4 specifically, although that may have been a consideration.
Jeff Hobbs The Tcl Guy Senior Developer http://www.ActiveState.com/