<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><br><div><div>On Aug 26, 2011, at 8:51 PM, Terry Reedy wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div><br><br>On 8/26/2011 8:42 PM, Guido van Rossum wrote:<br><blockquote type="cite">On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy<<a href="mailto:tjreedy@udel.edu">tjreedy@udel.edu</a>> wrote:<br></blockquote><br><blockquote type="cite"><blockquote type="cite">My impression is that a UFT-16 implementation, to be properly called such,<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">must do len and [] in terms of code points, which is why Python's narrow<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">builds are called UCS-2 and not UTF-16.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I don't think anyone else has that impression. Please cite chapter and<br></blockquote><blockquote type="cite">verse if you really think this is important. IIUC, UCS-2 does not<br></blockquote><blockquote type="cite">allow surrogate pairs, whereas Python (and Java, and .NET, and<br></blockquote><blockquote type="cite">Windows) 16-bit strings all do support surrogate pairs. And they all<br></blockquote><br>For that reason, I think UTF-16 is a better term that UCS-2 for narrow builds (whether or not the above impression is true).<br></div></blockquote><br></div><div>I agree. It's weird to call something UCS-2 if code points above 65535 are representable.</div><div>The naming convention for codecs is that the UTF prefix is used for lossless encodings that cover the entire range of Unicode.</div><div><br></div><div>"<span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); ">The first amendment to the original edition of the UCS defined</span><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); "> </span><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); "><a href="http://en.wikipedia.org/wiki/UTF-16" title="UTF-16" class="mw-redirect" style="text-decoration: none; color: rgb(6, 69, 173); background-image: none; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: initial; background-position: initial initial; background-repeat: initial initial; ">UTF-16</a></span><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); ">, an extension of UCS-2, to represent code points outside the BMP."</span></div><div><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); "><br></span></div><div><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px; background-color: rgb(255, 255, 255); ">Raymond</span></div><br></body></html>