
Why would PEP 393 apply to other implementations than CPython? Regards Antoine. On Fri, 26 Aug 2011 00:01:42 +0000 Dino Viehland <dinov@microsoft.com> wrote:
Guido wrote:
Which reminds me. The PEP does not say what other Python implementations besides CPython should do. presumably Jython and IronPython will continue to use UTF-16, so presumably the language reference will still have to document that strings contain code units (not code points) and the objections Tom Christiansen raised against this will remain true for those versions of Python. (I don't know about PyPy, they can presumably decide when they start their Py3k port.)
OTOH perhaps IronPython 3.3 and Jython 3.3 can use a similar approach and we can lay the narrow build issues to rest? Can someone here speak for them?
The biggest difficulty for IronPython here would be dealing w/ .NET interop. We can certainly introduce either an IronPython specific string class which is similar to CPython's PyUnicodeObject or we could have multiple distinct .NET types (IronPython.Runtime.AsciiString, System.String, and IronPython.Runtime.Ucs4String) which all appear as the same type to Python.
But when Python is calling a .NET API it's always going to return a System.String which is UTF-16. If we had to check and convert all of those strings when they cross into Python it would be very bad for performance. Presumably we could have a 4th type of "interop" string which lazily computes this but if we start wrapping .Net strings we could also get into object identity issues.
We could stop using System.String in IronPython all together and say when working w/ .NET strings you get the .NET behavior and when working w/ Python strings you get the Python behavior. I'm not sure how weird and confusing that would be but conversion from an Ipy string to a .NET string could remain cheap if both were UTF-16, and conversions from .NET strings to Ipy strings would only happen if the user did so explicitly.
But it's a huge change - it'll almost certainly touch every single source file in IronPython. I would think we'd get 3.2 done first and then think about what to do here.