[Python-Dev] PEP 393 Summer of Code Project
Dino Viehland
dinov at microsoft.com
Fri Aug 26 02:01:42 CEST 2011
Guido wrote:
> Which reminds me. The PEP does not say what other Python
> implementations besides CPython should do. presumably Jython and
> IronPython will continue to use UTF-16, so presumably the language
> reference will still have to document that strings contain code units (not code
> points) and the objections Tom Christiansen raised against this will remain
> true for those versions of Python. (I don't know about PyPy, they can
> presumably decide when they start their Py3k
> port.)
>
> OTOH perhaps IronPython 3.3 and Jython 3.3 can use a similar approach and
> we can lay the narrow build issues to rest? Can someone here speak for
> them?
The biggest difficulty for IronPython here would be dealing w/ .NET interop.
We can certainly introduce either an IronPython specific string class which
is similar to CPython's PyUnicodeObject or we could have multiple distinct
.NET types (IronPython.Runtime.AsciiString, System.String, and
IronPython.Runtime.Ucs4String) which all appear as the same type to Python.
But when Python is calling a .NET API it's always going to return a System.String
which is UTF-16. If we had to check and convert all of those strings when they
cross into Python it would be very bad for performance. Presumably we could
have a 4th type of "interop" string which lazily computes this but if we start
wrapping .Net strings we could also get into object identity issues.
We could stop using System.String in IronPython all together and say when
working w/ .NET strings you get the .NET behavior and when working w/ Python
strings you get the Python behavior. I'm not sure how weird and confusing that
would be but conversion from an Ipy string to a .NET string could remain cheap if
both were UTF-16, and conversions from .NET strings to Ipy strings would only
happen if the user did so explicitly.
But it's a huge change - it'll almost certainly touch every single source file in
IronPython. I would think we'd get 3.2 done first and then think about what to
do here.
More information about the Python-Dev
mailing list