[Python-ideas] Alternative Unicode implementations (NSString/NSMutableString)

Nick Coghlan ncoghlan at gmail.com
Tue Jul 18 22:34:19 EDT 2017

On 19 July 2017 at 09:40, Victor Stinner <victor.stinner at gmail.com> wrote:
> Supporting a new kind of string storage would require a lot of efforts.
> There are a lot of C code specialized for each Unicode kind

If I understand the requested flag correctly, it would be to request
one of the following:

1. *Never* use any of CPython's fast paths, and instead be permanently slow; or
2. Indicate that it's a "lazily rendered" subclass that should hold
off on calling PyUnicode_Ready for as long as possible, but still do
so when necessary (akin to creating strings via the old Py_UNICODE
APIs and then calling PyUnicode_READY on them)

Neither of those is exactly straightforward, but I think it has the
potential to tie in well with a Rust concept that Armin Ronacher
recently pointed out, which is that in addition to their native String
type, they also define a *separate* CString type as part of their C
FFI layer: https://doc.rust-lang.org/std/ffi/struct.CString.html

The Rust example does prompt me to ask whether this might be better
modeled as a "PlatformString" data type (essentially a str subclass
with an extra void * entry for a pointer to the native object), while
the operator.index() precedent prompts me to ask whether or not this
might be better handled with a "__platformstr__" protocol, but the
basic *idea* of having a clearly defined way of modeling
platform-native text strings at least somewhat independently of the
core Python data types seems reasonable to me.

(If we do go with the "flag bit" option, then it may actually be
possible to steal the existing "Py_UNICODE *" pointer at the same time
- that way an externally defined string would automatically be handled
the same way as any other unready string, and "Py_UNICODE *" would
just be a particular example of a platform string type)


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-ideas mailing list