[Python-Dev] Unicode-like objects

Just van Rossum just@letterror.com
Wed, 5 Feb 2003 00:46:13 +0100


Unless I misunderstand, it's currently impossible to create an object
that behaves just like a unicode object unless the backing store is
compatible with a "real" unicode object (in which case the buffer
interface can be used). While browsing through the implementation I came
across this interesting XXX comment in unicodeobject.c:

PyObject *PyUnicode_FromObject(register PyObject *obj)
{
    /* XXX Perhaps we should make this API an alias of
           PyObject_Unicode() instead ?! */

If this were done, making unicode-like objects becomes possible:
PyObject_Unicode() actually calls __unicode__ whereas
PyUnicode_FromObject() currently requires a buffer. Is there any reason
why implementing what the comment suggests would be a bad idea?

(The use case is this. The PyObjC project marries Objective-C with
Python. This is cool as it gives us direct access to almost all of
Cocoa, the native OSX GUI interface. However, Cocoa defines its own
string type and for reasons that are waaay beyond the scope of this post
(check the archives of the pyobjc-dev list if you're really really
interested; see a recent thread called "NSString & mutability") it
appears a bad idea to _convert_ these strings to Python unicode strings.
So we need to wrap them. Yet they should work as much like unicode
strings as possible...)

Just