[Python-Dev] More Unicode support
M.-A. Lemburg
mal@lemburg.com
Mon, 06 Nov 2000 19:15:27 +0100
Guido van Rossum wrote:
>
> [Guido]
> > > Adding unistr() and StreamRecoder isn't enough. The problem is that
> > > when you set sys.stdout to a StreamRecoder, the print statement
> > > doesn't do the right thing! Try it. print u"foo" will work, but
> > > print u"\u1234" will fail because print always applies the default
> > > encoding.
>
> [MAL]
> > Hmm, that's due to PyFile_WriteObject() calling PyObject_Str().
> > Perhaps we ought to let it call PyObject_Unicode() (which you
> > find in the patch on SF) instead for Unicode objects. That way
> > the file-like .write() method will be given a Unicode object
> > and StreamRecoder could then do the trick.
>
> That's still not enough. Classes and types should be able to have a
> __str__ (or tp_str) that yields Unicode too.
Instances are allowed to return Unicode through their __str__
method and PyObject_Unicode() will pass it along. PyObject_Str()
will still convert it to an 8-bit string though because there's
too much code out there which expects a string object (without
checking !) ... even the Python core.
So if you print an instance which returns Unicode through __str__,
the wrapper should see a real Unicode object at its end... at least
I think we're getting closer ;-)
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/