[Python-Dev] why doesn't print pass unicode strings on to the file object?

M.-A. Lemburg mal@lemburg.com
Mon, 17 Sep 2001 19:50:08 +0200


Guido van Rossum wrote:
> 
> > Martin von Loewis wrote:
> > >
> > > > is there any reason why "print" cannot pass unicode
> > > > strings on to the underlying write method?
> > >
> > > Mostly because there is no guarantee that every .write method will
> > > support Unicode objects. I see two options: either a stream might
> > > declare itself as supporting unicode on output (by, say, providing a
> > > unicode attribute), or all streams are required by BDFL pronouncement
> > > to accept Unicode objects.
> >
> > I think the latter option would go a long way: many file-like
> > objects are written in C and will use the C parser markers. These
> > can handle Unicode without problem (issuing an exception in case
> > the conversion to ASCII fails).
> 
> Agreed, but BDFL pronouncement doesn't make it so: individual modules
> still have to be modified if they don't do the right thing (especially
> 3rd party modules -- we have no control there).

True, but we are only talking about file objects which are used
for sys.stdout -- I don't think that allowing Unicode to be
passed to their .write() methods will break a whole lot of code.
 
> And then, what's the point of handling Unicode if we only accept
> Unicode-encoded ASCII strings?

I was under the impression that Fredrik wants to let Unicode
pass through from the print statement to the .write method
of sys.stdout. 

If the sys.stdout object knows about Unicode then
things will work just fine; if not, the internal Python machinery
will either try to convert it to an ASCII string (e.g. if the
file object uses "s#") or the file object will raise a TypeError
(this is what cStringIO does).

Currently, Python forces conversion to 8-bit strings for all
printed objects (at least this is what it did last time I looked
into this problem a long while ago).

> > The only notable exception is
> > the cStringIO module -- but this could probably be changed to
> > be buffer interface compliant too.
> 
> Sure, just submit a patch.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Consulting & Company:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/