[issue9198] Should repr() print unicode characters outside the BMP?

Ezio Melotti report at bugs.python.org
Thu Jul 8 17:35:53 CEST 2010

Ezio Melotti <ezio.melotti at gmail.com> added the comment:

Yes, but as I said in the message I linked, that's *not* what I want to do.
I want to change only the behavior of the interactive interpreter and only when the string sent to stdout is not encodable (so only when the encoding is not UTF-*).

This can be done changing sys.displayhook, but I haven't figured out yet how to do it. The default displayhook (Python/sysmodule.c:71) calls PyFile_WriteObject (Objects/fileobject.c:139) passing the object as is and the stdout. PyFile_WriteObject then does the equivalent of sys.stdout.write(repr(obj)).
This is all done passing around unicode strings. Ideally we should try to encode the repr of the objects in displayhook using sys.stdout.encoding and 'backslashreplace', but then:
  1) we would have to decode the resulting byte string again before passing it to PyFile_WriteObject;
  2) we would have to find a way to write to sys.stdout a bytestring but I don't think that's possible (keep in mind that sys.stdout could also be some other object).

OTOH even if the intermediate step of encoding/decoding looks redundant it shouldn't affect performances too much, because it's not that common to print lot of text in the interactive interpreter and even in those cases probably performances are not so important. It would anyway be better to find another way to do it.


Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list