Unicode encoding - ignoring errors

Michal Ludvig mludvig at logix.net.nz
Mon Dec 29 13:06:09 CET 2008


in my script I have sys.stdout and sys.stderr redefined to output
unicode strings in the current system encoding:

	encoding = locale.getpreferredencoding()
	sys.stdout = codecs.getwriter(encoding)(sys.stdout)

However on some systems the locale doesn't let all the unicode chars be
displayed and I eventually end up with UnicodeEncodeError exception.

I know I could explicitly "sanitize" all output with:

	whatever.encode(encoding, "replace")

but it's quite inconvenient. I'd much prefer to embed this "replace"
operation into the sys.stdout writer.

Is there any way to set a conversion error handler in codecs.getwriter()
or perhaps chain it with some other filter somehow? I prefer to have
questionmarks in the output instead of experiencing crashes with
UnicodeEncodeErrors ;-)



More information about the Python-list mailing list