Unicode & Pythonwin / win32 / console?

Wed Jan 11 15:48:54 EST 2006

Robert wrote:
> is in a PythonWin Interactive session - ok results for cyrillic chars
> (tolerant mbcs/utf-8 encoding!).
> But if I do this on Win console (as you probably mean), I get also
> encoding Errors - no matter if chcp1251, because cyrillic chars raise
> the encoding errors also.

If you do "chcp 1251" (not "chcp1251") in the console, and then
run python.exe in the same console, what is the value of
sys.stdout.encoding?

> I think this is not a good behaviour of python to be so picky. 

I think it it is good.

   Errors should never pass silently.
   Unless explicitly silenced.

> In
> 1136925967.990106.299760 at g44g2000cwa.googlegroups.com I showed, how I
> solved this so far. Any better/portable idea?

Not sure why you aren't using sys.stdout.encoding on Linux. I would do

try:
  c = codecs.getwriter(sys.stdout.encoding)
except:
  c = codecs.getwriter('ascii')
sys.stdout = c(sys.stdout, 'replace')

Also, I wouldn't edit site.py, but instead add sitecustomize.py.

> Yes. But the original problem is, that occasionally unicode strings
> (filenames in my case) arise which are not defined on the local
> platform, but have to be displayed (in 'replace' - encoding-mode)
> without breaking the app flow. Thats the pain of the default behaviour
> of current python - and there is no simple switch. Why should "print
> xy" not print something _always_ as good and as far as possible?

Because the author of the application wouldn't know that there
is a bug in the application, and that information was silently
discarded. Users might only find out much later that they have
question marks in places where users originally entered data,
and they would have no way of retrieving the original data.

If you can accept that data loss: fine, but you should silence
the errors explicitly.

Regards,
Martin