codecs latin1 unicode standard output file

Mon Dec 15 06:38:50 EST 2003

Marko Faldix wrote:

> I try to describe. It's a Window machine with Python 2.3.2 installed. Using
> command line (cmd). Put these lines of code in a file called klotentest1.py:
>
> # -*- coding: iso-8859-1 -*-
>
> print unicode("My umlauts are ä, ö, ü", "latin-1")
> print "My umlauts are ä, ö, ü"
>
> Calling this on command line:
>
> klotentest1.py
>
> Indeed, result of first print is as desired, result of second print delivers
> strange letters but no error.

your console device doesn't use iso-8859-1; it probably uses cp850.
if you print an 8-bit string to the console, Python assumes that you
know what you're doing...

> Now I call this on command line:
>
> klotentest1.py > klotentest1.txt
>
> This fails:
> Traceback (most recent call last):
> File "C:\home\marko\moeller_port\moeller_port_exec_svn\klotentest1.py", line
> 3, in ?
>     print unicode("My umlauts are õ, ÷, ³", "latin-1")
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position
> 15: ordinal not in range(128)
>
> In my point of view python shouldn't act in different ways whether result is
> piped to file or not.

when you print to a console with a known encoding, Python 2.3 auto-
magically converts Unicode strings to 8-bit strings using the console
encoding.

files don't have an encoding, which is why the second case fails.

also note that in 2.2 and earlier, you example always failed.

</F>