[Python-Dev] Non-ASCII characters in test_pep277.py in 2.3

07 Oct 2002 04:41:09 +0200

Guido van Rossum <guido@python.org> writes:

> The file pep_2777.py uses an encoding cookie that specifies UTF-8.
> Unfortunately my toolchain doesn't know about this, and displays it as
> Latin-1.  

What do you mean by "toolchain"? At the end of the chain is python, it
should know about this well enough.

Did you try to open the file in IDLE? What other tools are you using?

> Since the only UTF-8 is in 8-bit string literals (not Unicode
> literals), wouldn't it make more sense to drop the encoding cookie
> and use \xXX escapes in those literals?

Having the original byte strings allows to verify correctness of the
test visually: the files created should look the same in the operating
system (using ls, or a file explorer) as they do in the source code.

> I'm not even sure this use is legal in phase 2 of PEP 263.

It is: the file has a decoding declared, and decodes properly under
that decoding. String literals are preserved in their original
encoding, so they will come out at run-time the same as they are on
disk.

Regards,
Martin