Unicode & Pythonwin / win32 / console?

"Martin v. Löwis" martin at v.loewis.de
Fri Jan 13 00:52:18 CET 2006


Robert wrote:
> * Webbrowsers for example have to display defective HTML as good as
> possible, unknown unicode chars as "?" and so on... Users got very
> angry in the beginning of browsers when 'strict' programmers displayed
> their exception error boxes ...

Right. If you would develop a webbrowser in Python, you should do the
same.

> No one is really angry when
> occasionally chinese chars are displayed cryptically on non-chinese
> computers.

That is not true. Japanese are *frequently* upset when their
characters don't render correctly. They even have a word for that:
moji-bake. I assume it is the similar for Chinese.

> * anything is nice-printable in python by default, why not
> unicode-strings!? If the decision for default 'strict' encoding on
> stdout stands, we have at least to discuss about print-repr for
> unicode.

If you want to see this change really badly, you need to write a PEP.

> * on Windows for example the (good) mbcs_encode is anyway tolerant as
> it: unkown chars are mapped to '?' . I never had any objection to this.

Apparently, you haven't been dealing with character sets long enough.
I have seen *a lot* of objections to the way the CP_ACP encoding
deals with errors, e.g.

http://groups.google.com/group/comp.lang.python/msg/dea84298cb2673ef?dmode=source&hl=en

When windows converts these file names in CP_ACP, then the
file names in a directory are not round-trippable. This is
a source of permanent pain.

> * I would also live perfectly with .encode(enc) to run 'replace' by
> default, and 'strict' on demand. None of my apps and scripts would
> break because of this, but win. A programmer is naturally very aware
> when he wants 'strict'. Can you name realistic cases where 'replace'
> behavior would be so critical that a program damages something?

File names. Replace an unencodable filename with a question mark,
and you get a pattern that matches multiple files. For example, do

get_deletable_files.py | xargs rm

and you remove much more files than you want to. Pretty catastrophic.

Regards,
Martin



More information about the Python-list mailing list