[Python-Dev] what Windows and Linux really do Re: PEP 383 (again)

Stephen Hansen apt.shansen at gmail.com
Thu Apr 30 16:05:29 CEST 2009

> You can't even print them without getting an error from Python.  In fact,
> you also can't print strings containing the proposed half-surrogate
> encodings either: in both cases, the output encoder rejects them with a
> UnicodeEncodeError.   (If not even Python, with its generally lenient
> attitude, can print those things, some other libraries probably will fail,
> too.)

I think you may be confusing two completely separate things; its a
long-known issue that the windows console is simply not a Unicode-aware
display device naturally. You have to manually set the codepage (by typing
'chcp 65001' -- that's utf8) *and* manually make sure you have a
unicode-enabled font chosen for it (which for console fonts is extremely
limited to none, and last I looked the default font didn't support unicode)
before you can even try to successfully print valid unicode. The default
codepage is 437 (for me at least; I think it depends on which language of
Windows you're using) which is ASCII-/ish/.

You have to do your test in an environment which actually supports
displaying unicode at all, or its meaningless.

Personally and for all the use cases I have to deal with at work, I would
/love/ to see this PEP succeed. Being able to query a list of files in a
directory and get them -all-, display them all to a user
(which necessitates it being converted to unicode one way or the other. I
don't care if certain characters don't display: as long as any arbitrary
file will always end up looking like a distinct series of readable and
unreadable glyphs so the user can select it clearly), and then perform
operations on any selected file regardless of whatever nonsense may be going
on underneath with confused users and encodings... in a cross-platform way,
would be a tremendous boon to future py3k porting efforts. I ramble.

If there's inconsistent encodings used by users on a posix system so that
they can only make sense of half of what the names really are... that's for
other programs to deal with. I just want to be able to access the files they
tell me they want.

For anyone who is doing something low-level, they can use the bytes API.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090430/9e41f61f/attachment.htm>

More information about the Python-Dev mailing list