[Pythonmac-SIG] Filename encodings on the Mac

Jack Jansen jack@oratrix.nl
Sun, 08 Jul 2001 22:56:53 +0200


Mark,
thanks, that was very helpful!

I think that this means that the solution I've just implemented (when
you pass a unicode string where an 8-bit string is expected we convert
with to the encoding of the current system font) is probably the best
we can do. Unicode characters that are representable in the current
encoding are passed through unscathed, and we can't do any better as
the APIs underlying Python open() and friends (libc fopen(), in this
case) don't allow us to do any better.

But there's still a problem with the multibyte system fonts, I think.
If MacPython knows there's no Python unicode codec for the current
encoding it pretends that 8bit characters are MacRoman. So, passing a
correct unicode Japanese filename to open() will cause it to fail if
there are non-ascii characters in there: the Python unicode->macroman
converter will complain that the characters are not available in the
macroman set. Returning MacRoman is my guess, the alternative is
returning "ascii", which will only allow 7bit characters. If people
using multibyte systems (or single-byte systems for an encoding for
which no Python unicode codec yet exists) feel that returning ascii
would be a better idea: let me know. Or better, let's discuss this on
the mailing list.

It must be possible to create a multibyte MacJapanese <-> Unicode
codec with the Python unicode infrastructure: after all there's a
utf-8 codec too, and that's also a multibyte encoding. But I'm
completely out of my water here. If someone wants to create one and
contribute it I'll gladly try and have it incorporated in the standard
distribution, and I can put people into contact with the Python
unicode gurus, but that's about as much as I can promise.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm