unicode filenames

Erik Max Francis max at alcyone.com
Mon Feb 3 02:44:12 EST 2003


Alex Martelli wrote:

> ALMOST entirely -- for example, none of the bytes is allowed to have
> the value 47 (since that is the code for "slash" in ASCII).

I thought we would all be reasonable enough to implicitly understand
that was a condition.  I thought of explicitly mentioning it, but
thought it too obvious.  Just goes to show.

> As long as the encoding never needs to use a byte whose value is
> 47.  I think that rules out UTF-8 and most other popular
> multi-byte encodings, doesn't it?

UTF-8 not including a slash.  Or UTF-16 not including a slash.  Or
Latin-1 not including a slash.  And so on.

The context is Unicode filenames; Unicode filenames on Windows certainly
have similar restrictions; you can't put _any_ character in there and
expect it to work (for precisely the reasons; I suspect Windows would
restrict them more, in fact).  Same goes for a UNIX filesystem, so it's
not like in context that limitation wasn't already apparent.

-- 
 Erik Max Francis / max at alcyone.com / http://www.alcyone.com/max/
 __ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/  \ ... Not merely peace in our time, but peace for all time.
\__/ John F. Kennedy
    Python chess module / http://www.alcyone.com/pyos/chess/
 A chess game adjudicator in Python.




More information about the Python-list mailing list