Question about working with html entities in python 2 to use them as filenames
Steve D'Aprano
steve+python at pearwood.info
Tue Nov 22 20:32:52 EST 2016
On Wed, 23 Nov 2016 09:00 am, Lew Pitcher wrote:
> 2) Apparently os.mkdir() (at least) defaults to requiring an ASCII
> pathname.
No, you have misinterpreted what you have seen.
Even in Python 2, os.mkdir will accept a Unicode argument. You just have to
make sure it is given as unicode:
os.mkdir(u'/tmp/für')
Notice the u' delimiter instead of the ordinary ' delimiter? That tells
Python to use a unicode (text) string instead of an ascii byte-string.
If you don't remember the u' delimiter, and write an ordinary byte-string '
delimiter, then the result you get will depend on some combination of your
operating system, the source code encoding, and Python's best guess of what
you mean.
os.mkdir('/tmp/für') # don't do this!
*might* work, if all the factors align correctly, but often won't. And when
it doesn't, the failure can be extremely mysterious, usually involving a
spurious
UnicodeDecodeError: 'ascii' codec
error.
Dealing with Unicode text is much simpler in Python 3. Dealing with
*unknown* encodings is never easy, but so long as you can stick with
Unicode and UTF-8, Python 3 makes it easy.
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list