On Wed, Mar 17, 2021 at 1:11 AM Michał Górny <mgorny@gentoo.org> wrote:
On Wed, 2021-03-17 at 13:55 +0900, Inada Naoki wrote:
> OK. setuptools doesn't specify encoding at all. So locale-specific
> encoding is used.
> We can not fix it in short term.

How about writing paths as bytestrings in the long term?  I think this
should eliminate the necessity of knowing the correct encoding for
the filesystem.
On Linux and many Unixes, there is no "correct" filesystem encoding.  ASCII and UTF-8 are probably the most common encodings for individual files, maybe even large collections of files, but nevertheless, paths are bytestrings.  Treating paths as UTF-8 works fine for most files, but once in a while there'll be a filename that fails to convert, and that's not the fault of the filename.

For example, what happens if you need a file to be named touch "Ma$(echo | tr '\012' '\361')ana" ?

For a presentation application (for EG), assuming UTF-8 is probably fine, maybe even a good thing.  But for a filesystem backup tool, it's important to not assume an encoding so you can back up and restore all filenames irrespective of what the files' creators intended encodingwise.