On Wed, Mar 17, 2021 at 6:37 PM Steve Dower steve.dower@python.org wrote:
On 3/17/2021 8:00 AM, Michał Górny wrote:
How about writing paths as bytestrings in the long term? I think this should eliminate the necessity of knowing the correct encoding for the filesystem.
That's what we're trying to do, the problem is that they start as strings, and so we need to convert them to a bytestring.
That conversion is the encoding ;)
And yeah, for reading, I'd use a UTF-8 reader that falls back to locale on failure (and restarts reading the file). But for writing, we need the tools that create these files (including Notepad!) to use the encoding we want.
A somewhat radical idea carrying this to the extreme would be to use UTF-16 (LE) on Windows. After all, this _is_ the native file system encoding, and Notepad will happily read and write it.