On Mon, Feb 8, 2016 at 6:32 AM, Victor Stinner <victor.stinner@gmail.com> wrote:
 Windows native type for filenames is
Unicode, and the Windows has a weird behaviour when you use bytes.

Just to clarify -- what does it currently do for bytes? IIUC, Windows uses UTF-16, so can you pass in UTF-16 bytes? Or when using bytes is is assuming some Windows ANSI-compatible encoding? (and what does it return?)

Are we brave enough to force users to use the "right" type for filenames?

I think so :-)

On Python 2, it wasn't possible to use Unicode for filenames, many
functions fail badly with Unicode,

I've had fine success using Unicode filenames with py2 on Windows -- in fact, as soon as my users have non-ansi characters in their names I'm pretty sure I have no choice....

especially when you mix bytes and
Unicode.

well yes, that sure does get ugly!

-CHB


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov