On 2 September 2016 at 08:31, Steve Dower
This proposal would remove all use of the *A APIs and only ever call the *W APIs. When Windows returns paths to Python as str, they will be decoded from utf-16-le and returned as text (in whatever the minimal representation is). When Windows returns paths to Python as bytes, they will be decoded from utf-16-le to utf-8 using surrogatepass (Windows does not validate surrogate pairs, so it is possible to have invalid surrogates in filenames). Equally, when paths are provided as bytes, they are decoded from utf-8 into utf-16-le and passed to the *W APIs.
The overall proposal looks good to me, there's just a terminology glitch here: utf-8 <-> utf-16-le should either be described as transcoding, or else as decoding and then re-encoding. As they're both text codecs, there's no "decoding" operation that switches between them. As far as the timing of this particular change goes, I think you make a good case that all of the cases that will see a behaviour change with this PEP have already been receiving deprecation warnings since 3.3, which would make it acceptable to change the default behaviour in 3.6. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia