[Python-ideas] Processing surrogates in

Andrew Barnert abarnert at yahoo.com
Thu May 14 21:48:16 CEST 2015


On May 14, 2015, at 07:49, random832 at fastmail.us wrote:
> 
>> On Thu, May 14, 2015, at 04:48, Andrew Barnert via Python-ideas wrote:
>> As far as I can tell, all of your extra cases are just examples of the
>> surrogateescape error handler, which Nick already mentioned.
> 
> Technically filesystem names (and other similar boundary APIs like
> environ, anything ctypes, etc) on Windows can contain arbitrary
> surrogates

Are you sure? I thought that, unless you're using Win95 or NT 3.1 or something, Win32 *W APIs are explicitly for Unicode characters (not code units), minus nulls and any relevant reserved characters (e.g.. no slashes in filenames, no control characters in filenames except for substream names, etc.). That's what the Naming Files doc seems to imply. (Then again, there are other areas that seem confusing or misleading--e.g., where it tells you not to worry about normalization because once the string gets through Win32 and to the filesystem it's just a string of WCHARs, which sounds to me like that's exactly why you _should_ worry about normalization...)

> and have nothing to do with surrogateescape.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list