<div class="gmail_quote">On Tue, Apr 28, 2009 at 20:45, &quot;Martin v. Löwis&quot; <span dir="ltr">&lt;<a href="mailto:martin@v.loewis.de">martin@v.loewis.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


<div class="im">&gt; Furthermore, I don&#39;t believe that PEP 383 works consistently on Windows,<br>

<br>

</div>What makes you say that? PEP 383 will have no effect on Windows,<br>

compared to the status quo, whatsoever.<br>

</blockquote><div><br>That&#39;s what you believe, but it&#39;s not clear to me that that follows from your proposal.<br><br>Your proposal says that utf-8b would be used for file systems, but then you also say that it might be used for command line arguments and environment variables.  So, which specific APIs will it be used with on Windows and on POSIX systems?   Or will utf-8b simply not be available on Windows at all?  What happens if I create a Python version of tar, utf-8b strings slip in there, and I try to use them on Windows?<br>


<br>You also assume that all Windows file system functions strictly conform to UTF-16 in practice (not just on paper).  Have you verified that?  It certainly isn&#39;t true across all versions of Windows (since NT originally used UCS-2).   What&#39;s the situation on Windows CE?<br>


<br>

Another question on Linux: what happens when I decode a file system path with utf-8b and then pass

the resulting unicode string to Gnome?  To Qt?  To

windows.forms?  To Java?  To a unicode regular expression library?  To wprintf?  AFAIK, the behavior of most libraries is undefined for the kinds of unicode strings you construct, and it may be undefined in a bad way (crash, buffer overflow, whatever).<br>


<br>Tom<br><br></div></div>