<br><div><span class="gmail_quote">On 10/11/07, <b class="gmail_sendername">Guido van Rossum</b> <<a href="mailto:guido@python.org">guido@python.org</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On 10/11/07, Christian Heimes <<a href="mailto:lists@cheimes.de">lists@cheimes.de</a>> wrote:<br>> Guido van Rossum wrote:<br>> > Um, where does the filename object in that expression come from? It<br>> > appears to be a PyString object. Who created it? That could should be
<br>> > changed to create a PyUnicode instead (using the filesystem encoding).<br>><br>> Python/compile.c:makecode()<br>> filename = PyString_FromString(c->c_filename);<br>><br>> Modules/pyexpat.c:getcode()
<br>> filename = PyString_FromString(__FILE__);<br>><br>> Objects/codeobject.c:code_new()<br>> PyArg_ParseTuple(args, "iiiiiSO!O!O!SSiS|O!O!:code"<br>><br>> As I tried to explain earlier that may be a problem. PyUnicode_Decode()
<br>> doesn't work so early. The codecs package isn't initialized yet.<br><br>But some codecs are "built-in" and have custom APIs. I wonder if we<br>could do something that figures out the default fs encoding, and see
<br>if it is one of the supported ones, and then uses that; otherwise<br>tries UTF-8 with the "replace" error handling option (so it won't fail<br>if the data is non-UTF-8).<br></blockquote><div><br>Thats pretty much what Christian pondered at the start of this thread but with a defined "failure" mode.
<br><br>+1 from me, give it a try and see what 3.0a2 testers say. Are there OSes and filesystems out there that'd store in anything other than one of the popular codecs (UTF-8, 16, 32, latin1, mbcs)? That seems like a bad idea to me but obviously I don't run the world.
<br><br>-gps<br><br></div></div>