New subject: r87518 - python/branches/py3k/Parser/tokenizer.c

27 Dec 2010

Am 27.12.2010 21:12, schrieb victor.stinner:
...
Author: victor.stinner
Date: Mon Dec 27 21:12:13 2010
New Revision: 87518
Log:
Issue #10778: decoding_fgets() decodes the filename from the filesystem
encoding instead of UTF-8.
Modified:
   python/branches/py3k/Parser/tokenizer.c
Modified: python/branches/py3k/Parser/tokenizer.c
==============================================================================

--- python/branches/py3k/Parser/tokenizer.c	(original)
+++ python/branches/py3k/Parser/tokenizer.c	Mon Dec 27 21:12:13 2010
@@ -545,6 +545,7 @@
 {
     char *line = NULL;
     int badchar = 0;
+    PyObject *filename;
     for (;;) {
         if (tok->decoding_state == STATE_NORMAL) {
             /* We already have a codec associated with
@@ -585,12 +586,16 @@
     if (badchar) {
         /* Need to add 1 to the line number, since this line
            has not been counted, yet.  */
-        PyErr_Format(PyExc_SyntaxError,
-            "Non-UTF-8 code starting with '\\x%.2x' "
-            "in file %.200s on line %i, "
-            "but no encoding declared; "
-            "see http://python.org/dev/peps/pep-0263/ for details",
-            badchar, tok->filename, tok->lineno + 1);
+        filename = PyUnicode_DecodeFSDefault(tok->filename);
+        if (filename != NULL) {
+            PyErr_Format(PyExc_SyntaxError,
+                    "Non-UTF-8 code starting with '\\x%.2x' "
+                    "in file %.200U on line %i, "
+                    "but no encoding declared; "
+                    "see http://python.org/dev/peps/pep-0263/ for details",
+                    badchar, filename, tok->lineno + 1);
+            Py_DECREF(filename);
+        }
Hmm, and in case decoding fails, we return a Unicode error (without context)
instead of a syntax error?  Doesn't seem like a good trade-off when the file
name is just displayed in a message.

Georg

    

Re: [Python-Dev] r87518 - python/branches/py3k/Parser/tokenizer.c

Georg Brandl

Victor Stinner

Victor Stinner

Georg Brandl

Victor Stinner

tags

participants (2)