[Python-checkins] r73699 - in python/branches/release31-maint: Lib/test/test_codecs.py Misc/NEWS Objects/unicodeobject.c

amaury.forgeotdarc python-checkins at python.org
Tue Jun 30 00:38:54 CEST 2009


Author: amaury.forgeotdarc
Date: Tue Jun 30 00:38:54 2009
New Revision: 73699

Log:
Merged revisions 73698 via svnmerge from 
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r73698 | amaury.forgeotdarc | 2009-06-30 00:36:49 +0200 (mar., 30 juin 2009) | 7 lines
  
  #6373: SystemError in str.encode('latin1', 'surrogateescape')
  if the string contains unpaired surrogates.
  (In debug build, crash in assert())
  
  This can happen with normal processing, if python starts with utf-8,
  then calls sys.setfilesystemencoding('latin-1')
........


Modified:
   python/branches/release31-maint/   (props changed)
   python/branches/release31-maint/Lib/test/test_codecs.py
   python/branches/release31-maint/Misc/NEWS
   python/branches/release31-maint/Objects/unicodeobject.c

Modified: python/branches/release31-maint/Lib/test/test_codecs.py
==============================================================================
--- python/branches/release31-maint/Lib/test/test_codecs.py	(original)
+++ python/branches/release31-maint/Lib/test/test_codecs.py	Tue Jun 30 00:38:54 2009
@@ -1549,6 +1549,11 @@
         self.assertEqual("foo\udca5bar".encode("iso-8859-3", "surrogateescape"),
                          b"foo\xa5bar")
 
+    def test_latin1(self):
+        # Issue6373
+        self.assertEqual("\udce4\udceb\udcef\udcf6\udcfc".encode("latin1", "surrogateescape"),
+                         b"\xe4\xeb\xef\xf6\xfc")
+
 
 def test_main():
     support.run_unittest(

Modified: python/branches/release31-maint/Misc/NEWS
==============================================================================
--- python/branches/release31-maint/Misc/NEWS	(original)
+++ python/branches/release31-maint/Misc/NEWS	Tue Jun 30 00:38:54 2009
@@ -12,6 +12,10 @@
 Core and Builtins
 -----------------
 
+- Issue #6373: Fixed a RuntimeError when encoding with the latin-1 codec and
+  the 'surrogateescape' error handler, a string which contains unpaired
+  surrogates.
+
 Library
 -------
 

Modified: python/branches/release31-maint/Objects/unicodeobject.c
==============================================================================
--- python/branches/release31-maint/Objects/unicodeobject.c	(original)
+++ python/branches/release31-maint/Objects/unicodeobject.c	Tue Jun 30 00:38:54 2009
@@ -4201,10 +4201,12 @@
                     repsize = PyBytes_Size(repunicode);
                     if (repsize > 1) {
                         /* Make room for all additional bytes. */
+                        respos = str - PyBytes_AS_STRING(res);
                         if (_PyBytes_Resize(&res, ressize+repsize-1)) {
                             Py_DECREF(repunicode);
                             goto onError;
                         }
+                        str = PyBytes_AS_STRING(res) + respos;
                         ressize += repsize-1;
                     }
                     memcpy(str, PyBytes_AsString(repunicode), repsize);


More information about the Python-checkins mailing list