[Python-checkins] CVS: python/dist/src/Python import.c,2.193,2.194

M.-A. Lemburg lemburg@users.sourceforge.net
Thu, 07 Feb 2002 03:33:51 -0800


Update of /cvsroot/python/python/dist/src/Python
In directory usw-pr-cvs1:/tmp/cvs-serv8617/Python

Modified Files:
	import.c 
Log Message:
Fix to the UTF-8 encoder: it failed on 0-length input strings.

Fix for the UTF-8 decoder: it will now accept isolated surrogates
(previously it raised an exception which causes round-trips to
fail).

Added new tests for UTF-8 round-trip safety (we rely on UTF-8 for
marshalling Unicode objects, so we better make sure it works for
all Unicode code points, including isolated surrogates).

Bumped the PYC magic in a non-standard way -- please review. This
was needed because the old PYC format used illegal UTF-8 sequences
for isolated high surrogates which now raise an exception.



Index: import.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/import.c,v
retrieving revision 2.193
retrieving revision 2.194
diff -C2 -d -r2.193 -r2.194
*** import.c	12 Jan 2002 11:05:10 -0000	2.193
--- import.c	7 Feb 2002 11:33:49 -0000	2.194
***************
*** 42,47 ****
         (quite apart from that the -U option doesn't work so isn't used
         anyway).
  */
! #define MAGIC (60717 | ((long)'\r'<<16) | ((long)'\n'<<24))
  
  /* Magic word as global; note that _PyImport_Init() can change the
--- 42,66 ----
         (quite apart from that the -U option doesn't work so isn't used
         anyway).
+ 
+    XXX MAL, 2002-02-07: I had to modify the MAGIC due to a fix of the
+        UTF-8 encoder (it previously produced invalid UTF-8 for unpaired
+        high surrogates), so I simply bumped the month value to 20 (invalid
+        month) and set the day to 1.  This should be recognizable by any
+        algorithm relying on the above scheme. Perhaps we should simply
+        start counting in increments of 10 from now on ?!
+ 
+    Known values:
+        Python 1.5:   20121
+        Python 1.5.1: 20121
+        Python 1.5.2: 20121
+        Python 2.0:   50823
+        Python 2.0.1: 50823
+        Python 2.1:   60202
+        Python 2.1.1: 60202
+        Python 2.1.2: 60202
+        Python 2.2:   60717
+        Python 2.3a0: 62001
  */
! #define MAGIC (62001 | ((long)'\r'<<16) | ((long)'\n'<<24))
  
  /* Magic word as global; note that _PyImport_Init() can change the