[Python-Dev] Dict suppressing exceptions

Guido van Rossum guido at python.org
Wed Aug 9 23:30:04 CEST 2006


I've been happily ignoring python-dev for the last three weeks or so,
and Neal just pointed me to some thorny issues that are close to
resolution but not quite yet resolved, yet need to be before beta 3 on
August 18 (Friday next week).

Here's my take on the dict-suppressing-exceptions issue (I'll send out
separate messages for each issue where Neal has asked me to weigh in).

It wasn't my idea to stop ignoring exceptions in dict lookups; I would
gladly have put this off until Py3k, where the main problem
(str-unicode __eq__ raising UnicodeError) will go away.

But since people are adamant that they want this in sooner, I suggest
that to minimize breakage we could make an exception for str-unicode
comparisons.

I came up with the following code to reproduce the issue; this prints
0 in 2.2, False in 2.3 and 2.4, but raises UnicodeDecodeError in 2.5
(head):

  a = {u"1": 1}
  x = hash(u"1")
  class C(str):
      def __hash__(s): return x
  print C("\xff") in a

The following patch makes this print False in 2.5 again.

Notes about the patch:

- this also fixes an out-of-date comment that should be fixed even if
the rest of the idea is rejected (lookdict_string() can return NULL
when it calls lookdict)

- the exception could be narrowed even further by only suppressing the
exception when startkey and key are both either str or unicode
instances.

What do people think?

--- Objects/dictobject.c	(revision 51180)
+++ Objects/dictobject.c	(working copy)
@@ -230,7 +230,8 @@
 lookdict() is general-purpose, and may return NULL if (and only if) a
 comparison raises an exception (this was new in Python 2.5).
 lookdict_string() below is specialized to string keys, comparison of which can
-never raise an exception; that function can never return NULL.  For both, when
+never raise an exception; that function can never return NULL (except when it
+decides to replace itself with the more general lookdict()).  For both, when
 the key isn't found a dictentry* is returned for which the me_value field is
 NULL; this is the slot in the dict at which the key would have been found, and
 the caller can (if it wishes) add the <key, value> pair to the returned
@@ -259,8 +260,13 @@
 		if (ep->me_hash == hash) {
 			startkey = ep->me_key;
 			cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
-			if (cmp < 0)
-				return NULL;
+			if (cmp < 0) {
+				if (PyErr_Occurred()==PyExc_UnicodeDecodeError) {
+					PyErr_Clear();
+				}
+				else
+					return NULL;
+                        }
 			if (ep0 == mp->ma_table && ep->me_key == startkey) {
 				if (cmp > 0)
 					return ep;
@@ -289,8 +295,13 @@
 		if (ep->me_hash == hash && ep->me_key != dummy) {
 			startkey = ep->me_key;
 			cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
-			if (cmp < 0)
-				return NULL;
+			if (cmp < 0) {
+				if (PyErr_Occurred()==PyExc_UnicodeDecodeError) {
+					PyErr_Clear();
+				}
+				else
+					return NULL;
+                        }
 			if (ep0 == mp->ma_table && ep->me_key == startkey) {
 				if (cmp > 0)
 					return ep;


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list