[issue19512] Avoid most calls to PyUnicode_DecodeUTF8Stateful() in Python interactive mode
STINNER Victor
report at bugs.python.org
Wed Nov 6 23:43:12 CET 2013
STINNER Victor added the comment:
Another problem is that PyUnicode_FromString() failure is not handled correctly in some cases. PyUnicode_FromString() can fail because an decoder error, but also because of a MemoryError. For example, PyDict_GetItemString() returns NULL as if the entry does not exist if PyUnicode_FromString() failed :-(
---
PyObject *
PyDict_GetItemString(PyObject *v, const char *key)
{
PyObject *kv, *rv;
kv = PyUnicode_FromString(key);
if (kv == NULL) {
PyErr_Clear();
return NULL;
}
rv = PyDict_GetItem(v, kv);
Py_DECREF(kv);
return rv;
}
---
While working on failmalloc issues (#18048, #19437), I found some places where MemoryError caused tricky bugs because of this. Example of such issue:
---
changeset: 84684:af18829a7754
user: Victor Stinner <victor.stinner at gmail.com>
date: Wed Jul 17 01:22:45 2013 +0200
files: Objects/structseq.c Python/pythonrun.c
description:
Close #18469: Replace PyDict_GetItemString() with _PyDict_GetItemId() in structseq.c
_PyDict_GetItemId() is more efficient: it only builds the Unicode string once.
Identifiers (dictionary keys) are now created at Python initialization, and if
the creation failed, Python does exit with a fatal error.
Before, PyDict_GetItemString() failure was not handled: structseq_new() could
call PyObject_GC_NewVar() with a negative size, and structseq_dealloc() could
also crash.
---
So moving from PyDict_GetItemString() to _PyDict_GetItemId() is for perfomances, but the main motivation is to handle better errors. I hope that the identifier will be initialized quickly at startup, and if its initialization failed, the failure is handled better...
There is also a _PyDict_GetItemIdWithError() function. But it is not used currently (it was in changeset 2dd046be2c88).
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19512>
_______________________________________
More information about the Python-bugs-list
mailing list