Re: [Python-checkins] cpython: pyexpat uses the new Unicode API

changeset: 72548:a1be34457ccf user: Victor Stinner <victor.stinner at haypocalc.com> date: Sat Oct 01 01:05:40 2011 +0200 summary: pyexat uses the new Unicode API
files: Modules/pyexpat.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -1234,11 +1234,13 @@ static PyObject * xmlparse_getattro(xmlparseobject *self, PyObject *nameobj) { - const Py_UNICODE *name; + Py_UCS4 first_char; int handlernum = -1;
if (!PyUnicode_Check(nameobj)) goto generic; + if (PyUnicode_READY(nameobj)) + return NULL;
Why is this PyUnicode_READY necessary? Can tp_getattro pass unfinished unicode objects? I hope we don't have to update all extension modules? -- Amaury Forgeot d'Arc

Why is this PyUnicode_READY necessary? Can tp_getattro pass unfinished unicode objects?
There is no guarantee that it does pass READY strings. Giving such guarantee would be fairly error-prone, since people would now have to remember in what places they need to READY. So the guideline really is that you need to READY always whenever you access the internal representation.
I hope we don't have to update all extension modules?
Of course not. Existing code will continue to work fine, with some really minor limitations (see the PEP for details). In fact, pyexpat worked just fine before Victor changed it (and continues to do so after the change, with potentially reduced memory consumption). Such code will not use the new API to access the flexible representation. If somebody changes the module to do so, they will have to add PyUnicode_READY calls. As Victor changed pyexpat to use the new API, he also added the READY calls (which are necessary precisely to continue supporting the old API, so they can go in Python 4). Regards, Martin
participants (2)
-
Amaury Forgeot d'Arc
-
martin@v.loewis.de