[Python-Dev] [Python-checkins] cpython: pyexpat uses the new Unicode API
Victor Stinner
victor.stinner at haypocalc.com
Tue Oct 4 11:48:42 CEST 2011
Le 03/10/2011 11:10, Amaury Forgeot d'Arc a écrit :
>> changeset: 72548:a1be34457ccf
>> user: Victor Stinner<victor.stinner at haypocalc.com>
>> date: Sat Oct 01 01:05:40 2011 +0200
>> summary:
>> pyexat uses the new Unicode API
>>
>> files:
>> Modules/pyexpat.c | 12 +++++++-----
>> 1 files changed, 7 insertions(+), 5 deletions(-)
>>
>>
>> diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c
>> --- a/Modules/pyexpat.c
>> +++ b/Modules/pyexpat.c
>> @@ -1234,11 +1234,13 @@
>> static PyObject *
>> xmlparse_getattro(xmlparseobject *self, PyObject *nameobj)
>> {
>> - const Py_UNICODE *name;
>> + Py_UCS4 first_char;
>> int handlernum = -1;
>>
>> if (!PyUnicode_Check(nameobj))
>> goto generic;
>> + if (PyUnicode_READY(nameobj))
>> + return NULL;
>
> Why is this PyUnicode_READY necessary?
> Can tp_getattro pass unfinished unicode objects?
> I hope we don't have to update all extension modules?
The Unicode API is supposed to only deliver ready strings. But all
extensions written for Python 3.2 use the "legacy" API
(PyUnicode_FromUnicode and PyUnicode_FromString(NULL, size)) and so no
string is ready.
But *no*, you don't have to update your extension reading strings to add
a call to PyUnicode_READY. You only have to call PyUnicode_READY if you
use the new API (e.g. PyUnicode_READ_CHAR), so if you modify your code.
Another extract of my commit (on pyexpat):
- name = PyUnicode_AS_UNICODE(nameobj);
+ first_char = PyUnicode_READ_CHAR(nameobj, 0);
Victor
More information about the Python-Dev
mailing list