[Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8"
Victor Stinner
victor.stinner at gmail.com
Thu Nov 7 22:38:32 CET 2013
2013/11/7 Benjamin Peterson <benjamin at python.org>:
> 2013/11/7 victor.stinner <python-checkins at python.org>:
>> http://hg.python.org/cpython/rev/99afa4c74436
>> changeset: 86995:99afa4c74436
>> user: Victor Stinner <victor.stinner at gmail.com>
>> date: Thu Nov 07 13:33:36 2013 +0100
>> summary:
>> Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8"
>> if the input string is NULL
>>
>> files:
>> Objects/unicodeobject.c | 2 ++
>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>
>>
>> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
>> --- a/Objects/unicodeobject.c
>> +++ b/Objects/unicodeobject.c
>> @@ -2983,6 +2983,8 @@
>> char *l_end;
>>
>> if (encoding == NULL) {
>> + if (lower_len < 6)
>
> How about doing something like strlen("utf-8") rather than hardcoding that?
Full code:
if (encoding == NULL) {
if (lower_len < 6)
return 0;
strcpy(lower, "utf-8");
return 1;
}
On my opinion, it is easy to guess that 6 is len("utf-8") + 1 byte for NUL.
Calling strlen() at runtime may slow-down a function in the fast-path
of PyUnicode_Decode() and PyUnicode_AsEncodedString() which are
important functions. I know that some developers can execute strlen()
during compilation, but I don't see the need of replacing 6 with
strlen("utf-8")+1.
Victor
More information about the Python-Dev
mailing list