[issue9200] Make str methods work with non-BMP chars on narrow builds

Marc-Andre Lemburg report at bugs.python.org
Thu Aug 18 18:38:12 CEST 2011


Marc-Andre Lemburg <mal at egenix.com> added the comment:

STINNER Victor wrote:
> 
> STINNER Victor <victor.stinner at haypocalc.com> added the comment:
> 
> I don't think that macros specific to unicodeobject.c should get the _PY_UNICODE_ prefix. "_Py_" prefix is reserved to exported symbols, but symbols reserved for the Python interprefer itself. For _Py_UNICODE_NEXT, you can call it NEXT_CHARACTER().

Can we please stick to the things we discussed on issue10542.

The macros are intended to be public starting with 3.3, not private
and they are meant to be used outside the interpreter as well.
They will only be private for patch level release patches. For those
you don't need the Py-prefix, but it also doesn't hurt having it
there.

> _Py_UNICODE_ISHIGHSURROGATE,_Py_UNICODE_ISLOWSURROGATE and _Py_UNICODE_JOIN_SURROGATES are only used once, I would prefer to see them inlined in _Py_UNICODE_NEXT.
> 
> The first cast to Py_UCS4 in _Py_UNICODE_JOIN_SURROGATES is useless.
> 
> It looks like the macro can be simplified to something like:
> 
> #define _Py_UNICODE_NEXT(ptr, end)                                      \
>     (_Py_UNICODE_ISHIGHSURROGATE(*(ptr)) && (ptr) < (end)) && _Py_UNICODE_ISLOWSURROGATE((ptr)[1] ?           \
>      ((ptr) += 2,_Py_UNICODE_JOIN_SURROGATES((ptr)[-2], (ptr)[-1])) :  \
>      (Py_UCS4)*(ptr)++)
> 
> (you don't need two "a?b:c")
> 
> I would prefer to see _Py_UNICODE_NEXT as a function instead of a macro because it has border effect (ptr++ or ptr += 2). You cannot write Py_UNICODE_ISALNUM(_Py_UNICODE_NEXT(p, e)) for example, because Py_UNICODE_ISALNUM is defined as:

> #define Py_UNICODE_ISALNUM(ch) (Py_UNICODE_ISALPHA(ch) || Py_UNICODE_ISDECIMAL(ch) || Py_UNICODE_ISDIGIT(ch) || Py_UNICODE_ISNUMERIC(ch))
> 
> And so _Py_UNICODE_NEXT is expanded 4 times. It is also horrible to debug a program having such long macro. If you really want to keep it as a macro, please add a least a big warning.

Having it as function would kill the performance advantage, so that's not
really an option. The use case you mention is also not really realistic.

Adding an extra warning to the macro version is still a good idea, though.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9200>
_______________________________________


More information about the Python-bugs-list mailing list