[Python-ideas] Add "has_surrogates" flags to string object
Masklinn
masklinn at masklinn.net
Tue Oct 8 13:38:19 CEST 2013
On 2013-10-08, at 13:17 , Serhiy Storchaka wrote:
> Here is an idea about adding a mark to PyUnicode object which allows fast answer to the question if a string has surrogate code. This mark has one of three possible states:
>
> * String doesn't contain surrogates.
> * String contains surrogates.
> * It is still unknown.
>
> We can combine this with "is_ascii" flag in 2-bit value:
>
> * String is ASCII-only (and doesn't contain surrogates).
> * String is not ASCII-only and doesn't contain surrogates.
> * String is not ASCII-only and contains surrogates.
> * String is not ASCII-only and it is still unknown if it contains surrogate.
Isn't that redundant with the kind under shortest form representation?
More information about the Python-ideas
mailing list