[Python-ideas] Adding str.isascii() ?

M.-A. Lemburg mal at egenix.com
Fri Jan 26 05:12:42 EST 2018


On 26.01.2018 10:44, INADA Naoki wrote:
>> +1
>>
>> Just a note: checking the header in CPython will only give a hint,
>> since strings created using higher order kinds can still be 100%
>> ASCII.
>>
> 
> Oh, really?
> I think checking header is enough for all ready unicode.

No, because you can pass in maxchar to PyUnicode_New() and
the implementation will take this as hint to the max code point
used in the string. There is no check done whether maxchar
is indeed the minimum upper bound to the code point ordinals.

The reason for doing this is simple: you don't want to have to
scan the string every time you create a Unicode object.
CPython itself often does do such a scan before calling
PyUnicode_New(), so in many cases, the header will be set
to ASCII, but not always.

> For example, this is _PyUnicode_EqualToASCIIString implementation:
> 
>     if (PyUnicode_READY(unicode) == -1) {
>         /* Memory error or bad data */
>         PyErr_Clear();
>         return non_ready_unicode_equal_to_ascii_string(unicode, str);
>     }
>     if (!PyUnicode_IS_ASCII(unicode))
>         return 0;
> 
> And I think str.isascii() can be implemented as:
> 
>     if (PyUnicode_READY(unicode) == -1) {
>         return NULL;
>     }
>     if (PyUnicode_IS_ASCII(unicode)) {
>         Py_RETURN_TRUE;
>     }
>     else {
>         Py_RETURN_FALSE;
>     }
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Jan 26 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/



More information about the Python-ideas mailing list