[Python-ideas] Adding str.isascii() ?

M.-A. Lemburg mal at egenix.com
Fri Jan 26 10:36:04 EST 2018

On 26.01.2018 16:16, Random832 wrote:
> On Fri, Jan 26, 2018, at 09:18, M.-A. Lemburg wrote:
>> Is there a way to call an API which fixes the setting
>> (a public version of unicode_adjust_maxchar()) ?
>> Without this, how would an extension be able to provide a
>> correct value upfront without knowing the content ?
> It obviously has to know the content before it can finally return the string (or pass it to any other function, etc), because strings are immutable. Why not then do all the intermediate work in an array of int32's (or perhaps a UCS-4 PyUnicode to be returned only if needed), then afterward scan and build the string?

The create, write data, resize approach is a standard way to build
(longer) Pythhon string objects in the Python C API, since it
avoids temporary copies.

E.g. you don't want to first build a buffer to hold 100MB XML,
then scan it for the max code point being used, create a python
string from it (which copies the data into a second 100MB
buffer) and then deallocate the first buffer again.

Instead you create an uninitialized Python Unicode object
and use PyUnicde_WRITE() to write the data directly into
the object, avoiding the 100MB temp buffer.

PS: Strings are immutable in Python, but they are not in C.
You can manipulate string objects provided you own the only

Marc-Andre Lemburg

Professional Python Services directly from the Experts (#1, Jan 26 2018)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611

More information about the Python-ideas mailing list