On Mon, May 16, 2022 at 2:11 PM Antoine Pitrou
PyUnicodeBuilder_Init(&builder);
// Overallocation is more efficient if the final length is unknown PyUnicodeBuilder_EnableOverallocation(&builder); PyUnicodeBuilder_WriteStr(&builder, key); PyUnicodeBuilder_WriteChar(&builder, '=');
// Disable overallocation before the last write PyUnicodeBuilder_DisableOverallocation(&builder);
Having to manually enable or disable overallocation doesn't sound right. Overallocation should be done *before* writing, not after. If there are N bytes remaining and you write N bytes, then no reallocation should occur.
Calling these functions has no immediate effect on the current buffer. EnableOverallocation() doesn't enlarge the buffer. Even if the buffer is currently "over allocated", DisableOverallocation() leaves the buffer unchanged. Only the next writes will use a different strategy depending on the current setting. Only the Finish() function shrinks the buffer. Currently, it's the _PyUnicodeWriter.overallocate member. If possible, I would prefer to not expose the structure members in the public C API. Overallocation should be enabled before writing and disabled before the last write. It's disabled by default. For some use cases, it's more efficient to not enable overallocation (default). Always enabling overallocation makes the code less efficient. For example, a single write of 10 MB allocates 15 MB on Windows and then shinks the final string to 10 MB. Note: The current _PyUnicodeWriter implementation also has an optimization when there is exactly one single WriteStr(obj) operation, Finish() returns the input string object unchanged, even if overallocation is enabled. Victor -- Night gathers, and now my watch begins. It shall not end until my death.