[docs] [issue13604] update PEP 393 (match implementation)
Martin v. Löwis
report at bugs.python.org
Thu Dec 15 23:45:27 CET 2011
Martin v. Löwis <martin at v.loewis.de> added the comment:
> PyUnicode_AsUnicode(), PyUnicode_AS_UNICODE(), PyUnicode_GET_SIZE(),
> ... do reallocate a Py_UNICODE* string for a ready string, but I
> don't think that it is a usual use case.
Define "usual". There were certainly plenty of occurrences of that
in the Python code base, and I believe that extension modules also
use it, provided they care about the content of string objects at all.
> PyUnicode_AS_UNICODE() &
> friends are usually only used to build strings.
No. They are also used to inspect them.
> So even if a third party module uses the legagy Unicode API, the PEP
> 393 will still optimize the memory usage thanks to implicit calls to
> PyUnicode_READY() (done everywhere in Python source code).
... unless they inspect a given Unicode string, in which case it
will use twice the memory (or 1.5x).
> "Resizing a Unicode string remains possible until it is finalized,
> generally by calling PyUnicode_READY."
> I changed PyUnicode_Resize(): it is now *always* possible to resize a
> string. The change was required because some decoders overallocate
> the string, and then resize after decoding the input.
> The sentence can be simply removed.
Well, I meant the resizing of strings that doesn't move the object
in memory (i.e. unicode_resize). You (apparently) changed its signature
to take PyUnicode_Object** (instead of PyUnicode_Object*). It's probably
irrelevant since that's a unicodeobject.c-internal function, anyway.
> This function was added to Python 3.3 and is directly deprecated. Why
> adding a function to deprecate it? PyUnicode_AsUnicode() and
> PyUnicode_GET_SIZE() were not enough?
If it was not in 3.2, we should certainly remove it right away.
Python tracker <report at bugs.python.org>
More information about the docs