[docs] [issue13604] update PEP 393 (match implementation)

Thu Dec 15 23:45:27 CET 2011

Martin v. Löwis <martin at v.loewis.de> added the comment:

> PyUnicode_AsUnicode(), PyUnicode_AS_UNICODE(), PyUnicode_GET_SIZE(),
> ... do reallocate a Py_UNICODE* string for a ready string, but I
> don't think that it is a usual use case.

Define "usual". There were certainly plenty of occurrences of that
in the Python code base, and I believe that extension modules also
use it, provided they care about the content of string objects at all.

> PyUnicode_AS_UNICODE() &
> friends are usually only used to build strings.

No. They are also used to inspect them.

> So even if a third party module uses the legagy Unicode API, the PEP
> 393 will still optimize the memory usage thanks to implicit calls to
> PyUnicode_READY() (done everywhere in Python source code).

... unless they inspect a given Unicode string, in which case it
will use twice the memory (or 1.5x).

> "Resizing a Unicode string remains possible until it is finalized,
> generally by calling PyUnicode_READY."
> 
> I changed PyUnicode_Resize(): it is now *always* possible to resize a
> string. The change was required because some decoders overallocate
> the string, and then resize after decoding the input.
> 
> The sentence can be simply removed.

Well, I meant the resizing of strings that doesn't move the object
in memory (i.e. unicode_resize). You (apparently) changed its signature
to take PyUnicode_Object** (instead of PyUnicode_Object*). It's probably
irrelevant since that's a unicodeobject.c-internal function, anyway.

> "PyUnicode_AsUnicodeAndSize"
> 
> This function was added to Python 3.3 and is directly deprecated. Why
> adding a function to deprecate it? PyUnicode_AsUnicode() and
> PyUnicode_GET_SIZE() were not enough?

If it was not in 3.2, we should certainly remove it right away.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13604>
_______________________________________