[Python-checkins] cpython: Elaborate on representations and canonical/legacy unicode objects
antoine.pitrou
python-checkins at python.org
Sat Oct 22 22:12:06 CEST 2011
http://hg.python.org/cpython/rev/4c628ee78cd6
changeset: 73056:4c628ee78cd6
user: Antoine Pitrou <solipsis at pitrou.net>
date: Sat Oct 22 22:08:05 2011 +0200
summary:
Elaborate on representations and canonical/legacy unicode objects
files:
Doc/c-api/unicode.rst | 16 +++++++++++++++-
1 files changed, 15 insertions(+), 1 deletions(-)
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -18,7 +18,21 @@
points must be below 1114112 (which is the full Unicode range).
:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
-in the Unicode object.
+in the Unicode object. The :c:type:`Py_UNICODE*` representation is deprecated
+and inefficient; it should be avoided in performance- or memory-sensitive
+situations.
+
+Due to the transition between the old APIs and the new APIs, unicode objects
+can internally be in two states depending on how they were created:
+
+* "canonical" unicode objects are all objects created by a non-deprecated
+ unicode API. They use the most efficient representation allowed by the
+ implementation.
+
+* "legacy" unicode objects have been created through one of the deprecated
+ APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the
+ :c:type:`Py_UNICODE*` representation; you will have to call
+ :c:func:`PyUnicode_READY` on them before calling any other API.
Unicode Type
--
Repository URL: http://hg.python.org/cpython
More information about the Python-checkins
mailing list