[Python-checkins] r88892 - peps/trunk/pep-0393.txt

martin.v.loewis python-checkins at python.org
Sat Aug 27 07:25:20 CEST 2011


Author: martin.v.loewis
Date: Sat Aug 27 07:25:20 2011
New Revision: 88892

Log:
Revert checkins; they should have gone to hg.


Modified:
   peps/trunk/pep-0393.txt

Modified: peps/trunk/pep-0393.txt
==============================================================================
--- peps/trunk/pep-0393.txt	(original)
+++ peps/trunk/pep-0393.txt	Sat Aug 27 07:25:20 2011
@@ -145,17 +145,10 @@
 
 The canonical representation can be accessed using two macros
 PyUnicode_Kind and PyUnicode_Data. PyUnicode_Kind gives one of the
-value PyUnicode_WCHAR_KIND (0), PyUnicode_1BYTE_KIND (1),
-PyUnicode_2BYTE_KIND (2), or PyUnicode_4BYTE _KIND(3). PyUnicode_Data
-gives the void pointer to the data, masking out the pointer kind. All
-these functions call PyUnicode_Ready in case the canonical
-representation hasn't been computed yet. Access to individual
-characters should use PyUnicode_{READ|WRITE}[_CHAR]:
-
-  - PyUnciode_READ(kind, data, index)
-  - PyUnicode_WRITE(kind, data, index, value)
-  - PyUnicode_READ_CHAR(unicode, index)
-  - PyUnicode_WRITE_CHAR(unicode, index, value)
+value PyUnicode_1BYTE (1), PyUnicode_2BYTE (2), or PyUnicode_4BYTE
+(3). PyUnicode_Data gives the void pointer to the data, masking out
+the pointer kind. All these functions call PyUnicode_Ready
+in case the canonical representation hasn't been computed yet.
 
 A new function PyUnicode_AsUTF8 is provided to access the UTF-8
 representation. It is thus identical to the existing
@@ -170,6 +163,13 @@
 PyUnicode_AsUnicode is deprecated; it computes the wstr representation
 on first use.
 
+String Operations
+-----------------
+
+Various convenience functions will be provided to deal with the
+canonical representation, in particular with respect to concatenation
+and slicing.
+
 Stable ABI
 ----------
 
@@ -181,30 +181,6 @@
 about the internals of CPython's data types, include PyUnicodeObject
 instances.  It will need to be slightly updated to track the change.
 
-Open Issues
-===========
-
-- When an application uses the legacy API, it may hold onto
-  the Py_UNICODE* representation, and yet start calling Unicode
-  APIs, which would call PyUnicode_Ready, invalidating the 
-  Py_UNICODE* representation; this would be an incompatible change.
-  The following solutions can be considered:
-
-  * accept it as an incompatible change. Applications using the
-    legacy API will have to fill out the Py_UNICODE buffer completely
-    before calling any API on the string under construction.
-  * require explicit PyUnicode_Ready calls in such applications;
-    fail with a fatal error if a non-ready string is ever read.
-    This would also be an incompatible change, but one that is
-    more easily detected during testing.
-  * as a compromise between these approaches, implicit PyUnicode_Ready
-    calls (i.e. those not deliberately following the construction of
-    a PyUnicode object) could produce a warning if they convert an
-    object.
-
-- Which of the APIs created during the development of the PEP should
-  be public?
-
 Discussion
 ==========
 
@@ -213,16 +189,11 @@
 It makes the implementation more complex. That's true, but considered
 worth given the gains.
 
-The Py_UNICODE representation is not instantaneously available,
+The Py_Unicode representation is not instantaneously available,
 slowing down applications that request it. While this is also true,
 applications that care about this problem can be rewritten to use the
 str representation.
 
-The question was raised whether the wchar_t representation is
-discouraged, or scheduled for removal. This is not the intent of this
-PEP; applications that use them will see a performance penalty,
-though. Future versions of Python may consider to remove them.
-
 Copyright
 =========
 


More information about the Python-checkins mailing list