[Python-Dev] PEP 393 close to pronouncement

Tue Sep 27 00:19:02 CEST 2011

Hi,

Le lundi 26 septembre 2011 23:00:06, Guido van Rossum a écrit :
> So, if you have the time, please review PEP 393 and/or play with the
> code (the repo is linked from the PEP's References section now).

I played with the code. The full test suite pass on Linux, FreeBSD and 
Windows. On Windows, there is just one failure in test_configparser, I didn't 
investigate it yet. I like the new API: a classic loop on the string length, 
and a macro to read the nth character. The backward compatibility is fully 
transparent and is already well tested because some modules still use the 
legacy API.

It's quite easy to move from the legacy API to the new API. It's just boring, 
but it's almost done in the core (unicodeobject.c, but also some modules like 
_io).

Since the introduction of PyASCIIObject, the PEP 393 is really good in memory 
footprint, especially for ASCII-only strings. In Python, you manipulate a lot 
of ASCII strings.

PEP
===

It's not clear what is deprecated. It would help to have a full list of the 
deprecated functions/macros.

Sometimes Martin wrote PyUnicode_Ready, sometimes PyUnicode_READY. It's 
confusing.

Typo: PyUnicode_FAST_READY => PyUnicode_READY.

"PyUnicode_WRITE_CHAR" is not listed in the New API section.

Typo in "PyUnicode_CONVERT_BYTES(from_type, tp_type, begin, end, to)": tp_type 
=> to_type.

"PyUnicode_Chr(ch)": Why introducing a new function? PyUnicode_FromOrdinal was 
not enough?

"GDB Debugging Hooks" It's not done yet.

"None of the functions in this PEP become part of the stable ABI (PEP 384)." 
Why? Some functions don't depend on the internal representation, like 
PyUnicode_Substring or PyUnicode_FindChar.

Typo: "In order to port modules to the new API, try to eliminate the use of 
these API elements: ... PyUnicode_GET_LENGTH ..." PyUnicode_GET_LENGTH is part 
of the new API. I suppose that you mean PyUnicode_GET_SIZE.

Victor