[issue1943] improved allocation of PyUnicode objects

Marc-Andre Lemburg report at bugs.python.org
Tue Jun 2 18:38:55 CEST 2009


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Antoine Pitrou wrote:
> Antoine Pitrou <pitrou at free.fr> added the comment:
> 
>> The patch breaks C API + binary compatibility for an essential Python
>> type - that's not something you can easily undo.
> 
> I don't see how it breaks C API compatibility. No officially documented
> function has changed, and the accessor macros still work. Am I missing
> something?

Yes: The layout and object type of the PyUnicodeObject object.

You cannot simply recompile your code and have it working. Instead,
you have to provide different sub-typing implementations depending
on whether PyUnicodeObject is a PyVarObject or PyObject, since
these are inherently different in their structure.

Please note that all type objects documented in the header files
not explicitly declared for private use only, are in fact
public APIs. You need access to those type objects in order to
be able to subclass them.

> As for binary compatibility, yes, it does break it, which isn't an
> exceptional situation in the development process. We have changed other
> "essential types" too -- for example, recently, the PyLong object got
> 30-bit digits on some systems. Why you think it is hard to undo, I don't
> understand.

That's a different kind of change. Even though it's very hard to
sub-type longs due to their PyVarObject nature and the fact that
longs even dig into the PyObject_VAR_HEAD, you can still recompile
your code and it will continue to work. The change was to a typedef -
the name of the typedef itself has not changed.

This is similar to compiling Python as UCS2 or UCS4 version - Py_UNICODE
will stay the same typedef, but on a UCS2 system it maps to 16 bits,
whereas on a UCS4 system it is set to 32 bits.

Note that the Unicode implementation takes great care not to hide
this binary incompatibility - by remapping all APIs to include the
UCS2/UCS4 hint in the exported name. As an side: The long implementation
does not.

> As for the future ABI PEP, which has not yet been accepted, it does not
> mention PyUnicodeObject as part of the structures which are guaranteed
> to remain binary-compatible :
> http://www.python.org/dev/peps/pep-0384/#structures

That's fine, but please note that the ABI PEP only addresses
applications that wish to benefit from the binary compatibility
across Python versions.

It has no implications on applications that don't want to use
the ABI or cannot, since they are too low-level, such as extensions
wishing to sub-class built-in types.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1943>
_______________________________________


More information about the Python-bugs-list mailing list