[Python-Dev] [Python-checkins] r79397 - in python/trunk: Doc/c-api/capsule.rst Doc/c-api/cobject.rst Doc/c-api/concrete.rst Doc/data/refcounts.dat Doc/extending/extending.rst Include/Python.h Include/cStringIO.h Include/cobject.h Include/datetime.h Include/py_curses.h Include/pycapsule.h Include/pyexpat.h Include/ucnhash.h Lib/test/test_sys.py Makefile.pre.in Misc/NEWS Modules/_ctypes/callproc.c Modules/_ctypes/cfield.c Modules/_ctypes/ctypes.h Modules/_cursesmodule.c Modules/_elementtree.c Modules/_testcapimodule.c Modules/cStringIO.c Modules/cjkcodecs/cjkcodecs.h Modules/cjkcodecs/multibytecodec.c Modules/cjkcodecs/multibytecodec.h Modules/datetimemodule.c Modules/pyexpat.c Modules/socketmodule.c Modules/socketmodule.h Modules/unicodedata.c Objects/capsule.c Objects/object.c Objects/unicodeobject.c PC/VS7.1/pythoncore.vcproj PC/VS8.0/pythoncore.vcproj PC/os2emx/python27.def PC/os2vacpp/python.def Python/compile.c Python/getargs.c

M.-A. Lemburg mal at egenix.com
Fri Mar 26 10:20:08 CET 2010


Larry Hastings wrote:
> 
> M.-A. Lemburg wrote:
>> Backporting PyCapsule is fine, but the changes you made to all
>> those PyCObject uses does not look backwards compatible.
>>
>> The C APIs exposed by the modules (e.g. the datetime module)
>> are used in lots of 3rd party extension modules and changing
>> them from PyCObject to PyCapsule is a major change in the
>> module API.
> 
> You're right, my changes aren't backwards compatible.  I thought it was
> reasonable for four reasons:

Just as reminder of the process we have in place for such changes:
Please discuss any major breakage on python-dev before checking in
the patch.

> 1. The CObject API isn't safe.  It's easy to crash Python 2.6 in just a
> few lines by mixing and matching CObjects.  Switching Python to capsules
> prevents a class of exploits.  I've included a script at the bottom of
> this message that demonstrates three such crashes.  The script runs in
> Python 2 and 3, but 3.1 doesn't crash because it's using capsules.

That's good, but then again: deliberate wrong use of APIs will
always cause crashes and at least I don't know of any report about
PyCObjects posing a problem in all their years of existence.

> 2. As I just mentioned, Python 3.1 already uses capsules everywhere
> instead of CObjects.  Since part of the purpose of Python 2.7 is to
> prepare developers for the to upgrade to 3.1, getting them to switch to
> capsules now is just one more way they are prepared.

Sure, but forcing them is not a good idea, even less so if you can easily
expose both a PyCObject and PyCapsule interface to the same C API.

> 3. Because CObject is unsafe, I want to deprecate it in 2.7, and if we
> ever made a 2.8 I want to remove it completely.

Please remember that PyCObjects are not only used internally
in CPython, but also in other 3rd party modules to expose C APIs
and those will generally have to support more than just the latest
Python release.

If you deprecate those C APIs, the process will at least have to cover
one more release, i.e. run through the whole deprecation process:

1. pending deprecation (2.7)
2. deprecation (2.8)
3. removal (2.9)

I think it's better to add a -3 warning when using PyCObjects in
2.7.

> 4. When Python publishes an API using a CObject, it describes the thing
> the CObject points to in a header file.  In nearly all cases that header
> file also provides a macro or inline function that does the importing
> work for you.  I changed those to use capsules too.  So if the
> third-party code uses the macro or inline function, all you need do is
> recompile it against 2.7 and it works fine.  Sadly I know of one
> exception: pyexpat.expat_CAPI.  The header file just describes the
> struct pointed to by the CObject, but callers

I know about those macros... I introduced that idea with mxDateTime
and then added the same logic in CPython in a couple of places
a long while ago, e.g. socketmodule.h. IIRC, Jim Fulton added
PyCObjects for exchanging module C APIs a few years before that.

Later on PyCObject_Import() was added to simplify the C API
import a bit.

A recompile is certainly a possibility to have 3rd party
modules switch to the capsule interfaces of the stdlib
modules, but we should still be more careful about this.

Wouldn't it be possible to have PyCObject_AsVoidPtr() et al.
work with PyCapsules as well ?

After all, the caller is only interested in the pointer and
doesn't really care about what object was used to wrap it.

> I can suggest four ways to ameliorate the problem.
> 
> First, we could do as Antoine Pitrou suggests on the bug (issue 7992):
> wherever the CObject used to be published as a module attribute to
> expose an API, we could provide both a CObject and a capsule; internally
> Python would only use the capsules.  This would allow third-party
> libraries to run against 2.7 unchanged.  The major problem with this is
> that third-party libraries would still be vulnerable to the
> mix-and-match CObject crash. 

True, but we've been happy with this vulnerability for years, just
as we've been happy with the fact that it's easy to crash the
VM by passing hand-crafted byte-code to it, or using ctypes to
call an OS function with the wrong parameters, etc.

Like I said: there are many ways to deliberately crash Python.

We don't have a concept of interface signatures in Python,
it's mostly based on trust.

> A secondary, minor concern: obviously we'd
> store the CObject attribute with the existing name, and the capsule
> attribute would have to get some new name.  But in Python 3.1, these
> attributes already expose a capsule.  Therefore, people who convert to
> using the capsules now would have to convert again when moving to 3.1.

This should be manageable with some aliasing.

> Second, we could make CObject internally support unpacking capsules.  If
> you gave a capsule to PyCObject_AsVoidPtr() it would unpack it and
> return the pointer within.  (We could probably also map the capsule
> "context" to the CObject "desc", if any of the Python use cases needed
> it.)  I wouldn't change anything else about CObjects; creating and using
> them would continue to work as normal.  This would also allow
> third-party libraries to run against Python 2.7 unchanged.  The only
> problem is that it's unsafe, as indeed allowing any use of
> PyCObject_AsVoidPtr() is unsafe.

That's a good idea. Yes, playing with fire is unsafe, but fire is
useful in a lot of places as well :-)

> Third, I've been pondering writing a set of preprocessor macros, shipped
> in their own header file distributed independently of Python and
> released to the public domain, that would make it easy to use either
> CObjects or capsules depending on what version of Python you were
> compiling against.  Obviously, using these macros would require a source
> code change in the third-party library.  But these macros would make it
> a five-minute change.  This could compliment the first or second
> approaches.

I'm not sure whether that would worth the effort. 3rd party modules
for Python 2.x are likely not going to use them, since they typically
have to support more than just Python 2.7.

E.g. look at Plone - they are still running on Python 2.4 and use
lots of 3rd party modules. It took Zope years to switch from 2.4 to
2.6. You have the same situation with other large systems, whether
OSS or not.

If a required 3rd party module would suddenly only support Python 2.7,
developers would either have to find a replacement that (still) works
for them (rather unlikely), or try to port their whole app to 2.7,
which often enough is not easily possible due to other restrictions.

This aspect is often not considered when discussing such backwards
incompatible changes - at least not in the early stages. Fortunately,
we have so far - and most of the time - found a way to keep everyone
happy.

> Fourth, we could back out of the changes to published APIs and convert
> them back to CObjects.  -1.

See above. That's still an option as well, but I agree that either
making PyCObjects compatible with PyCapsules (via changes to
the PyCObject functions) or having them exist side-by-side is
a better option.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Mar 26 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Python-Dev mailing list