[New-bugs-announce] [issue10977] Concrete object C API needs abstract path for subclasses of builtin types

Raymond Hettinger report at bugs.python.org
Sat Jan 22 00:49:39 CET 2011

New submission from Raymond Hettinger <rhettinger at users.sourceforge.net>:

Currently, the concrete object C API bypasses any methods defined on subclasses of builtin types.  

It has long been accepted that subclasses of builtin types need to override many methods rather than just a few because the type itself was implemented with direct internal calls.  This work-around requires extra work but still makes it possible to create a consistent subclass (a case-insensitive dictionary for example).

However, there is another problem with this workaround.  Any subclassed builtin may still be bypassed by other functions and methods using the concrete C API.

For example, the OrderedDict class overrides enough dict methods to make itself internally consistent and fully usable by all pure python code.  However, any builtin or extension using PyDict_SetItem() or PyDict_DelItem() will update the OrderedDict's internal unordered dict
and bypass all the logic keeping dict ordering invariants intact.  Note, this problem would still impact OrderedDict even if it were rewritten in C.  The concrete calls would directly access the original dict structure and ignore any of the logic implemented in the subclass (whether coded in pure python or in C).

Since OrderedDict is written in pure python, the consequence of having its invariants violated would result in disorganization, but not in a crash.  However if OrderedDict were rewritten in C, augmenting the inherited data structure with its own extra state, then this problem could result in seg-faulting.

Note this is only one example.  Pretty much any subclass of a builtin type that adds additional state is vulnerable to a concrete C API that updates only part of the state while leaving the extra state information in an inconsistent state.

Another example would be a list subclass that kept extra state tracking the number of None objects contained within.  There is no way to implement that subclass, either in C or pure python, even if every method were overridden, that wouldn't be rendered inconsistent by an external tool using PyList_SetItem().

My recommendation is to find all of the mutating methods for the concrete C API and add an alternate path for subclasses.  The alternate path should use the abstract API.

Pseudo-code for PyList_SetItem():

   if type(obj) is list:
      # proceed as currently implemented
      # use PyObject_SetItem() adapting the 
      # function parameters and return values if necessary
      # to match the API for PyList_SetItem().
      raise BadInternalCall('object should be a list')

components: Interpreter Core
messages: 126800
nosy: rhettinger
priority: normal
severity: normal
status: open
title: Concrete object C API needs abstract path for subclasses of builtin  types
type: behavior
versions: Python 3.3

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list