[Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules

Sat Jul 9 20:15:33 EDT 2016

On 10 July 2016 at 05:10, Steven D'Aprano <steve at pearwood.info> wrote:
> The other side of the issue is that requiring exact correspondence is
> considerably more difficult and may be over-kill for some uses.
> Hypothetically speaking, if I have an object that only supports pickling
> by accident, and I replace it with one that doesn't support pickling,
> shouldn't it be my decision whether that counts as a functional
> regression (a bug) or a reliance on an undocumented and accidental
> implementation detail?

That's the proposed policy change and the reason I figured it needed a
python-dev discussion, as currently it's up to folks adding the Python
equivalents (or the C accelerators) to decide on a case by case basis
whether or not to care about compatibility for:

- string representations
- pickling (and hence multiprocessing support)
- subclassing
- isinstance checks
- descriptor behaviour

The main way for such discrepancies to arise is for the Python
implementation to be a function (or closure), while the C
implementation is a custom stateful callable.

The problem with the current "those are just technical implementation
details" approach is that a lot of Pythonistas learn standard library
API behaviour and capabilities through a mix of experimentation and
introspection rather than reading the documentation, so if CPython
uses the accelerated version by default, then most folks aren't going
to notice the discrepancies until they (or their users) are trying to
debug a problem like "my library works fine in CPython, but breaks
when used with multiprocessing on PyPy" or "my doctests fail when
running under MicroPython".

For non-standard library code, those kinds of latent compatibility
defects are fine, but PEP 399 is specifically about setting design &
development policy for the *standard library* to help improve
cross-implementation compatibility not just of the standard library
itself, but of code *using* the standard library in ways that work on
CPython in its default configuration.

One example of a practical consequence of the change in policy would
be to say that if you don't want to support subclassing, then don't
give the *type* a public name - hide it behind a factory function, the
way contextlib.contextmanager hides
contextlib._GeneratorContextManager.

That example also shows that accidentally making a type public (but
still undocumented) isn't necessarily a commitment to keeping it
public forever - that type was originally
contextlib.GeneratorContextManager, so when it was pointed out it
wasn't documented, I had to choose between supporting it as a public
API (which I didn't want to do) and adding the leading underscore to
its name to better reflect its implementation detail status.

In a similar fashion, the policy I'm proposing here also wouldn't
require that discrepancies always be resolved in favour of enhancing
the Python version to match the C version - in some cases, if the
module maintainer genuinely doesn't want to support a particular
behaviour, then they can make sure that behaviour isn't available for
the C version either. The key change is that it would become
officially *not* OK for the feature set of the C version to be a
superset of the feature set of the Python version - either the C
version has to be constrained to match the Python one, or the Python
one enhanced to match the C one, rather than leaving the latent
compatibility defect in place as a barrier to adoption for alternate
runtimes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia