[Python-Dev] Proposal: explicitly disallow function/class mismatches in accelerator modules
steve at pearwood.info
Sun Jul 10 23:26:14 EDT 2016
On Sun, Jul 10, 2016 at 10:15:33AM +1000, Nick Coghlan wrote:
> On 10 July 2016 at 05:10, Steven D'Aprano <steve at pearwood.info> wrote:
> > The other side of the issue is that requiring exact correspondence is
> > considerably more difficult and may be over-kill for some uses.
> > Hypothetically speaking, if I have an object that only supports pickling
> > by accident, and I replace it with one that doesn't support pickling,
> > shouldn't it be my decision whether that counts as a functional
> > regression (a bug) or a reliance on an undocumented and accidental
> > implementation detail?
> That's the proposed policy change and the reason I figured it needed a
> python-dev discussion, as currently it's up to folks adding the Python
> equivalents (or the C accelerators) to decide on a case by case basis
> whether or not to care about compatibility for:
> - string representations
> - pickling (and hence multiprocessing support)
> - subclassing
> - isinstance checks
> - descriptor behaviour
Right... and that's what I'm saying *ought* to be the decision of the
maintainer. Do I understand that you agree with this, but you want to
ensure that such decisions are made up front rather than when and if
discrepencies are noticed?
> The main way for such discrepancies to arise is for the Python
> implementation to be a function (or closure), while the C
> implementation is a custom stateful callable.
> The problem with the current "those are just technical implementation
> details" approach is that a lot of Pythonistas learn standard library
> API behaviour and capabilities through a mix of experimentation and
> introspection rather than reading the documentation,
> so if CPython
> uses the accelerated version by default, then most folks aren't going
> to notice the discrepancies until they (or their users) are trying to
> debug a problem like "my library works fine in CPython, but breaks
> when used with multiprocessing on PyPy" or "my doctests fail when
> running under MicroPython".
Yes, and that's a problem, but is it a big enough problem to justify a
policy change and pre-emptive effort to prevent it?
The great majority of people are never going to run their Python code on
anything other than CPython, and while I always encourage people to
write in the most platform-independent fashion possible, I am also
realistic enough to recognise that platform independence is an ideal
that many people will fail to meet. (I'm sure that I've written code
that isn't as platform independent as I hope.)
My two core questions are:
(1) How much extra effort are we going to *mandate* that core devs put
in to hide the differences between C and Python code, for the benefit of
a small minority that will notice them?
(2) When should that effort be done? Upfront, or when and as problems
are reported or noticed?
My preference for answers will be, (1) not much, and (2) when problems
are reported. In other words, close to the status quo.
I can't speak for others, but I have a tendency towards over-analysing
my code, trying to pre-emptively spot and avoid even really obscure
failure modes before they occur. That's a trap: it makes it hard to
finish (as much as any code is finished) and harder to meet deadlines.
It's taken me a lot of effort, and much influence from TDD, to realise
that it's okay to release code with a bug you didn't spot. You can
always fix it in the next release.
I think the same applies here. I'm okay with something close to the
status quo: if a C accelerator doesn't quite have the same undocumented
interface as the pure Python one, then its a bug in one or the other,
in which case its okay to fix it when somebody notices.
But I don't think I'm okay to make it mandatory that we prevent such
possible incompatibilities ahead of time.
If we do make this mandatory, how is it going to be enforced and
checked? The normal way to enforce that accelerated code has the same
behaviour as Python code is to see that they both pass the same tests.
But this can only check for features where a test has been written. If
you don't think of an incompatibility ahead of time, how do you write a
test for it?
I appreciate that the standard library should be held up to a higher
level of professionalism than external code, but I don't think that
*all* the burden should fall on the core developers. Reliance on
undocumented features is always a dubious thing to do. We all do it, and
when it turns out that the feature can't be counted on (because it
changes from one version to another, or isn't available on some
platforms), who is to blame for our application breaking?
As the programmer who relied on a promise that was never made, surely I
must take at least a bit of responsibility? Its not like the docs are
locked up in a filing cabinent in the basement behind a door with a sign
saying "Beware of the leopard".
I'm just not comfortable with mandating that core devs must do even more
work to protect programmers (including myself) from our own failures to
code defensively and read the docs, with respect to this specific issue.
We have to draw the line somewhere. I've seen people write code that
relies on the exact wording of error messages. I've even done that
myself, a long time ago. I hope that we would all agree that mirroring
error messages is crossing the line. But beyond that, I'm not sure where
the line lies, and I'd rather put off dealing with it until necessary,
and on a case-by-case basis.
> For non-standard library code, those kinds of latent compatibility
> defects are fine, but PEP 399 is specifically about setting design &
> development policy for the *standard library* to help improve
> cross-implementation compatibility not just of the standard library
> itself, but of code *using* the standard library in ways that work on
> CPython in its default configuration.
PEP 399 already raises this issue. Quote:
Any new accelerated code must act as a drop-in replacement as
close to the pure Python implementation as reasonable. Technical
details of the VM providing the accelerated code are allowed to
differ as necessary
Are you saying that's not good enough? If so, what's your proposed
More information about the Python-Dev