[Python-ideas] Why is design-by-contracts not widely adopted?

Wed Sep 26 04:36:23 EDT 2018

On Wed, 26 Sep 2018 at 06:41, Chris Angelico <rosuav at gmail.com> wrote:
>
> On Wed, Sep 26, 2018 at 2:47 PM Marko Ristin-Kaufmann
> <marko.ristin at gmail.com> wrote:
> >
> > Hi Chris,
> >
> >> An extraordinary claim is like "DbC can improve *every single project*
> >> on PyPI". That requires a TON of proof. Obviously we won't quibble if
> >> you can only demonstrate that 99.95% of them can be improved, but you
> >> have to at least show that the bulk of them can.
> >
> >
> > I tried to give the "proof" (not a formal one, though) in my previous message.
>
> (Formal proof isn't necessary here; we say "extraordinary proof", but
> it'd be more accurate to say "extraordinary evidence".)
>
> > The assumptions are that:
> > * There are always contracts, they can be either implicit or explicit. You need always to figure them out before you call a function or use its result.
>
> Not all code has such contracts. You could argue that code which does
> not is inferior to code which does, but not everything follows a
> strictly-definable pattern.

Also, the implicit contracts code currently has are typically pretty
loose. What you need to "figure out" is very general. Explicit
contracts are typically demonstrated as being relatively strict, and
figuring out and writing such contracts is more work than writing code
with loose implicit contracts. Whether the trade-off of defining tight
explicit contracts once vs inferring a loose implicit contract every
time you call the function is worth it, depends on how often the
function is called. For most single use (or infrequently used)
functions, I'd argue that the trade-off *isn't* worth it.

Here's a quick example from the pip codebase:

# Retry every half second for up to 3 seconds
@retry(stop_max_delay=3000, wait_fixed=500)
def rmtree(dir, ignore_errors=False):
    shutil.rmtree(dir, ignore_errors=ignore_errors,
                  onerror=rmtree_errorhandler)

What contract would you put on this code? The things I can think of:

1. dir is a directory: obvious from the name, not worth the runtime
cost of checking as shutil.rmtree will do that and we don't want to
duplicate work.
2. dir is a string: covered by type declarations, if we used them. No
need for contracts
3. ignore_errors is a boolean: covered by type declarations.
4. dir should exist: Checked by shutil.rmtree, don't want to duplicate work.
5. After completion, dir won't exist. Obvious unless we have doubts
about what shutil.rmtree does (but that would have a contract too).
Also, we don't want the runtime overhead (again).

In addition, adding those contracts to the code would expand it
significantly, making readability suffer (as it is, rmtree is clearly
a thin wrapper around shutil.rmtree).

> > * Figuring out contracts by trial-and-error and reading the code (the implementation or the test code) is time consuming and hard.
>
> Agreed.

With provisos. Figuring out contracts in sufficient detail to use the
code is *in many cases* simple. For harder cases, agreed. But that's
why this is simply a proof that contracts *can* be useful, not that
100% of code would benefit from them.

> > * The are tools for formal contracts.
>
> That's the exact point you're trying to make, so it isn't evidence for
> itself. Tools for formal contracts exist as third party in Python, and
> if that were good enough for you, we wouldn't be discussing this.
> There are no such tools in the standard library or language that make
> formal contracts easy.
>
> > * The contracts written in documentation as human text inevitably rot and they are much harder to maintain than automatically verified formal contracts.
>
> Agreed.

Agreed, if contracts are automatically verified. But when runtime cost
comes up, people suggest that contracts can be disabled in production
code - which invalidates the "automatically verified" premise.

> > * The reader is familiar with formal statements, and hence reading formal statements is faster than reading the code or trial-and-error.
>
> Disagreed. I would most certainly NOT assume that every reader knows
> any particular syntax for such contracts. However, this is a weaker
> point.

Depends on what "formal statement" means. If it means "short snippet
of Python code", then yes, the reader will be familiar. But there's
only so much you can do in a short snippet of Python, without calling
out to other functions (which may or may not be "obvious" in their
behavour) so whether it's easier to read a contract is somewhat in
conflict with wanting strong contracts.

> So I'll give you two and two halves for that. Good enough to make do.
>
> > I then went on to show why I think, under these assumptions, that formal contracts are superior as a documentation tool and hence beneficial. Do you think that any of these assumptions are wrong? Is there a hole in my logical reasoning presented in my previous message? I would be very grateful for any pointers!
> >
> > If these assumptions hold and there is no mistake in my reasoning, wouldn't that qualify as a proof?
> >
>
[...]
> You might argue that a large proportion of PyPI projects will be
> "library-style" packages, where the main purpose is to export a bunch
> of functions. But even then, I'm not certain that they'd all benefit
> from DbC. Some would, and you've definitely made the case for that;
> but I'm still -0.5 on adding anything of the sort to the stdlib, as I
> don't yet see that *enough* projects would actually benefit.

The argument above, if it's a valid demonstration that all code would
benefit from contracts, would *also* imply that every function in the
stdlib should have contracts added. Are you proposing that, too, and
is your proposal not just for syntax for contracts, but *also* for
wholesale addition of contracts to the stdlib? If so, you should be
far more explicit that this is what you're proposing, because you'd
likely get even more pushback over that sort of churn in the stdlib
than over a syntax change to support contracts. Even Guido didn't push
that far with type annotations...

> People have said the same thing about type checking, too. Would
> *every* project on PyPI benefit from MyPy's type checks? No. Syntax
> for them was added, not because EVERYONE should use them, but  because
> SOME will use them, and it's worth having some language support. You
> would probably do better to argue along those lines than to try to
> claim that every single project ought to be using contracts.

Precisely. To answer the original question, "Why is design by
contracts not widely adopted?" part of the answer is that I suspect
extreme claims like this have put many people off, seeing design by
contract as more of an evangelical stance than a practical tool. If it
were promoted more as a potentially useful addition to the
programmer's toolbox, and less of the solution to every problem, it
may have gained more traction. (Similar issues are why people are
skeptical of functional programming, and many other tools - even the
"strong typing vs weak typing" debate can have a flavour of this "my
proposal solves all the world's ills" attitude).

Personally, I'm open to the benefits of design by contract. But if I
need to buy into a whole philosophy to use it (or engage with its user
community) I'll pass.

Paul