[Python-ideas] Why is design-by-contracts not widely adopted?
robertc at robertcollins.net
Wed Sep 26 02:17:22 EDT 2018
On Wed, 26 Sep 2018 at 05:19, Marko Ristin-Kaufmann
<marko.ristin at gmail.com> wrote:
> Hi Robert,
>> Claiming that DbC annotations will improve the documentation of every
>> single library on PyPI is an extraordinary claim, and such claims
>> require extraordinary proof.
> I don't know what you mean by "extraordinary" claim and "extraordinary" proof, respectively.
Chris already addressed this.
> I tried to show that DbC is a great tool and far superior to any other tools currently used to document contracts in a library, please see my message https://groups.google.com/d/msg/python-ideas/dmXz_7LH4GI/5A9jbpQ8CAAJ. Let me re-use the enumeration I used in the message and give you a short summary.
> When you are documenting a method you have the following options:
> 1) Write preconditions and postconditions formally and include them automatically in the documentation (e.g., by using icontract library).
> 2) Write precondtions and postconditions in docstring of the method as human text.
> 3) Write doctests in the docstring of the method.
> 4) Expect the user to read the actual implementation.
> 5) Expect the user to read the testing code.
So you can also:
0) Write clear documentation
e.g. just document the method without doctests. You can write doctests
in example documentation, or even in unit tests if desired: having use
cases be tested is often valuable.
> The implicit or explicit contracts are there willy-nilly. When you use a module, either you need to figure them out using trial-and-error or looking at the implementation (4), looking at the test cases and hoping that they generalize (5), write them as doctests (3) or write them in docstrings as human text (2); or you write them formally as explicit contracts (1).
> I could not identify any other methods that can help you with expectations when you call a function or use a class (apart from formal methods and proofs, which I omitted as they seem too esoteric for the current discussion).
> Given that:
> * There is no other method for representing contracts,
> * people are trained and can read formal statements and
> * there is tooling available to write, maintain and represent contracts in a nice way
> I see formal contracts (1) as a superior tool. The deficiencies of other approaches are:
> 2) Comments and docstrings inevitably rot and get disconnected from the implementation in my and many other people's experience and studies.
> 3) Doctests are much longer and hence more tedious to read and maintain, they need extra text to signal the intent (is it a simple test or an example how boundary conditions are handled or ...). In any non-trivial case, they need to include even the contract itself.
> 4) Looking at other people's code to figure out the contracts is tedious and usually difficult for any non-trivial function.
> 5) Test cases can be difficult to read since they include much broader testing logic (mocking, set up). Most libraries do not ship with the test code. Identifying test cases which demonstrate the contracts can be difficult.
I would say that contracts *are* a formal method. They are machine
interpretable rules about when the function may be called and about
what it may do and how it must leave things.
The critique I offer of DbC in a Python context is the same as for
other formal methods: is the benefit worth the overhead. If you're
writing a rocket controller, Python might not be the best language for
> Any function that is used by multiple developers which operates on the restricted range of input values and gives out structured output values benefits from contracts (1) since the user of the function needs to figure them out to properly call the function and handle its results correctly. I assume that every package on pypi is published to be used by wider audience, and not the developer herself. Hence every package on pypi would benefit from formal contracts.
In theory there is no difference between theory and practice, but in
practice there may be differences between practice and theory :).
Less opaquely, you're using a model to try and extrapolate human
behaviour. This is not an easy thing to do, and you are very likely to
be missing factors. For instance, perhaps training affects perceived
benefits. Perhaps a lack of experimental data affects uptake in more
data driven groups. Perhaps increased friction in changing systems is
felt to be a negative.
And crucially, perhaps some of these things are true. As previously
mentioned, Python has wonderful libraries; does it have more per
developer than languages with DbC built in? If so then that might
speak to developer productivity without these formal contracts: it may
be that where the risks of failure are below the threshold that formal
methods are needed, that we're better off with the tradeoff Python
> Some predicates are hard to formulate, and we will never be able to formally write down all the contracts. But that doesn't imply for me to not use contracts at all (analogously, some functionality is untestable, but that doesn't mean that we don't test what we can).
> I would be very grateful if you could point me where this exposition is wrong (maybe referring to my original message, https://groups.google.com/d/msg/python-ideas/dmXz_7LH4GI/5A9jbpQ8CAAJ, which I spent more thought on formulating).
I think the underlying problem is that you're treating this as a logic
problem (what does logic say applies here), rather than an engineering
problem (what can we measure and what does it tell us about whats
going on). At least, thats how it appears to me.
> So far, I was not confronted against nor read on the internet a plausible argument against formal contracts (the only two exceptions being lack of tools and less-skilled programmers have a hard time reading formal statements as soon as they include boolean logic and quantifiers). I'm actively working on the former, and hope that the latter would improve with time as education in computer sciences improves.
> Another argument, which I did read often on internet, but don't really count is that quality software is not a priority and most projects hence dispense of documentation or testing. This should, hopefully, not apply to public pypi packages and is highly impractical for any medium-size project with multiple developers (and very costly in the long run).
I have looked for but could not find any studies into the developer
productivity (and correctness) tradeoffs that DbC creates, other than
stuff from 20 years ago which by its age clearly cannot contrast with
modern Python/Ruby/Rust etc.
Consider this: the goal of software development is to deliver
features, at some level of correctness. One very useful measure of
productivity then is a measure of how long it takes a given team to
produce those features at that level of correctness.
If DbC reduces the time it takes to get those features, it is
If it increases the time it takes to get those features, it is
decreasing productivity, *even if it increases correctness*. Being
more correct than needed is not beneficial much of the time.
Does DbC deliver higher productivity @ a given correctness level? I
don't know - thats why I went looking for research, but I couldn't
find any (I may have missed it of course, I'd be happy to read some
citations). I'm specifically looking for empirical data here, not
extrapolation or rationalisations.
>> I can think of many libraries where necessary pre and post conditions
>> (such as 'self is still locked') are going to be noisy, and at risk of
>> reducing comprehension if the DbC checks are used to enhance/extended
> It is up to the developer to decide which contracts are enforced during testing, production or displayed in the documentation (you can pick the subset of the three, it's not an exclusion). This feature ("enabled" argument to a contract) has been already implemented in the icontract library.
> Some of the examples you've been giving would be better expressed with
> a more capable type system in my view (e.g. Rust's), but I have no
> good idea about adding that into Python :/.
> I don't see how type system would help regardless how strict it would be? Unless each input and each output represent a special type, which would be super confusing as soon as you would put them in the containers and have to struggle with invariance, contravariance and covariance. Please see https://github.com/rust-lang/rfcs/issues/1077 for a discussion about introducing DbC to Rust. Unfortunately, the discussion about contracts in Rust is also based on misconceptions (e.g., see https://github.com/rust-lang/rfcs/issues/1077#issuecomment-94582917) -- there seems to be something wrong in the way anybody proposing DbC exposes contracts to the wider audience and miss to address these issues in a good way. So most people just react instinctively with "80% already covered with type systems" / "mere runtime type checks, use assert" and "that's only an extension to testing, so why bother" :(.
> I would now like to answer Hugh and withdraw from the discussion pro/contra formal contracts unless there is a rational, logical argument disputing the DbC in its entirety (not in one of its specific aspects or as a misconception/straw-man). A lot has been already said, many articles have been written (I linked some of the pages which I thought were short & good reads and I would gladly supply more reading material). I doubt I can find a better way to contribute to the discussion.
Sure; like I said, I think the fundamental question about DbC is
actually whether it helps:
a) all programs
b) all nontrivial programs
c) high assurance programs
My suspicion, for which I have only anecdata, is that its really in c)
today. Kindof where TDD was in the early 2000's (and as I understand
the research, its been shown to be a wash: you do get more tests than
test-last or test-during, and more tests is correlated with quality
and ease of evolution, but if you add that test coverage in
test-during or test-last, you end up with the same benefits).
More information about the Python-ideas