Re: [Python-ideas] Why is design-by-contracts not widely
Date: Mon, 24 Sep 2018 09:46:16 +0200 From: Marko Ristin-Kaufmann
To: Python-Ideas Subject: Re: [Python-ideas] Why is design-by-contracts not widely adopted?
[munch]
Python is easier to write and read, and there are no libraries which are close in quality in Eiffel space (notably, Numpy, OpenCV, nltk and sklearn). I really don't see how the quality of these libraries have anything to do with lack (or presence) of the contracts. OpenCV and Numpy have contracts all over their code (written as assertions and not documented), albeit with very non-informative violation messages. And they are great libraries. Their users would hugely benefit from a more mature and standardized contracts library with informative violation messages.
I would say the most likely outcome of adding Design by Contract would be no change in the quality or usefulness of these libraries, with a small but not insignificant chance of a decline in quality. Fred Brooks in his "No Silver Bullet" paper distinguished between essential complexity, which is the problem we try to solve with software, and accidental complexity, solving the problems caused by your tools and/or process that get in the way of solving the actual problem. "Yak shaving" is a similar, less formal term for accidental complexity, when you have to do something before you can do something before you can actually do some useful work. Adding new syntax or semantics to a programming language very often adds accidental complexity. C and Python (currently) are known as simple languages. When starting a programming project in C or Python, there's maybe a brief discussion about C99 or C11, or Python 3.5 or 3.6, but that's it. There's one way to do it. On the other hand C++ is notorious for having been designed with a shovel rather than a chisel. The people adding all the "features" were well intentioned, but it's still a mess. C++ programming projects often start by specifying exactly which bits of the language the programming team will be allowed to use. I've seen these reach hundreds of pages in length, consuming God knows how many hours to create, without actually creating a single line of useful software. I think a major reason that Design by Contract hasn't been widely adopted in the three decades since its introduction is because, mostly, it creates more accidental complexity than it reduces essential complexity, so the costs outweigh any benefits. Software projects, in any language, never have enough time to do everything. By your own example, the Python developers of numpy, OpenCV, nlk, and sklearn; who most certainly weren't writing contracts; produced better quality software than the Eiffel equivalent developers who (I assume) did use DbC. Shouldn't the Eiffel developers be changing their development method, not the Python developers? Maybe in a world with infinite resources contracts could be added to those Python packages, or everything in PyPi, and it would be an improvement. But we don't. So I'd like to see the developers of numpy etc keep doing whatever it is that they're doing now. -- cheers, Hugh Fisher
On 25/09/18 12:59, Hugh Fisher wrote: Thank you for a very well thought out post, Hugh. I completely agree. I just wanted to pull out one comment:
Adding new syntax or semantics to a programming language very often adds accidental complexity.
This is, in my view, the main reason why the bar for adding new syntax to Python is and should be so high. People advocating new syntax often remark that programmers can choose not to use it; they don't have to write their Python using the new syntax. That is true as far as it goes. However, programmers do have to *read* Python using the new syntax, so it does impact on them. The additional accidental complexity isn't something you can just dismiss because not everyone will have to use it. -- Rhodri James *-* Kynesim Ltd
Those arguments are rules of thumb, which may or may not apply to DbC,
and speculation, based on why DbC isn't more popular, to explain why
DbC isn't more popular. They are general arguments for features in
general, whereas Marko has been giving arguments for why DbC in
particular is good or why it isn't more popular. The general arguments
don't address the specific arguments.
I don't use DbC, but I do use Numpy. Numpy is a very mathematical
library, with many pure functions. It has lots of similarities between
its functions and methods. I can easily see how design-by-contract can
help Numpy users read the documentation and compare functions.
Text is often less structured, so it is less likely to come out
consistent. After all, isn't that why we keep adding structure to it,
such as with Javadocs and Sphinx? Those examples add more syntax,
while Marko's proposal doesn't necessarily require more syntax.
On Tue, Sep 25, 2018 at 8:00 AM Hugh Fisher
Date: Mon, 24 Sep 2018 09:46:16 +0200 From: Marko Ristin-Kaufmann
To: Python-Ideas Subject: Re: [Python-ideas] Why is design-by-contracts not widely adopted? [munch]
Python is easier to write and read, and there are no libraries which are close in quality in Eiffel space (notably, Numpy, OpenCV, nltk and sklearn). I really don't see how the quality of these libraries have anything to do with lack (or presence) of the contracts. OpenCV and Numpy have contracts all over their code (written as assertions and not documented), albeit with very non-informative violation messages. And they are great libraries. Their users would hugely benefit from a more mature and standardized contracts library with informative violation messages.
I would say the most likely outcome of adding Design by Contract would be no change in the quality or usefulness of these libraries, with a small but not insignificant chance of a decline in quality.
Fred Brooks in his "No Silver Bullet" paper distinguished between essential complexity, which is the problem we try to solve with software, and accidental complexity, solving the problems caused by your tools and/or process that get in the way of solving the actual problem. "Yak shaving" is a similar, less formal term for accidental complexity, when you have to do something before you can do something before you can actually do some useful work.
Adding new syntax or semantics to a programming language very often adds accidental complexity.
C and Python (currently) are known as simple languages. When starting a programming project in C or Python, there's maybe a brief discussion about C99 or C11, or Python 3.5 or 3.6, but that's it. There's one way to do it.
On the other hand C++ is notorious for having been designed with a shovel rather than a chisel. The people adding all the "features" were well intentioned, but it's still a mess. C++ programming projects often start by specifying exactly which bits of the language the programming team will be allowed to use. I've seen these reach hundreds of pages in length, consuming God knows how many hours to create, without actually creating a single line of useful software.
I think a major reason that Design by Contract hasn't been widely adopted in the three decades since its introduction is because, mostly, it creates more accidental complexity than it reduces essential complexity, so the costs outweigh any benefits.
Software projects, in any language, never have enough time to do everything. By your own example, the Python developers of numpy, OpenCV, nlk, and sklearn; who most certainly weren't writing contracts; produced better quality software than the Eiffel equivalent developers who (I assume) did use DbC. Shouldn't the Eiffel developers be changing their development method, not the Python developers?
Maybe in a world with infinite resources contracts could be added to those Python packages, or everything in PyPi, and it would be an improvement. But we don't. So I'd like to see the developers of numpy etc keep doing whatever it is that they're doing now.
--
cheers, Hugh Fisher _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hi Hugh, Software projects, in any language, never have enough time to do everything.
By your own example, the Python developers of numpy, OpenCV, nlk, and sklearn; *who most certainly weren't writing contracts;* produced better quality software than the Eiffel equivalent developers who (I assume) did use DbC. Shouldn't the Eiffel developers be changing their development method, not the Python developers?
(emphasis mine)
This is *absolutely* *not true* as you can notice if you multiply any two
matrices of wrong dimensions in numpy or opencv (or use them as weights in
sklearn).
For example, have a look at OpenCV functions. *Most of them include
preconditions and postconditions *(*e.g., *
https://docs.opencv.org/3.4.3/dc/d8c/namespacecvflann.html#a57191110b01f200e...).
I would even go as far to claim that OpenCV would be unusable without the
contracts. Imagine if you had to figure out the dimensions of the matrix
after each operation if it were lacking in the documentation. That would
make the development sluggish as a snail.
Numpy provides contracts in text, *e.g. *see
https://www.numpy.org/devdocs/reference/generated/numpy.ndarray.transpose.ht...
:
ndarray.transpose(**axes*)
Returns a view of the array with axes transposed.
For a 1-D array, this has no effect. (To change between column and row
vectors, first cast the 1-D array into a matrix object.) For a 2-D array,
this is the usual matrix transpose. For an n-D array, if axes are given,
their order indicates how the axes are permuted (see Examples). If axes are
not provided and a.shape = (i[0], i[1], ... i[n-2], i[n-1]), then
a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0]).
As you can see, there are three contracts: 1) no effect on 1D array, 2) if
a 2D array, it equals the matrix transpose, 3) if n-D array, the order of
axes indicates the permutation. The contract 3) is written out formally. It
might not be very clear or precise what is meant in 2) where formalizing it
(at least to a certain extent) would remove many doubts.
It is obvious to me that supplementing or replacing these contracts *in
text* with *formal *contracts (1 and 2, since 3 is already formal) is
extremely beneficial since: a) many developers use numpy and an improvement
in documentation (such as higher precision and clarity) has a large impact
on the users and b) enforcing the contracts automatically (be it only
during testing or in production) prevents bugs related to contract
violation in numpy such that the users can effectively rely on the
contracts. The argument b) is important since now I just rely that these
statements are true whenever I use numpy. If there is an error in numpy it
takes a long time to figure out since I doubt the last that there is an
error in numpy and especially it takes even longer that I suspect numpy of
not satisfying its written contracts.
Please mind that contracts can be toggled on/off whenever the performance
is important so that slow execution is not an argument against the formal
contracts.
Cheers,
Marko
On Tue, 25 Sep 2018 at 14:01, Hugh Fisher
Date: Mon, 24 Sep 2018 09:46:16 +0200 From: Marko Ristin-Kaufmann
To: Python-Ideas Subject: Re: [Python-ideas] Why is design-by-contracts not widely adopted? [munch]
Python is easier to write and read, and there are no libraries which are close in quality in Eiffel space (notably, Numpy, OpenCV, nltk and sklearn). I really don't see how the quality of these libraries have anything to do with lack (or presence) of the contracts. OpenCV and Numpy have contracts all over their code (written as assertions and not documented), albeit with very non-informative violation messages. And they are great libraries. Their users would hugely benefit from a more mature and standardized contracts library with informative violation messages.
I would say the most likely outcome of adding Design by Contract would be no change in the quality or usefulness of these libraries, with a small but not insignificant chance of a decline in quality.
Fred Brooks in his "No Silver Bullet" paper distinguished between essential complexity, which is the problem we try to solve with software, and accidental complexity, solving the problems caused by your tools and/or process that get in the way of solving the actual problem. "Yak shaving" is a similar, less formal term for accidental complexity, when you have to do something before you can do something before you can actually do some useful work.
Adding new syntax or semantics to a programming language very often adds accidental complexity.
C and Python (currently) are known as simple languages. When starting a programming project in C or Python, there's maybe a brief discussion about C99 or C11, or Python 3.5 or 3.6, but that's it. There's one way to do it.
On the other hand C++ is notorious for having been designed with a shovel rather than a chisel. The people adding all the "features" were well intentioned, but it's still a mess. C++ programming projects often start by specifying exactly which bits of the language the programming team will be allowed to use. I've seen these reach hundreds of pages in length, consuming God knows how many hours to create, without actually creating a single line of useful software.
I think a major reason that Design by Contract hasn't been widely adopted in the three decades since its introduction is because, mostly, it creates more accidental complexity than it reduces essential complexity, so the costs outweigh any benefits.
Software projects, in any language, never have enough time to do everything. By your own example, the Python developers of numpy, OpenCV, nlk, and sklearn; who most certainly weren't writing contracts; produced better quality software than the Eiffel equivalent developers who (I assume) did use DbC. Shouldn't the Eiffel developers be changing their development method, not the Python developers?
Maybe in a world with infinite resources contracts could be added to those Python packages, or everything in PyPi, and it would be an improvement. But we don't. So I'd like to see the developers of numpy etc keep doing whatever it is that they're doing now.
--
cheers, Hugh Fisher _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Sep 25, 2018 at 09:59:53PM +1000, Hugh Fisher wrote:
C and Python (currently) are known as simple languages.
o_O That's a usage of "simple" I haven't come across before. Especially in the case of C, which is a minefield of *intentionally* underspecified behaviour which makes it near to impossible for the developer to tell what a piece of syntactically legal C code will actually do in practice. -- Steve
participants (5)
-
Franklin? Lee
-
Hugh Fisher
-
Marko Ristin-Kaufmann
-
Rhodri James
-
Steven D'Aprano