[Python-ideas] Repurpose `assert' into a general-purpose check

Tue Nov 28 11:34:02 EST 2017

On Tue, Nov 28, 2017 at 10:11:46AM +0300, Ivan Pozdeev via Python-ideas wrote:

> I invite you to show me a single use case for those "assertions" because 
> after ~20 years of experience in coding (that included fairly large 
> projects), I've yet to see one.

I already mentioned not one but multiple use-cases for assertions:

- checked comments (assertions are documentation, not just code)
- checks of internal algorithm logic
- design-by-contract style pre- and post-conditions
- checking program invariants
- defensive programming against conditions that "cannot happen".

John Regehr wrote an excellent post on this from the perspective of a 
systems programmer:

https://blog.regehr.org/archives/1091

(he even links to a post by our own Ned Batchelder) but many of his 
use-cases for assert applies just as well to Python as C. Assertions 
also work great with fuzzers:

http://www.squarefree.com/2014/02/03/fuzzers-love-assertions/

I've also written on assertions before:

https://import-that.dreamwidth.org/676.html

Assertions can also be used as a kind of "Continuous Testing" that 
operates in debug builds (or in Python in __debug__ mode): every time 
you run the code, it tests its own internal state by running the 
asserts. As Regehr puts it:

    "When we write an assertion, we are teaching a program to 
    diagnose bugs in itself."

If the code you have worked with needed none of these things, assertions 
might not be useful to you. But that doesn't mean there are no use-cases 
for assertions. Not everyone is lucky to work with code that is so 
self-documenting that checked comments are redundant, or code so 
obviously correct that there is no point to checking its internal state.

In your opening post, you gave a use-case for assertions:

    "a check to only do in debug mode, when you can't yet trust your
    code to manage and pass around internal data correctly."

Now you say there are none. Would you like to revise one of those 
statements?

> Any, every check that you make at debug time either
> * belongs in production as well (all the more because it's harder to 
> diagnose there), or

I'm not going to say that claim is entirely wrong. For example, 
Microsoft's research project Midori used two kinds of assertions:

  Debug.Assert only runs under debugging;

  Release.Assert always runs

and it also promised that all contracts will either be checked at 
runtime, or the compiler can prove that the contract is always satisfied 
and so can skip the check. Otherwise they cannot be disabled.

http://joeduffyblog.com/2016/02/07/the-error-model/

But Midori was written in a custom systems language based on C#, and the 
lessons from it probably don't apply directly to a rapid application 
development language / scripting language like Python. In any case, 
Midori's approach was rather unusual even compared to other systems 
languages.

More conventionally, Eiffel (for example) allows the developer to enable 
or disable contract checking. By default, pre-conditions are checked in 
release builds and post-conditions and invariants are skipped, but each 
one can be enabled or disabled individually. We have the choice to 
choose faster code or more extensive error checking, depending on which 
we value more.

I think that's an excellent tradeoff to have. Python currently only has 
a very coarse switch that can only turn assertions on or off, but even 
that coarse switch is better than nothing.

> * belongs in a test -- something coded independently from the program 
> (if your code as a whole cannot be trusted, how any specific part of it 
> can?), or

I love tests. I write unit tests and doc tests all the time. (Well, 
except when I'm being lazy, when I only write doc tests.) But tests 
cannot replace assertions. There are at least two problems:

- external tests don't have access to the function locals;

- and even if you can write a test for something, its in the wrong place 
to be useful as a replacement of an assertion.

John Regehr discusses this exact issue (see link above) and writes that 
"there is a strong synergy between assertions and unit tests". If you do 
them right, they complement each other. And I would argue that 
assertions are a kind of test integrated in the code.

Neither assertions nor unit tests can find all bugs. The wise programmer 
uses both.

In an earlier post, I gave an actual assertion from one of my functions. 
Here it is again:

    assert 0 <= r < abs(y)

Its not obvious from that line in isolation, but both r and y are local 
variables of a function that calculates a mathematical result (hence the 
short and undescriptive names). How can I possibly write a test to check 
this? They are *local* to the function, so I have no access to them from 
outside of the test! 

(Actually, in this *specific case*, r is part of the return value and 
could be extracted by a test function. But y isn't.)

In general, assert is great for tests that occur inside the body of a 
function, checking assertions about the function internals. You cannot 
replace that with an external test. Here's an assertion from another 
function:

    c = collections.Counter(symbols)
    assert c

The Counter c is an internal implementation detail of this one function. 
It would be silly to expose it (how?) so I can write a test to check it. 

Even if I could write such a test, that would be the wrong place for the 
check. I want the check there inside the function, not in a completely 
different .py file. The assertion is as much for me, the reader of the 
code, as for the interpreter. After reading that assertion, I can read 
the rest of the function knowing that it is safe to assume that c will 
always have at least one key. It is a checked comment: `assert c` is 
better than:

    # c will not be empty

because the assertion is checked at runtime unless I disable assert 
checking, while comments are "lies in code" that rapidly become 
obsolete or inaccurate.

> * isn't needed at all because a fault will inevitably surface somewhere 
> down the line (as some exception or an incorrect result that a test will 
> catch).

You don't know that an error will be raised. The program may simply do 
the wrong thing and silently return garbage that you have no way of 
knowing is garbage.

Nor do you know that a test will catch the problem. Most real world code 
does not have even close to 100% test coverage -- which is why people 
are always discovering new bugs.

But even if there is an obvious failure later on, or a failing test, it 
is better to catch errors sooner rather than later. The longer it takes 
to discover the error, the harder it is to debug. Assertions can reduce 
the distance between where the bug occurs and where you notice it.

> Finally, I've got much experience using existing code outside its 
> original use cases, where the original author's assumptions may no 
> longer hold but the specific logic can be gauded to produce the desired 
> result. Coding these assumptions in would undermine that goal.

Of course assertions can be misused. The same applies to code that 
defeats duck-typing with excessive isinstance() checks, or people who 
make everything private and all classes final (in languages that support 
that). John Regehr also discusses some poor ways to misuse assertions. 
This is not a good argument for changing assert.

> So, I see "debug assertions" as either intentionally compromizing 
> correctness for performance (a direct opposite of Python's design 
> principles),

You have that backwards: assertions compromise performance for 
correctness.

I'm very aware that every time I write an assert, that's a runtime check 
that compromises performance. But I do so when I believe that the gain 
in correctness, or the advantage in finding bugs closer to their origin, 
or even their value as documentation, outweighs the performance cost.

> or as an inferiour, faulty, half-measure rudiment from 
> times when CI wasn't a thing (thus not something that should be taught 
> and promoted as a best practice any longer).

If you think that Continuous Integration is a suitable replacement for 
assertions, then I think you have misunderstood either CI or assertions 
or both. That's like saying that now that we have CI, we don't need to 
check the return code on C functions, or catch exceptions. CI and 
assertions are complementary, not in opposition.

CI is a development practice to ensure that the master is always in a 
working state. That's great. But how do you know the master is working? 
You need *tests*, and assertions complement tests. Unit tests are not a 
replacement for assertions.

You wouldn't say "unit tests are obsolete now that we have integration 
tests", or "we don't need fuzzers, we have regression tests". Why would 
you say that you don't need assertions just because you are using CI?

-- 
Steve