[Python-ideas] Repurpose `assert' into a general-purpose check

Tue Jan 16 20:42:47 EST 2018

On Tue, Jan 16, 2018 at 07:37:29AM -0800, smarie wrote:

[...]
> > The problem with a statement called "validate" is that it will break a 
> > huge number of programs that already include functions and methods using 
> > that name.
> >
> 
> You definitely make a point here. But that would be the case for absolutely 
> *any* language evolution as soon as the proposed statements are plain old 
> english words. Should it be a show-stopper ? I dont think so.

It is not a show-stopper, but it is a very, very larger barrier to 
adding new keywords. If there is a solution to the problem that doesn't 
require a new keyword, that is almost always preferred over breaking 
people's code when they upgrade.

> > But apart from the use of a keyword, we already have a way to do almost 
> > exactly what you want:
> >
> >     if not expression: raise ValidationError(message)
> >
> > after defining some appropriate ValidationError class. And it is only a 
> > few key presses longer than the proposed:
> >
> >     validate expression, ValidationError, message
> >
> 
> This is precisely what is not good in my opinion: here you do not separate 
> <validation means> from <validation intent>. Of course if <validation 
> means> is just a "x > 0" statement, it works, but now what if you rely on a 
> 3d-party provided validation function (or even yours) such as e.g. 
> "is_foo_compliant" ? 

There's no need to invent a third-party validation function. It might be 
my own validation function, or it might be a simple statement like:

    if variable is None: ...

which can fail with NameError if "variable" is not defined. Or "x > 0" 
can fail if x is not a number.

Regardless of what the validation check does, there are two ways it can 
not pass:

- the check can fail;

- or the check can raise an exception.

The second generally means that the check code itself is buggy or 
incomplete, which is why unittest reports these categories separately. 

That is a good thing, not a problem to be fixed. For example:

    # if x < 0: raise ValueError('x must not be negative')
    validate x >= 0, ValueError, 'x must not be negative'

Somehow my code passes a string to this as x. Wouldn't you, the 
developer, want to know that there is a code path that somehow results 
in x being a string? I know I would. Maybe that will become obvious 
later on, but it is best to determine errors as close to their source as 
we can.

With your proposed validate keyword, the interpreter lies to me: it says 
that the check x >= 0 *fails* rather than raises, which implies that x 
is a negative number. Now I waste my time trying to debug how x could 
possibly be a negative number when the symptom is actually very 
different (x is a string).

Hiding the exception is normally a bad thing, but if I really want to do that, I can write a helper 
function:

def is_larger_or_equal(x, y):
    try:
        return x >= y
    except:
        return False

If I find myself writing lots of such helper functions, that's probably 
a hint that I am hiding too much information. Bare excepts have been 
called the most diabolic Python anti-pattern:

https://realpython.com/blog/python/the-most-diabolical-python-antipattern/

so hiding exceptions *by default* (as your proposed validate statement 
would do) is probably not a great idea.

The bottom line is, if my check raises an exception instead of passing 
or failing, I want to know that it raised. I don't want the error to 
be hidden as a failed check.

> if not is_foo_compliant(x): raise ValidationError(message)
> 
> What if this third part method raises an exception instead of returning 
> False in some cases ? 

Great! I would hope it did raise an exception if it were passed 
something that it wasn't expecting and can't deal with.

There may be some cases where I want a validation function to ignore all 
errors, but if so, I will handle them individually with a wrapper 
function, which let's me decide how to handle individual errors:

def my_foo_compliant(x):
    try:
        return is_foo_compliant(x)
    except SpamError, EggsError:
        return True
    except CheeseError:
        return False
    except:
        raise

But I can count the number of times I've done that in practice on the 
fingers of one hand.

[...]
> The goal is really to let developers express their applicative intent 
> =(what should be checked and what is the outcome if anything goes wrong), 
> and give them confidence that the statement will always fail the same way,
> whatever the failure modes /behaviour of the checkers used in the 
> statement.

I don't agree that this is a useful goal for the Python interpreter to 
support as a keyword or built-in function.

If you want to create your own library to do this, I wish you good luck, 
but I would not use it and I honestly think that it is a trap: something 
that seems to be convenient and useful but actually makes maintaining 
code harder by hiding unexpected, unhandled cases as if they were 
expected failures.

-- 
Steve