[Python-ideas] Pre-conditions and post-conditions

Thu Sep 20 16:52:26 EDT 2018

Hi,
Again a brief update.

* icontract supports now static and class methods (thanks to my colleague
Adam Radomski) which came very handy when defining a group of functions as
an interface *via* an abstract (stateless) class. The implementors then
need to all satisfy the contracts without needing to re-write them. You
could implement the same behavior with *_impl or _* ("protected") methods
where public methods would add the contracts as asserts, but we find the
contracts-as-decorators more elegant (N functions instead of 2*N; see the
snippet below).

* We implemented a linter to statically check that the contract arguments
are defined correctly. It is available as a separate Pypi package
pyicontract-lint (https://github.com/Parquery/pyicontract-lint/). Next step
will be to use asteroid to infer that the return type of the condition
function is boolean. Does it make sense to include PEX in the release on
github?

* We plan to implement a sphinx plugin so that contracts can be readily
visible in the documentation. Is there any guideline or standard/preferred
approach how you would expect this plugin to be implemented? My colleagues
and I don't have any experience with sphinx plugins, so any guidance is
very welcome.

class Component(abc.ABC, icontract.DBC):
    """Initialize a single component."""

    @staticmethod
    @abc.abstractmethod
    def user() -> str:
        """
        Get the user name.

        :return: user which executes this component.
        """
        pass

    @staticmethod
    @abc.abstractmethod
    @icontract.post(lambda result: result in groups())
    def primary_group() -> str:
        """
        Get the primary group.

        :return: primary group of this component
        """
        pass

    @staticmethod
    @abc.abstractmethod
    @icontract.post(lambda result: result.issubset(groups()))
    def secondary_groups() -> Set[str]:
        """
        Get the secondary groups.

        :return: list of secondary groups
        """
        pass

    @staticmethod
    @abc.abstractmethod
    @icontract.post(lambda result: all(not pth.is_absolute() for pth in result))
    def bin_paths(config: mapried.config.Config) -> List[pathlib.Path]:
        """
        Get list of binary paths used by this component.

        :param config: of the instance
        :return: list of paths to binaries used by this component
        """
        pass

    @staticmethod
    @abc.abstractmethod
    @icontract.post(lambda result: all(not pth.is_absolute() for pth in result))
    def py_paths(config: mapried.config.Config) -> List[pathlib.Path]:
        """
        Get list of py paths used by this component.

        :param config: of the instance
        :return: list of paths to python executables used by this component
        """
        pass

    @staticmethod
    @abc.abstractmethod
    @icontract.post(lambda result: all(not pth.is_absolute() for pth in result))
    def dirs(config: mapried.config.Config) -> List[pathlib.Path]:
        """
        Get directories used by this component.

        :param config: of the instance
        :return: list of paths to directories used by this component
        """
        pass

On Sat, 15 Sep 2018 at 22:14, Marko Ristin-Kaufmann <marko.ristin at gmail.com>
wrote:

> Hi David Maertz and Michael Lee,
>
> Thank you for raising the points. Please let me respond to your comments
> in separation. Please let me know if I missed or misunderstood anything.
>
> *Assertions versus contracts.* David wrote:
>
>> I'm afraid that in reading the examples provided it is difficulties for
>> me not simply to think that EVERY SINGLE ONE of them would be FAR easier to
>> read if it were an `assert` instead.
>>
>
> I think there are two misunderstandings on the role of the contracts.
> First, they are part of the function signature, and not of the
> implementation. In contrast, the assertions are part of the implementation
> and are completely obscured in the signature. To see the contracts of a
> function or a class written as assertions, you need to visually inspect the
> implementation. The contracts are instead engraved in the signature and
> immediately visible. For example, you can test the distinction by pressing
> Ctrl + q in Pycharm.
>
> Second, assertions are only suitable for preconditions. Postconditions are
> practically unmaintainable as assertions as soon as you have multiple early
> returns in a function. The invariants implemented as assertions are always
> unmaintainable in practice (except for very, very small classes) -- you
> need to inspect each function of the class and all their return statements
> and manually add assertions for each invariant. Removing or changing
> invariants manually is totally impractical in my view.
>
> *Efficiency and Evidency. *David wrote:
>
>> The API of the library is a bit noisy, but I think the obstacle it's more
>> in the higher level design for me. Adding many layers of expensive runtime
>> checks and many lines of code in order to assure simple predicates that a
>> glance at the code or unit tests would do better seems wasteful.
>
>
> I'm not very sure what you mean by expensive runtime checks -- every
> single contract can be disabled at any point. Once a contract is disabled,
> there is literally no runtime computational cost incurred. The complexity
> of a contract during testing is also exactly the same as if you wrote it in
> the unit test. There is a constant overhead due to the extra function call
> to check the condition, but there's no more time complexity to it. The
> overhead of an additional function call is negligible in most practical
> test cases.
>
> When you say "a glance at the code", this implies to me that you referring
> to your own code and not to legacy code. In my experience, even simple
> predicates are often not obvious to see in other people's code as one might
> think (*e.g. *I had to struggle with even most simple ones like whether
> the result ends in a newline or not -- often having to actually run the
> code to check experimentally what happens with different inputs).
> Postconditions prove very useful in such situations: they let us know that
> whenever a function returns, the result must satisfy its postconditions.
> They are formal and obvious to read in the function signature, and hence
> spare us the need to parse the function's implementation or run it.
>
> Contracts in the unit tests.
>
>> The API of the library is a bit noisy, but I think the obstacle it's more
>> in the higher level design for me. Adding many layers of expensive runtime
>> checks and many lines of code in order to assure simple predicates that a
>> glance at the code or *unit tests would do better* seems wasteful.
>>
> (emphasis mine)
>
> Defining contracts in a unit test is, as I already mentioned in my
> previous message, problematic due to two reasons. First, the contract
> resides in a place far away from the function definition which might make
> it hard to find and maintain. Second, defining the contract in the unit
> test makes it impossible to put the contract in the production or test it
> in a call from a different function. In contrast, introducing the contract
> as a decorator works perfectly fine in all the three above-mentioned cases
> (smoke unit test, production, deeper testing).
>
> *Library. *Michael wrote:
>
>> I just want to point out that you don't need permission from anybody to
>> start a library. I think developing and popularizing a contracts library is
>> a reasonable goal -- but that's something you can start doing at any time
>> without waiting for consensus.
>
>
> As a matter of fact, I already implemented the library which covers most
> of the design-by-contract including the inheritance of the contracts. (The
> only missing parts are retrieval of "old" values in postconditions and loop
> invariants.) It's published on pypi as "icontract" package (the website is
> https://github.com/Parquery/icontract/). I'd like to gauge the interest
> before I/we even try to make a proposal to make it into the standard
> library.
>
> The discussions in this thread are an immense help for me to crystallize
> the points that would need to be addressed explicitly in such a proposal.
> If the proposal never comes about, it would at least flow into the
> documentation of the library and help me identify and explain better the
> important points.
>
> *Observation of contracts. *Michael wrote:
>
>> Your contracts are only checked when the function is evaluated, so you'd
>> still need to write that unit test that confirms the function actually
>> observes the contract. I don't think you necessarily get to reduce the
>> number of tests you'd need to write.
>
>
> Assuming that a contracts library is working correctly, there is no need
> to test whether a contract is observed or not -- you assume it is. The same
> applies to any testing library -- otherwise, you would have to test the
> tester, and so on *ad infinitum.*
>
> You still need to evaluate the function during testing, of course. But you
> don't need to document the contracts in your tests nor check that the
> postconditions are enforced -- you assume that they hold. For example, if
> you introduce a postcondition that the result of a function ends in a
> newline, there is no point of making a unit test, passing it some value and
> then checking that the result value ends in a newline in the test.
> Normally, it is sufficient to smoke-test the function. For example, you
> write a smoke unit test that gives a range of inputs to the function by
> using hypothesis library and let the postconditions be automatically
> checked. You can view each postcondition as an additional test case in this
> scenario -- but one that is also embedded in the function signature and
> also applicable in production.
>
> Not all tests can be written like this, of course. Dealing with a complex
> function involves writing testing logic which is too complex to fit in
> postconditions. Contracts are not a panacea, but they absolute us from
> implementing trivial testing logic while keeping the important bits of the
> documentation close to the function and allowing for deeper tests.
>
> *Accurate contracts. *Michael wrote:
>
>> There's also no guarantee that your contracts will necessarily be
>> *accurate*. It's entirely possible that your preconditions/postconditions
>> might hold for every test case you can think of, but end up failing when
>> running in production due to some edge case that you missed.
>>
>
> Unfortunately, there is no practical exit from this dilemma -- and it
> applies all the same for the tests. Who guarantees that the testing logic
> of the unit tests are correct? Unless you can formally prove that the code
> does what it should, there is no way around it. Whether you write contracts
> in the tests or in the decorators, it makes no difference to accuracy.
>
> If you missed to test an edge case, well, you missed it :). The
> design-by-contract does not make the code bug-free, but makes the bugs *much
> less likely* and *easier *to detect *early*. In practice, if there is a
> complex contract, I encapsulate its complex parts in separate functions
> (often with their own contracts), test these functions in separation and
> then, once the tests pass and I'm confident about their correctness, put
> them into contracts.
>
> (And if you decide to disable those pre/post conditions to avoid the
>> efficiency hit, you're back to square zero.)
>>
>
> In practice, we at Parquery AG let the critical contracts to run in
> production to ensure that the program blows up before it exercises
> undefined behavior in a critical situation. The informative violation
> errors of the icontract library help us to trace the bugs more easily since
> the relevant values are part of the error log.
>
> However, if some of the contracts are too inefficient to check in
> production, alas you have to turn them off and they can't be checked since
> they are inefficient. This seems like a tautology to me -- could you please
> clarify a bit what you meant? If a check is critical and inefficient at the
> same time then your problem is unsolvable (or at least ill-defined);
> contracts as well as any other approach can not solve it.
>
> *Ergonimical assertions. *Michael wrote:
>
>> Or I guess to put it another way -- it seems what all of these contract
>> libraries are doing is basically adding syntax to try and make adding
>> asserts in various places more ergonomic, and not much else. I agree those
>> kinds of libraries can be useful, but I don't think they're necessarily
>> useful enough to be part of the standard library or to be a technique
>> Python programmers should automatically use by default.
>
>
> From the point of view of the *behavior, *that is exactly the case. The
> contracts (*e.g. *as function decorators) make postconditions and
> invariants possible in practice. As I already noted above, postconditions
> are very hard and invariants almost impossible to maintain manually without
> the contracts. This is even more so when contracts are inherited in a class
> hierarchy.
>
> Please do not underestimate another aspect of the contracts, namely the
> value of contracts as verifiable documentation. Please note that the only
> alternative that I observe in practice without design-by-contract is to
> write contracts in docstrings in *natural language*. Most often, they are
> just assumed, so the next programmer burns her fingers expecting the
> contracts to hold when they actually differ from the class or function
> description, but nobody bothered to update the docstrings (which is a
> common pitfall in any code base over a longer period of time).
>
> *Automatic generation of tests.* Michael wrote:
>
>> What might be interesting is somebody wrote a library that does something
>> more then just adding asserts. For example, one idea might be to try
>> hooking up a contracts library to hypothesis (or any other library that
>> does quickcheck-style testing). That might be a good way of partially
>> addressing the problems up above -- you write out your invariants, and a
>> testing library extracts that information and uses it to automatically
>> synthesize interesting test cases.
>
>
> This is the final goal and my main motivation to push for
> design-by-contract in Python :). There is a whole research community that
> tries to come up with automatic test generations, and contracts are of
> great utility there. Mind that generating the tests based on contracts is
> not trivial: hypothesis just picks elements for each input independently
> which is a much easier problem. However, preconditions can define how the
> arguments are *related*. Assume a function takes two numbers as
> arguments, x and y. If the precondition is y < x < (y + x)  * 10, it is not
> trivial even for this simple example to come up with concrete samples of x
> and y unless you simply brute-force the problem by densely sampling all the
> numbers and checking the precondition.
>
> I see a chicken-and-egg problem here. If design-by-contract is not widely
> adopted, there will also be fewer or no libraries for automatic test
> generation. Honestly, I have absolutely no idea how you could approach
> automatic generation of test cases without contracts (in one form or the
> other). For example, how could you automatically mock a class without
> knowing its invariants?
>
> Since generating test cases for functions with non-trivial contracts is
> hard (and involves collaboration of many people), I don't expect anybody to
> start even thinking about it if the tool can only be applied to almost
> anywhere due to lack of contracts. Formal proofs and static analysis are
> even harder beasts to tame -- and I'd say the argument holds true for them
> even more.
>
> David and Michael, thank you again for your comments! I welcome very much
> your opinion and any follow-ups as well as from other participants on this
> mail list.
>
> Cheers,
> Marko
>
> On Sat, 15 Sep 2018 at 10:42, Michael Lee <michael.lee.0x2a at gmail.com>
> wrote:
>
>> I just want to point out that you don't need permission from anybody to
>> start a library. I think developing and popularizing a contracts library is
>> a reasonable goal -- but that's something you can start doing at any time
>> without waiting for consensus.
>>
>> And if it gets popular enough, maybe it'll be added to the standard
>> library in some form. That's what happened with attrs, iirc -- it got
>> fairly popular and demonstrated there was an unfilled niche, and so Python
>> acquired dataclasses..
>>
>>
>> The contracts make merely tests obsolete that test that the function or
>>> class actually observes the contracts.
>>>
>>
>> Is this actually the case? Your contracts are only checked when the
>> function is evaluated, so you'd still need to write that unit test that
>> confirms the function actually observes the contract. I don't think you
>> necessarily get to reduce the number of tests you'd need to write.
>>
>>
>> Please let me know what points *do not *convince you that Python needs
>>> contracts
>>>
>>
>> While I agree that contracts are a useful tool, I don't think they're
>> going to be necessarily useful for *all* Python programmers. For example,
>> contracts aren't particularly useful if you're writing fairly
>> straightforward code with relatively simple invariants.
>>
>> I'm also not convinced that libraries where contracts are checked
>> specifically *at runtime* actually give you that much added power and
>> impact. For example, you still need to write a decent number of unit tests
>> to make sure your contracts are being upheld (unless you plan on checking
>> this by just deploying your code and letting it run, which seems
>> suboptimal). There's also no guarantee that your contracts will necessarily
>> be *accurate*. It's entirely possible that your
>> preconditions/postconditions might hold for every test case you can think
>> of, but end up failing when running in production due to some edge case
>> that you missed. (And if you decide to disable those pre/post conditions to
>> avoid the efficiency hit, you're back to square zero.)
>>
>> Or I guess to put it another way -- it seems what all of these contract
>> libraries are doing is basically adding syntax to try and make adding
>> asserts in various places more ergonomic, and not much else. I agree those
>> kinds of libraries can be useful, but I don't think they're necessarily
>> useful enough to be part of the standard library or to be a technique
>> Python programmers should automatically use by default.
>>
>> What might be interesting is somebody wrote a library that does something
>> more then just adding asserts. For example, one idea might be to try
>> hooking up a contracts library to hypothesis (or any other library that
>> does quickcheck-style testing). That might be a good way of partially
>> addressing the problems up above -- you write out your invariants, and a
>> testing library extracts that information and uses it to automatically
>> synthesize interesting test cases.
>>
>> (And of course, what would be very cool is if the contracts could be
>> verified statically like you can do in languages like dafny -- that way,
>> you genuinely would be able to avoid writing many kinds of tests and could
>> have confidence your contracts are upheld. But I understanding implementing
>> such verifiers are extremely challenging and would probably have too-steep
>> of a learning curve to be usable by most people anyways.)
>>
>> -- Michael
>>
>>
>>
>> On Fri, Sep 14, 2018 at 11:51 PM, Marko Ristin-Kaufmann <
>> marko.ristin at gmail.com> wrote:
>>
>>> Hi,
>>> Let me make a couple of practical examples from the work-in-progress (
>>> https://github.com/Parquery/pypackagery, branch
>>> mristin/initial-version) to illustrate again the usefulness of the
>>> contracts and why they are, in my opinion, superior to assertions and unit
>>> tests.
>>>
>>> What follows is a list of function signatures decorated with contracts
>>> from pypackagery library preceded by a human-readable description of the
>>> contracts.
>>>
>>> The invariants tell us what format to expect from the related string
>>> properties.
>>>
>>> @icontract.inv(lambda self: self.name.strip() == self.name)
>>> @icontract.inv(lambda self: self.line.endswith("\n"))
>>> class Requirement:
>>>     """Represent a requirement in requirements.txt."""
>>>
>>>     def __init__(self, name: str, line: str) -> None:
>>>         """
>>>         Initialize.
>>>
>>>         :param name: package name
>>>         :param line: line in the requirements.txt file
>>>         """
>>>         ...
>>>
>>> The postcondition tells us that the resulting map keys the values on
>>> their name property.
>>>
>>> @icontract.post(lambda result: all(val.name == key for key, val in result.items()))
>>> def parse_requirements(text: str, filename: str = '<unknown>') -> Mapping[str, Requirement]:
>>>     """
>>>     Parse requirements file and return package name -> package requirement as in requirements.txt
>>>
>>>     :param text: content of the ``requirements.txt``
>>>     :param filename: where we got the ``requirements.txt`` from (URL or path)
>>>     :return: name of the requirement (*i.e.* pip package) -> parsed requirement
>>>     """
>>>     ...
>>>
>>>
>>> The postcondition ensures that the resulting list contains only unique
>>> elements. Mind that if you returned a set, the order would have been lost.
>>>
>>> @icontract.post(lambda result: len(result) == len(set(result)), enabled=icontract.SLOW)
>>> def missing_requirements(module_to_requirement: Mapping[str, str],
>>>                          requirements: Mapping[str, Requirement]) -> List[str]:
>>>     """
>>>     List requirements from module_to_requirement missing in the ``requirements``.
>>>
>>>     :param module_to_requirement: parsed ``module_to_requiremnt.tsv``
>>>     :param requirements: parsed ``requirements.txt``
>>>     :return: list of requirement names
>>>     """
>>>     ...
>>>
>>> Here is a bit more complex example.
>>> - The precondition A requires that all the supplied relative paths
>>> (rel_paths) are indeed relative (as opposed to absolute).
>>> - The postcondition B ensures that the initial set of paths (given in
>>> rel_paths) is included in the results.
>>> - The postcondition C ensures that the requirements in the results are
>>> the subset of the given requirements.
>>> - The precondition D requires that there are no missing requirements (*i.e.
>>> *that each requirement in the given module_to_requirement is also
>>> defined in the given requirements).
>>>
>>> @icontract.pre(lambda rel_paths: all(rel_pth.root == "" for rel_pth in rel_paths))  # A
>>> @icontract.post(
>>>     lambda rel_paths, result: all(pth in result.rel_paths for pth in rel_paths),
>>>     enabled=icontract.SLOW,
>>>     description="Initial relative paths included")  # B
>>> @icontract.post(
>>>     lambda requirements, result: all(req.name in requirements for req in result.requirements),
>>>     enabled=icontract.SLOW)  # C
>>> @icontract.pre(
>>>     lambda requirements, module_to_requirement: missing_requirements(module_to_requirement, requirements) == [],
>>>     enabled=icontract.SLOW)  # D
>>> def collect_dependency_graph(root_dir: pathlib.Path, rel_paths: List[pathlib.Path],
>>>                              requirements: Mapping[str, Requirement],
>>>                              module_to_requirement: Mapping[str, str]) -> Package:
>>>
>>>     """
>>>     Collect the dependency graph of the initial set of python files from the code base.
>>>
>>>     :param root_dir: root directory of the codebase such as "/home/marko/workspace/pqry/production/src/py"
>>>     :param rel_paths: initial set of python files that we want to package. These paths are relative to root_dir.
>>>     :param requirements: requirements of the whole code base, mapped by package name
>>>     :param module_to_requirement: module to requirement correspondence of the whole code base
>>>     :return: resolved depedendency graph including the given initial relative paths,
>>>     """
>>>
>>> I hope these examples convince you (at least a little bit :-)) that
>>> contracts are easier and clearer to write than asserts. As noted before in
>>> this thread, you can have the same *behavior* with asserts as long as
>>> you don't need to inherit the contracts. But the contract decorators make
>>> it very explicit what conditions should hold *without* having to look
>>> into the implementation. Moreover, it is very hard to ensure the
>>> postconditions with asserts as soon as you have a complex control flow since
>>> you would need to duplicate the assert at every return statement. (You
>>> could implement a context manager that ensures the postconditions, but a
>>> context manager is not more readable than decorators and you have to
>>> duplicate them as documentation in the docstring).
>>>
>>> In my view, contracts are also superior to many kinds of tests. As the
>>> contracts are *always* enforced, they also enforce the correctness
>>> throughout the program execution whereas the unit tests and doctests only
>>> cover a list of selected cases. Furthermore, writing the contracts in these
>>> examples as doctests or unit tests would escape the attention of most less
>>> experienced programmers which are not  used to read unit tests as
>>> documentation. Finally, these unit tests would be much harder to read than
>>> the decorators (*e.g.*, the unit test would supply invalid arguments
>>> and then check for ValueError which is already a much more convoluted piece
>>> of code than the preconditions and postconditions as decorators. Such
>>> testing code also lives in a file separate from the original implementation
>>> making it much harder to locate and maintain).
>>>
>>> Mind that the contracts *do not* *replace* the unit tests or the
>>> doctests. The contracts make merely tests obsolete that test that the
>>> function or class actually observes the contracts. Design-by-contract helps
>>> you skip those tests and focus on the more complex ones that test the
>>> behavior. Another positive effect of the contracts is that they make your
>>> tests deeper: if you specified the contracts throughout the code base, a
>>> test of a function that calls other functions in its implementation will
>>> also make sure that all the contracts of that other functions hold. This
>>> can be difficult to implement  with standard unit test frameworks.
>>>
>>> Another aspect of the design-by-contract, which is IMO ignored quite
>>> often, is the educational one. Contracts force the programmer to actually
>>> sit down and think *formally* about the inputs and the outputs
>>> (hopefully?) *before* she starts to implement a function. Since many
>>> schools use Python to teach programming (especially at high school level),
>>> I imagine writing contracts of a function to be a very good exercise in
>>> formal thinking for the students.
>>>
>>> Please let me know what points *do not *convince you that Python needs
>>> contracts (in whatever form -- be it as a standard library, be it as a
>>> language construct, be it as a widely adopted and collectively maintained
>>> third-party library). I would be very glad to address these points in my
>>> next message(s).
>>>
>>> Cheers,
>>> Marko
>>>
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180920/ad60cacd/attachment-0001.html>