[Python-ideas] "old" values in postconditions
Marko Ristin-Kaufmann
marko.ristin at gmail.com
Sun Sep 30 01:28:25 EDT 2018
Hi James,
I copy/pasted the discussion re the readability tool to an issue on github:
https://github.com/Parquery/icontract/issues/48
Would you mind opening a separate issue and copy/pasting what you find
relevant re MockP approach in a separate issue?
I think it's time to fork the issues and have separate threads with code
highlighting etc. Is that OK with you?
Cheers,
Marko
On Sat, 29 Sep 2018 at 21:22, Marko Ristin-Kaufmann <marko.ristin at gmail.com>
wrote:
> Hi James,
> I reread the proposal with MockP. I still don't get the details, but if I
> think I understand the basic idea. You put a placeholder and whenever one
> of its methods is called (including dunders), you record it and finally
> assemble an AST and compile a lambda function to be executed at actual call
> later.
>
> But that would still fail if you want to have:
> @snapshot(var1=some_func(MockP.arg1, MockP.arg2))
> , right? Or there is a way to record that?
>
> Cheers,
> Marko
>
> Le sam. 29 sept. 2018 à 00:35, James Lu <jamtlu at gmail.com> a écrit :
>
>> I am fine with your proposed syntax. It’s certainly lucid. Perhaps it
>> would be a good idea to get people accustomed to “non-magic” syntax.
>>
>> I still have a feeling that most developers would like to store the state
>> in many different custom ways.
>>
>> Please explain. (Expressions like thunk(all)(a == b for a, b in
>> P.arg.meth()) would be valid.)
>>
>> I'm thinking mostly about all the edge cases which we would not be able
>> to cover (and how complex that would be to cover them).
>>
>>
>> Except for a > b > c being one flat expression with 5 members, it seems
>> fairly easy to recreate an AST, which can then be compiled down to a code
>> object. The code object can be fun with a custom “locals()”
>>
>> Below is my concept code for such a P object.
>>
>> from ast import *
>>
>> # not done: enforce Singleton property on EmptySymbolType
>>
>> class EmptySymbolType(object): ...
>>
>> EmptySymbol = EmptySymbolType() # empty symbols are placeholders
>>
>> class MockP(object):
>>
>> # "^" is xor
>>
>> @icontract.pre(lambda symbol, astnode: (symbol is None) ^ (astnode is
>> None))
>>
>> def __init__(self, symbol=None, value=EmptySymbol, astnode=None,
>> initsymtable=(,)):
>>
>> self.symtable = dict(initsymtable)
>>
>> if symbol:
>>
>> self.expr = Expr(value=Name(id=symbol, ctx=Load()))
>>
>> self.symtable = {symbol: value}
>>
>> else:
>>
>> self.expr = astnode
>>
>> self.frozen = False
>>
>> def __add__(self, other):
>>
>> wrapped = MockP.wrap_value(other)
>>
>> return MockP(astnode=Expr(value=BinOp(self.expr, Add(),
>> wrapped.expr), initsymtable={**self.symtable, **wrapped.symtable})
>>
>> def compile(self): ...
>>
>> def freeze(self):
>>
>> # frozen objects wouldn’t have an overrided getattr, allowing for
>> icontract to manipulate the MockP object using its public interface
>>
>> self.frozen = True
>>
>> @classmethod
>>
>> def wrap_value(cls, obj):
>>
>> # create a MockP object from a value. Generate a random identifier
>> and set that as the key in symtable, the AST node is the name of that
>> identifier, retrieving its value through simple expression evaluation.
>>
>> ...
>>
>>
>> thunk = MockP.wrap_value
>>
>> P = MockP('P')
>>
>> # elsewhere: ensure P is only accessed via valid “dot attribute access”
>> inside @snapshot so contracts fail early, or don’t and allow Magic like
>> __dict__ to occur on P.
>>
>> On Sep 27, 2018, at 9:49 PM, Marko Ristin-Kaufmann <
>> marko.ristin at gmail.com> wrote:
>>
>> Hi James,
>>
>> I still have a feeling that most developers would like to store the state
>> in many different custom ways. I see also thunk and snapshot with wrapper
>> objects to be much more complicated to implement and maintain; I'm thinking
>> mostly about all the edge cases which we would not be able to cover (and
>> how complex that would be to cover them). Then the linters need also to
>> work around such wrappers... It might also scare users off since it looks
>> like too much magic. Another concern I also have is that it's probably very
>> hard to integrate these wrappers with mypy later -- but I don't really have
>> a clue about that, only my gut feeling?
>>
>> What about we accepted to repeat "lambda P, " prefix, and have something
>> like this:
>>
>> @snapshot(
>> lambda P, some_name: len(P.some_property),
>> lambda P, another_name: hash(P.another_property)
>> )
>>
>> It's not too verbose for me and you can still explain in three-four
>> sentences what happens below the hub in the library's docs. A
>> pycharm/pydev/vim/emacs plugins could hide the verbose parts.
>>
>> I performed a small experiment to test how this solution plays with
>> pylint and it seems OK that arguments are not used in lambdas.
>>
>> Cheers,
>> Marko
>>
>>
>> On Thu, 27 Sep 2018 at 12:27, James Lu <jamtlu at gmail.com> wrote:
>>
>>> Why couldn’t we record the operations done to a special object and
>>> replay them?
>>>
>>> Actually, I think there is probably no way around a decorator that
>>>> captures/snapshots the data before the function call with a lambda (or even
>>>> a separate function). "Old" construct, if we are to parse it somehow from
>>>> the condition function, would limit us only to shallow copies (and be
>>>> complex to implement as soon as we are capturing out-of-argument values
>>>> such as globals *etc.)*. Moreove, what if we don't need shallow
>>>> copies? I could imagine a dozen of cases where shallow copy is not what the
>>>> programmer wants: for example, s/he might need to make deep copies, hash or
>>>> otherwise transform the input data to hold only part of it instead of
>>>> copying (*e.g., *so as to allow equality check without a double copy
>>>> of the data, or capture only the value of certain property transformed in
>>>> some way).
>>>>
>>>>
>>> from icontract import snapshot, P, thunk
>>> @snapshot(some_identifier=P.self.some_method(P.some_argument.some_attr))
>>>
>>> P is an object of our own type, let’s call the type MockP. MockP returns
>>> new MockP objects when any operation is done to it. MockP * MockP = MockP.
>>> MockP.attr = MockP. MockP objects remember all the operations done to them,
>>> and allow the owner of a MockP object to re-apply the same operations
>>>
>>> “thunk” converts a function or object or class to a MockP object,
>>> storing the function or object for when the operation is done.
>>>
>>> thunk(function)(<MockP expression>)
>>>
>>> Of course, you could also thunk objects like so: thunk(3) * P.number.
>>> (Though it might be better to keep the 3 after P.number in this case so
>>> P.number’s __mult__ would be invoked before 3’s __mult__ is invokes.
>>>
>>>
>>> In most cases, you’d save any operations that can be done on a copy of
>>> the data as generated by @snapshot in @postcondiion. thunk is for rare
>>> scenarios where 1) it’s hard to capture the state, for example an object
>>> that manages network state (or database connectivity etc) and whose stage
>>> can only be read by an external classmethod 2) you want to avoid using
>>> copy.deepcopy.
>>>
>>> I’m sure there’s some way to override isinstance through a meta class or
>>> dunder subclasshook.
>>>
>>> I suppose this mocking method could be a shorthand for when you don’t
>>> need the full power of a lambda. It’s arguably more succinct and readable,
>>> though YMMV.
>>>
>>> I look forward to reading your opinion on this and any ideas you might
>>> have.
>>>
>>> On Sep 26, 2018, at 3:56 PM, James Lu <jamtlu at gmail.com> wrote:
>>>
>>> Hi Marko,
>>>
>>> Actually, following on #A4, you could also write those as multiple
>>> decorators:
>>> @snpashot(lambda _, some_identifier: some_func(_,
>>> some_argument.some_attr)
>>> @snpashot(lambda _, other_identifier: other_func(_.self))
>>>
>>> Yes, though if we’re talking syntax using kwargs would probably be
>>> better.
>>> Using “P” instead of “_”: (I agree that _ smells of ignored arguments)
>>>
>>> @snapshot(some_identifier=lambda P: ..., some_identifier2=lambda P: ...)
>>>
>>> Kwargs has the advantage that you can extend multiple lines without
>>> repeating @snapshot, though many lines of @capture would probably be more
>>> intuitive since each decorator captures one variable.
>>>
>>> Why uppercase "P" and not lowercase (uppercase implies a constant for
>>> me)?
>>>
>>> To me, the capital letters are more prominent and explicit- easier to
>>> see when reading code. It also implies its a constant for you- you
>>> shouldn’t be modifying it, because then you’d be interfering with the
>>> function itself.
>>>
>>> Side node: maybe it would be good to have an @icontract.nomutate
>>> (probably use a different name, maybe @icontract.readonly) that makes sure
>>> a method doesn’t mutate its own __dict__ (and maybe the __dict__ of the
>>> members of its __dict__). It wouldn’t be necessary to put the decorator on
>>> every read only function, just the ones your worried might mutate.
>>>
>>> Maybe a @icontract.nomutate(param=“paramname”) that ensures the __dict__
>>> of all members of the param name have the same equality or identity before
>>> and after. The semantics would need to be worked out.
>>>
>>> On Sep 26, 2018, at 8:58 AM, Marko Ristin-Kaufmann <
>>> marko.ristin at gmail.com> wrote:
>>>
>>> Hi James,
>>>
>>> Actually, following on #A4, you could also write those as multiple
>>> decorators:
>>> @snpashot(lambda _, some_identifier: some_func(_,
>>> some_argument.some_attr)
>>> @snpashot(lambda _, other_identifier: other_func(_.self))
>>>
>>> Am I correct?
>>>
>>> "_" looks a bit hard to read for me (implying ignored arguments).
>>>
>>> Why uppercase "P" and not lowercase (uppercase implies a constant for
>>> me)? Then "O" for "old" and "P" for parameters in a condition:
>>> @post(lambda O, P: ...)
>>> ?
>>>
>>> It also has the nice property that it follows both the temporal and the
>>> alphabet order :)
>>>
>>> On Wed, 26 Sep 2018 at 14:30, James Lu <jamtlu at gmail.com> wrote:
>>>
>>>> I still prefer snapshot, though capture is a good name too. We could
>>>> use generator syntax and inspect the argument names.
>>>>
>>>> Instead of “a”, perhaps use “_”. Or maybe use “A.”, for arguments. Some
>>>> people might prefer “P” for parameters, since parameters sometimes means
>>>> the value received while the argument means the value passed.
>>>>
>>>> (#A1)
>>>>
>>>> from icontract import snapshot, __
>>>> @snapshot(some_func(_.some_argument.some_attr) for some_identifier, _
>>>> in __)
>>>>
>>>> Or (#A2)
>>>>
>>>> @snapshot(some_func(some_argument.some_attr) for some_identifier, _,
>>>> some_argument in __)
>>>>
>>>> —
>>>> Or (#A3)
>>>>
>>>> @snapshot(lambda some_argument,_,some_identifier:
>>>> some_func(some_argument.some_attr))
>>>>
>>>> Or (#A4)
>>>>
>>>> @snapshot(lambda _,some_identifier:
>>>> some_func(_.some_argument.some_attr))
>>>> @snapshot(lambda _,some_identifier, other_identifier:
>>>> some_func(_.some_argument.some_attr), other_func(_.self))
>>>>
>>>> I like #A4 the most because it’s fairly DRY and avoids the extra
>>>> punctuation of
>>>>
>>>> @capture(lambda a: {"some_identifier": some_func(a.some_argument.some_attr)})
>>>>
>>>>
>>>> On Sep 26, 2018, at 12:23 AM, Marko Ristin-Kaufmann <
>>>> marko.ristin at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Franklin wrote:
>>>>
>>>>> The name "before" is a confusing name. It's not just something that
>>>>> happens before. It's really a pre-`let`, adding names to the scope of
>>>>> things after it, but with values taken before the function call. Based
>>>>> on that description, other possible names are `prelet`, `letbefore`,
>>>>> `predef`, `defpre`, `beforescope`. Better a name that is clearly
>>>>> confusing than one that is obvious but misleading.
>>>>
>>>>
>>>> James wrote:
>>>>
>>>>> I suggest that instead of “@before” it’s “@snapshot” and instead of “
>>>>> old” it’s “snapshot”.
>>>>
>>>>
>>>> I like "snapshot", it's a bit clearer than prefixing/postfixing verbs
>>>> with "pre" which might be misread (*e.g., *"prelet" has a meaning in
>>>> Slavic languages and could be subconsciously misread, "predef" implies to
>>>> me a pre-*definition* rather than prior-to-definition , "beforescope"
>>>> is very clear for me, but it might be confusing for others as to what it
>>>> actually refers to ). What about "@capture" (7 letters for captures *versus
>>>> *8 for snapshot)? I suppose "@let" would be playing with fire if
>>>> Python with conflicting new keywords since I assume "let" to be one of the
>>>> candidates.
>>>>
>>>> Actually, I think there is probably no way around a decorator that
>>>> captures/snapshots the data before the function call with a lambda (or even
>>>> a separate function). "Old" construct, if we are to parse it somehow from
>>>> the condition function, would limit us only to shallow copies (and be
>>>> complex to implement as soon as we are capturing out-of-argument values
>>>> such as globals *etc.)*. Moreove, what if we don't need shallow
>>>> copies? I could imagine a dozen of cases where shallow copy is not what the
>>>> programmer wants: for example, s/he might need to make deep copies, hash or
>>>> otherwise transform the input data to hold only part of it instead of
>>>> copying (*e.g., *so as to allow equality check without a double copy
>>>> of the data, or capture only the value of certain property transformed in
>>>> some way).
>>>>
>>>> I'd still go with the dictionary to allow for this extra freedom. We
>>>> could have a convention: "a" denotes to the current arguments, and "b"
>>>> denotes the captured values. It might make an interesting hint that we put
>>>> "b" before "a" in the condition. You could also interpret "b" as "before"
>>>> and "a" as "after", but also "a" as "arguments".
>>>>
>>>> @capture(lambda a: {"some_identifier": some_func(a.some_argument.some_attr)})
>>>> @post(lambda b, a, result: b.some_identifier > result + a.another_argument.another_attr)
>>>> def some_func(some_argument: SomeClass, another_argument: AnotherClass) -> SomeResult:
>>>> ...
>>>>
>>>> "b" can be omitted if it is not used. Under the hub, all the arguments
>>>> to the condition would be passed by keywords.
>>>>
>>>> In case of inheritance, captures would be inherited as well. Hence the
>>>> library would check at run-time that the returned dictionary with captured
>>>> values has no identifier that has been already captured, and the linter
>>>> checks that statically, before running the code. Reading values captured in
>>>> the parent at the code of the child class might be a bit hard -- but that
>>>> is case with any inherited methods/properties. In documentation, I'd list
>>>> all the captures of both ancestor and the current class.
>>>>
>>>> I'm looking forward to reading your opinion on this and alternative
>>>> suggestions :)
>>>> Marko
>>>>
>>>> On Tue, 25 Sep 2018 at 18:12, Franklin? Lee <
>>>> leewangzhong+python at gmail.com> wrote:
>>>>
>>>>> On Sun, Sep 23, 2018 at 2:05 AM Marko Ristin-Kaufmann
>>>>> <marko.ristin at gmail.com> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > (I'd like to fork from a previous thread, "Pre-conditions and
>>>>> post-conditions", since it got long and we started discussing a couple of
>>>>> different things. Let's discuss in this thread the implementation of a
>>>>> library for design-by-contract and how to push it forward to hopefully add
>>>>> it to the standard library one day.)
>>>>> >
>>>>> > For those unfamiliar with contracts and current state of the
>>>>> discussion in the previous thread, here's a short summary. The discussion
>>>>> started by me inquiring about the possibility to add design-by-contract
>>>>> concepts into the core language. The idea was rejected by the participants
>>>>> mainly because they thought that the merit of the feature does not merit
>>>>> its costs. This is quite debatable and seems to reflect many a discussion
>>>>> about design-by-contract in general. Please see the other thread, "Why is
>>>>> design-by-contract not widely adopted?" if you are interested in that
>>>>> debate.
>>>>> >
>>>>> > We (a colleague of mine and I) decided to implement a library to
>>>>> bring design-by-contract to Python since we don't believe that the concept
>>>>> will make it into the core language anytime soon and we needed badly a tool
>>>>> to facilitate our work with a growing code base.
>>>>> >
>>>>> > The library is available at http://github.com/Parquery/icontract.
>>>>> The hope is to polish it so that the wider community could use it and once
>>>>> the quality is high enough, make a proposal to add it to the standard
>>>>> Python libraries. We do need a standard library for contracts, otherwise
>>>>> projects with conflicting contract libraries can not integrate (e.g., the
>>>>> contracts can not be inherited between two different contract libraries).
>>>>> >
>>>>> > So far, the most important bits have been implemented in icontract:
>>>>> >
>>>>> > Preconditions, postconditions, class invariants
>>>>> > Inheritance of the contracts (including strengthening and weakening
>>>>> of the inherited contracts)
>>>>> > Informative violation messages (including information about the
>>>>> values involved in the contract condition)
>>>>> > Sphinx extension to include contracts in the automatically generated
>>>>> documentation (sphinx-icontract)
>>>>> > Linter to statically check that the arguments of the conditions are
>>>>> correct (pyicontract-lint)
>>>>> >
>>>>> > We are successfully using it in our code base and have been quite
>>>>> happy about the implementation so far.
>>>>> >
>>>>> > There is one bit still missing: accessing "old" values in the
>>>>> postcondition (i.e., shallow copies of the values prior to the execution of
>>>>> the function). This feature is necessary in order to allow us to verify
>>>>> state transitions.
>>>>> >
>>>>> > For example, consider a new dictionary class that has "get" and
>>>>> "put" methods:
>>>>> >
>>>>> > from typing import Optional
>>>>> >
>>>>> > from icontract import post
>>>>> >
>>>>> > class NovelDict:
>>>>> > def length(self)->int:
>>>>> > ...
>>>>> >
>>>>> > def get(self, key: str) -> Optional[str]:
>>>>> > ...
>>>>> >
>>>>> > @post(lambda self, key, value: self.get(key) == value)
>>>>> > @post(lambda self, key: old(self.get(key)) is None and
>>>>> old(self.length()) + 1 == self.length(),
>>>>> > "length increased with a new key")
>>>>> > @post(lambda self, key: old(self.get(key)) is not None and
>>>>> old(self.length()) == self.length(),
>>>>> > "length stable with an existing key")
>>>>> > def put(self, key: str, value: str) -> None:
>>>>> > ...
>>>>> >
>>>>> > How could we possible implement this "old" function?
>>>>> >
>>>>> > Here is my suggestion. I'd introduce a decorator "before" that would
>>>>> allow you to store whatever values in a dictionary object "old" (i.e. an
>>>>> object whose properties correspond to the key/value pairs). The "old" is
>>>>> then passed to the condition. Here is it in code:
>>>>> >
>>>>> > # omitted contracts for brevity
>>>>> > class NovelDict:
>>>>> > def length(self)->int:
>>>>> > ...
>>>>> >
>>>>> > # omitted contracts for brevity
>>>>> > def get(self, key: str) -> Optional[str]:
>>>>> > ...
>>>>> >
>>>>> > @before(lambda self, key: {"length": self.length(), "get":
>>>>> self.get(key)})
>>>>> > @post(lambda self, key, value: self.get(key) == value)
>>>>> > @post(lambda self, key, old: old.get is None and old.length + 1
>>>>> == self.length(),
>>>>> > "length increased with a new key")
>>>>> > @post(lambda self, key, old: old.get is not None and old.length
>>>>> == self.length(),
>>>>> > "length stable with an existing key")
>>>>> > def put(self, key: str, value: str) -> None:
>>>>> > ...
>>>>> >
>>>>> > The linter would statically check that all attributes accessed in
>>>>> "old" have to be defined in the decorator "before" so that attribute errors
>>>>> would be caught early. The current implementation of the linter is fast
>>>>> enough to be run at save time so such errors should usually not happen with
>>>>> a properly set IDE.
>>>>> >
>>>>> > "before" decorator would also have "enabled" property, so that you
>>>>> can turn it off (e.g., if you only want to run a postcondition in testing).
>>>>> The "before" decorators can be stacked so that you can also have a more
>>>>> fine-grained control when each one of them is running (some during test,
>>>>> some during test and in production). The linter would enforce that before's
>>>>> "enabled" is a disjunction of all the "enabled"'s of the corresponding
>>>>> postconditions where the old value appears.
>>>>> >
>>>>> > Is this a sane approach to "old" values? Any alternative approach
>>>>> you would prefer? What about better naming? Is "before" a confusing name?
>>>>>
>>>>> The dict can be splatted into the postconditions, so that no special
>>>>> name is required. This would require either that the lambdas handle
>>>>> **kws, or that their caller inspect them to see what names they take.
>>>>> Perhaps add a function to functools which only passes kwargs that fit.
>>>>> Then the precondition mechanism can pass `self`, `key`, and `value` as
>>>>> kwargs instead of args.
>>>>>
>>>>> For functions that have *args and **kwargs, it may be necessary to
>>>>> pass them to the conditions as args and kwargs instead.
>>>>>
>>>>> The name "before" is a confusing name. It's not just something that
>>>>> happens before. It's really a pre-`let`, adding names to the scope of
>>>>> things after it, but with values taken before the function call. Based
>>>>> on that description, other possible names are `prelet`, `letbefore`,
>>>>> `predef`, `defpre`, `beforescope`. Better a name that is clearly
>>>>> confusing than one that is obvious but misleading.
>>>>>
>>>>> By the way, should the first postcondition be `self.get(key) is
>>>>> value`, checking for identity rather than equality?
>>>>>
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180930/be6fd4dd/attachment-0001.html>
More information about the Python-ideas
mailing list