Statically checking purity of functions

Hello, As far as I can tell the following has not been discussed yet on the searchable internet. It may seem that my idea concerns only mypy, but since "typing" is part of the stdlib, I believe it is appropriate for this mailing list. Even though Python is a very dynamic language, in practice many routines (when called with arguments of the intended type) are pure in the functional programming sense, i.e. they have no observable side effects and the result reproducibly depends on the arguments. I think there could be many interesting applications of knowing whether a function is pure or not. Example: Long-running computations often need to be parallelized. There are plenty of ways to parallelize Python code (think of concurrent.futures or ipyparallel) and most of them make implicit assumptions that some functions are pure. Another example: if the purity of functions would be known, Jupyter notebooks could automatically deduce the side effects of code cells and thus calculate their dependencies on each other. Changes to a cell could thus invalidate other (dependent) cells. With plain Python (without type annotations) there is no way to verify the purity of a function as has been well explained here: http://stackoverflow.com/a/31664102 But couldn't this be done with a static type checker like mypy? If mypy would track functions as pure or not pure, it should be able to verify the claimed purity of any new function. For example, the following function would pass the test, @pure def add(a: int, b: int) -> int: return a + b while the next one would produce an error @pure def add_random(a: int) -> int: return a + random.randrange(10) because the static checker would know that random.randrange() is not pure (it modifies the state of the random module). Am I overseeing any problems with the above? Christoph

On 9 June 2016 at 14:37, Christoph Groth <christoph@grothesque.org> wrote:
That sounds both reasonable and potentially useful, albeit of relatively niche interest. It's quite similar to Perl's "taint checks" that track whether data is affected by external input (although the latter marks *data*, not functions). I've not looked into mypy yet, so all I know about it is from skimming some of the typing discussions here. But if it's possible to write a plugin for mypy to do something like this, that would be pretty awesome. I'm not sure how much a capability like that would affect Python itself, though. (In fact, I'd say it ought not to - being able to do something like this without any changes to the language would be ideal). Paul

Paul Moore wrote:
Indeed! Thanks for the pointer, I haven't heard of taint checks before.
Testing for purity, like testing for type correctness, could be kept separate from the language. (Although I have the gut feeling that one day there will be a switch to CPython that enables mypy-like testing at runtime.) There is one exception: there must be some syntax to declare (non-)purity. Function annotations (used by mypy for type checking) set the __annotations__ attribute. In the same way there could be, for example, a __pure__ attribute that, if present, can be True or False. It could be, for example, set by the decorators "typing.pure" and "typing.impure". The problem with using decorators, as compared to having a dedicated syntax, is that the name of a decorator is just a name, and can be assigned some different value. But I do not see a clear majority for adding syntax to Python for declaring purity right now... ;-) Perhaps this is going too far, but I note that there are other traits of functions that could be tracked as well. For example, one could track whether a function is potentially blocking. If such alternative traits are judged important enough (I do not consider "blocking" important enough, it's just an example of the concept), one could also store several such traits in a __traits__ attribute.

Here’s where this falls down (NB, the print is just a simple example of non-purity, but you can imagine that you can just do arbitrary code). class SideEffectyInt(int): def __add__(self, other): if other == 3: print(“YAY") return super().__add__(other) This SideEffectyInt() object is suitable for passing into your pure “add” function, but has an impure add logic. I’m not sure how mypy could resolve this problem: I presume it’d need purity declarations for every possible function on every type *and* the ability to see all function invocations, which is pretty difficult given that Python’s dynamism makes it possible to invoke arbitrary functions. So I don’t really see how this pure decorator could *enforce* purity. Cory

Cory Benfield wrote:
Of course, an instance of SideEffectInt is also an instance of int and as such mypy will not complain when it is passed to the "add" function. However, using the above technique any type checking by mypy can be circumvented. Mypy will happily nod through the following program: class FloatingInt(int): def __add__(self, other): return float(super().__add__(other)) def add(a: int, b: int) -> int: return a + b print(add(FloatingInt(1), 2)) However, if we change the above class definition to class FloatingInt(int): def __add__(self, other) -> float: return float(super().__add__(other)) mypy will start to complain with the following error: test.py:2: error: Return type of "__add__" incompatible with supertype "int" Now, mypy could be probably more strict and flag already the first definition of FloatingInt.__add__() as an error, because it knows (as the above error message shows) what its return type should be From inheritance. It also knows that float() produces a float, since it does not accept def foo() -> int: return float(5) I guess it's a choice of mypy not to complain about bits of code that don't use type declarations. In an exactly analogous way, mypy could detect that calling print() in SideEffectlyInt.__add__ is incompatible with the (supposedly declared) purity of int.__add__() if SideEffectlyInt was declared as: class SideEffectyInt(int): @pure def __add__(self, other): if other == 3: print(“YAY") return super().__add__(other) And just like before, if mypy was more strict, it could detect the purity-breaking of SideEffectyInt.__add__() from inheritance.
This is correct, but I believe your statement remains correct when one replaces "purity" by "typing": In order to verify with certainty that the typing is correct for a function, mypy would have to know the typing declaration of every function that is called inside it.
So I don’t really see how this pure decorator could *enforce* purity.
It seems to me that purity could be enforced to the same degree that correct typing can be enforced. In other words, the goal of mypy seems not to be to prove that typing of a program is consistent (Python is too dynamic for that), but rather to help finding possible errors. It seems to me that the same service could be provided for purity. That should be good enough for many applications. In that spirit, perhaps it would make more sense to assume purity by default if unspecified (just like mypy assumes type correctness when no typing information is present), and have both "pure" and "impure" (dirty? ;-) decorators.

On 10 June 2016 at 01:37, Christoph Groth <christoph@grothesque.org> wrote:
The problem, as Cory demonstrates, is that purity is transitive: you have to know that all the runtime interactions are also pure, or your function stops being pure. external names like random in add_random can never be pure (because any piece of impure code can change the module globals to rebind random to something else). This means that any pure function would have to accept as parameters everything it operates on, and you'd need to error on any call to a pure function with impure arguments. It would be an interesting exercise to see how it plays out in practice though. -Rob

Perhaps this would work better with something like property based testing of some sort. You would take a function and say for the allowed types, run a memoization series and check if the return value is the same every time for the arguments. E.g., if you know the function can only take subclasses of numbers.Number whose arithmetic operators are pure, the type checker / test suite can verify. Code such as:
would allow for a test: @given(integers(), integers()) def add_testing(a, b): result_1 = add(a,b) assert isinstance(result_1, int) for _ in range(100): assert result_1 == add(a,b) - Ed

Perhaps this would work better with something like property based testing of some sort. You would take a function and say for the allowed types, run a memoization series and check if the return value is the same every time for the arguments. E.g., if you know the function can only take subclasses of numbers.Number whose arithmetic operators are pure, the type checker / test suite can verify. Code such as:
would allow for a test: @given(integers(), integers()) def add_testing(a, b): result_1 = add(a,b) assert isinstance(result_1, int) for _ in range(100): assert result_1 == add(a,b) - Ed

Robert Collins wrote:
True, but when checking for type correctness one faces exactly the same problem. Consider the following script. Uncommenting any of the commented-out blocks will make it fail in the same way, but mypy will only detect the first failure. That's already not bad at all, since most of such renaming of global variables will typically happen at global scope. Sufficiently insidious monkey-patching can never be detected by a linter that only looks at the Python source. (Hey, it could happen inside a C extension module!) import math def bad_sin(x: float) -> str: return str(x) ### Mypy will detect this as a problem: # math.sin = bad_sin ### Mypy will not mind the following: # def do_evil(): # math.sin = bad_sin # do_evil() def sin_squared(x: float) -> float: return math.sin(x)**2 print(sin_squared(math.pi / 6))
You mean because one can never be sure about the purity of global variables? That's true, but exactly to the same extent that one cannot be sure about the typing annotations of global variables. I believe that such "weak" checking for purity would still be useful, just as the equally weak verification of typing by mypy is useful. Christoph

On Fri, Jun 10, 2016 at 11:57 AM, Christoph Groth <christoph@grothesque.org> wrote:
I'm late to the discussion, but I agree with Christoph. I doubt purity checking can reliably enforce anything without making it impractically restrictive, but even best-effort, partial purity checking could be useful, similar to how mypy is useful. A checker could catch many (but not all) purity violations and purity annotations would make programs easier to reason about for humans and tools. I've thought about this a bit in the past -- mostly in the context of mypy. I'm not sure whether purity checking would be more than marginally useful, but at least in some use cases it might be a valuable a tool. For example, purity checking might help with reasoning about concurrency and parallelism. However, writing a purity checking tool for Python that is flexible enough that many programmers would actually want to use it sounds like a difficult (research?) project. Here are some practical considerations: 1) We'd need to annotate library functions with respect to purity, as Python derives much of its value from the available libraries, and we'd want pure functions to be able to use (pure) libraries. Perhaps stubs in typeshed would have purity annotations for individual functions. Alternatively, there could be a typeshed-like repository for purity information only, and a tool could use information from both repositories. The safest thing would be to default to impure, unless a function is explicitly annotated as pure. 2) Type annotations would likely be needed for practical checking of purity, as otherwise things like method calls would be hard to check for purity. Thus purity checking would be a stricter form of static type checking. 3) For consistency with PEP 484 style gradual typing, we may want to allow functions without annotations to be called from pure functions. Tools might make this user-configurable, as this is clearly a big hole in a purity checking system. 4) It may be necessary to allow impure operations to be used within a pure function as long the impurity won't be visible outside the function. For example, perhaps this function should be fine, as an external observer can't tell whether the function uses pure operations only: def squares(n: int) -> List[int]: a = [] for i in range(n): a.append(i * i) return a I`m not sure if I know how to implement this in a useful way, though. I believe that there is some prior work in this area (not specific to Python). Jukka

On 22 June 2016 at 13:23, Jukka Lehtosalo <jlehtosalo@gmail.com> wrote:
we could have eg: def parallel_map(f, data): if is_pure(f): # <impl that does special optimisations to take advantage of the fact f is truly pure> # special case that uses light weight threads or sth (maybe can run bits of it GIL free or sth, less overhead or sth) # its ok to share immutable data else: #<impl that has no issue with impure f> # eg: multiprocessing process pools.. #its ok to mutate everything becauuse we no longer truly share anything. in other words, even if only a small subset of code can be proven pure it can be useful to help optimise code (in this example to avoid locking by knowing that no 2 parrallel instances of the callback can mock with mutually shared state) for some scenarios it - might be - ok even if we leave the proving to the programmer. (eg a C extension could implement the optimised parallel map in this example and a user defined decorator is enough to shove a flag on to the function f to trigger the optimised code path) if this it is not truly pure tho despite the assertion it might cause issues so in that case it probably is not something that should ship with python by default?

Le 9 juin 2016 9:48 PM, "Robert Collins" <robertc@robertcollins.net> a écrit :
FAT Python implements optimizations on pure functions like inlining but disable optimizations as soon as an assumption becomes false. There is no magic, it runs checks at runtime using guards. JIT compilers do the same. Some projects like numba or pythran don't implement guards, but it's a deliberate choice to maximize performance. It's acceptable because they use opt-in optimizations. Victor

On 9 June 2016 at 14:37, Christoph Groth <christoph@grothesque.org> wrote:
That sounds both reasonable and potentially useful, albeit of relatively niche interest. It's quite similar to Perl's "taint checks" that track whether data is affected by external input (although the latter marks *data*, not functions). I've not looked into mypy yet, so all I know about it is from skimming some of the typing discussions here. But if it's possible to write a plugin for mypy to do something like this, that would be pretty awesome. I'm not sure how much a capability like that would affect Python itself, though. (In fact, I'd say it ought not to - being able to do something like this without any changes to the language would be ideal). Paul

Paul Moore wrote:
Indeed! Thanks for the pointer, I haven't heard of taint checks before.
Testing for purity, like testing for type correctness, could be kept separate from the language. (Although I have the gut feeling that one day there will be a switch to CPython that enables mypy-like testing at runtime.) There is one exception: there must be some syntax to declare (non-)purity. Function annotations (used by mypy for type checking) set the __annotations__ attribute. In the same way there could be, for example, a __pure__ attribute that, if present, can be True or False. It could be, for example, set by the decorators "typing.pure" and "typing.impure". The problem with using decorators, as compared to having a dedicated syntax, is that the name of a decorator is just a name, and can be assigned some different value. But I do not see a clear majority for adding syntax to Python for declaring purity right now... ;-) Perhaps this is going too far, but I note that there are other traits of functions that could be tracked as well. For example, one could track whether a function is potentially blocking. If such alternative traits are judged important enough (I do not consider "blocking" important enough, it's just an example of the concept), one could also store several such traits in a __traits__ attribute.

Here’s where this falls down (NB, the print is just a simple example of non-purity, but you can imagine that you can just do arbitrary code). class SideEffectyInt(int): def __add__(self, other): if other == 3: print(“YAY") return super().__add__(other) This SideEffectyInt() object is suitable for passing into your pure “add” function, but has an impure add logic. I’m not sure how mypy could resolve this problem: I presume it’d need purity declarations for every possible function on every type *and* the ability to see all function invocations, which is pretty difficult given that Python’s dynamism makes it possible to invoke arbitrary functions. So I don’t really see how this pure decorator could *enforce* purity. Cory

Cory Benfield wrote:
Of course, an instance of SideEffectInt is also an instance of int and as such mypy will not complain when it is passed to the "add" function. However, using the above technique any type checking by mypy can be circumvented. Mypy will happily nod through the following program: class FloatingInt(int): def __add__(self, other): return float(super().__add__(other)) def add(a: int, b: int) -> int: return a + b print(add(FloatingInt(1), 2)) However, if we change the above class definition to class FloatingInt(int): def __add__(self, other) -> float: return float(super().__add__(other)) mypy will start to complain with the following error: test.py:2: error: Return type of "__add__" incompatible with supertype "int" Now, mypy could be probably more strict and flag already the first definition of FloatingInt.__add__() as an error, because it knows (as the above error message shows) what its return type should be From inheritance. It also knows that float() produces a float, since it does not accept def foo() -> int: return float(5) I guess it's a choice of mypy not to complain about bits of code that don't use type declarations. In an exactly analogous way, mypy could detect that calling print() in SideEffectlyInt.__add__ is incompatible with the (supposedly declared) purity of int.__add__() if SideEffectlyInt was declared as: class SideEffectyInt(int): @pure def __add__(self, other): if other == 3: print(“YAY") return super().__add__(other) And just like before, if mypy was more strict, it could detect the purity-breaking of SideEffectyInt.__add__() from inheritance.
This is correct, but I believe your statement remains correct when one replaces "purity" by "typing": In order to verify with certainty that the typing is correct for a function, mypy would have to know the typing declaration of every function that is called inside it.
So I don’t really see how this pure decorator could *enforce* purity.
It seems to me that purity could be enforced to the same degree that correct typing can be enforced. In other words, the goal of mypy seems not to be to prove that typing of a program is consistent (Python is too dynamic for that), but rather to help finding possible errors. It seems to me that the same service could be provided for purity. That should be good enough for many applications. In that spirit, perhaps it would make more sense to assume purity by default if unspecified (just like mypy assumes type correctness when no typing information is present), and have both "pure" and "impure" (dirty? ;-) decorators.

On 10 June 2016 at 01:37, Christoph Groth <christoph@grothesque.org> wrote:
The problem, as Cory demonstrates, is that purity is transitive: you have to know that all the runtime interactions are also pure, or your function stops being pure. external names like random in add_random can never be pure (because any piece of impure code can change the module globals to rebind random to something else). This means that any pure function would have to accept as parameters everything it operates on, and you'd need to error on any call to a pure function with impure arguments. It would be an interesting exercise to see how it plays out in practice though. -Rob

Perhaps this would work better with something like property based testing of some sort. You would take a function and say for the allowed types, run a memoization series and check if the return value is the same every time for the arguments. E.g., if you know the function can only take subclasses of numbers.Number whose arithmetic operators are pure, the type checker / test suite can verify. Code such as:
would allow for a test: @given(integers(), integers()) def add_testing(a, b): result_1 = add(a,b) assert isinstance(result_1, int) for _ in range(100): assert result_1 == add(a,b) - Ed

Perhaps this would work better with something like property based testing of some sort. You would take a function and say for the allowed types, run a memoization series and check if the return value is the same every time for the arguments. E.g., if you know the function can only take subclasses of numbers.Number whose arithmetic operators are pure, the type checker / test suite can verify. Code such as:
would allow for a test: @given(integers(), integers()) def add_testing(a, b): result_1 = add(a,b) assert isinstance(result_1, int) for _ in range(100): assert result_1 == add(a,b) - Ed

Robert Collins wrote:
True, but when checking for type correctness one faces exactly the same problem. Consider the following script. Uncommenting any of the commented-out blocks will make it fail in the same way, but mypy will only detect the first failure. That's already not bad at all, since most of such renaming of global variables will typically happen at global scope. Sufficiently insidious monkey-patching can never be detected by a linter that only looks at the Python source. (Hey, it could happen inside a C extension module!) import math def bad_sin(x: float) -> str: return str(x) ### Mypy will detect this as a problem: # math.sin = bad_sin ### Mypy will not mind the following: # def do_evil(): # math.sin = bad_sin # do_evil() def sin_squared(x: float) -> float: return math.sin(x)**2 print(sin_squared(math.pi / 6))
You mean because one can never be sure about the purity of global variables? That's true, but exactly to the same extent that one cannot be sure about the typing annotations of global variables. I believe that such "weak" checking for purity would still be useful, just as the equally weak verification of typing by mypy is useful. Christoph

On Fri, Jun 10, 2016 at 11:57 AM, Christoph Groth <christoph@grothesque.org> wrote:
I'm late to the discussion, but I agree with Christoph. I doubt purity checking can reliably enforce anything without making it impractically restrictive, but even best-effort, partial purity checking could be useful, similar to how mypy is useful. A checker could catch many (but not all) purity violations and purity annotations would make programs easier to reason about for humans and tools. I've thought about this a bit in the past -- mostly in the context of mypy. I'm not sure whether purity checking would be more than marginally useful, but at least in some use cases it might be a valuable a tool. For example, purity checking might help with reasoning about concurrency and parallelism. However, writing a purity checking tool for Python that is flexible enough that many programmers would actually want to use it sounds like a difficult (research?) project. Here are some practical considerations: 1) We'd need to annotate library functions with respect to purity, as Python derives much of its value from the available libraries, and we'd want pure functions to be able to use (pure) libraries. Perhaps stubs in typeshed would have purity annotations for individual functions. Alternatively, there could be a typeshed-like repository for purity information only, and a tool could use information from both repositories. The safest thing would be to default to impure, unless a function is explicitly annotated as pure. 2) Type annotations would likely be needed for practical checking of purity, as otherwise things like method calls would be hard to check for purity. Thus purity checking would be a stricter form of static type checking. 3) For consistency with PEP 484 style gradual typing, we may want to allow functions without annotations to be called from pure functions. Tools might make this user-configurable, as this is clearly a big hole in a purity checking system. 4) It may be necessary to allow impure operations to be used within a pure function as long the impurity won't be visible outside the function. For example, perhaps this function should be fine, as an external observer can't tell whether the function uses pure operations only: def squares(n: int) -> List[int]: a = [] for i in range(n): a.append(i * i) return a I`m not sure if I know how to implement this in a useful way, though. I believe that there is some prior work in this area (not specific to Python). Jukka

On 22 June 2016 at 13:23, Jukka Lehtosalo <jlehtosalo@gmail.com> wrote:
we could have eg: def parallel_map(f, data): if is_pure(f): # <impl that does special optimisations to take advantage of the fact f is truly pure> # special case that uses light weight threads or sth (maybe can run bits of it GIL free or sth, less overhead or sth) # its ok to share immutable data else: #<impl that has no issue with impure f> # eg: multiprocessing process pools.. #its ok to mutate everything becauuse we no longer truly share anything. in other words, even if only a small subset of code can be proven pure it can be useful to help optimise code (in this example to avoid locking by knowing that no 2 parrallel instances of the callback can mock with mutually shared state) for some scenarios it - might be - ok even if we leave the proving to the programmer. (eg a C extension could implement the optimised parallel map in this example and a user defined decorator is enough to shove a flag on to the function f to trigger the optimised code path) if this it is not truly pure tho despite the assertion it might cause issues so in that case it probably is not something that should ship with python by default?

Le 9 juin 2016 9:48 PM, "Robert Collins" <robertc@robertcollins.net> a écrit :
FAT Python implements optimizations on pure functions like inlining but disable optimizations as soon as an assumption becomes false. There is no magic, it runs checks at runtime using guards. JIT compilers do the same. Some projects like numba or pythran don't implement guards, but it's a deliberate choice to maximize performance. It's acceptable because they use opt-in optimizations. Victor
participants (11)
-
Christoph Groth
-
Cory Benfield
-
Ed Minnix
-
Joonas Liik
-
Joseph Jevnik
-
Jukka Lehtosalo
-
Paul Moore
-
Random832
-
Robert Collins
-
Sven R. Kunze
-
Victor Stinner