Python-based monads essay part 2
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Thu Oct 20 03:42:48 EDT 2016
On Thursday 20 October 2016 04:57, Chris Angelico wrote:
> Short-circuiting depends only on what's known *prior* to that
> evaluation. Dead code elimination removes the notion of time:
Surely not. I understand "dead code" to mean the same as *unreachable* code:
code which cannot be reached by any path through the program:
if False:
spam() # dead code
not merely code that doesn't affect the return result:
x = spam()
x = 999
Calling spam is not dead code, even if the result is immediately thrown away!
In any case, whichever definition you use, for spam() to be dead code in the
later example, it has to be known to have no side-effects.
> int(input("Enter something:")) * 0 + 5
>
> This can simply return 5, because (barring an exception being thrown)
> the value entered cannot affect the result. Python does not do this.
> And any system that has optimization flags (PostgreSQL, Pike, etc),
> this would be marked "has side effects", and therefore is not
> considered dead code. But if you (maybe mistakenly) mark input() as a
> pure function, then yes, this could indeed be optimized out.
*Assuming* that the compiler can distinguish between functions with side-
effects and those that aren't, and *assuming* that the compiler knows that
input() returns something which can be cast to an int with no side-effects or
errors, then in principle the compiler could optimize that to just 5.
But I don't think that's terribly likely, even for an aggressive optimizing
compiler. Maybe C, because there's nothing so terrible I wouldn't believe about
C compilers *wink*
This is not the best example, because its obvious that the call to int() might
fail, and hence the effect of this code may be to raise ValueError. A better
example would be:
len(input("Enter something:")) * 0 + 5
Assuming the compiler knows that len() and input() have their standard meaning,
is it justified to optimized that expression to just 5? I don't think so,
because the side effect of input() displaying text on the screen and waiting
for the user to type something is the whole reason for the function to exist.
Regardless of whether your language is imperative, or functional with monads, I
think that expression has to write text to the screen and wait for the user to
type something, or the compiler is buggy. But it doesn't have to happen
straight away! Consider:
L = [0, 1, 2, 3, 4, len(input("A")) * 0 + 5, 6]
print("B")
print(L[5])
Does this program print "A B 5", or "B A 5"? In Python, it prints "A" first,
but that may not be the case for all languages. My guess is that, like
declarative, asynchronous, concurrent and parallel programming (did I miss any
buzzwords?), functional programming doesn't guarantee the order in which code
is executed.
Actually, neither does imperative programming, if the language supports "call
by name" or similar parameter passing mechanisms:
def call_by_name(x):
print("B")
print(x)
call_by_name(len(input("A")) * 0 + 5)
If x is a thunk, then this too will print "B" first.
>> However, if we think of the I/O interaction (aka *the process*) to be
>> the return value of a function, every bead in the I/O chain counts.
>
> And that's what I mean about making function purity meaningless. Here,
> look - this is a pure function!
>
> magic_numbers = [1, 2, 4, 8]
> def do_magic(x):
> magic_numbers[x % 4] += x / 4
> return magic_numbers[(x + 1) % 4]
>
> Since the previous state of magic_numbers can be considered an input,
> and the new state can be considered an output, this is a pure
> function! Right?
No, but you know that :-)
Inputs are explicitly arguments to the function, not external state. Here's how
we can turn that into real input:
def do_magic(state, x):
state = calculate_new_state(state) # work it out for yourself *wink*
value = state[(x + 1) % 4]
return state, value
and then use a monad to make the state invisible, so that you actually only
need to call do_magic(x).
I'm hand-waving here because I don't remember all the fine details from Greg's
essay. Call it magic if you want, but the compiler can do it, because the
details of how state is stored is invisible to the caller, the compiler is free
to do the optimization:
change the functional but inefficient idiom
state = calculate_new_state(state)
into the imperative but efficient idiom:
modify_in_place(state)
That's allowed, because (1) nobody and nothing except the do_magic() function
can access the state; (2) the use of a monad makes this completely transparent
and mathematically vigorous, which means the compiler can do it without human
intervention; and (3) we want the program to run in less than a billion
petabytes of memory. Practicality beats purity.
All joking aside, the whole point of functional programming is to avoid the
human programmer having to track external state. It's not actually to avoid
external state: given our existing hardware, such a thing would be (1)
inefficient and (2) impossible because we're not doing a Platonic Ideal
computation, we're manipulating bits in RAM with a CPU that has to bang
trillions of electrons around really fast, generating heat. We just have to
avoid external state *where the programmer can trip over it* and end up with
buggy code.
And also where, in practice, you actually need and want it. We have to handle
input and output, otherwise what's the point of the computation? If your code
doesn't interact with the outside universe, it might as well not exist.
As I understand it, monads are just a way to put these essential I/O things
into a mathematically vigorous footing so that the compiler can handle them as
if they were functionally pure. It doesn't mean they *actually are*
functionally pure -- obviously if you delete a file or print output to the
screen or pause for the user to enter data, you're making changes of state to
the outside world. So what? We don't live in Plato's Ideal World of forms.
> Technical means nothing if you're destroying the meaning of functional
> purity in order to squeeze I/O into it.
I think that's an extreme over-reaction.
--
Steven
git gets easier once you get the basic idea that branches are homeomorphic
endofunctors mapping submanifolds of a Hilbert space.
More information about the Python-list
mailing list