[Tutor] Text processing and functional programming?
Clay Shirky
clay at shirky.com
Wed Aug 13 00:12:42 EDT 2003
> Hmmm. Yikes. Ok, I do agree with you here. The statement:
>
> indent = lambda s: reduce(min,map(len,findall('(?m)^ *(?=\S)',s)))
>
> is overloaded. *grin*
You know, I picked up that book expecting to love it (almost everything I do
involves processing text in some way or other), but after reading at Ch 1, I
mostly learned that functional programming is a way of making Python look
like Perl.
> For those who haven't seen it before: 'lambda' is an operator that creates
> simple functions.
My impression is that lambda and map/filter/reduce can often be replaced by
list comprehensions. Is this correct?
> ###
> def indent(s):
> results = findall('(?m)^ *(?=\S)',s)
> minimum = len(results[0])
> for r in results:
> if len(r) < minimum:
> minimum = len(r)
> return minimum
> ###
This is *so* much more readable than the above.
> This explicit looping approach works, but it mixes together applying a
> function across a sequence with finding the minimum of the sequence.
> There's more potential for programmer error with this approach.
I'm skeptical about this. This assumes that most programmer error comes from
writing rather than maintaining code. If you think the code will get written
once and read often, the risk of error from semantic density grows.
> Of course, it's not difficult in this example to pinpoint our confusion
> down to the findall() regular expression statement (since it's the very
> first statement... *grin*), but I hope the point about the advantage of
> debugging a functional program is more clear.
Well this is the ironic part. The problem in this example is in the regex,
and since the regex is an explicit assignment and first, you'll catch it
*faster* with a step-through approach than you would in the functional
approach, esp if you assume that the person doing the debugging is different
from the person doing the coding. Furthermore, since regexes are usually
more problematic than assignments or loops, being able to zero in on that is
easier with an explicit assignment from findall.
>From looking at TPiP, it looks to me like FP takes some basically sensible
ideas -- the easiest loop to debug is the loop you don't write, nested
transformations can make the operations on an object clear -- and makes them
cardinal virtues, at the risk of losing things like readability and obvious
flow.
-clay
More information about the Tutor
mailing list