[Tutor] extracting numbers from a list
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Wed Oct 18 04:23:25 CEST 2006
On Mon, 16 Oct 2006, kumar s wrote:
> I have a simple question to ask tutors:
>
> list A :
>
> a = [10,15,18,20,25,30,40]
Hi Kumar,
If you're concerned about correctness, I'd recommend that you try thinking
about the problem inductively. An inductive definition for what you're
asking is straightforward to state in about three or four lines of code.
I'll try to go through it slowly so you see what the reasoning behind it
is. The code sketch above uses a technique that you should already know
called "mathematical induction."
http://en.wikipedia.org/wiki/Mathematical_induction
Let's say we're designing a function called getSpans(). Here are some
sample behavior we'd like from it:
getSpans([10, 15]) = [(10, 15)]
getSpans([10, 15, 18]) = [(10, 15), (16, 18)]
getSpans([10, 15, 18, 20]) = [(10, 15), (16, 18), (19, 20)]
Would you agree that this is reasonable output for a function like this?
getSpans() takes a list of numbers, and returns a list of pairs of
numbers.
There is one "base" case to this problem. The smallest list we'd like to
consider is a list of two elements. If we see that, then we're happy,
because the answer is really simple:
getSpans([a, b]) = [(a, b)]
Otherwise, let's imagine a list that's a bit longer, with three elements.
Concretely, we know that this is going to look like:
getSpans([a, b, c]) = [(a, b), (b+1, c)]
But another way to say this, though is that:
getSpans([a, b, c]) = [(a, b)] + getSpans([b+1, c])
That is, we try to restate the problem in terms of smaller subproblems.
Let's look at what the case for four elements might look like:
getSpans([a, b, c, d]) = [(a, b), (b+1, c), (c+1, d)]
Concretely, we know that that's the list of spans we'd like to see. But
if we think about it, we might also restate this as:
getSpans([a, b, c, d]) = [a, b] + getSpans([b+1, c, d])
because getSpans([b+1, c, d]) is going to give us:
[(b+1, c), (c+1, d)]
All we need to do is add on [(a, b)] to that to get the complete answer to
getSpans([a, b, c, d]).
Generally, for any particular list L that's longer than two elements:
getExons(L) = [L[0:2]] + getExons([L[1] + 1] + L[2:])
When we work inductively, all we really need to think about is "base case"
and "inductive case": the solution will often just fall through from
stating those two cases. An inductively-designed function is going to
look something like:
def solve(input):
if input looks like a base-case:
handle that directly in a base-case way
else:
break up the problem into smaller pieces
that we assume can be solve()d by induction
The inductive definition above is slightly inefficient because we're doing
physical list slicing. Rewriting it to use loops and list indicies
instead of slicing is a little harder, but not much harder.
Another example: how do we add up a list of numbers? If there's just one
number, that must be the sum. Otherwise, we can add up the first number
to the sum of the rest of the numbers.
#################################
def mysum(L):
if len(L) == 1:
return L[0]
else:
return L[0] + mysum(L[1:])
#################################
It's a funky way of doing this, but this is a real definition that works
(modulo limits in Python's recursion implementation). It's inefficient,
but it's easy to state and reason about. I'm assuming you're more
interested in correctness than efficiency at the moment. Get it correct
first, then if you really need to, work to get it fast.
More information about the Tutor
mailing list