[Tutor] How does this work?

Christopher Spears cspears2002 at yahoo.com
Tue Mar 14 02:08:37 CET 2006


Out of Learning Python, I was given this text:

This is a paragraph that mentions bell peppers
multiple times. For
one, here is a red pepper and dried tomato salad
recipe. I don't like
to use green peppers in my salads as much because they
have a harsher
flavor.

This second paragraph mentions red peppers and green
peppers but not
the "s" word (s-a-l-a-d), so no bells should show up.

This third paragraph mentions red peppercorns and
green peppercorns,
which aren't vegetables but spices (by the way, bell
peppers really
aren't peppers, they're chilies, but would you rather
have a good cook
or a good botanist prepare your salad?).

I am supposed to write a program to replace the
strings "green pepper" and "red pepper" with "bell
pepper" only if they occur together in a paragraph
before the word "salad" and not if they are followed
(with no space) by the string "corn."  I'm supposed to
do this without regular expressions.

Here is the solution that was given to me:

file = open('pepper.txt')
text = file.read()
paragraphs = text.split('\n\n')

def find_indices_for(big, small):
    indices = []
    cum = 0
    while 1:
        index = big.find(small)
        if index == -1:
            return indices
        indices.append(index+cum)
        big = big[index+len(small):]
        cum = cum + index + len(small)

def fix_paragraphs_with_word(paragraphs, word):
    lenword = len(word)
    for par_no in range(len(paragraphs)):
        p = paragraphs[par_no]
        wordpositions = find_indices_for(p, word)
        if wordpositions == []: return
        for start in wordpositions:
            # Look for 'pepper' ahead.
            indexpepper = p.find('pepper')
            if indexpepper == -1: return -1 
            if p[start:indexpepper].strip():
                # Something other than whitespace in
between!
                continue
            where = indexpepper+len('pepper')
            if p[where:where+len('corn')] == 'corn':
                # It's immediately followed by 'corn'!
                continue
            if p.find('salad') < where:
                # It's not followed by 'salad'.
                continue
            # Finally! We get to do a change!
            p = p[:start] + 'bell' + p[start+lenword:]
            paragraphs[par_no] = p # Change mutable
argument!

fix_paragraphs_with_word(paragraphs, 'red')
fix_paragraphs_with_word(paragraphs, 'green')

for paragraph in paragraphs:
    print paragraph+'\n'

When I first read the question and saw the solution, I
was overwhelmed!  What was going on?!  However, I took
the program apart line by line and understand most of
it now.  I am still fuzzy on a few parts.

if p[start:indexpepper].strip():
                continue

What is that supposed to accomplish?  If the program
can remove whitespace between red and pepper, it's
supposed to move on?

p = p[:start] + 'bell' + p[start+lenword:]
paragraphs[par_no] = p # Change mutable argument!

Not sure how the above lines are supposed to change
the text...Give me regular expressions any day!

Clarifications are welcome!


More information about the Tutor mailing list