[Tutor] word challenge

Raymond Hettinger python@rcn.com
Sat Jul 19 02:35:02 2003


----- Original Message -----
From: "gyro funch" <gyromagnetic@excite.com>
To: <Tutor@python.org>
Sent: Friday, July 18, 2003 2:37 PM
Subject: [Tutor] word challenge


>
> Hi,
> I thought that this might be an interesting challenge for those of you with a 'puzzling' disposition. ;-)
>
> Suppose I have a phrase from which I want to extract a word made up of at least one letter from each of the words (more than one
letter may be taken from each word if desired). The letters must be aligned in the order in which they were taken from the words.
>
> For instance, if I have the phrase: "handy cool phrase"
> I might extract the "h" from "handy", one "o" from "cool", and the "r","s",and "e" from "phrase" to give "horse".

Since the number of paths through the phrase is very large,
a good bet is to start with a dictionary of available words
and test to see if they can be found in the phrase (while
paying attention to letter order and the requirement that
each group be represented).

For speed, there are three quick tests to narrow down the
field:
 * the first letter should be in the first group
 * the last letter should be in the last group
 * each letter must be in the set of letters in the groups

After ruling out words that don't pass the three tests, the
next step is to find possible paths through phrase.  Some
of the complexities and backtracking are illustrated by the test:
   "hot" in "hoo ot otel"

A recursive algorithm is used.  To find where a word is in the
remaining phrase starting from a given position and group,
only the first letter is checked to see if it is in the current group
or next group.  If so, the remainder of the word is checked
starting at the next matching group and position.  If that doesn't
word out, other instances of the first letter are checked for until
there are no more in the current or next group.

class Phrase:
    def __init__(self, s):
        self.words = s.split()
        self.first = dict.fromkeys(self.words[0])
        self.last = dict.fromkeys(self.words[-1])
        self.charmap = {}
        self.conseq = []
        self.group = []
        for i, word in enumerate(self.words):
            for char in word:
                self.charmap.setdefault(char, []).append(i)
                self.conseq.append(char)
                self.group.append(i)
        self.conseq = ''.join(self.conseq)
        self.topgroup = len(self.words)-1

    def __contains__(self, word):
        if word[0] not in self.first: return False
        if word[-1] not in self.last: return False
        for char in word[1:-1]:
            if char not in self.charmap:
                return False
        return self.paths(word)

    def paths(self, word, startpos=0, group=0):
        #print word, startpos, group
        if not word:
            return group == self.topgroup
        char = word[0]
        i = startpos-1
        while 1:
            i = self.conseq.find(char, i+1)
            if i < 0:
                return False
            g = self.group[i]
            if g > group + 1:
                return False
            if g == group and self.paths(word[1:], i+1, group):
                return True
            if g == group+1 and self.paths(word[1:], i+1, group+1):
                return True

P = Phrase("handy cool phrase")
print 'horse' in P
print 'jump' in P

for word in open('mydictionary.txt'):
    if word in P:
       print word


Raymond Hettinger


P.S.  Nice puzzle!