searching algorithm

Neil Cerutti horpner at yahoo.com
Thu May 10 14:32:46 EDT 2007


On 2007-05-10, Gigs_ <gigs at hi.t-com.hr> wrote:
> Hi all!
>
> I have text file (english-croatian dictionary) with words in it
> in alphabetical order. This file contains 179999 words in this
> format: english word: croatian word
>
> I want to make instant search for my gui Instant search, i mean
> that my program search words and show words to user as user
> type letters. yes, it needs to be fast
>
> Can someone give me some points (approaches) how to make this
> Should i make indexes and go with file.seek
>
> or should breake dictionary in peaces and put words that start
> with a in one and with b in another...
>
> So if i have this words
> [abridged dictionary below]
> absinth:pelin
> absinthe:pelin
> absolute:apsolutan
> absolute:apsolutni kod
> absolve:odrije?iti
> absolve:osloboditi
> absorb:absorbirati
> absorb:apsorbirati
> absorb:crpsti
>
> if user type: "abs" program should list all words above in
> english and in croatian if user type: "absorb" than program
> should list last 3 words in english and in croatian

A solution that solves the problem with a data structure might be
a multi-tree.

Each node points at a set of following letters, and a set of
croatian translations, either of which might be empty, making the
node a leaf.

For the above (abrideged) dictionary, you would generate (use a
fixed-width "programmers" font so the tree looks good):

                 a
                 |
                 b
                 |
                 s
                / \
               i   o
              /   / \
             n   l   r
            /   / \   \
           t   u   v   b->(absorbirati, crpisti)
          /    |   |
(pelin)<-h     t   e->(odrije?iti, osloboditi)
         |     |
(pelin)<-e     e->(apsolutan, apsolutni kod)

As the user enter letters, you just march down the tree, printing
all the words held in leaf nodes held in the current node.

-- 
Neil Cerutti
We shall reach greater and greater platitudes of achievement. --Richard J.
Daley



More information about the Python-list mailing list