Checking if string starts with list element

Alex Martelli alex at magenta.com
Mon Aug 14 12:31:14 EDT 2000


"Simon Brunning" <SBrunning at trisystems.co.uk> wrote in message
news:31575A892FF6D1118F5800600846864D4C71D3 at intrepid...
> I have a list of strings:
> romans = ['aqueduct', 'sanitation', 'roads', 'irrigation', 'medicine',
> 'education', 'wine', 'public baths', 'order']
>
> And I have a single string:
> us = 'wine, women and song.'
>
> I want to know whether my sting *starts with* one of the strings in my
list.
> The best that I can do is something like:
>
> for roman in romans:
>     if string.find(us, roman) == 0:
>         # do stuff
>         break
>
> The trouble is that in the code I'm writing, the list has a hundred or so
> elements, and it's running too slowly. Any suggestions?

If you're using the same 100-element list for many searches (or the changes
to the list between searches are small, at least) you can probably gain by
preprocessing the list into a dictionary of smaller lists based on the first
character:

romandict={}
for r in romans:
    c=r[0]
    if romandict.has_key(c):
        romandict[c].append(r)
    else:
        romandict[c]=[r]

and then the search becomes:

c=us[0]
searchlist=romandict.get(c,[])
for r in searchlist:
    if string.find(us, roman) == 0:
        # do stuff
        break

For truly huge lists, or one with many common-starts, you could
use a longer prefix than just 1 character in a similar approach.


Another approach would be to limit your list's preprocessing to a
sort, and then do a binary-search of the 'us' string in the sorted
list; this, in log(N) time, identifies one or two list items which
might be found at the start of 'us'; unfortunately, I don't think
there's a binary-search of a sorted list in Python's library (hope
I'm wrong, but...?), and coding it yourself in Python is not going
to be all that fast (it's also slightly tricky to code, famously).


Alex






More information about the Python-list mailing list