[Tutor] List comprehensions

Thu Apr 10 14:20:39 CEST 2008

Dinesh B Vadhia wrote:
> Kent
>  
> I'm using a Javascript autocomplete plugin for an online web 
> application/service.  Each time a user inputs a character, the character 
> is sent to the backend Python program which searches for the character 
> in a list of >10,000 string items.  Once it finds the character, the 
> backend will return that string and N other adjacent string items where 
> N can vary from 20 to 150.  Each string item is sent back to the JS in 
> separate print statements.  Hence, the for loop.

Ok, this sounds a little closer to a real spec. What kind of search are 
you doing? Do you really just search for individual characters or are 
you looking for the entire string entered so far as a prefix? Is the 
list of 10,000 items sorted? Can it be?

You need to look at your real problem and find an appropriate data 
structure, rather than showing us what you think is the solution and 
asking how to make it faster.

For example, if what you have a sorted list of strings and you want to 
find the first string that starts with a given prefix and return the N 
adjacent strings, you could use the bisect module to do a binary search 
rather than a linear search. Binary search of 10,000 items will take 
13-14 comparisons to find the correct location. Your linear search will 
take an average of 5,000 comparisons.

You might also want to use a trie structure though I'm not sure if that 
will let you find adjacent items.
http://www.cs.mcgill.ca/~cs251/OldCourses/1997/topic7/
http://jtauber.com/blog/2005/02/10/updated_python_trie_implementation/

> I haven't done any profiling yet as we are still building the system but 
> it seemed sensible that replacing the for loop with a built-in would 
> help.  Maybe not?

Not. An algorithm with poor "big O" performance should be *replaced*, 
not optimized.

Kent