[Tutor] List comprehensions to search a list--amazing!
Steven D'Aprano
steve at pearwood.info
Tue Mar 24 03:21:26 CET 2015
On Mon, Mar 23, 2015 at 08:42:23PM -0500, boB Stepp wrote:
> On Thu, Mar 19, 2015 at 12:10 AM, Dave Angel <davea at davea.name> wrote:
> > The catch to a list comprehension is it has to visit all the elements, while
> > a binary search would visit log-base-2 of them. So instead of 10000
> > elements, you'd be searching about 14 items.
>
> I suspected as much, but had not verified this. Nonetheless, this may
> prove sufficiently fast. I will have to test this with my final data
> files. Right now I am using test cases, while I continue to design,
> check, rewrite, etc.
>
> > For large lists, it'd probably be much quicker to use the bisect module.
> > https://docs.python.org/3.4/library/bisect.html
>
> Can you give me a ballpark number for "large", where this would start
> making a meaningful difference?
Tell us what you consider a meaningful difference :-)
What counts as "too slow" will depend on:
- what you are doing
- how often you are doing it
- what else you're doing at the same time
- what hardware you have to run it on
- whether you are running the program only once, or repeatedly
etc. E.g. taking 30 seconds to iterate over a billion items in a list is
insignificant if this is part of a bigger program which takes twenty
minutes to run, but critical if you are hoping to run the program
thirty times a second, hundreds of times a day.
But I've never heard anyone complain that a program was too fast :-)
(Actually, that's not quite true. Sometimes if the user is expecting
a program function to take a while, say "Rebuild database", and it
actually runs near instantaneously, it is indeed *too fast* because it
can give the user the idea that the rebuild function isn't working. But
that's a UI issue, not a speed issue.)
But if you twist my arm, and force me to pluck some round numbers from
thin air, I would guess:
- for under ten items, a linear search will be insignificantly faster;
- for under a hundred items, there's no meaningful difference;
- for under a million items, binary search will be typically better but
a linear search still acceptable ("everything is fast for small N");
- for over a million items, linear search will typically be
unacceptably slow.
For certain definitions of what's acceptable or not :-)
--
Steve
More information about the Tutor
mailing list