making a valid file name...
bearophileHUGS at lycos.com
bearophileHUGS at lycos.com
Wed Oct 18 19:32:41 EDT 2006
Tim Chase:
> In practice, however, for such small strings as the given
> whitelist, the underlying find() operation likely doesn't put a
> blip on the radar. If your whitelist were some huge document
> that you were searching repeatedly, it could have worse
> performance. Additionally, the find() in the underlying C code
> is likely about as bare-metal as it gets, whereas the set
> membership aspect of things may go through some more convoluted
> setup/teardown/hashing and spend a lot more time further from the
> processor's op-codes.
With this specific test (half good half bad), on Py2.5, on my PC, sets
start to be faster than the string search when the string "good" is
about 5-6 chars long (this means set are quite fast, I presume).
from random import choice, seed
from time import clock
def main(choice=choice):
seed(1)
n = 100000
for good in ("ab", "abc", "abcdef", "abcdefgh",
"abcdefghijklmnopqrstuvwxyz"):
poss = good + good.upper()
data = [choice(poss) for _ in xrange(n)] * 10
print "len(good) = ", len(good)
t = clock()
for c in data:
c in good
print round(clock()-t, 2)
t = clock()
sgood = set(good)
for c in data:
c in sgood
print round(clock()-t, 2), "\n"
main()
Bye,
bearophile
More information about the Python-list
mailing list