Replacing words from strings except 'and' / 'or' / 'and not'
John Machin
sjmachin at lexicon.net
Sun Nov 28 18:33:13 EST 2004
Skip Montanaro <skip at pobox.com> wrote in message news:<mailman.6853.1101656845.5135.python-list at python.org>...
> >> > Is there a reason to use sets here? I think lists will do as well.
> >>
> >> Sets are implemented using dictionaries, so the "if w in KEYWORDS"
> >> part would be O(1) instead of O(n) as with lists...
> >>
> >> (I.e. searching a list is a brute-force operation, whereas
> >> sets are not.)
>
> Jp> And yet... using sets here is slower in every possible case:
> ...
> Jp> This is a pretty clear example of premature optimization.
>
> I think the set concept is correct. The keywords of interest are best
> thought of as an unordered collection. Lists imply some ordering (or at
> least that potential). Premature optimization would have been realizing
> that scanning a short list of strings was faster than testing for set
> membership and choosing to use lists instead of sets.
>
> Skip
Jp scores extra points for pre-maturity by not trying out version 2.4,
by not reading the bit about sets now being built-in, based on dicts,
dicts being one of the timbot's optimise-the-snot-out-of targets ...
herewith some results from a box with a 1.4Ghz Athlon chip running
Windows 2000:
C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import
Set; x = Set(['and', 'or', 'not'])" "None in x"
1000000 loops, best of 3: 1.81 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import
Set; x = Set(['and', 'or', 'not'])" "None in x"
1000000 loops, best of 3: 1.77 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and',
'or', 'not'])" "None in x"
1000000 loops, best of 3: 0.29 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and',
'or', 'not'])" "None in x"
1000000 loops, best of 3: 0.289 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and',
'or', 'not']" "None in x"
1000000 loops, best of 3: 0.804 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and',
'or', 'not']" "None in x"
1000000 loops, best of 3: 0.81 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "from sets import
Set; x = Set(['and', 'or', 'not'])" "'and' in x"
1000000 loops, best of 3: 1.69 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and',
'or', 'not'])" "'and' in x"
1000000 loops, best of 3: 0.243 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and',
'or', 'not'])" "'and' in x"
1000000 loops, best of 3: 0.245 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and',
'or', 'not']" "'and' in x"
1000000 loops, best of 3: 0.22 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and',
'or', 'not']" "'and' in x"
1000000 loops, best of 3: 0.22 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = set(['and',
'or', 'not'])" "'not' in x"
1000000 loops, best of 3: 0.257 usec per loop
C:\junk>\python24\python \python24\lib\timeit.py -s "x = ['and',
'or', 'not']" "'not' in x"
1000000 loops, best of 3: 0.34 usec per loop
tee hee ...
More information about the Python-list
mailing list