[Python-Dev] Prospective Peephole Transformation

M.-A. Lemburg mal at egenix.com
Sat Feb 19 00:42:35 CET 2005


Raymond Hettinger wrote:
>>Wouldn't it help a lot more if the compiler would detect that
>>(1,2,3) is immutable and convert it into a constant at
>>compile time ?!
> 
> 
> Yes.  We've already gotten it to that point:
> 
> Python 2.5a0 (#46, Feb 15 2005, 19:11:35) [MSC v.1200 32 bit (Intel)] on
> win32
> 
>>>>import dis
>>>>dis.dis(compile('x in ("xml", "html", "css")', '', 'eval'))
> 
>   0           0 LOAD_NAME                0 (x)
>               3 LOAD_CONST               3 (('xml', 'html', 'css'))
>               6 COMPARE_OP               6 (in)
>               9 RETURN_VALUE  

Cool. Does that work for all tuples in the program ?

> The question is whether to go a step further to replace the linear
> search with a single hashed lookup:
> 
>   0           0 LOAD_NAME                0 (x)
>               3 LOAD_CONST               3 (searchset(['xml', 'html',
> 'css']))
>               6 COMPARE_OP               6 (in)
>               9 RETURN_VALUE  
> 
> This situation seems to arise often in source code.  You can see the
> cases in the standard library with:   grep 'in ("' *.py

I did a search on our code and Python's std lib. It turns
out that by far most such usages use either 2 or 3
values in the tuple. If you look at the types of the
values, the most common usages are strings and integers.

I'd assume that you'll get somewhat different results
from your benchmark if you had integers in the tuple.

> The transformation is easy to make at compile time.  The part holding me
> back is the introduction of searchset as a frozenset subtype and
> teaching marshal how to put it a pyc file.

Hmm, what if you'd teach tuples to do faster contains lookups for
string or integer only content, e.g. by introducing sub-types for
string-only and integer-only tuples ?!

> FWIW, some sample timings are included below (using frozenset to
> approximate what searchset would do).  The summary is that the tuple
> search takes .49usec plus .12usec for each item searched until a match
> is found.  The frozenset lookup takes a constant .53 usec.
> 
> 
> 
> Raymond
> 
> 
> 
> ------------------------------------------------------------------------
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='xml'"
> "x in s"
> 1000000 loops, best of 9: 0.49 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='css'"
> "x in s"
> 1000000 loops, best of 9: 0.621 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='html'"
> "x in s"
> 1000000 loops, best of 9: 0.747 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=('xml', 'css', 'html')" -s "x='pdf'"
> "x in s"
> 100000 loops, best of 9: 0.851 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='xml'" "x in s"
> 1000000 loops, best of 9: 0.529 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='css'" "x in s"
> 1000000 loops, best of 9: 0.522 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='html'" "x in s"
> 1000000 loops, best of 9: 0.53 usec per loop
> 
> C:\py25>python -m timeit -r9 -s "s=frozenset(['xml', 'css', 'html'])" -s
> "x='pdf'" "x in s"
> 1000000 loops, best of 9: 0.523 usec per loop

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 19 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list