Dive Into Python: call for comments (long)

Alex Martelli aleaxit at yahoo.com
Tue Apr 24 11:38:04 EDT 2001


"Andrew Dalke" <dalke at acm.org> wrote in message
news:9c2lt9$81g$1 at slb6.atl.mindspring.net...
> Steve Lamb wrote:
> >    Well, generally {}'s are less efficient but a little easier to read.
> [that is, M{0,3} is less efficient than M?M?M?]
>
> I disagree with your statement, but not enough to actually *test*
> it to see if I'm right.  I would expect M{0,3} should to be of
> comparable performance to (M(MM?)?)? and faster than M?M?M? - in

I've tried a simple test on re:

import re, time, sys, glob

tre1 = re.compile(r'M{0,3}')
tre2 = re.compile(r'M?M?M?')

def emptyfun(filenames, re):
    count = 0
    for aglob in filenames:
        for filename in glob.glob(aglob):
            lines = open(filename).readlines()
            for line in lines:
                if line:
                    count += 1
    return count

def nonempty(filenames, re):
    count = 0
    for aglob in filenames:
        for filename in glob.glob(aglob):
            lines = open(filename).readlines()
            for line in lines:
                if re.search(line):
                    count += 1
    return count

def timeit(header, fun, funargs):
    start = time.clock()
    result = fun(*funargs)
    stend = time.clock()
    print header, stend-start
    return result

x = timeit("Empty 1", emptyfun, (sys.argv[1:], tre1))
print "result:",x
x = timeit("Empty 2", emptyfun, (sys.argv[1:], tre2))
print "result:",x
x = timeit("Braces ", nonempty, (sys.argv[1:], tre1))
print "result:",x
x = timeit("Qmarks ", nonempty, (sys.argv[1:], tre2))
print "result:",x


After running it a few times to prime caches so that
file-reading relevance would disappear...:

D:\Python21>python tre.py Lib\*.py
Empty 1 0.496439771971
result: 54155
Empty 2 0.502022323501
result: 54155
Braces  1.0938423476
result: 54155
Qmarks  1.08212745415
result: 54155

D:\Python21>python tre.py Lib\*.py
Empty 1 0.524246929639
result: 54155
Empty 2 0.476897070187
result: 54155
Braces  1.04125355562
result: 54155
Qmarks  1.08087198768
result: 54155

D:\Python21>python tre.py Lib\*.py
Empty 1 0.494505448456
result: 54155
Empty 2 0.477975698594
result: 54155
Braces  1.01070331265
result: 54155
Qmarks  1.04960852577
result: 54155

D:\Python21>


Performance difference between a few tens of
thousands of matches with the question marks
and with the braces seems within noise under
this -- once braces are faster by 1/100th of
a second, twice qmarks are faster by 4/100th
(over 50,000+ matches in each case).

Choosing between them based strictly on
clarity and readability thus seems justified.


Alex






More information about the Python-list mailing list