longest sequence

Mike C. Fletcher mcfletch at rogers.com
Mon Feb 17 21:17:23 CET 2003

This approach, though far less interesting, should be faster than the 
decorate-sort-undecorate one:

def longest( seq ):
    maxLength = -1
    maxItem = None
    for item in seq:
        itemLength = len(item)
        if itemLength > maxLength:
            maxItem = item
            maxLength = itemLength
    return maxItem

My testing gives the following...
Sequence timing
size       forloop    decorate
     10 -> 0.000022   0.000038
    100 -> 0.000113   0.000408
   1000 -> 0.001162   0.007917
  10000 -> 0.014557   0.303996

Which is to say that, for most apps, it's not likely to make a 
difference, but with really big lists you can get slightly better 
performance.  Of course, creating a huge list takes 100s of times longer 
than finding the longest item, so optimising creation is probably going 
to be a much greater win.

Oh, a note, this algo returns the _first_ item of greatest length, spec 
didn't say what to do with multiple longest items (or no items at all 
(this version returns None in that case)).


PS: for the curious, the testing code...

def test(size):
    import random
    seq = [ " "*random.randint(0,32000) for item in range(size) ]
    random.shuffle( seq )
    t = time.clock()
    long = longest( seq )
    t = time.clock()-t
    ts = time.clock()
    s = [(len(i),i) for i in seq ]
    ts = time.clock()-ts
    assert len(long) == len(s[-1][1]), """Didn't get the longest 
    return t,ts

if __name__ == "__main__":
    import time
    print 'Sequence timing'
    print 'size       forloop    decorate'
    for size in [10,100,1000,10000]:
        t,ts = test( size )
        print '%7i -> %0.6f   %0.6f'%(size, t,ts)

Mark McEahern wrote:

>>Yay, optimizations time! :-)
>>def longest2(*args):
>>   if args:
>>      decorated = [ (len(S),S) for S in args ]
>>      decorated.sort()
>>      return decorated[-1][1]
>>This should be a bit quicker. :-)
>Yup, my testing shows your decorate-sort-undecorate approach is twice as
>fast.  Is the trick here that sort() sorts sequences of tuples on the first
>element of each tuple in the sequence?
>// m
  Mike C. Fletcher
  Designer, VR Plumber, Coder

More information about the Python-list mailing list