What is wrong in my list comprehension?

Peter Otten __peter__ at web.de
Mon Feb 2 16:55:05 EST 2009


Jason Scheirer wrote:

> On Feb 1, 3:37 am, Peter Otten <__pete... at web.de> wrote:
>> Hussein B wrote:
>> > Hey,
>> > I have a log file that doesn't contain the word "Haskell" at all, I'm
>> > just trying to do a little performance comparison:
>> > ++++++++++++++
>> > from datetime import time, timedelta, datetime
>> > start = datetime.now()
>> > print start
>> > lines = [line for line in file('/media/sda4/Servers/Apache/
>> > Tomcat-6.0.14/logs/catalina.out') if line.find('Haskell')]
>> > print 'Number of lines contains "Haskell" = ' +  str(len(lines))
>> > end = datetime.now()
>> > print end
>> > ++++++++++++++
>> > Well, the script is returning the whole file's lines number !!
>> > What is wrong in my logic?
>> > Thanks.
>>
>> """
>> find(...)
>> S.find(sub [,start [,end]]) -> int
>>
>> Return the lowest index in S where substring sub is found,
>> such that sub is contained within s[start:end].  Optional
>> arguments start and end are interpreted as in slice notation.
>>
>> Return -1 on failure.
>> """
>>
>> a.find(b) returns -1 if b is no found. -1 evaluates to True in a boolean
>> context.
>>
>> Use
>>
>> [line for line in open(...) if line.find("Haskell") != -1]
>>
>> or, better
>>
>> [line for line in open(...) if "Haskell" in line]
>>
>> to get the expected result.
>>
>> Peter
> 
> Or better, group them together in a generator:
> 
> sum(line for line in open(...) if "Haskell" in line)

You probably mean 

sum(1 for line in open(...) if "Haskell" in line)

if you want to count the lines containing "Haskell", or

sum(line.count("Haskell") for line in open(...) if "Haskell" in line)

if you want to count the occurences of "Haskell" (where the if clause is
logically superfluous, but may improve performance).
 
> and avoid allocating a new list with every line that contains Haskell
> in it.

But note that the OP stated that there were no such lines.

Peter





More information about the Python-list mailing list