What is wrong in my list comprehension?
Jason Scheirer
jason.scheirer at gmail.com
Mon Feb 2 16:41:33 EST 2009
On Feb 1, 3:37 am, Peter Otten <__pete... at web.de> wrote:
> Hussein B wrote:
> > Hey,
> > I have a log file that doesn't contain the word "Haskell" at all, I'm
> > just trying to do a little performance comparison:
> > ++++++++++++++
> > from datetime import time, timedelta, datetime
> > start = datetime.now()
> > print start
> > lines = [line for line in file('/media/sda4/Servers/Apache/
> > Tomcat-6.0.14/logs/catalina.out') if line.find('Haskell')]
> > print 'Number of lines contains "Haskell" = ' + str(len(lines))
> > end = datetime.now()
> > print end
> > ++++++++++++++
> > Well, the script is returning the whole file's lines number !!
> > What is wrong in my logic?
> > Thanks.
>
> """
> find(...)
> S.find(sub [,start [,end]]) -> int
>
> Return the lowest index in S where substring sub is found,
> such that sub is contained within s[start:end]. Optional
> arguments start and end are interpreted as in slice notation.
>
> Return -1 on failure.
> """
>
> a.find(b) returns -1 if b is no found. -1 evaluates to True in a boolean
> context.
>
> Use
>
> [line for line in open(...) if line.find("Haskell") != -1]
>
> or, better
>
> [line for line in open(...) if "Haskell" in line]
>
> to get the expected result.
>
> Peter
Or better, group them together in a generator:
sum(line for line in open(...) if "Haskell" in line)
and avoid allocating a new list with every line that contains Haskell
in it.
http://www.python.org/dev/peps/pep-0289/
More information about the Python-list
mailing list