[SPAM] VERY simple string comparison issue
MRAB
google at mrabarnett.plus.com
Wed Dec 24 14:10:51 EST 2008
Brad Causey wrote:
> Python Version: Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC
> v.1310 32 bit (Intel)] on win32
>
> List,
>
> I am trying to do some basic log parsing, and well, I am absolutely
> floored at this seemingly simple problem. I am by no means a novice in
> python, but yet this is really stumping me. I have extracted the
> pertinent code snippets and modified them to function as a standalone
> script. Basically I am reading a log file ( in this case, testlog.log)
> for entries and comparing them to entries in a safe list (in this case,
> safelist.lst). I have spent numerous hours doing this several ways and
> this is the most simple way I can come up with:
>
> <code>
> import string
>
> safelistfh = file('safelist.lst', 'r')
> safelist = safelistfh.readlines()
>
> logfh = file('testlog.log', 'r')
> loglines = logfh.readlines()
>
> def safecheck(line):
> for entry in safelist:
> print 'I am searching for\n'
> print entry
> print '\n'
> print 'to exist in\n'
> print line
> comp = line.find(entry)
> if comp <> -1:
> out = 'Failed'
> else:
> out = 'Passed'
> return out
>
Unless I've misunderstood what you're doing, wouldn't it be better as:
def safecheck(line):
for entry in safelist:
print 'I am searching for\n'
print entry
print '\n'
print 'to exist in\n'
print line
if entry in line:
return 'Passed'
return 'Failed'
> for log in loglines:
> finalentry = safecheck(log)
> if finalentry == 'Failed':
> print 'This is an internal site'
> else:
> print 'This is an external site'
> </code>
>
Actually, I think it would be better to use True and False instead of
'Passed' and 'Failed.
> The contents of the two files are as follows:
>
> <safelist.lst>
> http://www.mysite.com <http://www.mysite.com/>
> </safelist.lst>
>
> <testlog.log>
> http://www.mysite.com/images/homepage/xmlslideshow-personal.swf
> </testlog.log>
>
> It seems that no matter what I do, I can't get this to fail the " if
> comp <> -1:" check. (My goal is for the check to fail so that I know
> this is just a URL to a safe[internal] site)
> My assumption is that the HTTP:// is somehow affecting the searching
> capabilities of the string.find function. But I can't seem to locate any
> documentation online that outlines restrictions when using special
> characters.
>
> Any thoughts?
>
You'll still need to strip off the '\n'.
More information about the Python-list
mailing list