[Tutor] Omitting lines matching a list of strings from a file

Thu Feb 25 08:15:20 CET 2010

galaxywatcher at gmail.com wrote:
> I am trying to output a list of addresses that do not match a list of 
> State abbreviations. What I have so far is:
>
> def main():
>     infile = open("list.txt", "r")
>     for line in infile:
>         state = line[146:148]
>         omit_states = ['KS', 'KY', 'MA', 'ND', 'NE', 'NJ', 'PR', 'RI', 
> 'SD', 'VI', 'VT', 'WI']
>         for n in omit_states:
>             if state != n:
>                 print line
>     infile.close()
> main()
>
> This outputs multiple duplicate lines. The strange thing is that if I 
> change 'if state == n:' then I correctly output all matching lines. 
> But I don't want that. I want to output all lines that do NOT match 
> the States in the omit_states list.
>
> I am probably overlooking something very simple. Thanks in advance.
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor
>
The more pythonic way of doing it would be to use `not in` like listed 
below.  You should consider normalizing your input (state) by using 
.upper() unless you know for certain it's always upper-case.

state = line[146:148]
omit_states = ['KS', 'KY', ..., 'VT', 'WI']
if state not in omit_states:
    print line

-- 
Kind Regards,
Christian Witts
Business Intelligence

C o m p u s c a n | Confidence in Credit

Telephone: +27 21 888 6000
National Cell Centre: 0861 51 41 31
Fax: +27 21 413 2424
E-mail: cwitts at compuscan.co.za

NOTE:  This e-mail (including attachments )is subject to the disclaimer published at: http://www.compuscan.co.za/live/content.php?Item_ID=494.
If you cannot access the disclaimer, request it from email.disclaimer at compuscan.co.za or 0861 514131.

National Credit Regulator Credit Bureau Registration No. NCRCB6