simple text filter

John Machin sjmachin at
Thu Jun 12 15:28:33 CEST 2003

boutrosp at wrote in message news:<f903b9dd.0306111236.1ca87c93 at>...
> I need some help on a simple text filter. The problem I am having is
> when the file comes to the end it stays in the while loop and does not
> exit. I cannot figure this out. I would use a for loop with the
> readlines() but my datasets can range from 5 to 80 MB of text data.
> Here is the code I am using. Please help.
> import sys, re
> p1 = re.compile('ADT100')
> p8 = re.compile('ATAP')
> f=open('adt100_0489.rpt.txt', 'r')
> junky = 1
> done = False
> while not done :
>         junky = f.readline()
>         if :
>                 continue
>         elif :
>                 continue
>         elif junky == None :
>                 done = True
>         else :
>                 print junky
> f.close()

Try this:

import sys, re
good_stuff = re.compile(
   # list these in descending frequency order
for aline in file(sys.argv[1]):
   # hardcoded file names not a good idea
   if not good_stuff(aline):
   	print aline

You may want to ensure that you don't match e.g DROPKICK when you only
Note carefully the r prefix (raw string).

More information about the Python-list mailing list