How to remove subset from a file efficiently?

Fredrik Lundh fredrik at
Sat Jan 14 02:52:06 EST 2006

"fynali" wrote:

> Is a rewrite possible of Raymond's or Fredrik's suggestions above which
> will still give me the time saving made?

Python 2.2 don't have a readymade set type (new in 2.3), and it doesn't
support generator expressions (the thing that caused the syntax error).

however, using a dictionary instead of the set

    barred = {}
    for number in open(open('/home/sjd/python/wip/CBR0000319.dat')):
        barred[number] = None # just add it as a key

and a list comprehension instead of the generator expression

    outfile.writelines([number for number in infile if number not in barred])

(note the extra brackets)

should give you decent performance under 2.2.


More information about the Python-list mailing list