How to remove subset from a file efficiently?

Raymond Hettinger python at
Fri Jan 13 01:29:22 EST 2006

AJL wrote:
> How fast does this run?
> a = set(file('PSP0000320.dat'))
> b = set(file('CBR0000319.dat'))
> file('PSP-CBR.dat', 'w').writelines(a.difference(b))

Turning PSP into a set takes extra time, consumes unnecessary memory,
eliminates duplicates (possibly a bad thing), and loses the original
input ordering (probably a bad thing).

To jam the action into a couple lines, try this:

b = set(file('CBR0000319.dat'))


More information about the Python-list mailing list