How to remove subset from a file efficiently?
fynali
iladijas at gmail.com
Sat Jan 14 01:16:11 EST 2006
$ time fgrep -x -v -f CBR0000333 PSP0000333 > PSP-CBR.dat.fgrep
real 0m31.551s
user 0m16.841s
sys 0m0.912s
--
$ time ./cleanup.py
real 0m6.080s
user 0m4.836s
sys 0m0.408s
--
$ wc -l PSP-CBR.dat.fgrep PSP-CBR.dat.python
3872421 PSP-CBR.dat.fgrep
3872421 PSP-CBR.dat.python
Fantastic, at any rate the time is down from my initial ~4 min.!
Thank you Chris. The fgrep approach is clean and to the point; and one
more reason to love the *nix approach to handling everyday problems.
Fredrik's set|dict approach in Python above gives me one more reason to
love Python. And it is indeed fast, 5x!
Thank you all for all your help.
--
fynali
More information about the Python-list
mailing list