Something keeps nibbling on my list
Alex Martelli
aleaxit at yahoo.com
Wed Apr 18 10:27:58 EDT 2001
"Steve Purcell" <stephen_purcell at yahoo.com> wrote in message
news:mailman.987537806.6571.python-list at python.org...
[snip]
> FILENAME = '/etc/hosts.deny'
> OUTPUT = 'hosts.deny.test'
>
> orig = open(FILENAME)
> lines = orig.readlines()
> orig.close()
>
> def skip(line):
> return line[:1] == '#'
>
> filtered = open(OUTPUT,'w')
> for linenum in range(len(lines)):
> line = lines[linenum]
> if skip(line):
> continue
> if line in lines[:linenum]: # duplicate
> continue
> filtered.write(line)
>
> filtered.close()
This will work fine. The following also works, and it
may be faster (it uses a different way to detect
duplicates, with an auxiliary dictionary):
lines_seen = {}
def skip(line):
if lines_seen.has_key(line) or\
line.startswith('#'): return 1
lines_seen[line] = 1
return 0
filtered = open(OUTPUT, 'w')
seen = {}
for line in open(FILENAME).readlines():
if not skip(line):
filtered.write(line)
filtered.close()
Some would enjoy coding the last block as
open(OUTPUT, 'w').writelines(
[line for line in open(FILENAME).readlines()
if not skip(line)
]
)
but I personally think that's the kind of things
that gives list comprehensions a bad name, just as:
open(OUTPUT, 'w').writelines(
filter(keep, open(FILENAME).readlines())
)
(with keep defined like skip, but negated) would
be slightly too much of a good (functional) thing.
Alex
More information about the Python-list
mailing list