Efficient grep using Python?
tim.peters at gmail.com
Thu Dec 16 05:54:51 CET 2004
> fromkeys(open(f).readlines()) and fromkeys(open(f)) seem to be
Semantically, yes; pragmatically, no, in the way explained before.
> When I pass an iterator instance(or a generator iterator) to the
> dict.fromkeys, it is expanded at that moment,
I don't know what "expanded at that moment" means to you. The CPython
implementation of dict.fromkeys() alternates between getting the next
vaule from its iterable argument, and storing that value as a dict
key. It does that regardless of whether a list, or any other kind of
iterable object, is passed to it. So the difference isn't in
fromkeys(), it's in what's passed to fromkeys().
> thus fromkeys(open(f)) is effectively same with
> fromkeys(list(open(f))) and fromkeys(open(f).readlines()).
Semantically, yes; and the last two are pragmatically the same too.
The first is pragmatically different.
> Am I missing something?
You at least were <wink>.
Build a file containing a million long identical lines (so the dict
only has 1 entry in the end). Try all 3 spellings and watch their
memory use. Report what you find.
More information about the Python-list