[Python-Dev] Currently baking idea for dict.sequpdate(iterable,
Mon, 25 Nov 2002 21:07:39 -0500
> Perhaps the current set implementation could be made faster by
> limiting it somewhat more? The current API attempts to be fast *and*
> flexible, but tends to favor correctness over speed where a trade-off
> has to be made. But maybe that's a poor way of selling a new built-in
> data type, and we would do better by having a truly fast
> implementation that is more limited? It's easier to remove
> limitations than to add them.
I don't think you *can* get "fast" membership testing unless sets inherit
the C-level dict.__contains__ directly. The time burden of going thru a
Python __contains__ method can't be overcome.
For all the rest, it's plenty fast enough for me. Note that the spambayes
project uses sets freely, mostly for uniquification, but also for
convenience in manipulating usually-small sets of email.Message objects. If
you're going to *do* something with the set elements (which that project
does), the time to uniquify the original sequence is likely (as it is in
that project) trivial compared to the rest. For contexts requiring
heavy-duty repeated membership testing, though, spambayes code still used a
straight dict with values 1.
premature-etc-ly y'rs - tim