[issue21507] set and frozenset constructor should use operator.length_hint to guess the size of the iterator

Josh Rosenberg report at bugs.python.org
Thu May 15 04:08:21 CEST 2014


Josh Rosenberg added the comment:

I think the argument against using PyObject_LengthHint for the general iterable case is that for inputs other than sets or dicts, the assumption is that significant deduplication will occur. With your patch, if I run:

myset = frozenset([0] * 1000000)

it will allocate space for, if I'm reading it correctly, two million entries, and use exactly one of them. Obviously not the common case, but preallocating when you have no idea how much duplication will occur seems like a bad idea.

With a set or dict, you know it's already deduplicated, so preallocation is always a good thing, but for the general case, you'll be using more memory than necessary much of the time.

----------
nosy: +josh.rosenberg

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21507>
_______________________________________


More information about the Python-bugs-list mailing list