Interesting list() un-optimization
Terry Reedy
tjreedy at udel.edu
Thu Mar 7 17:53:27 EST 2013
On 3/7/2013 3:41 PM, Wolfgang Maier wrote:
>
>>>>> Iterators do not generally have __len__ methods.
>>>>>
>>>>>>>> len(iter(range(10)))
>>>>> Traceback (most recent call last):
>>>>> File "<stdin>", line 1, in <module>
>>>>> TypeError: object of type 'range_iterator' has no len()
>>>>
>>>> But iterators have a length hint method that are used for some
>>>> optimizations and preallocations, too.
>>>>
>>>>>>> i = iter(range(10))
>>>>>>> i.__length_hint__()
>>>> 10
>>>>
>>>> See http://www.python.org/dev/peps/pep-0424/
>>>
>
> very interesting (hadn't heard of it)! Just checked the PEP,
> then tested list()'s behavior, and it is just as described:
>
> class stupid(list):
> def __len__(self):
> print ('len() called')
> return NotImplemented
>
> def __length_hint__(self):
> print ('hint requested')
> l=iter(self).__length_hint__()
> print (l)
> return l
>
> a=stupid((1,2,3))
> len(d)
> ======>
> len() called
>
> Traceback (most recent call last):
> File "<pyshell#79>", line 1, in <module>
> len(d)
> TypeError: an integer is required
>
> list(d)
> ======>
> len() called
> hint requested
> 3
> [1, 2, 3]
>
> so list() first tries to call the iterable's __len__ method. If that raises a
> TypeError it falls back to __length_hint__ .
> What I still don't know is how the listiterator object's __length_hint__ works.
> Why, in this case, does it know that it has a length of 3 ? The PEP does not
> provide any hint how a reasonable hint could be calculated.
'How' depends on the iterator, but when it has an underlying concrete
iterable of known length, it should be rather trivial as .__next__ will
explicitly or implicitly use the count remaining in its operation. Part
of the justification of adding __length_hint__ is that in many cases it
is so easy. Iterators based on an iterator with a length_hint can just
pass it along.
The list iterator might work with a C pointer to the next item and a
countdown count initialized to the list length. The __next__ method
might be something like this mixed C and Python pseudocode:
if (countdown--) return *(current--);
else raise StopIteration
(For non-C coders, postfix -- decrements value *after* retrieving it.)
Then __length_hint__ would just return countdown. Tuples would work the
same way. Sets and dicts would need current-- elaborated to skip over
empty hash table slots. Range iterators would add the step to current,
instead of 1.
--
Terry Jan Reedy
More information about the Python-list
mailing list