[C++-sig] Re: Adding __len__ to range objects

Raoul Gough RaoulGough at yahoo.co.uk
Mon Aug 11 16:25:54 CEST 2003


David Abrahams <dave at boost-consulting.com> writes:

> Raoul Gough <RaoulGough at yahoo.co.uk> writes:
[snip]
>> On the other hand, the following should work OK, unless I'm
>> mistaken:
>>
>> iterator_copy = some_iterator
>> for x in some_iterator: print x
>> for x in iterator_copy: print x    # Prints same
>
> You're mistaken.  iterator_copy and some_iterator both refer to the
> same object.  Try it:
>
>     >>> i = iter(range(5))
>     >>> j = i
>     >>> for x in i: print x
>     ...
>     0
>     1
>     2
>     3
>     4
>     >>> for x in j: print x
>     ...

Of course - silly of me, but I still get confused sometimes by
Python's reference copying.

>> One real drawback
>> that I thought of is that adding this might break existing code. A
>> (broken) user-defined iterator might not provide enough smarts for
>> std::distance to work, and yet still be compatible with the existing
>> range support. 
>
> It would always supply enough smarts.  However, I am nervous about
> supplying __len__ for anything below random-access, especially for
> input iterators where it is probably destructive.

Well, I'm not so sure - std::distance almost certainly requires
iterator_traits<>::iterator_category which (AFAIK) range doesn't at
the moment. Since range currently only needs next() functionality, it
wouldn't need to know what the iterator's category is, but adding
len() wants more information for efficiency reasons.

>
>> Adding len would break this, since the distance function would get
>> generated even if it is never used or wanted.
>
> I don't think that's a problem.

Well, it certainly wouldn't be hard to fix code that does break.

>
>> I can think of two ways around this: add a new range_with_len type
>> (seems excessive), or provide some way for the client code to access
>> the generated class_ object to add their own extensions (probably
>> difficult?)
>
> I think you're barking up the wrong tree.  Maybe we ought to simply
> change our tune and say that range() produces an iterable-returning
> function rather than an iterator-returning function.  If we did that
> we could always generate __len__, and for that matter we could also
> generate __getitem__/__setitem__ for random-access ranges.

Sounds good to me. I think Andreas was also suggesting something in
this direction. I was almost going to suggest adding a _new_ C++
sequence wrapper (e.g. called view or sequence_view) that provides the
extra stuff as well as __iter__ support via the existing range
code. The only real benefit would be that the client code could then
choose whether to include the more sophisticated support. Maybe you
could achieve this via an optional iterator_category anyway (as you
pointed out, __len__ is probably a bad idea for input_iterators).

-- 
Raoul Gough
"Let there be one measure for wine throughout our kingdom, and one
measure for ale, and one measure for corn" - Magna Carta





More information about the Cplusplus-sig mailing list