[C++-sig] Re: Adding __len__ to range objects

Raoul Gough RaoulGough at yahoo.co.uk
Tue Aug 19 14:05:28 CEST 2003


"Joel de Guzman" <djowel at gmx.co.uk> writes:

> Raoul Gough <RaoulGough at yahoo.co.uk> wrote:
>
>> I've had a chance to look at Joel's indexing_suite, and there
>> should be quite a lot of overlap in the functionality between
>> indexing_suite and the extended iterator_range. I've been thinking
>> about how to provide the range-as-iterable support within the
>> indexing_suite framework, but it looks like there would have to be
>> some changes to the base classes.
>
> Right. This is one of the things that's holding me back to formally
> announcing the suite. It isn't as simple as it started out to be
> anymore. It is very useful as it is, but the bottomline is that more
> experience with the sub-framework is needed to ascertain that the
> interface will hold. Sure enough, when I added the map-indexing
> suite (just recently), the interface needed to be changed.

I see - so the slicing had to be factored out, because you can't slice
an associative container. In theory you could use lower_bound and
upper_bound to do some kind of slicing using the key type rather than
positional indexing. The Python mapping types don't seem to support
this, though (I guess the Python dictionary type is hash-based, so it
couldn't easily support this idea). Actually, that's a whole other
kettle of fish, looking at the Python mapping functions!

>
>> I'm basically considering an indexing_suite-derived class template
>> that would use two iterators for construction and provide only the
>> relevant support functions (from __iter__ alone for input_iterators
>> through to __getitem__ and __setitem__ for random access iterators
>> with non-const value types). The problem is that the indexing_suite
>> base classes assume that all container functionality is always
>> present, all the way up to __delitem__ (impossible for
>> iterators). Thus, it is currently unsuitable for use with iterator
>> pairs, especially if one considers const_iterators. For instance,
>> the following doesn't come close to compiling:
>> 
>> boost::python::class_<std::vector<int> > ("Vector")
>>  .def (boost::python::vector_indexing_suite<std::vector<int> const>());
>> 
>> For obvious reasons (note the const qualifier). I haven't got far
>> enough to see how many changes it would require to allow containers
>> with less functionality than is currently assumed. Maybe Joel has
>> some comments about the feasibility of this idea?
>
> I've already considered wrapping iterator ranges. I would appreciate
> it a lot if you are interested to help. As to how many changes it
> would require to allow containers with less functionality, not
> much. Right now, I have factored out 1) proxyability and 2)
> slicability protocols. I think what needs to be factored out, in
> addition to above are 3) resizability and perhaps 4) mutability
> (constness).

Well, I wonder about taking the same approach further, e.g. with any
iterator range extensions - it seems like a lot of complexity involved
in *removing* expected functionality. Maybe it would be possible to
split the base class into separate functional units which can be
recombined as appropriate, rather than having an all-in-one base
class?

I used something like that approach in my iterator_range extensions
(at http://home.clara.net/raoulgough/boost/). Basically, I use the MPL
to choose what def() functions to call, and rely on the compiler not
instantiating unused templated functions. e.g.

  template<>
  struct maybe_add_len<true> {
      template<typename Range>
      static void apply (class_<Range> &class_)
      {
          // Specialized version *does* add __len__
          class_.def ("__len__", &distance<Range>);
      }
  };

  // ... in the main class_ setup function:

  maybe_add_len
    <is_convertible<Category, std::forward_iterator_tag>::value>
    ::apply (result);

This only does the class_.def() for __len__ when the iterator is (at
least) a forward_iterator. The unspecialized maybe_add_len has an
empty apply function, so if the <true> specialization is not used:

1. In Python, len() automatically produces "TypeError: len() of
   unsized object"

2. distance<Range> is never used (i.e. its address is never taken) and
   won't even be instantiated.

This could be extended using a traits class that defined things like
has_len, has_slicing, has_append, etc...

I vaguely remember reading something about over-zealous compilers that
instantiate things when they shouldn't, but I would count that as too
broken to work with :-)

>
> You can see the factored out code in <indexing_suite_detail.hpp>
> (see no_proxy_helper, proxy_helper, slice_helper and
> no_slice_helper).  The indexing suite chooses the appropriate
> handler based on some flags and the types involved. The same system
> will need to be set in place to support resizability and mutability.

Could you explain the role of the element proxy vectors (proxy_group)?
I wrote a test case which echoes construction and destruction of
contained elements to stdout, and it looks to me like the elements are
being copied even when the proxy is in use (i.e. a copy happens if I
do "print vector[0]" from Python). Initially, I thought the proxy was
some way to avoid extra copying of UDTs, but I guess that isn't it.

>
> Is it feasible? Of course ;-)

I thought the same thing about extending iterator_range to support len
and getitem, and look where that got me! (I tried three separate
implementations before arriving at the current one). Maybe it was
*too* feasible :-)

-- 
Raoul Gough
"Let there be one measure for wine throughout our kingdom, and one
measure for ale, and one measure for corn" - Magna Carta





More information about the Cplusplus-sig mailing list