[C++-sig] Re: Adding len to range objects

Fri Aug 22 21:27:16 CEST 2003

"Joel de Guzman" <djowel at gmx.co.uk> writes:

> Raoul Gough <RaoulGough at yahoo.co.uk> wrote:
>> "Joel de Guzman" <djowel at gmx.co.uk> writes:

[snip digression into associative mappings]

>> Well, I wonder about taking the same approach further, e.g. with
>> any iterator range extensions - it seems like a lot of complexity
>> involved in *removing* expected functionality. Maybe it would be
>> possible to split the base class into separate functional units
>> which can be recombined as appropriate, rather than having an
>> all-in-one base class?
>
> That's a possibility, yes. We can contain the functionality into a
> couple of protocol groups. Making the suite finer-grained, but not
> too much.  Can you think of a nice way to group the existing and
> future functionality?  As it is, the protocols I outlined
> (proxyability, slicability, resizability and mutability) are
> somewhat implementation-centric.

The more I think about, the more I see this as two separate domains of
freedom. Firstly, there is the raw functionality of the C++ container
(e.g. random access indexing, resizability, etc.) and then there are
the Python interfacing aspects (i.e. proxies and return policies).

For the container functionality aspects, we could probably break the
suite up simply in terms of what Python methods are implemented (like
the maybe_define_len, maybe_define_getitem functions I wrote for
iterator_range). A likely problem with this approach is achieving good
reuse of common code, but maybe extending the existing container_utils
could provide an answer.

Take the __len__ method for example. There are probably only two
different implementations of this on the C++ side - either using the
container::size member function, or std::distance(begin_iter,
end_iter). This could mean something like:

  template<typename Container>
  struct memfn_len {
    typename Container::size_type
    static apply (call_traits<Container>::param_type c) {
      return c.size();
    }
  };

  template<typename Iterator>
  struct iterator_len {
    typename Iterator::difference_type
    static apply (iterator_pair<Iterator> const &c) {
      return std::distance (c.begin(), c.end());
    }
  };

This still requires some glue to choose the right version and convert
the return value to Python. In general, there may also need to be glue
to extract parameters from Python.

[snip]
>> This could be extended using a traits class that defined things
>> like has_len, has_slicing, has_append, etc...
>
> In some cases, hard-coding the enabling/disabling of features to
> types is not desireable. Sometimes, for example, you don't want
> proxying for vectors, even if it can. I opt for more user control
> over which is and which is not enabled.

This ties in with what I was thinking about the separation of
container functionality from the interfacing aspects of the
parameterization.

Using a traits class doesn't necessarily hard-code the determination
of what features to provide, it just alters the packaging of that
information. My personal preference is to package the information into
as few template parameters as possible. For example,

  template <class Container
           , class Traits = indexing_traits<Container>
           , bool UseProxy = Traits::default_use_proxy>
    class indexing_suite { /*... */ }

The client code could always substitute custom_indexing_traits for the
second parameter (possibly deriving the custom traits from an existing
traits instance). I guess I would still keep the proxy parameter
separate, since to proxy or not is somewhat orthogonal to the features
of the container type itself (which is what you were getting at, I
think).

Some of the advantages for traits over multiple parameters are (IMO)

1. the traits class provides the parameters as named constants, rather
   than having bare true/false values in client code.

2. it removes any particular ordering of the parameters, so you don't
   have to provide explicitly all parameters up to the one you want to
   override.

Basically, I think a traits class scales better as the number of
parameters grows. It also might provide some support functions (I'm
wondering about the kind of functions that std::char_traits has, like
compare and find).

>>> You can see the factored out code in <indexing_suite_detail.hpp>
>>> (see no_proxy_helper, proxy_helper, slice_helper and
>>> no_slice_helper).  The indexing suite chooses the appropriate
>>> handler based on some flags and the types involved. The same
>>> system will need to be set in place to support resizability and
>>> mutability.
>> 
>> Could you explain the role of the element proxy vectors
>> (proxy_group)?  I wrote a test case which echoes construction and
>> destruction of contained elements to stdout, and it looks to me
>> like the elements are being copied even when the proxy is in use
>> (i.e. a copy happens if I do "print vector[0]" from
>> Python). Initially, I thought the proxy was some way to avoid extra
>> copying of UDTs, but I guess that isn't it.
>
> Following Pythion semantics, basically, we want this:
>
>     val = c[i]
>     c[i].m()
>     val == c[i]
>
> where m is a non-const (mutating) member function (method). Yet, we
> also want this:
>
>     val = c[i]
>     del c[i]
>     # do something with val
>
> No dangling references == no return_internal_reference
>
>
> A proxy is basically a smart-pointer that has a reference to a
> container and an index. When *attached*, dereferencing the proxy
> will index the container. A proxy can be detached from the
> container. In the *detached* state, it holds the actual object which
> it returns when dereferenced. A proxy will be detached whenever you
> remove or replace that element from the container (e.g. del C[I],
> C[0:I] = [a,b,c], C[I] = x).
>
> The proxy_group manages the proxies and is responsible for detaching
> a proxy appropriately when necessary.

Well, I've got to admire that for dedication to Python semantics! I
can think of an alternative, though, that would simplify the suite
considerably: document a requirement that if the client code wants
Python-style semantics it should use a container of shared_ptr. This
obviously has some disadvantages over the proxies, but I think it
achieves the same semantics without any additional complexity in the
indexing suite.

>
>>> Is it feasible? Of course ;-)
>> 
>> I thought the same thing about extending iterator_range to support
>> len and getitem, and look where that got me! (I tried three
>> separate implementations before arriving at the current one). Maybe
>> it was *too* feasible :-)
>
> Haha! ;-) 

I guess my suggestions so far as still fairly abstract, but what I
*have* done is try to figure out a matrix of Python functionality
versus STL container type (and iterator category). You can see the
results on my site for now (http://home.clara.net/raoulgough/boost/),
but maybe we should create a directory in the sandbox CVS if we're
really going to work on this?

Any comments on my ramblings up until now are welcome, of course.

-- 
Raoul Gough
"Let there be one measure for wine throughout our kingdom, and one
measure for ale, and one measure for corn" - Magna Carta

[C++-sig] Re: Adding __len__ to range objects

[C++-sig] Re: Adding len to range objects