[C++-sig] Support for Indexing and Iteration Safety
David Abrahams
dave at boost-consulting.com
Wed May 28 20:05:26 CEST 2003
I have some ideas about how to better support the indexing operator[]
and related thoughts about the safety of iterators. Rather than
following my usual practice of posting a big tome about it which will
overwhelm everyone so they don't respond ;->, let me just point out a
few problems with the current Boost.Python implementation. You let me
know if you care about any of these issues:
1. Unlike most of the other operators, which can be wrapped using
"self op other", there's no easy way to wrap the indexing
operator. You end up having to write your own __getitem__
and __setitem__ functions.
2. You also have to implement the code which raises an exception
if the index is out of range or the key is not found (in case
you're implementing a mapping). If you forget, your extension
class is prone to crashes and accessing invalid data.
3. Even if you get that right, you may still be prone to crashes.
Suppose you're wrapping a std::vector<SomeBigObject>. If you
want to support syntax like:
>>> foo = vector_SomeBigObject(15)
>>> foo[10].bar = 3 # modify an element in-place
or if you just want to avoid unneccessarily copying the
SomeBigObject element when you access it:
>>> x = foo[10].bar
then you have to use return_internal_reference in your
__getitem__. All well and good. Even if you do:
>>> del foo
>>> print x
your x is still valid because of the call policy; it is kept
alive. On the other hand, what if you do:
>>> del foo[10] # erase the 10th item
>>> print x
In this case it crashes, or prints bogus data, because there's
no linkage between modifications to the *state* of foo and
the x object. Another case:
>>> x = foo[10]
>>> foo[10] = y
>>> print x, y
Now you'll see that x and y are always identical, because x
refers to the place in the array where you've written the
value of y.
4. Similar problem with exposed iterators:
>>> for x in some_vec:
... if x.test_something():
... x.foo = 3
... else:
... some_vec.pop_back()
Even if your iterator uses return_internal_reference to keep
some_vec alive during iteration and allow in-place
modification of the data it references, it does nothing to
ensure that things are OK when the *contents* of some_vec
change out from under you. The above could lead to a crash
or bogus results.
These problems are not vector-specific. I have some ideas about
addressing them but I'm not sure how far to go, and first I want to
know how many people care.
Thanks for reading,
--
Dave Abrahams
Boost Consulting
www.boost-consulting.com
More information about the Cplusplus-sig
mailing list