Interface of the set classes
Pierre Barbier de Reuille
pierre.barbier at cirad.fr
Fri Oct 29 18:19:58 CEST 2004
Ok, I'll try to answer and better explain what I want to understand.
Alex Martelli a écrit :
> Pierre Barbier de Reuille <pierre.barbier at cirad.fr> wrote:
> Nope -- to use std::set<T>, you need to be able to define a < comparator
> that obeys the required semantic rules (either as an operator< on T, or
> separately in various possible ways); to use std::vector<T>, you don't.
> So, you cannot "switch freely" -- there are constraints.
Right ... but take hash_set and you have exactly the same kind of "set".
In my case, I already had the '<' operator ... so it wasn't a problem.
Then, you'll argue that you need to be hashable. True, but as I said (I
don't mean here I was clear about that :o) ), the choice of the
container usually depend on the data set.
> That's because a std::set<...> is based on ordering (a Python's set is
> based on hashing, so it imposes different constraints -- namely, that
> the items be hashable, rather than having them be comparable).
> Moreover, the set of valid operations on a std::vector<T>, call it V, is
> extremely different from the set of valid operations on an std::set<T>,
> call it S. You can write V, but you can't write S, for example.
> Here, too, your assertion that "you can switch freely" is utterly
Yes, that's why I needed __getitem__ only because some things cannot be
done (in Python 2.3) without.
>>I don't know why this is different in Python !
> It's not: neither in Python nor in C++ can you switch freely.
True, but in part ^_^ In C++, you can easyly design algorithm working
with whatever sequence you have. But for that you'll have to use
templates, iterators, probably tags, and on. But it's possible ... I
don't see a way to do it in Python ... and that's my whole point.
>>Using "add" instead of "append" and "update" instead of "extend" does
>>not makes sense to me.
> Try, in C++'s example above mentioned, calling S.push_back(...), then...
> does the inability to use *THAT* make sense to you? It sure does to
> _me_: V has an order based on insertion, so "pushing a new item at the
> back" makes sense (though Python's 'append' terminology is clearer). S
> doesn't (no order in Python, order based on item values in C++), so it
> makes no sense to even think of "pushing onto the back" (or equivalently
> "appending"). Basically, the commonality between S and V is: you can
> loop on either (net of order &c issues). And that holds in C++ as well
> as in Python.
Yes, but use insert_iterator and you can insert with both vector and set
with the same syntax ... even if, obviously, the position you gave for
set will be ignored. So :
inserter(result, result.end()) = 3;
will insert 3 into result, at the end if result is "ordered" (as the
list in python), or where it has to be for "sorted" or "unordered"
containers. Genericity in C++ comes from template functions most of the
>>It's even worse because Python rely more on
>>interface than in inheritence ... so making classes similar helps a lot
> Not when the semantics are too different, as in the case of S and V.
Ok, that's something I don't agree ... for me there is a common semantic
between S and V which is "storing values in a linear container". After
that, you can put the properties you need in that container, but if I
need to write an algorithm whose only assumption is "a linear container"
I want to be able to use whatever linear container I have ... and I
don't want to reimplement it for each.
>>>>Then, why can't you at all access sets element by position ? As Python
>>>>iterator are so poor (mainly, you cannot copy them ...) you have things
>>>That's what itertools.tee is for (in Python 2.4).
>>Oky, thanks ^_^ So I should not need the __getitem__ anymore in sets.
> Well, you surely didn't get operator on C++'s std::set<...>s, that's
> for sure...
Well, no ... but it's not needed ^_^ And if I really need it you have
the "advance" template function that do the __getitem__ stuff in the
most efficient way, whatever your _linear_ container is.
>>>It would be false (in 2.4) -- set is a builtin type, side by side with
>>>list and dict, not relying on either of them.
>>Yes, I know it will become a built-ins type ! But that does not explain
> It is one now -- just download 2.4.
>>this interface incompatible with lists. I would expect maximum
>>compatibility to be able to exchange the types. Even compatibility
>>between dictionnary and list is usefull ... because at some point you
>>can consider a dictionnary as a list of tuples. And then, too bad you
>>cannot because of simple design decisions.
> You make claims, above, about the container templates in C++'s standard
> library, that leave me in doubt. The very fact that you make strong
> assertions without even hedging them in any way appears to indicate you
> are experienced with those templates and familiar with them -- yet what
> you say is false, appearing to indicate you aren't. So, I don't know if
> I can take for granted all the rich conceptual framework of generic
> programming, and just point to the fact that the interface differences
> between Python's sequences, mappings and sets reflect exactly their
> conceptual semantic differences -- but then, that should be prima facie
> obvious!; or, do I need to try to forget the fact that you ever
> mentioned C++, and try to teach and explain both the conceptual
> underpinnings and the practical implications of generic programming (or
> more generally signature-based polymorphism) in purely Pythonic terms.
So I hope I made my point clearer this time ^_^ I used Python for a bit
less than 2 years now (so that's not much) and C++ for 5 years with big
projects in C++ (so that's more). So I'm no reusability specialist, but
I used a lot the Boost library, where I learned a lot about reusability
in C++. What I read about Python reusability was based on
signature-based polymorphism as you call it. But then, it fails with
lists, sets and dictionnary. And I don't really see any other way to do
that (but to add methods to the sets ... and I don't want to do so).
> You can probably clarify your background, and therefore which approach
> would help you most, more effectively, than I can just "stab in the
> dark" at it. Do you know generic programming, and the offerings in
> C++'s standard library, solidly and in-depth? If so, do you feel that
> C++'s standard library offers you the right amount of polymorphism among
> vectors, sets and maps, while you perceive a lesser amount among
> Python's lists, sets and dicts, and in what aspects precisely? Whether
> this is the case, or not, can you present a simplified problem where you
> feel there isn't enough polymorphism for your purposes? I may perhaps
> be able to offer help in any case, both practical and theoretical, but
> there is such a huge gulf among the various possibilities here that I
> think it's probably better for you to explain exactly where you come
> from and what you're after, first -- a theoretical background to help
> you understand the rationale behind the choices, a practical approach to
> use the best solutions for specific problems, etc, etc...!
Ok, so how do you write an algorithm on a container without specifying
which, knowing that you cannot use the iterators, neither the signature
of the class, nor the derivation ?
That was all the methods I imagined for Python's genericity. But none of
them work when you see the base containers.
>>No, what I want is not a new container type. I just want to be able to
>>change the container I use without changing all my codes ! It's too
> It depends on what operations you're doing on that container, of course.
>>expensive. At the end, I finish using sets and hashsets from the STL
>>exported to Python as sequences. The efficiency is the same and it just
>>works great :) I just feel bad not using the "standard" sets class ...
> But -- there IS no operator in the C++ standard library's sets! And
> hashset (spelled with an underscore, I believe) is indeed a SGI
> extension (so, in STL stricto sensu), but it's not in the C++ standard,
> and AFAIK doesn't support indexing either...
As I said higher in this mail, that's true, set and hashsets does not
support operator, but you can use the "advance" function to emulate
it. It inefficient with those, but when I have no other choices using
Python technics ... At least those parts are efficients with lists and
as inefficient as enything else with sets.
Pierre, who hopes to be clearer that time :)
More information about the Python-list