[Python-Dev] Retrieve an arbitrary element from a set withoutremoving it

Steven D'Aprano steve at pearwood.info
Sat Oct 31 02:29:28 CET 2009


On Sat, 31 Oct 2009 07:27:22 am A.M. Kuchling wrote:
> On Fri, Oct 30, 2009 at 09:37:36PM +0100, Georg Brandl wrote:
> > I don't like this.  It gives a set object a hidden state, something
> > that AFAICS no other builtin has.

All objects have a reference count field which is hidden from Python 
code. The C API for objects has a flags field which specifies whether 
objects are read-only or read/write from Python code.

As of Python 2.6, type objects have an internal method cache. C code can 
clear it with PyType_ClearCache(), Python codes can't even see it.

Lists and dicts pre-allocate extra space, and record hidden state of how 
much of the space is actually in use. Sets may do the same. File 
objects may use internal buffers, with all the state that implies.


> > Iterating over an iterable is 
> > what iterators are for.

set.get(), or set.pick() as Wikipedia calls it, isn't for iterating over 
sets. It is for getting an arbitrary element from the set.

If the requirement that get/pick() cycles through the sets elements is 
the killer objection to this proposal, I'd be willing to drop it. I 
thought that was part of the OP's request, but apparently it isn't. I 
would argue that it's hardly "arbitrary" if you get the same element 
every time you call the method, but if others are happy with that 
behaviour, I'm not going to stand in the way.


> It also makes the object thread-unsafe; there's no way for two
> threads to iterate over it at the same time.  It's a terrible idea to
> introduce new things that won't work under threaded usage.

I would agree if the purpose of get/pick() was to iterate over the 
elements of the set, but that's not the purpose. The purpose is to 
return an arbitrary item each time it is called. If two threads get the 
same element, that's not a problem; if one thread misses an element 
because another thread grabbed it first, that's not a problem either. 
If people prefer a random element instead, I have no problem with 
that -- personally I think that's overkill, but maybe that's just me.


-- 
Steven D'Aprano


More information about the Python-Dev mailing list