[Python-Dev] A few questions about setobject

Wed Dec 28 11:24:45 CET 2005

Noam Raphael wrote:
> Is this desirable? 

Not sure what "this" refers to in your message: the text of the C API
documentation certainly is desirable as it stands (although it should
be clearer as to whether struct names should be prefixed).

The setentry typedef clearly violates the principles of the API, so
it should be renamed.

> Even if it is, it seems that the second sentence
> contradicts the first sentence.

Why does that seem so? To quote, the first sentence is

'All user visible names defined by Python.h (except those defined by
the included standard headers) have one of the prefixes "Py" or "_Py".'

and the second sentence is

'Names beginning with "_Py" are for internal use by the Python
implementation and should not be used by extension writers.'

I cannot see any contradiction between these.

> Perhaps the header file should stick
> with writing "struct { long hash; PyObject *key; }" three times (or
> define it in a macro and then undefine it), and the typedef be left to
> the .c file?

That would not be conforming to the C language standard (although
accepted by most compilers).

> I think it should be ok because it's never used
> really as a PyObject. Am I missing something? (Ok, I now thought that
> maybe it's because some parts don't treat dummy elements specially.
> But it seems to me that most parts do treat them specially, so perhaps
> it would be better to make a few changes so that all parts will treat
> them specially?)

In principle, you are right. One place that doesn't special-case the
dummy is set_clear_internal (in fact, the Py_DEBUG assertion is
completely useless there, AFAICT).

The tricky question is: can we be certain that we find all places,
in all code paths, where we have to special-case the dummy? Having
PyObject* which don't point to PyObject is error-prone.

Also, what would we gain? If you think it is speed: I doubt it. In
one place, a comment suggests that actually seeing the dummy element
is so much more unlikely than the other cases that, for performance,
the test for the dummy is done last. We would lose additional speed
in the cases where the dummy isn't yet considered.

> 3) The type of the result of a binary operator applied on a set and a
> frozenset is the type of the left set. You are welcomed to ignore
> this, but I just wanted to say that it seems to me better to make the
> operator symmetric, and to return a frozenset only if both sets are
> frozen.

How would you implement this? The result is obtained from copying the
left operand, and then applying the other operand. This is done so
that set subtyping becomes possible:

>>> class myset(set):pass
...
>>> x=myset([2,6])
>>> y=set([2,6])
>>> x.union(y)
myset([2, 6])

So if the result is not obtained by copying the left operand first,
how would you compute the result type, so that this example still
works?

Regards,
Martin