[Python-Dev] A few questions about setobject

Noam Raphael noamraph at gmail.com
Mon Dec 26 23:16:19 CET 2005


Hello,

I'm going over setobject.c/setobject.h, while trying to change them to
support cheap frozen-copying. I have a few questions.

1) This is a part of setobject.h:

typedef struct {
	long hash;
	PyObject *key;
} setentry;

typedef struct _setobject PySetObject;
struct _setobject {
	...
	setentry *table;
	setentry *(*lookup)(PySetObject *so, PyObject *key, long hash);
	setentry smalltable[PySet_MINSIZE];
	...
};

It seems to me that setentry and _setobject are defined for every file
that includes Python.h. In the Python C API, in the section about
include files, it is written that:
"All user visible names defined by Python.h (except those defined by
the included standard headers) have one of the prefixes "Py" or "_Py".
Names beginning with "_Py" are for internal use by the Python
implementation and should not be used by extension writers. Structure
member names do not have a reserved prefix."

Is this desirable? Even if it is, it seems that the second sentence
contradicts the first sentence. Perhaps the header file should stick
with writing "struct { long hash; PyObject *key; }" three times (or
define it in a macro and then undefine it), and the typedef be left to
the .c file?


2) The hash table used by sets uses a dummy element for deleted
entries. The implementation goes into the trouble of allocating it,
managing its reference count, and deallocating it at the end. What is
the reason for that? It seems to me that the only requirement of the
dummy element is that it shouldn't be a pointer to a valid PyObject,
and as such I would think that defining it like

int dummy_int;
PyObject *dummy = (PyObject *)(&dummy_int);

would be enough, and that it shouldn't be INCREFed or DECREFed every
time it is used. I think it should be ok because it's never used
really as a PyObject. Am I missing something? (Ok, I now thought that
maybe it's because some parts don't treat dummy elements specially.
But it seems to me that most parts do treat them specially, so perhaps
it would be better to make a few changes so that all parts will treat
them specially?)


3) The type of the result of a binary operator applied on a set and a
frozenset is the type of the left set. You are welcomed to ignore
this, but I just wanted to say that it seems to me better to make the
operator symmetric, and to return a frozenset only if both sets are
frozen.


If you think that these questions belong to c.l.py, then please say so
and I will go away.

Have a good day,
Noam


More information about the Python-Dev mailing list