[Python-3000] A better way to initialize PyTypeObject

Brett Cannon brett at python.org
Wed Nov 29 20:35:17 CET 2006


On 11/28/06, Talin <talin at acm.org> wrote:
>
> Guido van Rossum wrote:
> > On 11/28/06, Talin <talin at acm.org> wrote:
> >> Guido van Rossum wrote:
> >> > Some comments:
> >> >
> >> > - Fredrik's solution makes one call per registered method. (I don't
> >> > know if the patch he refers to follows that model.) That seems a fair
> >> > amount of code for an average type -- I'm wondering if it's too early
> >> > to worry about code bloat (I don't think the speed is going to
> >> > matter).
> >>
> >> One other thought: The special constants could themselves be nothing
> >> more than the offset into the PyTypeObject struct, i.e.:
> >>
> >>     #define SPECMETHOD_NEW ((const char*)offsetof(PyTypeObject,tp_new))
> >
> > I think this would cause too many issues with backwards compatibility.
> >
> > I like the idea much better to use special names (e.g. starting with a
> > ".").
> >
> >> In the PyType_Ready code, you would see if the method name had a value
> >> of less than sizeof(PyTypeObject); If so, then it's a special method
> >> name, and you fill in the struct at the specified offset.
> >>
> >> So the interpretation of the table could be very simple and fast. It
> has
> >> a slight disadvantage from the approach of using actual string names
> for
> >> special methods, in that it doesn't allow the VM to silently
> >> promote/demote methods to 'special' status.
> >
> > I think the interpretation will be fast enough (or else what you said
> > about premature optimization earlier wouldn't be correct. :-)
>
> OK, based on these comments and the other feedback from this thread,
> here's a more concrete proposal:
>
> == Method Table ==
>
> Method definitions are stored in a static table, identical in format to
> the existing PyMethodDef table.
>
> For non-method initializers, the most commonly-used ones will be passed
> in as parameters to the type creation function. Those that are less
> commonly used can be written in as a secondary step after the type has
> been created, or in some cases represented in the tp_members table.
>
> == Method Names ==
>
> As suggested by Guido, we use a naming convention to determine how a
> method in the method table is handled. I propose that methods be divided
> into three categories, which are "Normal", "Special", and "Internal"
> methods, and which are interpreted slightly differently at type
> initialization time.
>
> * Internal methods are those that have no equivalent Python name, such
> as tp_free/tp_alloc. Internal methods names start with a dot ("."), so
> tp_alloc would be represented by the string ".tp_alloc".


Haven't we had various arguments about how it's bad to use a leading dot to
have a special meaning?  I understand why we need some way to flag internal
methods on a type and I support going with an explicit way of specifying,
but is a dot really the best solution?  I mean something like INTERNAL_METH
"tp_alloc" would even work using C's automatic string concatentation and
doing::

  #define INTERNAL_METH "."

or whatever string we wanted that was not valid in a method name.  I don't
think this would lead us down the road of tons of macros and it makes things
very visible.

Internal methods are always stored into a slot in the PyTypeObject. If
> there is no corresponding slot for a given name, that is a runtime error.
>
> * Special methods have the double-underscore (__special__) naming
> convention. A special method may or may not have a slot definition in
> PyTypeObject. If there is such a slot, the method pointer will be stored
> into it; If there is no such slot, then the method pointer is stored
> into the class dict just like a normal method.
>
> Because the decision whether to put the method into a slot is made by
> the VM, the set of available slots can be modified in future Python
> releases without breaking existing code.
>
> * Normal methods are any methods that are neither special or internal.
> They are not placed in a slot, but are simply stored in the class dict.
>
> Brett Cannon brought up the point about __getitem__ being ambiguous,
> since there are two slots, one for lists and one for mappings. This is
> handled as follows:
>
> The "mapping" version of __getitem__ is a special method, named
> "__getitem__". The "list" version, however, is considered an internal
> method (since it's more specialized), and has the name ".tp_getitem".


Or the other option is that in the future we just don't have the distinction
and make sure that the __getitem__ methods do the requisite type checks.
The type check is done at some point in the C code anyway so it isn't like
there is a performance reason for the different slots.  And as for providing
a C-level function that provides a __getitem__ that takes Py_ssize_t, that
can still be provided, it just isn't what is put into the struct.

The one problem this does cause is testing for the interface support at the
C level.  But that could be a C function that looks for specific defined
functions.  Plus this would help make the C code less distinct from the way
things expose themselves at the Python level (which I personally think is a
good thing).

Greg Ewing's point about "next" is handled as follows: A function named
> "next" will never be treated as a special method name, since it does not
> follow the naming convention of either internal or special names.
> However, if you want to fill in the "tp_next" slot of the PyTypeObject,
> you can use the string ".tp_next" rather than "next".
>
> == Type Creation ==
>
> For backwards compatibility, the existing PyType_Ready function will
> continue to work on statically-declared PyTypeObject structures. A new
> function, 'PyType_Create' will be added that creates a new type from the
> input parameters and the method initialization tables as described
> previously. The actual type record may be allocated dynamically, as
> suggested by Greg Ewing.
>
> Structures such as tp_as_sequence which extend the PyTypeObject will be
> created as needed, if there are any methods that require those extension
> structures.
>
> == Backwards compatibility ==
>
> The existing PyType_Ready and C-style static initialization mechanism
> will continue to work - the new method for type creation will coexist
> alongside the old.
>
> It is an open question as to whether PyType_Ready should attempt to
> interpret the special method names and fill in the PyTypeObject slots.
> If it does, then PyType_Create can all PyType_Ready as a subroutine
> during the type creation process.
>
> Otherwise, the only modifications to the interpreter will be the
> creation of the new PyType_Create function and any required subroutines.
> Existing code should be unaffected.


Overall sounds good to me!

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20061129/7e47b095/attachment.html 


More information about the Python-3000 mailing list