[New-bugs-announce] [issue35331] Incorrect __module__ attribute for _struct.Struct and perhaps a few others

Dan Snider report at bugs.python.org
Tue Nov 27 13:22:25 EST 2018

New submission from Dan Snider <mr.assume.away at gmail.com>:

_struct.Struct not defining a valid __module__ by prefixing its tp_name slot with "_struct" is inconsistent with every other extension type which is available in the corresponding module globals.

>From the documentation of the `tp_name` slot:

Pointer to a NUL-terminated string containing the name of the type. For types that are accessible as module globals, the string should be the full module name, followed by a dot, followed by the type name; for built-in types, it should be just the type name. If the module is a submodule of a package, the full package name is part of the full module name. For example, a type named T defined in module M in subpackage Q in package P should have the tp_name initializer "P.Q.M.T".

For dynamically allocated type objects, this should just be the type name, and the module name explicitly stored in the type dict as the value for key '__module__'.


I know that this is also a way to make something unpickleable, but that seems like a poor way to do it and since _struct.Struct was relatively alone in this, I figured it was an oversight.

At the end is the script I made to display all currently alive "builtins" classes that have been "PyType_Ready"ed. For brevity I further manually filtered out obvious cases where a specified module would be inappropriate. 

The main point is that I think the new contextvars classes, _struct.Struct, and the weakref classes are missing the "_struct", "_contextvars", and "_weakref" prefixes in their tp_name slots, respectively. Since _contextvars is one of the few extension modules using the multiphase initialization protocol, maybe it should go in their type dicts (although the prefix method still works) instead, although i think the docs were referring to heap allocated types.

if __name__=='__main__':
    import sys, collections
    subclassesof = type.__subclasses__
    def get_types(*names):
        r = {"__builtins__":{'__import__':__import__, 'globals':globals}}
        for name in names:
            exec(f'from {name} import __dict__ as d; globals().update(d)', r)
        return dict.fromkeys(r[k] for k in r if isinstance(r[k],type)).keys()
    def derivative_classes(cls):
        a = b = r = {*subclassesof(cls)}
        while b:
            r, a, b, = r|b, b, set().union(*map(subclassesof, b))
        return r | a    
    classes = derivative_classes(object)
    singles = None, NotImplemented, ...
    od = collections.OrderedDict()
    odtypes = iter(od), od.keys(), od.items(), od.values()
    bltns = {cls for cls in classes if cls.__module__=='builtins'}
    bltns-= get_types('builtins', 'types', '_collections_abc')
    bltns-= {*map(type, odtypes)} | {*map(type, singles)}
    for cls in sorted(bltns, key=vars(type)['__name__'].__get__):
        print(f'# {sys.getrefcount(cls):4} {cls.__name__}')

# all of these are in _contextvars.__dict__ but have their __module__=='builtins':
#   25 Context
#   15 ContextVar
#   12 Token

# from _struct
#   23 Struct # IS in _struct.__dict__
#   11 unpack_iterator # no tp_new so make sense to leave as-is

# These are here because it's a mystery how they were included in the results
# without importing _testcapi:
#   25 hamt
#    8 hamt_array_node
#    8 hamt_bitmap_node
#    8 hamt_collision_node

# no idea what these are:
#   11 items
#   11 values
#   11 keys

# these are all in _weakref.__dict__
#   76 weakcallableproxy
#   76 weakproxy
#   32 weakref

messages: 330544
nosy: bup
priority: normal
severity: normal
status: open
title: Incorrect __module__ attribute for _struct.Struct and perhaps a few others
versions: Python 3.7, Python 3.8

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list