[Python-ideas] namedtuple literals [Was: RE a new namedtuple]

Nick Coghlan ncoghlan at gmail.com
Mon Jul 24 23:20:42 EDT 2017


On 25 July 2017 at 11:57, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Having such a builtin implictly create and cache new namedtuple type
> definitions so the end user doesn't need to care about pre-declaring
> them is still fine, and remains the most straightforward way of
> building a capability like this atop the underlying
> `collections.namedtuple` type.

I've updated the example I posted in the other thread with all the
necessary fiddling required for full pickle compatibility with
auto-generated collections.namedtuple type definitions:
https://gist.github.com/ncoghlan/a79e7a1b3f7dac11c6cfbbf59b189621

This shows that given ordered keyword arguments as a building block,
most of the actual implementation complexity now lies in designing an
implicit type cache that plays nicely with the way pickle works:

    from collections import namedtuple

    class _AutoNamedTupleTypeCache(dict):
        """Pickle compatibility helper for autogenerated
collections.namedtuple type definitions"""
        def __new__(cls):
            # Ensure that unpickling reuses the existing cache instance
            self = globals().get("_AUTO_NTUPLE_TYPE_CACHE")
            if self is None:
                maybe_self = super().__new__(cls)
                self = globals().setdefault("_AUTO_NTUPLE_TYPE_CACHE",
maybe_self)
            return self

        def __missing__(self, fields):
            cls_name = "_ntuple_" + "_".join(fields)
            return self._define_new_type(cls_name, fields)

        def __getattr__(self, cls_name):
            parts = cls_name.split("_")
            if not parts[:2] == ["", "ntuple"]:
                raise AttributeError(cls_name)
            fields = tuple(parts[2:])
            return self._define_new_type(cls_name, fields)

        def _define_new_type(self, cls_name, fields):
            cls = namedtuple(cls_name, fields)
            cls.__module__ = __name__
            cls.__qualname__ = "_AUTO_NTUPLE_TYPE_CACHE." + cls_name
            # Rely on setdefault to handle race conditions between threads
            return self.setdefault(fields, cls)

    _AUTO_NTUPLE_TYPE_CACHE = _AutoNamedTupleTypeCache()

    def auto_ntuple(**items):
        cls = _AUTO_NTUPLE_TYPE_CACHE[tuple(items)]
        return cls(*items.values())

But given such a cache, you get implicitly defined types that are
automatically shared between instances that want to use the same field
names:

    >>> p1 = auto_ntuple(x=1, y=2)
    >>> p2 = auto_ntuple(x=4, y=5)
    >>> type(p1) is type(p2)
    True
    >>>
    >>> import pickle
    >>> p3 = pickle.loads(pickle.dumps(p1))
    >>> p1 == p3
    True
    >>> type(p1) is type(p3)
    True
    >>>
    >>> p1, p2, p3
    (_ntuple_x_y(x=1, y=2), _ntuple_x_y(x=4, y=5), _ntuple_x_y(x=1, y=2))
    >>> type(p1)
    <class '__main__._AUTO_NTUPLE_TYPE_CACHE._ntuple_x_y'>

And writing the pickle out to a file and reloading it also works
without needing to explicitly predefine that particular named tuple
variant:

    >>> with open("auto_ntuple.pkl", "rb") as f:
    ...     p1 = pickle.load(f)
    ...
    >>> p1
    _ntuple_x_y(x=1, y=2)

In effect, implicitly named tuples would be like key-sharing
dictionaries, but sharing at the level of full type objects rather
than key sets.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list