> Besides performance, I don’t think it fits with Guido’s conception of the protocols as being more minimal than the builtin types—e.g., set has not just a & operator, but also an intersection method that takes 0 or more arbitrary iterables; the set protocol has no such method, so collections.abc.Set neither specifies nor provides an intersection method). It’s a bit muddy of a conception at the edges, but I think this goes over the line, and maybe have been explicitly thought about and rejected for the same reason as Set.intersection.

Making __missing__ a first class part of how __getitem__ seems more analogous to __getattr__ and __getattribute__ than the intersection method.  Is there any other dunder that is only implicitly called on builtin types?

I mostly agree with everything else you said.



On Tue, Apr 14, 2020 at 11:34 AM Steele Farnsworth <swfarnsworth@gmail.com> wrote:
I've implemented the class as a stand-alone module here: https://github.com/swfarnsworth/dynamicdict

It could in theory be made significantly more concise if `defdict_type` were the base for this class instead of `PyDict_Type`.



On Tue, Apr 14, 2020 at 1:32 PM Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
On Apr 13, 2020, at 18:44, Caleb Donovick <donovick@cs.stanford.edu> wrote:

I have built this data structure countless times. So I am in favor.

Maybe you can give a concrete example of what you need it for, then? I think that would really help the proposal. Especially if your example needs a per-instance rather than per-class factory function.

> Why can’t you just subclass dict and override that?  

Because TypeError: multiple bases have instance lay-out conflict is one of my least favorite errors.

But defaultdict, being a subclass or dict, has the same problem in the same situations, and (although I haven’t checked) I assume the same is true for the OP’s dynamicdict.

Perhaps `__missing__` could be a first class part of the getitem of protocol, instead of a `dict` specific feature.  So that
```
r = x[key]
```
means:
```
try:
  r = x.__getitem__(key)
except KeyError as e: # should we also catch IndexError?
  try:
    missing = x.__missing__
  except AttributeError:
    raise e from None
  r = missing(key)

```

Obviously this would come at some performance cost for non dict mappings so I don't know if this would fly.

Besides performance, I don’t think it fits with Guido’s conception of the protocols as being more minimal than the builtin types—e.g., set has not just a & operator, but also an intersection method that takes 0 or more arbitrary iterables; the set protocol has no such method, so collections.abc.Set neither specifies nor provides an intersection method). It’s a bit muddy of a conception at the edges, but I think this goes over the line, and maybe have been explicitly thought about and rejected for the same reason as Set.intersection.

On the other hand, none of that is an argument or any kind against your method decorator:

So instead maybe there could have standard decorator to get the same behavior?
```
def usemissing(getitem):
  @wraps(getitem)
  def wrapped(self, key):
    try:
      return getitem(self, key)
    except KeyError as e:
      try:
        missing = self.__missing__
      except AttributeError:
        raise e from None
    return missing(key)
  return wrapped    

```

This seems like a great idea, although maybe it would be easier to use as a class decorator rather than a method decorator. Either this:

    def usemissing(cls):
        missing = cls.__missing__
        getitem = cls.__getitem__
        def __getitem__(self, key):
            try:
                return getitem(self, key)
            except KeyError:
                 return missing(self, key)
        cls.__getitem__ = __getitem__
        return cls

Or this:

   def usemissing(cls):
        getitem = cls.__getitem__
        def __getitem__(self, key):
            try:
                return getitem(self, key)
            except KeyError:
                 return type(self).__missing__(self, key)
        cls.__getitem__ = __getitem__
        return cls

This also preserves the usual class-based rather than instance-based lookup for most special methods (including __missing__ on dict subclasses).

The first one has the advantage of failing at class decoration time rather than at first missing lookup time if you forget to include a __missing__, but it has the cost that (unlike a dict subclass) you can’t monkeypatch __missing__ after construction time. So I think I’d almost always prefer the first, but the second might be a better fit for the stdlib anyway?

I think either the method decorator or the class decorator makes sense for the stdlib. The only question is where to put it. Either fits in nicely with things like cached_property and total_ordering in functools. I’m not sure people will think to look for it there, as opposed to in collections or something else in the Data Types chapter in the docs, but that’s true of most of functools, and at least once people discover it (from a Python-list or StackOverflow question or whatever) they’ll learn where it is and be able to use it easily, just like the rest of that module.

It’s simple, but something many Python programmers couldn’t write for themselves, or would get wrong and have a hard time debugging, and it seems like the most flexible and least obtrusive way to do it. (It does still need actual motivating examples, though. Historically, the bar seems to be lower for new decorators in functools than new classes in collections, but it’s still not no bar…)

Additionally—I’m a lot less sure if this one belongs in the stdlib like @usemissing, but if you were going to put this on PyPI as a mappingtools or collections2 or more-functools or whatever—you could have an @addmissing(missingfunc) decorator, to handle cases where you want to adapt some third-party mapping type without modifying the code or subclassing:

    from sometreelib import SortedDict
    def missing(self, key):
        # ... whatever ...
    SortedDict = addmissing(missing)(SortedDict)

And if the common use cases are the same kinds of trivial functions as defaultdict, you could also do this:

    @addmissing(lambda self, key: key)
    class MyDict…

Alternatively, it could be implemented as part of one of the ABCs maybe something like:
```
class MissingMapping(Mapping):
  # Could also give MissingMapping its own metaclass
  # and do the modification of __getitem__ there.
  def __init_subclass__(cls, **kwargs):
    super().__init_subclass__(**kwargs)
    cls.__getitem__ = usemissing(cls.__getitem__)
  
  @abstractmethod
  def __missing__(self, key): pass

```

Presumably you’d also want to precompose a MutableMissingMapping ABC. Most user mappings are mutable, and I suspect that’s even more true for those that need __missing__, given that most uses of defaultdict are things like building up a multidict without knowing all the keys in advance.

As for the implementation, I think __init_subclass__ makes more sense than a metaclass (presumably a subclass or ABCMeta). Since mixins are all about composing, often with multiple inheritance, it’s hard to add metaclasses without interfering with user subclasses. (Or even with your own future—imagine if someone realizes MutableMapping needs its own metaclass, and MissingMapping already has one; now there’s no way to write MutableMissingMapping.) Composing ABCs with non-ABC mixins is already more of a pain than would be ideal, and I think a new submetaclass would make it worse.

But at any rate, I’m not sure this is a good idea. None of the other ABCs hide methods like this. For example, Mapping will give you a __contains__ if you don’t have one, but if you do write one that does things differently, yours overrides the default; here, there’d be no way to override the default __getitem__ to do things differently, because that would just get wrapped and replaced. That isn’t unreasonable behavior for a mixin in general, but I think it is confusing for an ABC/mixin hybrid in collections.abc.

Also, a protocol or ABC is something you can check for compliance (at runtime, or statically in mypy, or just in theory even if not in the actual code); is there ever any point in asking whether an object complies with MissingMapping? It’s something you can use an object as, but is there any way you can use a MissingMapping differently from a Mapping? So I think it’s not an ABC because it’s not a protocol.

Of course you could just make it a pure mixin that isn’t an ABC (and maybe isn’t in collections.abc), which also solves all of the problems above (except the optional one with the metaclass, but you already avoided that). But at that point, are there any advantages over the method or class decorator?

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/N6K2PLYJ7ZWEAN6FZWUGNJH23JBQQM33/
Code of Conduct: http://python.org/psf/codeofconduct/