Sounds good to me.

If you want a second opinion als Raymond Hettinger.

On Wed, Jan 8, 2020 at 11:09 Bar Harel <bzvi7919@gmail.com> wrote:
Well it depends on the consensus for __missing__, and the consensus for get().

I believe modifying get() is out of the question as it will lead to breakage. If so, whether it's a quirk or not doesn't matter as we can't change it. Adding a comment to the docs should suffice, along with updating the data model to explicitly state dict.__missing__ instead of object.__missing__.

As for __missing__, if it's only a dict thing, I don't believe Mapping should be aware of it's existence, which means the patch should center on UserDict behaving differently (thus mimicking dict's quirky behavior).

Since we cannot override get() to access self.data, (as it will bypass and break custom __getitem__) I think we might as well modify it to use the __contains__ it already created.

A simple check of "if k in D __getitem__" will slow only UserDict, not all mappings, and be an acceptable solution.

That's all as long as we accept the 2 facts that __missing__ is dict-only and .get() will stay somewhat quirky and unexpected but documented.

On Wed, Jan 8, 2020 at 8:11 PM Guido van Rossum <guido@python.org> wrote:
Hum, it looks like the UserDict.get() behavior is an accident due to the way Mapping.get() is implemented. The comment there says

    'D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.'

but the implementation just tries D[k] and catches KeyError. The UserDict.__contains__() implementation is explicitly code to avoid __missing__, since the default Mapping.__contains__() implementation *also* just calls D[k] and catches KeyError.

More indication that the UserDict.get() behavior is an accident: in test_userdict.py there's a test for __missing__, test_missing() but it doesn't test get() at all, only __getitem__().

How were you proposing to fix the situation? I could imagine changing Mapping.get() to call __contains__() first. That would slow down Mapping.get() slightly (the default implementation would end up calling __getitem__() twice if the key is present), but I don't think that's a grave concern (if people want it faster there are a zillion ways to do that).

On Wed, Jan 8, 2020 at 9:49 AM Bar Harel <bzvi7919@gmail.com> wrote:
Sorry for the double post then :-)

As for get(). Current implementation of UserDict returns the default value for get() if __missing__() raises KeyError. Which sounds reasonable to me. After all, you would logically expect __missing__ to be called for get() as you tried to get a non-existing value.

If __missing__ is only a dict thing, then I'll add a note to dict's .get() method saying it doesn't call __missing__ for missing values.
I can also modify UserDict to behave in a consistent manner.

Does that make sense?

On Wed, Jan 8, 2020 at 7:42 PM Guido van Rossum <guido@python.org> wrote:
(Note: We received your email twice.)

I don't think __missing__ should be called by get() -- get() already has a way to deal with missing keys, and making it use two different mechanisms would be weird (e.g. if get() calls __missing__, is the default value ever used?).

To answer your second question, __missing__ is only a dict thing, it is not part of Mapping.

On Wed, Jan 8, 2020 at 8:07 AM Bar Harel <bzvi7919@gmail.com> wrote:
Hey guys,

I was just about to fix dict's get() to call __missing__ if a key doesn't exist (before returning the default value) but realized that although small, that patch can cause future issues.
Right now there's an inconsistency:

>>> from collections import UserDict
>>> class A(dict):
...  def __missing__(self, key):
...   print(key)
...
>>> class B(UserDict):
...  def __missing__(self, key):
...   print("UserDict", key)
...
>>> a = A()
>>> b = B()
>>> a.get(123)
>>> b.get(123)
UserDict 123
>>> a.get(123, "abc")
'abc'
>>> b.get(123, "abc")
UserDict 123

The reason for this inconsistency is because the Mapping abc and dict behave differently.
Dict's get doesn't call __getitem__ which causes the call not to route to __missing__.
MutableMapping's get calls __getitem__, which UserDict implements as a check to __missing__ as well.

According to the doc, the specification requires dict's __getitem__ to call __missing__. It doesn't say anything about get().

Should get() call __missing__?

If it does, things like defaultdict.get() might break. It will however be more consistent with dict's specification.
If it doesn't, we expect Mapping to not care about __missing__ as it's only a dict thing, which will require UserDict to override get(). Dict's get() will need to receive a doc update as well stating __missing__ is not called.

Second question is: Is __missing__ only a dict thing, or is it part of the Mapping ABC?

I would expect it to be a part of the Mapping ABC, with subclasses not having to implement it. Right now it's not.

Looking forward for your inputs,
Bar Harel
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SDXOEMAEM6KQ3CQCJVBVRT5QNSPAVU6X/
Code of Conduct: http://python.org/psf/codeofconduct/


--
--Guido van Rossum (python.org/~guido)


--
--Guido van Rossum (python.org/~guido)
--
--Guido (mobile)