On Thu, Jun 25, 2020 at 9:02 AM Steele Farnsworth <swfarnsworth@gmail.com> wrote:
My point was only that, as far as I know, all the methods for built in container types that serve only to change what is contained return None, and that this was an intentional design choice, so changing it in one case would have to evoke a larger discussion about what those sorts of methods should return.

indeed -- and that is pretty darn baked in to Python, so I don't think it's going to change.

Note: I"n not sure your example with setdefault is correct:

seen = {}
for i in iterable:
  if seen.set_default(i, some_value) is not None:
    ...  # do something in case of duplicates
  else:
    ... # do something in case of first visit

A) you spelled setdefault() wrong -- no underscore

B) seen.setdefault(i, some_value) will return some_value if it's not there, and whatever the value it is it is (in this case, starting with an empty dict, it will be some_value always.

Running this code:

some_value = "sentinel"

iterable = [3, 2, 4, 2, 3]

seen = {}
for i in iterable:
    if seen.setdefault(i, some_value) is not None:
        # do something in case of duplicates
        print(f'{i} was already in there')
    else:
        # do something in case of first visit
        print(f'{i} was not already there')

results in:

In [9]: run in_set.py                                                          
3 was already in there
2 was already in there
4 was already in there
2 was already in there
3 was already in there

so not working.

But you can make it work if you reset the value:

some_value = "sentinel"

iterable = [3, 2, 4, 2, 3]

seen = {}
for i in iterable:
    if seen.setdefault(i, None) is not None:
        # do something in case of duplicates
        print(f'{i} was already in there')
    else:
        # do something in case of first visit
        print(f'{i} was not already there')
        seen[i] = some_value

In [11]: run in_set.py                                                          
3 was not already there
2 was not already there
4 was not already there
2 was already in there
3 was already in there

But this is a bit klunky as well, not really any better than the set version.




However, for the case at hand, adding a method similar to the dict.setdefault() would be a reasonable thing to do. I'm not sure what to call it, or what the API should be, but maybe:

class my_set(set):

    def add_if_not_there(self, item):
        if item in self:
            return True
        else:
            self.add(item)
        return False

seen = my_set()

for i in iterable:
    if seen.add_if_not_there(i):
        print(f'{i} was already in there')
    else:
        print(f'{i} was not already there')


However, while dict.setdefault does clean up and clarify otherwise somewhat ugly code, I'm not sure this is that much better than:

for i in iterable:
    if i in seen:
        print(f'{i} was already in there')
    else:
        seen.add(i)
        print(f'{i} was not already there')

But feel free to make the case :-)

Note that setdefault is in the MutableMapping ABC, so there could be some debate about whether to add this new method to the MutableSet ABC.

-CHB


 

I wouldn't be opposed to that discussion happening and for any changes that are made to happen within 3.x because I doubt that very much code that currently exists depends on these methods returning None or even use what they return at all.

On Thu, Jun 25, 2020, 10:28 AM Ben Avrahami <avrahami.ben@gmail.com> wrote:
Hey all,
Often I've found this kind of code:

seen = set()
for i in iterable:
  if i in seen:
    ...  # do something in case of duplicates
  else:
    seen.add(i)
    ... # do something in case of first visit

This kind of code appears whenever one needs to check for duplicates in case of a user-submitted iterable, or when we loop over a recursive iteration that may involve cycles (graph search or the like). This code could be improved if one could ensure an item is in the set, and get whether it was there before in one operation. This may seem overly specific, but dicts do do this:

seen = {}
for i in iterable:
  if seen.set_default(i, some_value) is not None:
    ...  # do something in case of duplicates
  else:
    ... # do something in case of first visit

I think the set type would benefit greatly from its add method having a return value. set.add would return True if the item was already in the set prior to insertion, and False otherwise.

Looking at the Cpython code, the set_add_entry already detects existing entries, adding a return value would require no additional complexity.

Any thoughts?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/6WYNYNG5J5HBD3PA7PW75RP4PMLOMH4C/
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ILNANLAGZR3S6VBMK7FJXUZZUMKGKJOV/
Code of Conduct: http://python.org/psf/codeofconduct/


--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython