Many times I have the need for a dictionary and its inverse form (keys and values swapped). For static dictionaries, it is easy, but sometimes I wish there were a class to hold and update both dictionaries at the same time. Also it will avoid creating 2 variables. Sample: # Current code d = {'foo': 10, 'bar': 12, 'baz': 17} d_inv = {v: k for k, v in d.items()} # My proposal d = InverseDict({'foo': 10, 'bar': 12, 'baz': 17}) d['foo'] == 10 d.inv[10] == 'foo' This class could exist in the collections module. The way the inverse dictionary is accessed could be different (another attribute name or another form): d.I[10] # Like numpy's matrix.T for transpose d.inverse[10] # More verbose d[10:1] # d[10:0] and d[10] would mean the direct dict # while d[10:1] the inverse one. # This would limit the usage to get/set/del only... (-d)[10] # negative symbol but it would require parenthesis. # (~d)[10] should be more correct because of # the operator's name The inverse version of the dict may or may not allow repeated items (overwrite vs error). The important part would be to remove keys from the inverse dict when the value is changed in the direct one: d['foo'] = 1 d.inv[1] == 'foo' d.inv[10] -> will raise KeyError instead of returning 'foo' Regards, João Bernardo
On Fri, Feb 12, 2016 at 1:14 PM, João Bernardo <jbvsmo@gmail.com> wrote:
Many times I have the need for a dictionary and its inverse form (keys and values swapped). ... The important part would be to remove keys from the inverse dict when the value is changed in the direct one:
d['foo'] == 10 d.inv[10] == 'foo' d['foo'] = 1 d.inv[1] == 'foo' d.inv[10] -> will raise KeyError instead of returning 'foo'
Sounds like what you're looking for is a view, rather than an additional concrete dictionary. In other words, it's a (presumably more efficient) way of doing this: def d.inv[x]: for key in d: if d[key] == x: return key raise KeyError Does that sound right? ChrisA
On Fri, Feb 12, 2016 at 12:18 AM, Chris Angelico <rosuav@gmail.com> wrote:
Sounds like what you're looking for is a view, rather than an additional concrete dictionary.
If the view offers the same O(1) complexity on amortized case, then it is OK. For the typical use in my experience (small dictionaries and several accesses) space is not a concern, so really adding a concrete dictionary like I've been doing is also fine. João Bernardo
You hit the nail on the head that inverting dictionaries has a big problem with collisions, as values become keys. One solution to this is to use a MultiDict, a la boltons' OrderedMultiDict: https://boltons.readthedocs.org/en/latest/dictutils.html#boltons.dictutils.O... But yes, I use inversions all the time, and would love to see a MultiDict in the collections module to support that use case. Mahmoud https://github.com/mahmoud https://twitter.com/mhashemi http://sedimental.org On Thu, Feb 11, 2016 at 6:14 PM, João Bernardo <jbvsmo@gmail.com> wrote:
Many times I have the need for a dictionary and its inverse form (keys and values swapped). For static dictionaries, it is easy, but sometimes I wish there were a class to hold and update both dictionaries at the same time. Also it will avoid creating 2 variables.
Sample:
# Current code d = {'foo': 10, 'bar': 12, 'baz': 17} d_inv = {v: k for k, v in d.items()}
# My proposal d = InverseDict({'foo': 10, 'bar': 12, 'baz': 17})
d['foo'] == 10 d.inv[10] == 'foo'
This class could exist in the collections module. The way the inverse dictionary is accessed could be different (another attribute name or another form):
d.I[10] # Like numpy's matrix.T for transpose d.inverse[10] # More verbose
d[10:1] # d[10:0] and d[10] would mean the direct dict # while d[10:1] the inverse one. # This would limit the usage to get/set/del only...
(-d)[10] # negative symbol but it would require parenthesis. # (~d)[10] should be more correct because of # the operator's name
The inverse version of the dict may or may not allow repeated items (overwrite vs error). The important part would be to remove keys from the inverse dict when the value is changed in the direct one:
d['foo'] = 1 d.inv[1] == 'foo' d.inv[10] -> will raise KeyError instead of returning 'foo'
Regards, João Bernardo
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Thursday, February 11, 2016 6:15 PM, João Bernardo <jbvsmo@gmail.com> wrote:
Many times I have the need for a dictionary and its inverse form (keys and values swapped). For static dictionaries, it is easy, but sometimes I wish there were a class to hold and update both dictionaries at the same time. Also it will avoid creating 2 variables.
This really sounds like something you should just flesh out and put on PyPI, and then suggest for stdlib inclusion once it's stable. And not just for the usual reasons, but because I think there are some design decisions that you'll want to work through and then play with in real code before you're sure you're happy with it. For example, when I've built something like this, sometimes I want errors on duplicate values, sometimes just pick one arbitrarily, sometimes I want the inverse dict to store sets of values, and sometimes I even want _both_ to store sets of values. Do you want to do just one of those, or two or more? Controlled by flags? Or different attribute names? Or different classes?
On Fri, Feb 12, 2016 at 2:07 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
For example, when I've built something like this, sometimes I want errors on duplicate values, sometimes just pick one arbitrarily, sometimes I want the inverse dict to store sets of values, and sometimes I even want _both_ to store sets of values. Do you want to do just one of those, or two or more? Controlled by flags? Or different attribute names? Or different classes?
My use case is quite simple. I believe I always used this pattern on a one-to-one mapping, so allowing repeated values to overwrite or throwing an error is not exactly important to me, but it may be to someone else. Someone suggested a MultiDict sort of thing, but I don't think I need that amount of complexity here. I will write a concrete class and post on PyPI then. João Bernardo
the other restriction is that both keys and values (or is it values and keys?) both need to be immutable. but yes, I've needed this a lot, and I think that fact that both sets of keys need to be unique isn't really a restriction -- if you need to look it up either way, then it's probably naturally a one-to-one mapping anyway. I've needed this a few times, and have always kludged something together, rather than make a proper, robust tested class. So I look forward to seeing what you come up with. -CHB On Fri, Feb 12, 2016 at 5:27 AM, João Bernardo <jbvsmo@gmail.com> wrote:
On Fri, Feb 12, 2016 at 2:07 AM, Andrew Barnert <abarnert@yahoo.com> wrote:
For example, when I've built something like this, sometimes I want errors on duplicate values, sometimes just pick one arbitrarily, sometimes I want the inverse dict to store sets of values, and sometimes I even want _both_ to store sets of values. Do you want to do just one of those, or two or more? Controlled by flags? Or different attribute names? Or different classes?
My use case is quite simple. I believe I always used this pattern on a one-to-one mapping, so allowing repeated values to overwrite or throwing an error is not exactly important to me, but it may be to someone else.
Someone suggested a MultiDict sort of thing, but I don't think I need that amount of complexity here.
I will write a concrete class and post on PyPI then.
João Bernardo
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Feb 12, 2016, at 05:27, João Bernardo <jbvsmo@gmail.com> wrote:
On Fri, Feb 12, 2016 at 2:07 AM, Andrew Barnert <abarnert@yahoo.com> wrote: For example, when I've built something like this, sometimes I want errors on duplicate values, sometimes just pick one arbitrarily, sometimes I want the inverse dict to store sets of values, and sometimes I even want _both_ to store sets of values. Do you want to do just one of those, or two or more? Controlled by flags? Or different attribute names? Or different classes?
My use case is quite simple. I believe I always used this pattern on a one-to-one mapping, so allowing repeated values to overwrite or throwing an error is not exactly important to me, but it may be to someone else.
Someone suggested a MultiDict sort of thing, but I don't think I need that amount of complexity here.
I will write a concrete class and post on PyPI then.
I look forward to it. Next time I need some variation of this, even if it *isn't* the same variation you end up building, the fact that there's a potential de facto standard for what to call the ".inv" or whatever still helps me, right? Also, even if nobody ever agrees to put this in the stdlib, the collections module linked to some outside recipes/modules like OrderedSet long before Nick and the other packaging guys started pushing the idea that it was a good idea for the stdlib in general. I think if this gets sufficient uptake, it deserves such a link. As you say, it's something many people end up building for themselves.
On Fri, Feb 12, 2016 at 12:27 PM, Andrew Barnert via Python-ideas < python-ideas@python.org> wrote:
I look forward to it. Next time I need some variation of this, even if it *isn't* the same variation you end up building, the fact that there's a potential de facto standard for what to call the ".inv" or whatever still helps me, right?
yeah, though I'm not sure I like that name... (can't think of a better one right now, though). But what I would like is for the "inverse" to be available as an object itself, so: my_double_dict = DoubleDict( ( (1:'a'), (2:'b'), (3:'c) ) ) my_inverse = my_double_dict.inv my_double_dict[1] == 'a' my_inverse['a'] == 1 i.e you now have two objects, which are essentially the same object, but with inverse referencing semantics. Note how I cleverly introduced a name for this object -- I think it more as doubling than inversing, or really a one-to-one mapping, but haven't thought of a clever name for from that... And for my part, re-using a key on either side should simply write-over the existing item, just like a dict. -CHB
Also, even if nobody ever agrees to put this in the stdlib, the collections module linked to some outside recipes/modules like OrderedSet long before Nick and the other packaging guys started pushing the idea that it was a good idea for the stdlib in general. I think if this gets sufficient uptake, it deserves such a link. As you say, it's something many people end up building for themselves.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Fri, Feb 12, 2016 at 1:36 PM, Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Feb 12, 2016 at 12:27 PM, Andrew Barnert via Python-ideas <python-ideas@python.org> wrote:
I look forward to it. Next time I need some variation of this, even if it *isn't* the same variation you end up building, the fact that there's a potential de facto standard for what to call the ".inv" or whatever still helps me, right?
yeah, though I'm not sure I like that name... (can't think of a better one right now, though).
But what I would like is for the "inverse" to be available as an object itself, so:
my_double_dict = DoubleDict( ( (1:'a'), (2:'b'), (3:'c) ) ) my_inverse = my_double_dict.inv
my_double_dict[1] == 'a' my_inverse['a'] == 1
i.e you now have two objects, which are essentially the same object, but with inverse referencing semantics.
I have some unpublished (lightly tested) code that basically does this (inspired by Guava's BiMap). class BiDict(dict): def __init__(self, *args, **kwargs): self._inverse = _BiDictInverse(self) self._super_inverse = super(BiDict, self._inverse) self.update(*args, **kwargs) @property def inverse(self): return self._inverse def __repr__(self): return 'BiDict(%s)' % super().__repr__() def __setitem__(self, key, value): if value in self._inverse and self._inverse[value] != key: raise ValueError( '%r already bound to %r' % (value, self._inverse[value])) if key in self: self._super_inverse.__delitem__(self[key]) super().__setitem__(key, value) self._super_inverse.__setitem__(value, key) def __delitem__(self, key): self._super_inverse.__delitem__(self[key]) super().__delitem__(key) def clear(self): super().clear() self._super_inverse.clear() def pop(self, key, *args): key_was_present = key in self value = super().pop(key, *args) if key_was_present: self._super_inverse.__delitem__(value) return value def popitem(self): (key, value) = super().popitem() self._super_inverse.__delitem__(value) return (key, value) def setdefault(self, key, *args): key_was_present = key in self value = super().setdefault(key, *args) if not key_was_present: self._super_inverse.__setitem__(value, key) return value def update(self, *args, **kwargs): if len(args) > 1: raise TypeError( 'update expected at most 1 arguments, got %d' % len(args)) if args and hasattr(args[0], 'keys'): for key in args[0]: self[key] = args[0][key] elif args: for key, value in args[0]: self[key] = value for key in kwargs: self[key] = kwargs[key] class _BiDictInverse(BiDict): def __init__(self, forward_dict): self._inverse = forward_dict self._super_inverse = super(BiDict, self._inverse)
On Feb 12 2016, Chris Barker <chris.barker-32lpuo7BZBA@public.gmane.org> wrote:
On Fri, Feb 12, 2016 at 12:27 PM, Andrew Barnert via Python-ideas < python-ideas-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org> wrote:
I look forward to it. Next time I need some variation of this, even if it *isn't* the same variation you end up building, the fact that there's a potential de facto standard for what to call the ".inv" or whatever still helps me, right?
yeah, though I'm not sure I like that name... (can't think of a better one right now, though).
Are there use cases where the set of keys and the set of values is overlapping? In most cases you can probably get away by not offering an "inverse" object at all, just let regular lookup fall back on value lookup. Best, -Nikolaus (No Cc on replies please, I'm reading the list) -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«
On Fri, Feb 12, 2016 at 3:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
Are there use cases where the set of keys and the set of values is overlapping?
In most cases you can probably get away by not offering an "inverse" object at all, just let regular lookup fall back on value lookup.
hmm, interesting. my use case at hand is mapping color name strings to RGB values -- it should be a one to one relationship, and I need to look for either one. But you are quite right, they would never overlap, unless someone decided a nice name for a color would be "(125, 15, 235)" -- even then they wouldn't because the RGB value sare a tuple, not a string. So that may be the way to get the easiest interface. -Thanks, - Chris
Best, -Nikolaus
(No Cc on replies please, I'm reading the list) -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.« _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Feb 12, 2016, at 15:59, Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Feb 12, 2016 at 3:02 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
Are there use cases where the set of keys and the set of values is overlapping?
In most cases you can probably get away by not offering an "inverse" object at all, just let regular lookup fall back on value lookup.
hmm, interesting. my use case at hand is mapping color name strings to RGB values -- it should be a one to one relationship, and I need to look for either one.
But you are quite right, they would never overlap, unless someone decided a nice name for a color would be "(125, 15, 235)" -- even then they wouldn't because the RGB value sare a tuple, not a string.
So that may be the way to get the easiest interface.
If that's your use case, there's an even easier interface: just toss the forward and reverse mappings in the same dict. If you don't need to iterate the dict[^1] or print in user-friendly form, it's hard to beat that for simplicity or compactness.[^2] Of the top of my head (untested): class SelfInvertingDict(dict): def __delitem__(self, key): super().__delitem__(self[key]) super().__delitem__(key) def __setitem__(self, key, value): if value in self and self[value] != key: raise ValueError("duplicate key: '{}'".format(value) if key in self: del self[key] super().__setitem__(key, value) super().__setitem__(value, key) [^1]: Even if you do need to iterate, you can always do "(or don't mind iterating with a type switch like "names = (key for key in d if isinstance(key, str))" [^2]: On the other hand, it's pretty easy to beat for static-type-checking purposes. The actual type of that dual dict is pretty ugly, and probably not describable in mypy terms, so presumably you'd just punt and label it Dict[Any, Any], or at best stick Union types in for key and value. But that's true for the double-dict-with-fall-through design too.
participants (7)
-
Andrew Barnert
-
Chris Angelico
-
Chris Barker
-
Ian Kelly
-
João Bernardo
-
Mahmoud Hashemi
-
Nikolaus Rath