Mapping with continguous ranges of keys
Peter Otten
__peter__ at web.de
Thu Dec 15 12:54:48 EST 2016
Steve D'Aprano wrote:
> I have some key:value data where the keys often are found in contiguous
> ranges with identical values. For example:
>
> {1: "foo",
> 2: "foo",
> 3: "foo",
> # same for keys 4 through 99
> 100: "foo",
> 101: "bar",
> 102: "bar",
> 103: "foobar",
> 104: "bar",
> 105: "foo",
> }
>
>
> So in this case, the keys 1 through 100, plus 105, all have the same
> value.
>
> I have about a million or two keys, with a few hundred or perhaps a few
> thousand distinct values. The size of each contiguous group of keys with
> the same value can vary from 1 to perhaps a hundred or so.
>
> Has anyone dealt with data like this and could give a recommendation of
> the right data structure to use?
>
> The obvious way is to explicitly list each key, in a regular dict. But I
> wonder whether there are alternatives which may be better (faster, more
> memory efficient)?
Use a list, either directly if there are no big holes
>>> r = range(10**5)
>>> sys.getsizeof(list(r))/sys.getsizeof(dict(zip(r, r)))
0.14306676635590074
or indirectly:
ranges_list = [
1,
101,
103,
104,
105,
106,
]
index_to_value = {
1: "foo",
2: "bar",
3: "foobar",
4: "bar",
5: "foo",
}
def get(key, default="missing"):
index = bisect.bisect(ranges_list, key)
return index_to_value.get(index, default)
More information about the Python-list
mailing list