Mapping with continguous ranges of keys
mbg1708 at planetmail.com
mbg1708 at planetmail.com
Fri Dec 16 13:39:34 EST 2016
On Thursday, 15 December 2016 17:06:39 UTC, Steve D'Aprano wrote:
> I have some key:value data where the keys often are found in contiguous
> ranges with identical values. For example:
>
> {1: "foo",
> 2: "foo",
> 3: "foo",
> # same for keys 4 through 99
> 100: "foo",
> 101: "bar",
> 102: "bar",
> 103: "foobar",
> 104: "bar",
> 105: "foo",
> }
> ...
> --
> Steve
All the answers seem to rely on in-memory solutions. But isn't the problem a classic data design problem (cf Codd) with two tables.
CREATE TABLE keys (id INTEGER NOT NULL PRIMARY KEY,
kkey INTEGER,
UNIQUE (kkey) );
## eg id = 999, kkey=101
CREATE TABLE values (id INTEGER NOT NULL PRIMARY KEY,
k_id INTEGER,
value VARCHAR,
UNIQUE (k_id, value),
FOREIGN KEY (k_id) REFERENCES keys(id));
## eg k_id = 999, value = "bar"
For example, Python/SQLITE can parse the list of key:value pairs. key is looked up in keys -- and a keys row is added if the key is new. The keys id is saved.
k_id--value pair is looked up -- and a row is added if the pair is new.
Some of this can be simplified by relying on SQL to handle non-UNIQUE errors.
This approach may be slower than in-memory processing, but it has almost no database size limits.
More information about the Python-list
mailing list