factory for efficient creation of many dicts with the same keys

Hi all, Sometimes you may need to create many dicts with the same keys, but different values. For example, if you want to return data from DB as dicts. I think that special type could be added to solve this task more effectively. I created proof of concept for this and here's benchmarks: # currently the fastest way to do it AFAIK $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys for i in range(nrows)]; enumerated = list(enumerate(range(nkeys)))" "for row in rows: {key: row[i] for i, key in enumerated}" 500 loops, best of 5: 645 usec per loop $ ./python -m timeit -s "nkeys = 5; nrows = 1000; rows = [(i,)*nkeys for i in range(nrows)]; factory = dict.factory(*range(nkeys)); from itertools import starmap" "for d in starmap(factory, rows): d" 5000 loops, best of 5: 81.1 usec per loop I'd like to write a patch if this idea will be accepted.

I think you've got it backwards -- if you send the patch the idea *may* be accepted. You ought to at least show us the docs for your proposed factory, it's a little murky from your example. On Fri, Sep 8, 2017 at 6:34 AM, Sergey Fedoseev <fedoseev.sergey@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Here's docs: .. staticmethod:: factory(*keys) Return a callable object that creates a dictionary from *keys* and its operands. For example: * ``dict.factory('1', 2, (3,))({1}, [2], {3: None})`` returns ``{'1': {1}, 2: [2], (3,): {3: None}}``. * ``dict.factory((3,), '1', 2)({1}, [2], {3: None})`` returns ``{(3,): {1}, '1': [2], 2: {3: None}}``. Equivalent to:: def factory(*keys): def f(*values): return dict(zip(keys, values)) return f Hope it makes my idea clearer. Link to patch (I guess it's too big to paste it here): https://github.com/sir-sigurd/cpython/commit/a0fe1a80f6e192368180a32e849771c... 2017-09-08 19:56 GMT+05:00 Guido van Rossum <guido@python.org>:

Thanks for your suggestion. FYI, you can use "key-sharing dict" (PEP 412: https://www.python.org/dev/peps/pep-0412/) when all keys are string. It saves not only creation time, but also memory usage. I think it's nice for CSV parser and, as you said, DB record. One question is, how is it useful? When working on large dataset, I think list or tuple (or namedtuple) are recommended for records. If it's useful enough, it's worth enough to added in dict. It can't be implemented as 3rd party because relying on many private in dict. Regards, INADA Naoki <songofacandy@gmail.com> On Sat, Sep 9, 2017 at 1:24 AM, Sergey Fedoseev <fedoseev.sergey@gmail.com> wrote:

I think you've got it backwards -- if you send the patch the idea *may* be accepted. You ought to at least show us the docs for your proposed factory, it's a little murky from your example. On Fri, Sep 8, 2017 at 6:34 AM, Sergey Fedoseev <fedoseev.sergey@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

Here's docs: .. staticmethod:: factory(*keys) Return a callable object that creates a dictionary from *keys* and its operands. For example: * ``dict.factory('1', 2, (3,))({1}, [2], {3: None})`` returns ``{'1': {1}, 2: [2], (3,): {3: None}}``. * ``dict.factory((3,), '1', 2)({1}, [2], {3: None})`` returns ``{(3,): {1}, '1': [2], 2: {3: None}}``. Equivalent to:: def factory(*keys): def f(*values): return dict(zip(keys, values)) return f Hope it makes my idea clearer. Link to patch (I guess it's too big to paste it here): https://github.com/sir-sigurd/cpython/commit/a0fe1a80f6e192368180a32e849771c... 2017-09-08 19:56 GMT+05:00 Guido van Rossum <guido@python.org>:

Thanks for your suggestion. FYI, you can use "key-sharing dict" (PEP 412: https://www.python.org/dev/peps/pep-0412/) when all keys are string. It saves not only creation time, but also memory usage. I think it's nice for CSV parser and, as you said, DB record. One question is, how is it useful? When working on large dataset, I think list or tuple (or namedtuple) are recommended for records. If it's useful enough, it's worth enough to added in dict. It can't be implemented as 3rd party because relying on many private in dict. Regards, INADA Naoki <songofacandy@gmail.com> On Sat, Sep 9, 2017 at 1:24 AM, Sergey Fedoseev <fedoseev.sergey@gmail.com> wrote:
participants (3)
-
Guido van Rossum
-
INADA Naoki
-
Sergey Fedoseev