Compact dict implementations (was: PEP 468
Hi, developers.
I'm trying to implement compact dict.
https://github.com/methane/cpython/pull/1
Current status is passing most of tests.
Some tests are failing because of I haven't updated `sizeof` until layout fix.
And I haven't dropped OrderedDict has linked list.
Before finishing implementation, I want to see comments and tests from core
developers.
Please come to core-mentorship ML or pull request and try it if you
interested in.
Regards,
--
INADA Naoki
Now I fixed failing tests (some tests relying to underlying layout).
Before posting it to bugs.python.org, I want to confirm I have chance to
it merged.
First big problem is language spec.
If builtin dict in both of PyPy and CPython is ordered, many people
will relying it.
It will force other Python implementations to implement it for compatibility.
In other words, it may be de-facto "Python Language", even if Python
Language spec
say it's an implementation detail.
Is it OK?
Second problem is performance.
Quick benchmark on my laptop (Sorry, I don't have dedicated hardware
for long running stable
benchmarking), It reduces 3% memory usage and increase 3% cpu time.
I'll run longer benchmark in next week.
I think I can't avoid the penalty because index hashtable and (hash,
key, value) is not in
same cacheline. (I hope my thought is wrong and there is way to optimize more.)
pybench: https://gist.github.com/methane/cfad1427d87ceff9310350e78a214880
benchmark: https://gist.github.com/methane/5eb11fdd93863813b222e795ca0bfc1f
Is it acceptable?
I have some other minor problems (e.g. How I can use 2byte integer?
Using int16_t in stdint.h
is OK?). I'll discuss them in core-mentor ML or bugs.python.org.
Thanks
--
INADA Naoki
In the original discussion, I think they decided to reimplement set before
dict.
The original discussion is here, for anyone else:
https://mail.python.org/pipermail/python-dev/2012-December/123028.html
On Jun 18, 2016 3:15 AM, "INADA Naoki"
If builtin dict in both of PyPy and CPython is ordered, many people will relying it. It will force other Python implementations to implement it for compatibility. In other words, it may be de-facto "Python Language", even if Python Language spec say it's an implementation detail.
Is it OK?
Ordered, or just initially ordered? I mean, "ordered if no deletion". They discussed scrambling the order. (Subdiscussion was here: https://mail.python.org/pipermail/python-dev/2012-December/123041.html)
Ordered, or just initially ordered? I mean, "ordered if no deletion".
I implemented "ordered". Because:
* "orderd" is easier to explain than "ordered if no deletion".
* I don't want to split sparse index hash and dense entry array.
In case of very small dict, index hash (8byte) and first two entries
(24*2=48byte)
can be on one cache line.
* Easy to implement "split dictionary" (aka. key sharing dictionary).
You can see what I implemented in here.
https://github.com/methane/cpython/pull/1/files
--
INADA Naoki
On Jun 18, 2016, at 9:57 AM, Franklin Lee
wrote: In the original discussion, I think they decided to reimplement set before dict.
I ended-up going in a different direction with sets (using linear probes to reduce the cost of collisions). Also, after the original discussion, PyPy implemented the idea for dicts and achieved some nice improvements. So, I think Inada Naoki is going in the right direction by focusing on compact dicts. Raymond
pybench: https://gist.github.com/methane/cfad1427d87ceff9310350e78a214880 benchmark: https://gist.github.com/methane/5eb11fdd93863813b222e795ca0bfc1f
Is it acceptable?
latest result is here
https://gist.github.com/methane/22cf5d1dadb62bc87a15e9244a9d0ab8
--
INADA Naoki
I've sent my patch to issue tracker, since I can't fix some remains
TODOs by myself.
http://bugs.python.org/issue27350
On Fri, Jun 17, 2016 at 6:15 PM, INADA Naoki
Hi, developers.
I'm trying to implement compact dict. https://github.com/methane/cpython/pull/1
Current status is passing most of tests. Some tests are failing because of I haven't updated `sizeof` until layout fix. And I haven't dropped OrderedDict has linked list.
Before finishing implementation, I want to see comments and tests from core developers. Please come to core-mentorship ML or pull request and try it if you interested in.
Regards, -- INADA Naoki
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
--
INADA Naoki
participants (3)
-
Franklin Lee
-
INADA Naoki
-
Raymond Hettinger