
Well, it seems to work now, but the creation from a combined, without holes, dict, is definitively faster no more. On the contrary, it is 2x slower.
This is probably because
1. combined dicts needs only one memcpy 2. probably I wrong something in my code, since I need TWO Py_INCREF for keys and values: https://github.com/Marco-Sulla/cpython/blob/2eea9ff796685127fc03fcc30ff6c652... (It's frozendict_clone and it's used in frozendict_merge)
Iteration continues to be faster. Probably also creation from dict with holes, I did not test it.
I suppose frozendict can improve memory space using shared keys and shared frozendicts.
Probably I'll try to write a C extension, even if I'll need a lot of help, in another mailing list.
I have some random remarks about possible improvements to dict performance:
A. lookdict functions that are unicode only could return zero immediately if the searched key is not a string, instead of using the basic lookdict B. every time the dict is changed, the keys could be checked if they are all unicode with an internal version of _PyDict_HasOnlyStringKeys (I created it for frozendict, it's dict_has_only_unicode_keys_exact) and change the dk_lookup accordingly. If the function changes only one value, it's sufficient to check if the value if unicode or not, and the original lookup func C. can USABLE_FRACTION be substituted with mp->ma_keys->dk_usable?