[Python-3000] bytes and dicts (was: PEP 3137: Immutable Bytes and Mutable Buffer)
jimjjewett at gmail.com
Fri Sep 28 20:33:04 CEST 2007
On 9/28/07, Guido van Rossum <guido at python.org> wrote:
> Well, maybe this is a good enough argument to give up.
Not quite yet... I still see two potential solutions, depending on
whether or not the exclusion is sticky. Details below.
If the exclusion is sticky, then add (implicit) flags saying "seen a
string" and "seen a byte". Similar logic is already there, in that
"seen a non-string" replaces the lookdict function.
The most common case (exact unicode in an exact unicode-only dict)
would stay the same as today, but the other cases would have some
If the exclusion is based on current contents, then we can add a
count; my concern is that keeping this efficient may be too hacky.
It looks like there is room for exactly one more pointer (-sized count
variable) before small dicts bleed to a third cacheline. Because of
this guard, bytes and strings can never appear in the same dict, so at
least one count is zero. Because dict entries are 3 pointers long,
there can never be more than (Py_ssize_t / 2) entries, so the sign bit
can be repurposed to indicate whether the count refers to strings or
bytes. (count==0 means no bytes or strings; count==5 means 5 string
keys; count==-32 means 32 bytes keys.)
More information about the Python-3000