[ python-Bugs-1646068 ] Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)
SourceForge.net
noreply at sourceforge.net
Fri Feb 2 21:20:16 CET 2007
Bugs item #1646068, was opened at 2007-01-27 13:23
Message generated for change (Comment added) made by jimjjewett
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1646068&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 6
Private: No
Submitted By: ked-tao (ked-tao)
Assigned to: Tim Peters (tim_one)
Summary: Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)
Initial Comment:
Portation problem.
Include/dictobject.h defines PyDictEntry.me_hash as a Py_ssize_t. Everywhere else uses a C 'long' for hashes.
On the system I'm porting to, ints and pointers (and ssize_t) are 32-bit, but longs and long longs are 64-bit. Therefore, the assignments to me_hash truncate the hash and subsequent lookups fail.
I've changed the definition of me_hash to 'long' and (in Objects/dictobject.c) removed the casting from the various assignments and changed the definition of 'i' in dict_popitem(). This has fixed my immediate problems, but I guess I've just reintroduced whatever problem it got changed for. The comment in the header says:
/* Cached hash code of me_key. Note that hash codes are C longs.
* We have to use Py_ssize_t instead because dict_popitem() abuses
* me_hash to hold a search finger.
*/
... but that doesn't really explain what it is about dict_popitem() that requires the different type.
Thanks. Kev.
----------------------------------------------------------------------
Comment By: Jim Jewett (jimjjewett)
Date: 2007-02-02 15:20
Message:
Logged In: YES
user_id=764593
Originator: NO
The whole point of a hash is that if it doesn't match, you can skip an
expensive comparison. How big to make the hash is a tradeoff between how
much you'll waste calculating and storing it vs how often it will save a
"real" comparison.
The comment means that, as an implementation detail, popitem assumes it
can store a pointer there instead, so hashes need to be at least as big as
a pointer.
Going to the larger of the two sizes will certainly solve your problem; it
just wastes some space, and maybe some time calculating the hash.
If you want to get that space back, just make sure the truncation is
correct and consistent. I *suspect* your problem is that when there is a
collision, either
(1) It is comparing a truncated value to an untruncated value, or
(2) The perturbation to find the next slot is going wrong, because of
when the truncation happens.
----------------------------------------------------------------------
Comment By: Georg Brandl (gbrandl)
Date: 2007-01-27 14:40
Message:
Logged In: YES
user_id=849994
Originator: NO
This is your code, Tim.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1646068&group_id=5470
More information about the Python-bugs-list
mailing list