[ python-Bugs-1054139 ] serious string hashing error in 2.4b1

SourceForge.net noreply at sourceforge.net
Wed Oct 27 02:51:10 CEST 2004


Bugs item #1054139, was opened at 2004-10-25 18:32
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1054139&group_id=5470

Category: Python Interpreter Core
Group: Python 2.4
Status: Closed
Resolution: Fixed
Priority: 9
Submitted By: Bob Halley (rthalley)
Assigned to: Raymond Hettinger (rhettinger)
Summary: serious string hashing error in 2.4b1

Initial Comment:
There is a serious hashing error in 2.4b1.  I don't know if the 
error is confined to 64-bit systems, or is a general problem. 
 
The program attached to this report produces, as expected, 
this output when run with python 2.3.3: 
 
Python: 20303f0 
System: Linux localhost 2.6.8-1.521 #1 Mon Aug 16 09:01:00 
EDT 2004 x86 
_64 x86_64 x86_64 GNU/Linux 
 
hash("DNSSEC") == -7602892900506903802 
 
hash("D") == 8704026181 
hash("DN") == 8704052292078464 
hash("DNS") == -2784798555566127274 
hash("DNSS") == 5809125768486327656 
hash("DNSSE") == 5232635463381381892 
hash("DNSSEC") == -7602892900506903802 
 
When run with 2.4b1, I get the following output: 
 
Python: 20400b1 
System: Linux localhost 2.6.8-1.521 #1 Mon Aug 16 09:01:00 
EDT 2004 x86 
_64 x86_64 x86_64 GNU/Linux 
 
hash("DNSSEC") == -7602892900506903802 
 
hash("D") == 8704026181 
hash("DN") == 8704052292078464 
hash("DNS") == 8704052292078464 
hash("DNSS") == 8704052292078464 
hash("DNSSE") == 8704052292078464 
hash("DNSSEC") == 8704052292078464 
Traceback (most recent call last): 
  File "foo.py", line 22, in ? 
    assert hb == ha, 'hashes do not match!' 
AssertionError: hashes do not match! 
 
The way I discovered this was that dnspython's regression 
suite started failing because the string 'DNSSEC' constructed 
character-by-character was not being found in a dictionary 
which had a 'DNSSEC' key. 
 
I have not yet found the underlying bug; I was focussing on 
getting the info needed to demonstrate the bug first, since 
it's so serious.  If I make any progress fixing it, I'll send a 
patch. 
 
/Bob 
 

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2004-10-26 20:51

Message:
Logged In: YES 
user_id=31435

Na, it's optimization, all right -- there's no conceivable reason 
other than optimization to try to mutate an immutable object, 
esp. one that's already been visible to user code.

But that's OK.  The more optimizations we do, the more 
optimization bugs we'll get.  We'd be the only language 
implementation project in the history of the universe to avoid 
that otherwise.  Don't be so touchy <wink>.

I'll note that the string object invariants aren't clearly written 
down in one place.  One use for making invariants explicit is 
as a sanity checklist when doing optimizations.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-10-26 19:42

Message:
Logged In: YES 
user_id=80475

IMO, it was my fault.  I reviewed, tested and applied the patch.

Also, I don't think optimization evils have much to do with
it.  It was an oversight that could affect any patch. 
Fortunately, a good beta process helped catch it quickly.

There may even be a silver lining.  By invalidating the
cache in the resize call, we may have fixed other possibly
broken code and will have prevented other errors in the future.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2004-10-26 09:26

Message:
Logged In: YES 
user_id=4771

Mea culpa.  Not only premature optimizations but even the ones that are apparently reasonably small-reaching are the root of... etc.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-10-25 21:53

Message:
Logged In: YES 
user_id=80475

Fixed.
See:
   Objects/stringobject.c 2.226
   Lib/test/string_tests.py 1.42

Thanks for the highly specific, timely bug report.


----------------------------------------------------------------------

Comment By: Bob Halley (rthalley)
Date: 2004-10-25 19:26

Message:
Logged In: YES 
user_id=671513

Patch #1054197 fixes this. 
 
The problem was that ceval.c wasn't invalidating the cached value. 
 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-10-25 19:15

Message:
Logged In: YES 
user_id=31435

FYI, I think ceval.c's string_concatenate() needs to reset 
ob_shash to -1 after it extends a string in-place.  Alas, I 
can't make time to test that now.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-10-25 19:08

Message:
Logged In: YES 
user_id=31435

Boosted to highest priority.  I don't get exactly the same 
numbers on my 32-bit box, but the problem exists there too.  
Suspect it has to do with new optimizations for string "+=".

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1054139&group_id=5470


More information about the Python-bugs-list mailing list