string hash, and an email to stimpy.

Ori Berger orib at writeme.com
Wed Aug 15 19:21:38 EDT 2001


Disclaimer: It's been over 2 years since I was last regularly reading
c.l.p (drop "regularly" for all practical purposes); I've gone through
the last 200 posts or so, and so no mention of what I'm about to post,
but it _may_ be old news. (I didn't see any mention of Guido's time 
machine either, nor Moshe's "which fun to me, I'm not signing 
permanent" signature. At least Tim's <wink>-ly yours style is still
here).

And to the point:

I was browsing 2.2a1 CVS repository through source forge's web 
interface. I looked at exactly two files, and found exactly two
puzzling things:

--------------
File: [Development] / python / python / dist / src / Objects /
stringobject.c (download) (as text) 
 Revision 2.122 , Thu Aug 2 04:15:00 2001 UTC (13 days, 18 hours ago) by
tim_one 
 Branch: MAIN 
 CVS Tags: after-descr-branch-merge, HEAD 
 Changes since 2.121: +50 -26 lines 

 Merge of descr-branch back into trunk.
---
(somewhere down below, inside string_hash)
        len = a->ob_size;
        p = (unsigned char *) a->ob_sval;
        x = *p << 7;
        while (--len >= 0)
                x = (1000003*x) ^ *p++;
        x ^= a->ob_size;
---

Do you really mean that hash function? Do you trust compilers to 
optimize that multiplication well enough? Won't a hashpjw-style 
hash be faster to compute? Will the direction of entropy reverse?

The following CVS entry is the only reference I could find (and it 
took me quite some time to find, since I searched for "hash" rather
than "hach" ....) is:
---
Revision 2.31 / (download) / (as text) - annotate - [select for diffs] ,
Wed Sep 11 20:22:48 1996 UTC (4 years, 11 months ago) by guido 
Branch: MAIN 
Changes since 2.30: +1 -1 lines 
Diff to previous 2.30 
---
Multiply by 1000003 instead of 3 in string hach
---

I'm quite sure it gives a more uniform distribution, but 1000003
has 6 or 7 bits set, which means it will probably compile to a 
real multiplication on most architectures.


Second thing:

---File: [Development] / python / python / dist / src / Lib / smtpd.py
(download) (as text) 
 Revision 1.7 , Mon Aug 13 21:18:01 2001 UTC (2 days, 2 hours ago) by
bwarsaw 
 Branch: MAIN 
 CVS Tags: HEAD 
 Changes since 1.6: +2 -1 lines 

 found_terminator(): Add a debug print showing the data.
---
(somewhere down below, inside smtp_RCPT...)
...
        if address.lower().startswith('stimpy'):
            self.push('503 You suck %s' % address)
            return
...
---

Who's the aforementioned stimpy figure? (No, I don't watch TV if it's
related - you'll have to give me all the details).

OriB.

(please cc:me on your reply - I suspect my news server is not that
good).



More information about the Python-list mailing list