
On Fri, 30 May 2003 15:00:21 -0500, Jeff Epler <jepler@unpythonic.net> writes:
On Fri, May 30, 2003 at 11:18:14AM -0500, Scott A Crosby wrote:
On
http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/LengthEffec...
UHASH exceeds the performance of perl's hash function, which is simpler than your own.
I notice that you say "with strings longer than around 44-bytes, UHASH dominates all the other hash functions, due in no small part to its extensive performance tuning and *hand-coded assembly routines.*" [emphasis mine] It's all well and good for people who can run your
We benchmarked it, and without assembly optimizations, uhash still exceeds perl. Also please note that we did not create uhash. We merely used it as a high performance universal hash which we could cite and benchmark. Freshly computed raw benchmarks on a P2-450 are at the end of this email. Looking at them now. I think we may have slightly err'ed and used the non-assembly version of the hash in constructing the graphs, because the crossover point compared to perl looks to be about 20 bytes with assembly, and about 48 without. Roughly, they show that uhash is about half the speed on a P2-450 without assembly. I do not have benchmarks on other platforms to compare it to. However, CW is known be about 10 times worse, relatively, than jenkin's on a SPARC. The python community will have to judge whether the performance difference of the current hash is worth the risk of the attack. Also, I'd like to thank Tim Peters for telling me about the potential of degradation that regular expressions may offer. Scott Time benchmarking actual hash (including benchmarking overhead) with a working set size of 12kb. Time(perl-5.8.0): 12.787 Mbytes/sec with string length 4, buf 12000 Time(uh_cw-1024): 6.010 Mbytes/sec with string length 4, buf 12000 Time(python): 14.952 Mbytes/sec with string length 4, buf 12000 Time(test32out_uhash): 4.584 Mbytes/sec with string length 4, buf 12000 Time(test32out_assembly_uhash): 6.014 Mbytes/sec with string length 4, buf 12000 Time(perl-5.8.0): 29.125 Mbytes/sec with string length 16, buf 12000 Time(uh_cw-1024): 11.898 Mbytes/sec with string length 16, buf 12000 Time(python): 36.445 Mbytes/sec with string length 16, buf 12000 Time(test32out_uhash): 19.169 Mbytes/sec with string length 16, buf 12000 Time(test32out_assembly_uhash): 25.660 Mbytes/sec with string length 16, buf 12000 Time(perl-5.8.0): 45.440 Mbytes/sec with string length 64, buf 12000 Time(uh_cw-1024): 16.168 Mbytes/sec with string length 64, buf 12000 Time(python): 62.213 Mbytes/sec with string length 64, buf 12000 Time(test32out_uhash): 71.396 Mbytes/sec with string length 64, buf 12000 Time(test32out_assembly_uhash): 106.873 Mbytes/sec with string length 64, buf 12000 Time benchmarking actual hash (Including benchmarking overhead) with a working set size of 6mb. Time(perl-5.8.0): 8.099 Mbytes/sec with string length 4, buf 6000000 Time(uh_cw-1024): 4.660 Mbytes/sec with string length 4, buf 6000000 Time(python): 8.840 Mbytes/sec with string length 4, buf 6000000 Time(test32out_uhash): 3.932 Mbytes/sec with string length 4, buf 6000000 Time(test32out_assembly_uhash): 4.859 Mbytes/sec with string length 4, buf 6000000 Time(perl-5.8.0): 20.878 Mbytes/sec with string length 16, buf 6000000 Time(uh_cw-1024): 9.964 Mbytes/sec with string length 16, buf 6000000 Time(python): 24.450 Mbytes/sec with string length 16, buf 6000000 Time(test32out_uhash): 16.168 Mbytes/sec with string length 16, buf 6000000 Time(test32out_assembly_uhash): 19.929 Mbytes/sec with string length 16, buf 6000000 Time(perl-5.8.0): 35.265 Mbytes/sec with string length 64, buf 6000000 Time(uh_cw-1024): 14.400 Mbytes/sec with string length 64, buf 6000000 Time(python): 46.650 Mbytes/sec with string length 64, buf 6000000 Time(test32out_uhash): 48.719 Mbytes/sec with string length 64, buf 6000000 Time(test32out_assembly_uhash): 63.523 Mbytes/sec with string length 64, buf 6000000