[Python-Dev] PEP 456
Christian Heimes
christian at python.org
Thu Oct 3 22:49:13 CEST 2013
Am 03.10.2013 21:53, schrieb Serhiy Storchaka:
>> the first time time with a bit shift of 7
>
> Double "time".
thx, fixed
>> with a 128bit seed and 64-bit output
>
> Inconsistancy with hyphen. There are same issues in other places.
I have unified the use of hyphens, thx!
>> bytes_hash provides the tp_hash slot function for unicode.
>
> Typo. Should be "unicode_hash".
Fixed
> x = _PyHash_Func->hashfunc(PyUnicode_BYTE_DATA(self),
> PyUnicode_GET_LENGTH(self) * PyUnicode_KIND(self));
Oh nice, that's easier to read. It's PyUnicode_DATA().
> I doubt about this. If one collects bytes and strings in one dictionary,
> this equality will only double the number of collisions (for DoS attack
> we need increase it by thousands and millions times). So it doesn't
> matter. On the other hand, I one deliberately uses bytes and str
> subclasses with overridden equality, same hash for ASCII bytes and
> strings can be needed.
It's not a big problem. I merely wanted to point out that there is a
simple possibility for a minor optimization. That's all. :)
>> For very short strings the setup costs for SipHash dominates its speed
> but it is still in the same order of magnitude as the current FNV code.
>
> We could use other algorithm for very short strings if it makes matter.
I though of that, too. The threshold is rather small, though. As far as
I remember an effective hash collision DoS works with 7 or 8 chars.
>> The summarized total runtime of the benchmark is within 1% of the
> runtime of an unmodified Python 3.4 binary.
>
> What about deviations of individual tests?
Here you go.
http://pastebin.com/dKdnBCgb
http://pastebin.com/wtfUS5Zz
Christian
More information about the Python-Dev
mailing list