Note that hashing in python 2.7 and prior to 3.4 is simply broken and
the randomization does not do nearly enough, see
https://bugs.python.org/issue14621
On Wed, Feb 17, 2016 at 4:45 AM, Shell Xu
I thought you are right. Here is the source code in python 2.7.11:
long PyObject_Hash(PyObject *v) { PyTypeObject *tp = v->ob_type; if (tp->tp_hash != NULL) return (*tp->tp_hash)(v); /* To keep to the general practice that inheriting * solely from object in C code should work without * an explicit call to PyType_Ready, we implicitly call * PyType_Ready here and then check the tp_hash slot again */ if (tp->tp_dict == NULL) { if (PyType_Ready(tp) < 0) return -1; if (tp->tp_hash != NULL) return (*tp->tp_hash)(v); } if (tp->tp_compare == NULL && RICHCOMPARE(tp) == NULL) { return _Py_HashPointer(v); /* Use address as hash value */ } /* If there's a cmp but no hash defined, the object can't be hashed */ return PyObject_HashNotImplemented(v); }
If object has hash function, it will be used. If not, _Py_HashPointer will be used. Which _Py_HashSecret are not used. And I checked reference of _Py_HashSecret. Only bufferobject, unicodeobject and stringobject use _Py_HashSecret.
On Wed, Feb 17, 2016 at 9:54 AM, Steven D'Aprano
wrote: On Tue, Feb 16, 2016 at 11:56:55AM -0800, Glenn Linderman wrote:
On 2/16/2016 1:48 AM, Christoph Groth wrote:
Hello,
Recent Python versions randomize the hashes of str, bytes and datetime objects. I suppose that the choice of these three types is the result of a compromise. Has this been discussed somewhere publicly?
Search archives of this list... it was discussed at length.
There's a lot of discussion on the mailing list. I think that this is the very start of it, in Dec 2011:
https://mail.python.org/pipermail/python-dev/2011-December/115116.html
and continuing into 2012, for example:
https://mail.python.org/pipermail/python-dev/2012-January/115577.html https://mail.python.org/pipermail/python-dev/2012-January/115690.html
and a LOT more, spread over many different threads and subject lines.
You should also read the issue on the bug tracker:
http://bugs.python.org/issue13703
My recollection is that it was decided that only strings and bytes need to have their hashes randomized, because only strings and bytes can be used directly from user-input without first having a conversion step with likely input range validation. In addition, changing the hash for ints would break too much code for too little benefit: unlike strings, where hash collision attacks on web apps are proven and easy, hash collision attacks based on ints are more difficult and rare.
See also the comment here:
http://bugs.python.org/issue13703#msg151847
I'm not a web programmer, but don't web applications also use dictionaries that are indexed by, say, tuples of integers?
Sure, and that is the biggest part of the reason they were randomized.
But they aren't, as far as I can see:
[steve@ando 3.6]$ ./python -c "print(hash((23, 42, 99, 100)))" 1071302475 [steve@ando 3.6]$ ./python -c "print(hash((23, 42, 99, 100)))" 1071302475
Web apps can use dicts indexed by anything that they like, but unless there is an actual attack, what does it matter? Guido makes a good point about security here:
https://mail.python.org/pipermail/python-dev/2013-October/129181.html
I think hashes of all types have been randomized, not _just_ the list you mentioned.
I'm pretty sure that's not actually the case. Using 3.6 from the repo (admittedly not fully up to date though), I can see hash randomization working for strings:
[steve@ando 3.6]$ ./python -c "print(hash('abc'))" 11601873 [steve@ando 3.6]$ ./python -c "print(hash('abc'))" -2009889747
but not for ints:
[steve@ando 3.6]$ ./python -c "print(hash(42))" 42 [steve@ando 3.6]$ ./python -c "print(hash(42))" 42
which agrees with my recollection that only strings and bytes would be randomized.
-- Steve _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/shell909090%40gmail.com
-- 彼節者有間,而刀刃者無厚;以無厚入有間,恢恢乎其於游刃必有餘地矣。 blog: http://shell909090.org/blog/ twitter: @shell909090 about.me: http://about.me/shell909090
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com