======================================================================================== the historical success of collision attacks does not imply a danger of pre-image attacks ======================================================================================== by Zooko Wilcox-O'Hearn, LeastAuthority.com, 2013-07-29 Summary ======= Most of the secure hash functions designed before 2007 have turned out to be vulnerable to collision attacks. This includes the widely-used secure hash functions MD5 and SHA-1. Newer hash functions, including those designed since the SHA-3 project began in 2007, do not appear to be vulnerable to collision attacks, but since they are newer, there has also been less time for cryptanalysts to find flaws in them. A widely cited web page shows a graphical representation of the history of various hash functions being broken: http://valerieaurora.org/hash.html The advice on that web page is that if you are relying on your hash function for collision-resistance, then you should be prepared to migrate to a new hash function every few years. .. insert table based on the main table below, but showing only vulnerability to collisions instead of pre-images What about pre-image attacks or second pre-image attacks? Some systems require their secure hash function to resist pre-image and 2nd-pre-image attacks, but do not require their secure hash function to resist collision attacks. For such systems, an interesting question is whether many or few of the secure hash functions published in the last 23 years—since the advent of modern cryptography—have turned out to be vulnerable to pre-image or 2nd-pre-image attacks. The answer is that with a single exception, no secure hash function has ever been shown to be vulnerable to (2nd-)pre-image attacks. That single exception is the second-oldest hash function ever designed, Snefru, which was designed in 1989 and 1990, and which turned out to be vulnerable to differential cryptanalysis. Differential cryptanalysis was discovered (by the open research community) in 1990. Preliminaries ============= The input to a secure hash function is called the *pre-image* and the output is called the *image*. A hash function *collision* is two different inputs (pre-images) which result in the same output. A hash function is *collision-resistant* if an adversary can't find any collision. A hash function is *pre-image resistant* if, given an output (image), an adversary can't find any input (pre-image) which results in that output. A hash function is *second pre-image resistant* if, given a pre-image, an adversary can't find any *other* pre-image which results in the same image. Motivation ========== My motivation for caring about pre-image resistance and 2nd-pre-image resistance is that it is possible to build digital signatures from secure hash functions. Some of these *hash-based digital signatures* have been proven to be secure (resistant to forgery) as long as the hash function they are built out of has second-pre-image resistance, e.g. Buchmann-2011_. Such a hash-based digital signature would fail if its underlying hash function failed at second-pre-image resistance, but this is the *only* way that it could be broken—any attack which was able to forge digital signatures against such a scheme would *have* to violate the second-pre-image resistance of the underlying hash function. One reason that hash-based digital signatures might be useful is that if an attacker has a sufficiently large quantum computer, they could forge digital signatures that rely on factorization or discrete log, such as RSA, DSA, ECDSA, or Ed25519. There is not any reason to think that such a quantum computer would enable them to break secure hash functions, however. Survey of attacks on hash functions =================================== We know that practical, widely-deployed secure hash functions have turned out to be vulnerable to collision attacks. MD5 is the most widely used secure hash function which turns out to admit collisions, but many other secure hash functions have likewise been found vulnerable to collisions. For some uses, such as the hash-based digital signatures mentioned above, it would be harmless to be able to generate collisions, but harmful to be able to generate pre-images or 2nd-pre-images [*]. For such systems, the relevant question is not whether hash function designs have historically been revealed to be vulnerable to collisions but instead whether they've been revealed to be vulnerable to (2nd-)pre-images. Here are the results of my search for pre-image or 2nd-pre-image attacks on widely-studied hash functions. *The bottom line is that no widely-studied hash function has ever succumbed to a (2nd-)pre-image attack except for Snefru.* Snefru was designed by Merkle in 1989 and proved vulnerable to differential cryptanalysis by Biham and Shamir in 1992. (Differential cryptanalysis had just been discovered in the open world by Biham and Shamir in 1990.) No other widely-studied hash function has been shown to be vulnerable to a practical (2nd-)pre-image attack. Furthermore, no other widely-studied hash function has been shown to be vulnerable to a (2nd-)pre-image attack that is more efficient than brute force, even if we were to count attacks too expensive for anyone to actually implement! The history of (2nd-)pre-image attacks is therefore quite different from the history of collision attacks. Most hash functions have been proven vulnerable to collision attacks more efficient than brute force, and even to collision attacks that could be implemented in practice. Here I omit papers with attacks which are less efficient than other published attacks (of course). I omit attacks on reduced-round variants of hash functions (there are a lot of those). I omit attacks which have unrealistic requirements, like attacks which require which require 2¹²⁸ precomputation or which require the messages to be 2⁵⁶ blocks long. "---" in a row means that I haven't found any published attack that fit these criteria. • *bit*: the number of bits of output • *cpb*: cycles per byte on ebash's amd64-sandy0_ for 4096 bytes, worst quartile • *comp*: approximate computation required for the attack • *mem*: approximate memory required for the attack • *⩻ brute*: does the attack comp * the attack mem cost less than brute force search (see Bernstein-2005_) • *possible?*: could the attack actually be implemented +-------------+------+-----+-----+--------------------------------------------------------+--------------------------------------------------------+ | hash | year | bit | cpb | (2nd-)preimage attacks | collision attacks | | | | | +------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | | | | | comp | mem | ⩻ brute | possible? | reference | comp | mem | ⩻ brute | possible? | reference | +=============+======+=====+=====+======+=====+=========+============+====================+======+=====+=========+============+====================+ | MD2 | 1989 | 128 | 638 | 2⁷² | 2⁷² | no | --- | Knudsen-2007_ | 2⁶³ | 2⁵² | no | --- | Knudsen-2007_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | Snefru_ -2 | 1990 | 128 | ? | 2²⁵ | 2⁰ | yes | yes | Biham-2008_ | 2¹³ | 2⁰ | yes | yes | Biham-2008_ | +-------------+ | +-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+ | | -3 | | | ? | 2⁵⁶ | 2⁰ | yes | yes | | 2²⁹ | 2⁰ | yes | yes | | +-------------+ | +-----+------+-----+---------+------------+ +------+-----+---------+------------+ | | -4 | | | ? | ≥2⁸⁸ | 2⁰ | yes | no | | ≥2⁴⁵ | 2⁰ | yes | yes | | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | MD4 | 1990 | 128 | 4 | 2⁹⁵ | 2³⁸ | no | --- | Zhong-2010_ | 2² | 2⁰ | yes | yes | Naito-2006_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | RIPEMD | 1990 | 128 | ? | --- | --- | --- | --- | --- | 2¹⁸ | 2⁰ | yes | yes | Wang-2005a_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | MD5 | 1991 | 128 | 6 | 2¹²³ | 2⁴⁸ | no | --- | Sasaki-2009_ | 2²⁴ | 2⁰ | yes | yes | Stevens-2007_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | HAVAL-256-3 | 1992 | 256 | ? | 2²²⁵ | 2⁶⁸ | no | --- | Sasaki-2008_ | 2²⁹ | 2⁰ | yes | yes | `Van_Rompay-2003`_ | +-------------+ + +-----+------+-----+---------+------------+ +------+-----+---------+------------+--------------------+ | -4 | | | ? | 2²⁵⁴ | 2⁶⁸ | no | --- | | 2³⁶ | 2⁰ | yes | yes | Yu-2006_ | +-------------+ + +-----+------+-----+---------+------------+ +------+-----+---------+------------+ + | -5 | | | ? | 2²⁵⁵ | 2⁶⁸ | no | --- | | 2¹²³ | 2⁰ | yes | no | | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | SHA-0 | 1993 | 160 | ? | 2¹⁸⁹ | 2⁸ | no | --- | --- | 2³⁴ | 2⁰ | yes | yes | Manuel-2008_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | GOST | 1994 | 256 | ? | 2¹⁹² | 2⁷⁰ | no | --- | Mendel-2008_ | 2¹⁰⁵ | 2⁰ | yes | no | Mendel-2008_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | SHA-1 | 1995 | 160 | 8 | --- | --- | --- | --- | | 2⁶⁹ | 2⁰ | yes | yes | Wang-2005b_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | RIPEMD-160 | 1996 | 160 | 16 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | Tiger | 1996 | 192 | 7 | 2¹⁸⁹ | 2⁸ | no | --- | Guo-2010_ | --- | --- | --- | --- | --- | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | Whirlpool | 2000 | 512 | 25 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | Panama | 2000 | 512 | ? | --- | --- | --- | --- | --- | 2⁶ | 2⁰ | yes | yes | Daemen-2007_ | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | SHA-256 | 2002 | 256 | 18 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ | RadioGatún | 2006 | 256 | ? | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +-------------+------+-----+-----+------+-----+---------+------------+--------------------+------+-----+---------+------------+--------------------+ .. _Daemen-2007: http://radiogatun.noekeon.org/panama/PanamaAttack.pdf Then we get to the SHA-3 competition, which began in 2007. None of the 14 second-round candidates for SHA-3 have been shown to be vulnerable to any attack better than brute force, either to find collisions or to find (2nd-)pre-images [SHA-3-Zoo_ ]. Here are rows for the five SHA-3 finalists. The cpb is potentially relevant because a hash function which used *too* few cycles per byte would fail at its goals, including its goal of (2nd-)pre-image-resistance. Of course we don't know how few is too few. The designers of these hash functions were probably choosing a number of cycles per byte to make their function competitive with SHA-256, and with other SHA-3 candidates, while not accidentally losing collision-resistance like so many of their predecessors had. +------------+------+-----+-----+---------------------------------------------------+---------------------------------------------------+ | hash | year | bit | cpb | (2nd-)preimage attacks | collision attacks | | | | | +------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ | | | | | comp | mem | ⩻ brute | possible? | reference | comp | mem | ⩻ brute | possible? | reference | +============+======+=====+=====+======+=====+=========+============+===============+======+=====+=========+============+===============+ | Skein | 2010 | 256 | 7 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +------------+------+-----+-----+------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ | Blake | 2010 | 256 | 8 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +------------+------+-----+-----+------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ | Grøstl | 2010 | 256 | 10 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +------------+------+-----+-----+------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ | Keccak | 2010 | 256 | 12 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +------------+------+-----+-----+------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ | JH | 2010 | 256 | 14 | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | +------------+------+-----+-----+------+-----+---------+------------+---------------+------+-----+---------+------------+---------------+ If you are aware of any other papers which fit these criteria, or if you spot an error in this document, please write to me: zooko@LeastAuthority.com. [*] Be very careful about this—the ability to generate collisions can be surprisingly harmful to some systems. This is one of those subtleties of cryptographic engineering which frequently trip up engineers who are not cryptography experts. The famous "Internet Root Cert" attack [Sotirov-2009_ ] is an example of engineers incorrectly thinking that their system was not threatened by collisions absent 2nd-pre-images. Thanks to Daira Hopwood for comments on this note. .. _Snefru: http://www.springerlink.com/content/t10683l407363633/ .. _Van_Rompay-2003: http://academic.research.microsoft.com/Publication/676305/cryptanalysis-of-3pass-haval .. _Bernstein-2005: http://cr.yp.to/papers.html#bruteforce .. _Wang-2005a: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.106.4759 .. _Wang-2005b: http://people.csail.mit.edu/yiqun/SHA1AttackProceedingVersion.pdf .. _Yu-2006: http://www.springerlink.com/content/0n9018738x721090/ .. _Naito-2006: http://www.springerlink.com/content/v6526284mu858v37/ .. _Stevens-2007: http://marc-stevens.nl/research/papers/MTh%20Marc%20Stevens%20-%20On%20Collisions%20for%20MD5.pdf .. _Knudsen-2007: http://www.springerlink.com/content/qn746388035614r1/ .. _Biham-2008: http://www.springerlink.com/content/208q118x13181g32/ .. _Sasaki-2008: http://www.springerlink.com/content/d382324nl16251pp/ .. _Mendel-2008: http://www.cosic.esat.kuleuven.be/publications/article-2091.pdf .. _Manuel-2008: http://www.springerlink.com/content/3810jp9730369045/ .. _Leurent-2008: http://www.di.ens.fr/~leurent/files/MD4_FSE08.pdf .. _Sasaki-2009: http://www.springerlink.com/content/d7pm142n58853467/ .. _Sotirov-2009: http://www.win.tue.nl/hashclash/rogue-ca/ .. _Zhong-2010: http://eprint.iacr.org/2010/583 .. _Guo-2010: http://eprint.iacr.org/2010/016 .. _Buchmann-2011: http://eprint.iacr.org/2011/484 .. _SHA-3-Zoo: http://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo .. _amd64-sandy0: http://bench.cr.yp.to/results-hash.html#amd64-sandy0