This page uses content from Wikipedia and is licensed under CC BYSA.
Secure Hash Algorithm  

Concepts  
hash functions · SHA · DSA  
Main standards  
SHA0 · SHA1 · SHA2 · SHA3


General  

Designers  National Security Agency 
First published  1993 (SHA0), 1995 (SHA1) 
Series  (SHA0), SHA1, SHA2, SHA3 
Certification  FIPS PUB 1804, CRYPTREC (Monitored) 
Cipher detail  
Digest sizes  160 bits 
Block sizes  512 bits 
Structure  Merkle–Damgård construction 
Rounds  80 
Best public cryptanalysis  
A 2011 attack by Marc Stevens can produce hash collisions with a complexity between 2^{60.3} and 2^{65.3} operations.^{[1]} The first public collision was published on 23 February 2017.^{[2]} SHA1 is prone to length extension attacks. 
In cryptography, SHA1 (Secure Hash Algorithm 1) is a cryptographic hash function which takes an input and produces a 160bit (20byte) hash value known as a message digest  typically rendered as a hexadecimal number, 40 digits long. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard.^{[3]}
Since 2005 SHA1 has not been considered secure against wellfunded opponents,^{[4]} and since 2010 many organizations have recommended its replacement by SHA2 or SHA3.^{[5]}^{[6]}^{[7]} Microsoft, Google, Apple and Mozilla have all announced that their respective browsers will stop accepting SHA1 SSL certificates by 2017.^{[8]}^{[9]}^{[10]}^{[11]}^{[12]}^{[13]}
In 2017 CWI Amsterdam and Google announced they had performed a collision attack against SHA1, publishing two dissimilar PDF files which produced the same SHA1 hash.^{[14]}^{[15]}^{[16]}
SHA1 produces a message digest based on principles similar to those used by Ronald L. Rivest of MIT in the design of the MD4 and MD5 message digest algorithms, but has a more conservative design.
SHA1 was developed as part of the U.S. Government's Capstone project.^{[17]} The original specification of the algorithm was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by U.S. government standards agency NIST (National Institute of Standards and Technology).^{[18]}^{[19]} This version is now often named SHA0. It was withdrawn by the NSA shortly after publication and was superseded by the revised version, published in 1995 in FIPS PUB 1801 and commonly designated SHA1. SHA1 differs from SHA0 only by a single bitwise rotation in the message schedule of its compression function. According to the NSA, this was done to correct a flaw in the original algorithm which reduced its cryptographic security, but they did not provide any further explanation.^{[citation needed]} Publicly available techniques did indeed compromise SHA0 before SHA1.^{[citation needed]}
SHA1 forms part of several widely used security applications and protocols, including TLS and SSL, PGP, SSH, S/MIME, and IPsec. Those applications can also use MD5; both MD5 and SHA1 are descended from MD4. SHA1 hashing is also used in distributed revision control systems like Git, Mercurial, and Monotone to identify revisions, and to detect data corruption or tampering. The algorithm has also been used on Nintendo's Wii gaming console for signature verification when booting, but a significant flaw in the first implementations of the firmware allowed for an attacker to bypass the system's security scheme.^{[20]}
SHA1 and SHA2 are the hash algorithms required by law for use in certain U.S. government applications, including use within other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 1801 also encouraged adoption and use of SHA1 by private and commercial organizations. SHA1 is being retired from most government uses; the U.S. National Institute of Standards and Technology said, "Federal agencies should stop using SHA1 for...applications that require collision resistance as soon as practical, and must use the SHA2 family of hash functions for these applications after 2010" (emphasis in original),^{[21]} though that was later relaxed.^{[22]}
A prime motivation for the publication of the Secure Hash Algorithm was the Digital Signature Standard, in which it is incorporated.
The SHA hash functions have been used for the basis of the SHACAL block ciphers.
Revision control systems such as Git and Mercurial use SHA1 not for security but for ensuring that the data has not changed due to accidental corruption. Linus Torvalds said about Git:
For a hash function for which L is the number of bits in the message digest, finding a message that corresponds to a given message digest can always be done using a brute force search in approximately 2^{L} evaluations. This is called a preimage attack and may or may not be practical depending on L and the particular computing environment. However, a collision, consisting of finding two different messages that produce the same message digest, requires on average only about 1.2 × 2^{L/2} evaluations using a birthday attack. Thus the strength of a hash function is usually compared to a symmetric cipher of half the message digest length. SHA1, which has a 160bit message digest, was originally thought to have 80bit strength.
In 2005, cryptographers Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu produced collision pairs for SHA0 and have found algorithms that should produce SHA1 collisions in far fewer than the originally expected 2^{80} evaluations.^{[25]}
In terms of practical security, a major concern about these new attacks is that they might pave the way to more efficient ones. Whether this is the case is yet to be seen, but a migration to stronger hashes is believed^{[by whom?]} to be prudent. Some of the applications that use cryptographic hashes, like password storage, are only minimally affected by a collision attack. Constructing a password that works for a given account requires a preimage attack, as well as access to the hash of the original password, which may or may not be trivial. Reversing password encryption (e.g. to obtain a password to try against a user's account elsewhere) is not made possible by the attacks. (However, even a secure password hash can't prevent bruteforce attacks on weak passwords.)
In the case of document signing, an attacker could not simply fake a signature from an existing document: The attacker would have to produce a pair of documents, one innocuous and one damaging, and get the private key holder to sign the innocuous document. There are practical circumstances in which this is possible; until the end of 2008, it was possible to create forged SSL certificates using an MD5 collision.^{[26]}
Due to the block and iterative structure of the algorithms and the absence of additional final steps, all SHA functions (except SHA3^{[27]}) are vulnerable to lengthextension and partialmessage collision attacks.^{[28]} These attacks allow an attacker to forge a message signed only by a keyed hash—SHA(message  key) or SHA(key  message)—by extending the message and recalculating the hash without knowing the key. A simple improvement to prevent these attacks is to hash twice: SHA_{d}(message) = SHA(SHA(0^{b}  message)) (the length of 0^{b}, zero block, is equal to the block size of the hash function).
In early 2005, Rijmen and Oswald published an attack on a reduced version of SHA1—53 out of 80 rounds—which finds collisions with a computational effort of fewer than 2^{80} operations.^{[29]}
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced.^{[30]} The attacks can find collisions in the full version of SHA1, requiring fewer than 2^{69} operations. (A bruteforce search would require 2^{80} operations.)
The authors write: "In particular, our analysis is built upon the original differential attack on SHA0, the near collision attack on SHA0, the multiblock collision techniques, as well as the message modification techniques used in the collision search attack on MD5. Breaking SHA1 would not be possible without these powerful analytical techniques."^{[31]} The authors have presented a collision for 58round SHA1, found with 2^{33} hash operations. The paper with the full attack description was published in August 2005 at the CRYPTO conference.
In an interview, Yin states that, "Roughly, we exploit the following two weaknesses: One is that the file preprocessing step is not complicated enough; another is that certain math operations in the first 20 rounds have unexpected security problems."^{[32]}
On 17 August 2005, an improvement on the SHA1 attack was announced on behalf of Xiaoyun Wang, Andrew Yao and Frances Yao at the CRYPTO 2005 Rump Session, lowering the complexity required for finding a collision in SHA1 to 2^{63}.^{[33]} On 18 December 2007 the details of this result were explained and verified by Martin Cochran.^{[34]}
Christophe De Cannière and Christian Rechberger further improved the attack on SHA1 in "Finding SHA1 Characteristics: General Results and Applications,"^{[35]} receiving the Best Paper Award at ASIACRYPT 2006. A twoblock collision for 64round SHA1 was presented, found using unoptimized methods with 2^{35} compression function evaluations. Since this attack requires the equivalent of about 2^{35} evaluations, it is considered to be a significant theoretical break.^{[36]} Their attack was extended further to 73 rounds (of 80) in 2010 by Grechnikov.^{[37]} In order to find an actual collision in the full 80 rounds of the hash function, however, tremendous amounts of computer time are required. To that end, a collision search for SHA1 using the distributed computing platform BOINC began August 8, 2007, organized by the Graz University of Technology. The effort was abandoned May 12, 2009 due to lack of progress.^{[38]}
At the Rump Session of CRYPTO 2006, Christian Rechberger and Christophe De Cannière claimed to have discovered a collision attack on SHA1 that would allow an attacker to select at least parts of the message.^{[39]}^{[40]}
In 2008, an attack methodology by Stéphane Manuel reported hash collisions with an estimated theoretical complexity of 2^{51} to 2^{57} operations.^{[41]} However he later retracted that claim after finding that local collision paths were not actually independent, and finally quoting for the most efficient a collision vector that was already known before this work.^{[42]}
Cameron McDonald, Philip Hawkes and Josef Pieprzyk presented a hash collision attack with claimed complexity 2^{52} at the Rump Session of Eurocrypt 2009.^{[43]} However, the accompanying paper, "Differential Path for SHA1 with complexity O(2^{52})" has been withdrawn due to the authors' discovery that their estimate was incorrect.^{[44]}
One attack against SHA1 was Marc Stevens^{[45]} with an estimated cost of $2.77M to break a single hash value by renting CPU power from cloud servers.^{[46]} Stevens developed this attack in a project called HashClash,^{[47]} implementing a differential path attack. On 8 November 2010, he claimed he had a fully working nearcollision attack against full SHA1 working with an estimated complexity equivalent to 2^{57.5} SHA1 compressions. He estimated this attack could be extended to a full collision with a complexity around 2^{61}.
On 8 October 2015, Marc Stevens, Pierre Karpman, and Thomas Peyrin published a freestart collision attack on SHA1's compression function that requires only 2^{57} SHA1 evaluations. This does not directly translate into a collision on the full SHA1 hash function (where an attacker is not able to freely choose the initial internal state), but undermines the security claims for SHA1. In particular, it was the first time that an attack on full SHA1 had been demonstrated; all earlier attacks were too expensive for their authors to carry them out. The authors named this significant breakthrough in the cryptanalysis of SHA1 The SHAppening.^{[6]}
The method was based on their earlier work, as well as the auxiliary paths (or boomerangs) speedup technique from Joux and Peyrin, and using high performance/cost efficient GPU cards from NVIDIA. The collision was found on a 16node cluster with a total of 64 graphics cards. The authors estimated that a similar collision could be found by buying US$2,000 of GPU time on EC2.^{[6]}
The authors estimated that the cost of renting enough of EC2 CPU/GPU time to generate a full collision for SHA1 at the time of publication was between US$75K–120K, and noted that was well within the budget of criminal organizations, not to mention national intelligence agencies. As such, the authors recommended that SHA1 be deprecated as quickly as possible.^{[6]}
On 23 February 2017, the CWI (Centrum Wiskunde & Informatica) and Google announced the SHAttered attack, in which they generated two different PDF files with the same SHA1 hash in roughly 2^{63.1} SHA1 evaluations. This attack is about 100,000 times faster than brute forcing a SHA1 collision with a birthday attack, which was estimated to take 2^{80} SHA1 evaluations. The attack required "the equivalent processing power as 6,500 years of singleCPU computations and 110 years of singleGPU computations".^{[2]}^{[16]}
At CRYPTO 98, two French researchers, Florent Chabaud and Antoine Joux, presented an attack on SHA0: collisions can be found with complexity 2^{61}, fewer than the 2^{80} for an ideal hash function of the same size.^{[48]}
In 2004, Biham and Chen found nearcollisions for SHA0—two messages that hash to nearly the same value; in this case, 142 out of the 160 bits are equal. They also found full collisions of SHA0 reduced to 62 out of its 80 rounds.^{[49]}
Subsequently, on 12 August 2004, a collision for the full SHA0 algorithm was announced by Joux, Carribault, Lemuet, and Jalby. This was done by using a generalization of the Chabaud and Joux attack. Finding the collision had complexity 2^{51} and took about 80,000 processorhours on a supercomputer with 256 Itanium 2 processors (equivalent to 13 days of fulltime use of the computer).
On 17 August 2004, at the Rump Session of CRYPTO 2004, preliminary results were announced by Wang, Feng, Lai, and Yu, about an attack on MD5, SHA0 and other hash functions. The complexity of their attack on SHA0 is 2^{40}, significantly better than the attack by Joux et al.^{[50]}^{[51]}
In February 2005, an attack by Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu was announced which could find collisions in SHA0 in 2^{39} operations.^{[30]}^{[52]}
Another attack in 2008 applying the boomerang attack brought the complexity of finding collisions down to 2^{33.6}, which is estimated to take 1 hour on an average PC.^{[53]}
In light of the results for SHA0, some experts^{[who?]} suggested that plans for the use of SHA1 in new cryptosystems should be reconsidered. After the CRYPTO 2004 results were published, NIST announced that they planned to phase out the use of SHA1 by 2010 in favor of the SHA2 variants.^{[54]}
Implementations of all FIPSapproved security functions can be officially validated through the CMVP program, jointly run by the National Institute of Standards and Technology (NIST) and the Communications Security Establishment (CSE). For informal verification, a package to generate a high number of test vectors is made available for download on the NIST site; the resulting verification, however, does not replace the formal CMVP validation, which is required by law for certain applications.
As of December 2013^{[update]}, there are over 2000 validated implementations of SHA1, with 14 of them capable of handling messages with a length in bits not a multiple of eight (see SHS Validation List).
These are examples of SHA1 message digests in hexadecimal and in Base64 binary to ASCII text encoding.
SHA1("The quick brown fox jumps over the lazy dog") gives hexadecimal: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12 gives Base64 binary to ASCII text encoding: L9ThxnotKPzthJ7hu3bnORuT6xI=
Even a small change in the message will, with overwhelming probability, result in many bits changing due to the avalanche effect. For example, changing dog
to cog
produces a hash with different values for 81 of the 160 bits:
SHA1("The quick brown fox jumps over the lazy cog") gives hexadecimal: de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3 gives Base64 binary to ASCII text encoding: 3p8sf9JeGzr60+haC9F9mxANtLM=
The hash of the zerolength string is:
SHA1("") gives hexadecimal: da39a3ee5e6b4b0d3255bfef95601890afd80709 gives Base64 binary to ASCII text encoding: 2jmj7l5rSw0yVb/vlWAYkK/YBwk=
Pseudocode for the SHA1 algorithm follows:
Note 1: All variables are unsigned 32bit quantities and wrap modulo 2^{32} when calculating, except for ml, the message length, which is a 64bit quantity, and hh, the message digest, which is a 160bit quantity. Note 2: All constants in this pseudo code are in big endian. Within each word, the most significant byte is stored in the leftmost byte position Initialize variables: h0 = 0x67452301 h1 = 0xEFCDAB89 h2 = 0x98BADCFE h3 = 0x10325476 h4 = 0xC3D2E1F0 ml = message length in bits (always a multiple of the number of bits in a character). Preprocessing: append the bit '1' to the message e.g. by adding 0x80 if message length is a multiple of 8 bits. append 0 ≤ k < 512 bits '0', such that the resulting message length in bits is congruent to −64 ≡ 448 (mod 512) append ml, the original message length, as a 64bit bigendian integer. Thus, the total length is a multiple of 512 bits. Process the message in successive 512bit chunks: break message into 512bit chunks for each chunk break chunk into sixteen 32bit bigendian words w[i], 0 ≤ i ≤ 15 Extend the sixteen 32bit words into eighty 32bit words: for i from 16 to 79 w[i] = (w[i3] xor w[i8] xor w[i14] xor w[i16]) leftrotate 1 Initialize hash value for this chunk: a = h0 b = h1 c = h2 d = h3 e = h4 Main loop:^{[3]}^{[55]} for i from 0 to 79 if 0 ≤ i ≤ 19 then f = (b and c) or ((not b) and d) k = 0x5A827999 else if 20 ≤ i ≤ 39 f = b xor c xor d k = 0x6ED9EBA1 else if 40 ≤ i ≤ 59 f = (b and c) or (b and d) or (c and d) k = 0x8F1BBCDC else if 60 ≤ i ≤ 79 f = b xor c xor d k = 0xCA62C1D6 temp = (a leftrotate 5) + f + e + k + w[i] e = d d = c c = b leftrotate 30 b = a a = temp Add this chunk's hash to result so far: h0 = h0 + a h1 = h1 + b h2 = h2 + c h3 = h3 + d h4 = h4 + e Produce the final hash value (bigendian) as a 160bit number: hh = (h0 leftshift 128) or (h1 leftshift 96) or (h2 leftshift 64) or (h3 leftshift 32) or h4
The number hh
is the message digest, which can be written in hexadecimal (base 16), but is often written using Base64 binary to ASCII text encoding.
The constant values used are chosen to be nothing up my sleeve numbers: The four round constants k
are 2^{30} times the square roots of 2, 3, 5 and 10. The first four starting values for h0
through h3
are the same with the MD5 algorithm, and the fifth (for h4
) is similar.
Instead of the formulation from the original FIPS PUB 1801 shown, the following equivalent expressions may be used to compute f
in the main loop above:
Bitwise choice between c and d, controlled by b. (0 ≤ i ≤ 19): f = d xor (b and (c xor d)) (alternative 1) (0 ≤ i ≤ 19): f = (b and c) xor ((not b) and d) (alternative 2) (0 ≤ i ≤ 19): f = (b and c) + ((not b) and d) (alternative 3) (0 ≤ i ≤ 19): f = vec_sel(d, c, b) (alternative 4) Bitwise majority function. (40 ≤ i ≤ 59): f = (b and c) or (d and (b or c)) (alternative 1) (40 ≤ i ≤ 59): f = (b and c) or (d and (b xor c)) (alternative 2) (40 ≤ i ≤ 59): f = (b and c) + (d and (b xor c)) (alternative 3) (40 ≤ i ≤ 59): f = (b and c) xor (b and d) xor (c and d) (alternative 4) (40 ≤ i ≤ 59): f = vec_sel(c, b, c xor d) (alternative 5)
It was also shown^{[56]} that for the rounds 32–79 the computation of:
w[i] = (w[i3] xor w[i8] xor w[i14] xor w[i16]) leftrotate 1
can be replaced with:
w[i] = (w[i6] xor w[i16] xor w[i28] xor w[i32]) leftrotate 2
This transformation keeps all operands 64bit aligned and, by removing the dependency of w[i]
on w[i3]
, allows efficient SIMD implementation with a vector length of 4 like x86 SSE instructions.
In the table below, internal state means the "internal hash sum" after each compression of a data block.
Note that performance will vary not only between algorithms, but also with the specific implementation and hardware used. The OpenSSL tool has a builtin "speed" command that benchmarks the various algorithms on the user's system.
Algorithm and variant  Output size (bits) 
Internal state size (bits) 
Block size (bits) 
Max message size (bits) 
Rounds  Operations  Security bits (Info) 
Capacity against length extension attacks 
Performance on Skylake (median cpb)^{[57]}  First Published  

long messages  8 bytes  
MD5 (as reference)  128  128 (4 × 32) 
512  Unlimited^{[58]}  64  And, Xor, Rot, Add (mod 2^{32}), Or  <64 (collisions found) 
0  4.99  55.00  1992  
SHA0  160  160 (5 × 32) 
512  2^{64} − 1  80  And, Xor, Rot, Add (mod 2^{32}), Or  <34 (collisions found) 
0  ≈ SHA1  ≈ SHA1  1993  
SHA1  <63 (collisions found^{[59]}) 
3.47  52.00  1995  
SHA2  SHA224 SHA256 
224 256 
256 (8 × 32) 
512  2^{64} − 1  64  And, Xor, Rot, Add (mod 2^{32}), Or, Shr  112 128 
32 0 
7.62 7.63 
84.50 85.25 
2004 2001 
SHA384 SHA512 
384 512 
512 (8 × 64) 
1024  2^{128} − 1  80  And, Xor, Rot, Add (mod 2^{64}), Or, Shr  192 256 
128 (≤ 384) 0 
5.12 5.06 
135.75 135.50 

SHA512/224 SHA512/256 
224 256 
112 128 
288 256 
≈ SHA384  ≈ SHA384  
SHA3  SHA3224 SHA3256 SHA3384 SHA3512 
224 256 384 512 
1600 (5 × 5 × 64) 
1152 1088 832 576 
Unlimited^{[60]}  24^{[61]}  And, Xor, Rot, Not  112 128 192 256 
448 512 768 1024 
8.12 8.59 11.06 15.88 
154.25 155.50 164.00 164.00 
2015 
SHAKE128 SHAKE256 
d (arbitrary) d (arbitrary) 
1344 1088 
min(d/2, 128) min(d/2, 256) 
256 512 
7.08 8.59 
155.25 155.50 
For verifying the hash (which is the only thing they verify in the signature), they have chosen to use a function (strncmp) which stops on the first nullbyte – with a positive result. Out of the 160 bits of the SHA1hash, up to 152 bits are thrown away.
Unlike SHA1 and SHA2, Keccak does not have the lengthextension weakness, hence does not need the HMAC nested construction. Instead, MAC computation can be performed by simply prepending the message with the key.
In the unlikely event that b is greater than 2^64, then only the loworder 64 bits of b are used.