Putting the cracking of SHA-1 in perspective
Summary: SHA-1 is one of the most prevalent forms of a secure hash algorithm used in the legal and security industry. Now that Professor Xiaoyun Wang and her associates in Tsinghua University and Shandong University of Technology have officially cracked the SHA-1 hashing algorithm, the fallout will begin.
SHA-1 is one of the most prevalent forms of a secure hash algorithm used in the legal and security industry. Now that Professor Xiaoyun Wang and her associates in Tsinghua University and Shandong University of Technology have officially cracked the SHA-1 hashing algorithm, the fallout will begin. This won't actually be due to security concerns for the most part, but the legal ramifications may be severe.
Background:
A digital hash is basically a fingerprint of a data file. The perfect hashing algorithm will always produce a unique-enough finger print for a particular data stream that it is practically impossible to find a different data stream matching that finger print. Professor Wang did just that and found a different data stream with an identical finger print that matches the SHA-1 hash of the original data stream. While hashes have been broken before, the SHA-1 hash was published by the NIST in the1995 and was believed to be solid for a long time to come. But professor Wang surprised the cryptographic community in early 2005 with the announcement that she and her team had figured out a way to speed up the cracking process by more than 11 orders of magnitude.
Before I continue, I want to make it clear that the work of Professor Wang and her team is probably one of the biggest accomplishments in the field of cryptanalysis in recent years and is very well respected by her peers. But to put this event in the proper perspective, the finding of a hash collision does not mean the end of the world if your current security products use the SHA-1 hashing algorithm. Just because a hash collision is found doesn't necessarily mean hackers can start exploiting this. Not only does it still requires a massive amount of computing fire power to find a single hash collision but more importantly; finding a hash collision doesn't necessarily mean that a hacker has something useful.
For example: If a legal contract was digitally signed using SHA-1, the fact that you can produce a hash collision doesn't mean you have the ability to generate an arbitrary contract that conflicts with the original contract. Just because you have a blob of data that happens to have an identical SHA-1 hash isn't the same as being able to produce a different legal contract that calls the original contract in to question. That blog you produce will most likely not even be a legible computer document by any sense of the imagination. That's not to say it couldn't ever be done, but it's certainly not trivial. So if John Doe had a prenuptial agreement with his wife that was digitally signed with SHA-1 and time stamped which stated that his wife only gets $10,000, the wife can't reasonably claim that because one SHA-1 hash collision was found by professor Wang that the original prenuptial agreement was actually a forgery. She certainly won't be able to produce another prenuptial document that showed that she was suppose to get $1,000,000 which has an identical SHA-1 hash. The science of finger print forensics or even genetic DNA matching is far less reliable than SHA-1 hashing but perfectly legitimate in the courts.
The problem is that lawyers can certainly try to use the argument that SHA-1 is flawed and juries and even courts have proven to be extremely gullible in the past regardless of what science says. Take the infamous case of the "MD5 defense". A Sydney Magistrate threw out the digitally time stamped photos in a speeding ticket case because the Roads and Traffic authority failed to produce an expert to testify that its speed camera images were secure. The motorist's defense lawyer took advantage of the courts ignorance and argued that the MD5 hashing algorithm was a discredited piece of technology and therefore the speeding photos were invalid. Never mind that the defense never proved any actual tampering by the police department or explained how hash collisions in MD5 could possibly be used to fake photographs, it didn't matter because the judge was ignorant and the traffic authority was incompetent in their prosecution of the case. We lock people away for life with photographs and audio recordings all the time that have NO digital signatures but because a piece of police evidence used a less than perfect MD5 hashing algorithm in the digital signature the entire case was thrown out. With SHA-1 being officially cracked by Chinese researchers, the "MD5 defense" just became the MD5/SHA-1 defense.
It would be interesting how lawyers view this case and I'm going to ask some of them what they think in terms of the legal ramifications. There seems to be some legal precedent in Australia because of one stupid court room but I don't think that's supposed to affect the United States or any other country. The problem is that the US Supreme Court recently cited a foreign legal precedent though not without protest from Justice Scalia and other constitutionalists. Hopefully we can get some sanity back in to the legal system.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
More or less?
Are fingerprints and DNA more or less reliable than SHA-1?
More reliable
You're wrong by many orders of magnitude.
Finger prints are even less with today's readers
Then again, take in account the recent Mythbusters Episode where they were able to pick up the fingerprints, and apply it to a lock.
Basically saying, that fooling a finger print reader is not nearly as difficult.
far worse ...
Sorry, MUCH LESS. I fixed the sentense in the blog
2^16 is 16.8 million possiblities. SHA-1 hash collisions in its new weakened state has hash collisions in 2^68.
2^16 = 65536
Sorry for the quick and dirty math, meant 24 bits
I suppose it depends on how the crack works.
But modifying a file so that it has the same SHA-1 sum [i]and[/i] the same file size afterwards? That would be a trick worth knowing.
Same SHA-1 sum and the same file size.
You can't just tinker with the bits in a compressed archive file.
uncompress, change then add random stuff until SHA worked?
uncompress, change random, compress, test.
That's brute force.
The chinese scientist found some easier way than brute force to break SHA, so there could be an easy way to predict what random stuff to add to make the SHA the same as original.
You're making some big leaps
No, the scientist found A collision, not necessarily a SPECIAL collision that would allow you to create a fake signature for a fake document. There's a world of difference there.
I don't think you realise how easy it would be
If your intent was to find a way to modify a message and the hash to indicate it was authentic, you wouldn't need to spend that much money.
Put it this way. Do you really think the NSA would give out an encryption mechanism that would allow terrorists to encrypt their hard drive, and the NSA have no way to 'bypass' the encryption? (bypass as in with relatively little hardware).
Nope. They'd find a very devious mathematical way of obscuring
the weakness.
(Via Padlock uses SHA) http://www.via.com.tw/en/initiatives/padlock/features.jsp
Do you have any idea how silly that statement is
NSA doesn't set the AES encryption standard, the AES was produced by an international consortium and the winning design wasn't even from an American. Sheesh.
http://articles.techrepublic.com.com/5100-1009_11-6145490.html
Try this from wikipedia
"SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its compression function; this was done, according to the NSA, to correct a flaw in the original algorithm which reduced its cryptographic security. However, the NSA did not provide any further explanation or identify what flaw was corrected. Weaknesses have subsequently been reported in both SHA-0 and SHA-1. SHA-1 appears to provide greater resistance to attacks, supporting the NSA?s assertion that the change increased the security"
Perhaps you'll note the bit where the final algorithm was defined by NSA.
Heh, I'm not afraid of the "brute force" approach :-)
But isn't the problem of "How do you change a file so that [i]when you later compress it[/i], it still has the same SHA-1 sum as the original compressed file?" [b]much[/b] harder than the uncompressed case?
well you wouldn't use a computer
With lots of parallel purpose-designed hardware (as in NSA) you could find lots of collisions very quickly. Eventually you'd have the entire private key. Once you've got this, you can sign your own message and it appear to come from the original sender.
The answer to your other question is probably yes. It kind of depends what the original file is really. There could be ways to do this.
C'mon, now you're getting ridiculous
Hash collisions have nothing to do with revealing the private key, sheesh! Hash algorithm != asymmetric encryption algorithm. You can find hash collisions till you're blue in the face and you're NOT going to find a private key in there. Sheesh.
Bruce Scheider "this pretty much puts a bullet in SHA-1"
Here is a link:
http://www.computerweekly.com/Articles/2005/03/08/208736/companies-forced-to-reconsider-security-as-sha-1-code-is.htm
Where Bruce Schneider (Counterpane Internet Security) is reported to say this.
It pretty much blows away your whole article don't you think?
I do.
In wikipedia on many pages, it is stated that the NSA developed SHA-1 etc:
http://en.wikipedia.org/wiki/Cryptographic_hash_function#List_of_cryptographic_hash_functions
You also clearly don't understand the inter-relationship between block and hash ciphers. Note: SHA-1 can be used to give a block cipher (it's called SHACAL-1). Similarily, a block cipher can be used to hide a function of a document that verifies authenticity. (on the page linked to above, see the following:
"Just as block ciphers can be used to build hash functions, hash functions can be used to build block ciphers. Examples of such block ciphers are SHACAL, BEAR and LION."
If you grokked the basic math, this is obvious.