Putting the cracking of SHA-1 in perspective

Putting the cracking of SHA-1 in perspective

Summary: SHA-1 is one of the most prevalent forms of a secure hash algorithm used in the legal and security industry.  Now that Professor Xiaoyun Wang and her associates in Tsinghua University and Shandong University of Technology have officially cracked the SHA-1 hashing algorithm, the fallout will begin.

SHARE:
TOPICS: Legal
50

SHA-1 is one of the most prevalent forms of a secure hash algorithm used in the legal and security industry.  Now that Professor Xiaoyun Wang and her associates in Tsinghua University and Shandong University of Technology have officially cracked the SHA-1 hashing algorithm, the fallout will begin.  This won't actually be due to security concerns for the most part, but the legal ramifications may be severe.

Background:
A digital hash is basically a fingerprint of a data file.  The perfect hashing algorithm will always produce a unique-enough finger print for a particular data stream that it is practically impossible to find a different data stream matching that finger print.  Professor Wang did just that and found a different data stream with an identical finger print that matches the SHA-1 hash of the original data stream.  While hashes have been broken before, the SHA-1 hash was published by the NIST in the1995 and was believed to be solid for a long time to come.  But professor Wang surprised the cryptographic community in early 2005 with the announcement that she and her team had figured out a way to speed up the cracking process by more than 11 orders of magnitude.

Before I continue, I want to make it clear that the work of Professor Wang and her team is probably one of the biggest accomplishments in the field of cryptanalysis in recent years and is very well respected by her peers.  But to put this event in the proper perspective, the finding of a hash collision does not mean the end of the world if your current security products use the SHA-1 hashing algorithm.  Just because a hash collision is found doesn't necessarily mean hackers can start exploiting this.  Not only does it still requires a massive amount of computing fire power to find a single hash collision but more importantly; finding a hash collision doesn't necessarily mean that a hacker has something useful.

For example: If a legal contract was digitally signed using SHA-1, the fact that you can produce a hash collision doesn't mean you have the ability to generate an arbitrary contract that conflicts with the original contract.  Just because you have a blob of data that happens to have an identical SHA-1 hash isn't the same as being able to produce a different legal contract that calls the original contract in to question.  That blog you produce will most likely not even be a legible computer document by any sense of the imagination.  That's not to say it couldn't ever be done, but it's certainly not trivial.  So if John Doe had a prenuptial agreement with his wife that was digitally signed with SHA-1 and time stamped which stated that his wife only gets $10,000, the wife can't reasonably claim that because one SHA-1 hash collision was found by professor Wang that the original prenuptial agreement was actually a forgery.  She certainly won't be able to produce another prenuptial document that showed that she was suppose to get $1,000,000 which has an identical SHA-1 hash.  The science of finger print forensics or even genetic DNA matching is far less reliable than SHA-1 hashing but perfectly legitimate in the courts.

The problem is that lawyers can certainly try to use the argument that SHA-1 is flawed and juries and even courts have proven to be extremely gullible in the past regardless of what science says.  Take the infamous case of the "MD5 defense".  A Sydney Magistrate threw out the digitally time stamped photos in a speeding ticket case because the Roads and Traffic authority failed to produce an expert to testify that its speed camera images were secure.  The motorist's defense lawyer took advantage of the courts ignorance and argued that the MD5 hashing algorithm was a discredited piece of technology and therefore the speeding photos were invalid.  Never mind that the defense never proved any actual tampering by the police department or explained how hash collisions in MD5 could possibly be used to fake photographs, it didn't matter because the judge was ignorant and the traffic authority was incompetent in their prosecution of the case.  We lock people away for life with photographs and audio recordings all the time that have NO digital signatures but because a piece of police evidence used a less than perfect MD5 hashing algorithm in the digital signature the entire case was thrown out.  With SHA-1 being officially cracked by Chinese researchers, the "MD5 defense" just became the MD5/SHA-1 defense.

It would be interesting how lawyers view this case and I'm going to ask some of them what they think in terms of the legal ramifications.  There seems to be some legal precedent in Australia because of one stupid court room but I don't think that's supposed to affect the United States or any other country.  The problem is that the US Supreme Court recently cited a foreign legal precedent though not without protest from Justice Scalia and other constitutionalists.  Hopefully we can get some sanity back in to the legal system.

Topic: Legal

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

50 comments
Log in or register to join the discussion
  • More or less?

    "The science of finger print forensics or even genetic DNA matching is still far reliable than SHA-1 hashing."

    Are fingerprints and DNA more or less reliable than SHA-1?
    Letophoro
    • More reliable

      Fingerprints and DNA are more reliable than SHA-1. No two fingerprints between two or more persons have ever been found to be identical. DNA is mildly less reliable in that true identical twins have the same DNA makeup, but never the same fingerprints.
      ken_ballard9
      • You're wrong by many orders of magnitude.

        Fingerprints and DNA results colide on a order of far less than 2^32 power. SHA-1 even in its weakened state will collide in 2^68.
        georgeou
        • Finger prints are even less with today's readers

          Keep in mind, many of the modern readers only pick a certain amount of matching points, which could be systematically less than 2^32.

          Then again, take in account the recent Mythbusters Episode where they were able to pick up the fingerprints, and apply it to a lock.

          Basically saying, that fooling a finger print reader is not nearly as difficult.
          nucrash
          • far worse ...

            If you have one of those laptops with fingerprint ID, walk around at work and see how many of your workmates will be granted access. "Fingerprint security" is one of the biggest pieces of marketing propaganda I've seen for personal computers in the past 5 years. I don't waste my time activating it.
            zoroaster
    • Sorry, MUCH LESS. I fixed the sentense in the blog

      DNA tests can have collisions of 1 in a million depending on the particular test. It's still reliable because when used in conjunction with other evidence the odds of an error is so small that it's not reasonable doubt. That doesn't prevent stupid juries like the OJ jury from turning "reasonalbe doubt" in to "any doubt".

      2^16 is 16.8 million possiblities. SHA-1 hash collisions in its new weakened state has hash collisions in 2^68.
      georgeou
      • 2^16 = 65536

        N/T
        rpmyers1
        • Sorry for the quick and dirty math, meant 24 bits

          24 bit is 16.8 million, sorry for the quick and dirty math. DNA tests can be as accurate as 1 in 10 billion with some of the more modern testing, so that's getting past 2^33. While that's nowhere close to 2^68 for SHA-1, it's an insanely accurate number.
          georgeou
  • I suppose it depends on how the crack works.

    If someone can download a binary file, tamper with its contents and then append a custom blob to it so that it still has the same SHA-1 sum as the untampered version then that might fool some people (provided the blob were small enough). Does the crack provide a methodology for creating such a blob?

    But modifying a file so that it has the same SHA-1 sum [i]and[/i] the same file size afterwards? That would be a trick worth knowing.
    Zogg
    • Same SHA-1 sum and the same file size.

      Probably very tough on text files, but I could see it image files where you have millions of least-significant color bits to play with. As in steganographic techniques, tinkering with the low-order bits in pixels make almost no perceptible difference in the appearance of the image.
      Henry Miller
      • You can't just tinker with the bits in a compressed archive file.

        Well, you can I suppose. But I wouldn't expect the file to uncompress afterwards ;-) .
        Zogg
        • uncompress, change then add random stuff until SHA worked?

          Or something similar to the above, maybe
          uncompress, change random, compress, test.
          That's brute force.

          The chinese scientist found some easier way than brute force to break SHA, so there could be an easy way to predict what random stuff to add to make the SHA the same as original.
          stevey_d
          • You're making some big leaps

            "The chinese scientist found some easier way than brute force to break SHA, so there could be an easy way to predict what random stuff to add to make the SHA the same as original."

            No, the scientist found A collision, not necessarily a SPECIAL collision that would allow you to create a fake signature for a fake document. There's a world of difference there.
            georgeou
          • I don't think you realise how easy it would be

            To buy several thousand FPGAs, put them on some PCBs, and run the collision detection algorithm at orders of magnitude faster than you could conceive of on a thousand PCs.

            If your intent was to find a way to modify a message and the hash to indicate it was authentic, you wouldn't need to spend that much money.

            Put it this way. Do you really think the NSA would give out an encryption mechanism that would allow terrorists to encrypt their hard drive, and the NSA have no way to 'bypass' the encryption? (bypass as in with relatively little hardware).

            Nope. They'd find a very devious mathematical way of obscuring
            the weakness.

            (Via Padlock uses SHA) http://www.via.com.tw/en/initiatives/padlock/features.jsp
            stevey_d
          • Do you have any idea how silly that statement is

            I'm going to tell you one last time but it doesn't sound like you're capable of understanding. The hash has nothing to do with encryption. Encryption is used on a hash to produce a digital siganture, but you can find hash collisions till you're blue in the face and you're still not going to reveal the private key of an RSA or DH key pair.

            NSA doesn't set the AES encryption standard, the AES was produced by an international consortium and the winning design wasn't even from an American. Sheesh.
            http://articles.techrepublic.com.com/5100-1009_11-6145490.html
            georgeou
          • Try this from wikipedia

            Look up SHA-1

            "SHA-1 differs from SHA-0 only by a single bitwise rotation in the message schedule of its compression function; this was done, according to the NSA, to correct a flaw in the original algorithm which reduced its cryptographic security. However, the NSA did not provide any further explanation or identify what flaw was corrected. Weaknesses have subsequently been reported in both SHA-0 and SHA-1. SHA-1 appears to provide greater resistance to attacks, supporting the NSA?s assertion that the change increased the security"

            Perhaps you'll note the bit where the final algorithm was defined by NSA.
            stevey_d
          • Heh, I'm not afraid of the "brute force" approach :-)

            I'm thinking you'd need a very fast computer to make that method worthwhile, although it was [u]always[/u] guaranteed to be [i]possible[/i] to do it that way.

            But isn't the problem of "How do you change a file so that [i]when you later compress it[/i], it still has the same SHA-1 sum as the original compressed file?" [b]much[/b] harder than the uncompressed case?
            Zogg
          • well you wouldn't use a computer

            You'd build a brute force engine out of FPGAs or ASICs. You can even program these in C if you want (Verilog is more common).
            With lots of parallel purpose-designed hardware (as in NSA) you could find lots of collisions very quickly. Eventually you'd have the entire private key. Once you've got this, you can sign your own message and it appear to come from the original sender.

            The answer to your other question is probably yes. It kind of depends what the original file is really. There could be ways to do this.
            stevey_d
          • C'mon, now you're getting ridiculous

            "With lots of parallel purpose-designed hardware (as in NSA) you could find lots of collisions very quickly. Eventually you'd have the entire private key."

            Hash collisions have nothing to do with revealing the private key, sheesh! Hash algorithm != asymmetric encryption algorithm. You can find hash collisions till you're blue in the face and you're NOT going to find a private key in there. Sheesh.
            georgeou
          • Bruce Scheider "this pretty much puts a bullet in SHA-1"

            Maybe you don't understand what you're writing about George.

            Here is a link:
            http://www.computerweekly.com/Articles/2005/03/08/208736/companies-forced-to-reconsider-security-as-sha-1-code-is.htm

            Where Bruce Schneider (Counterpane Internet Security) is reported to say this.

            It pretty much blows away your whole article don't you think?
            I do.

            In wikipedia on many pages, it is stated that the NSA developed SHA-1 etc:
            http://en.wikipedia.org/wiki/Cryptographic_hash_function#List_of_cryptographic_hash_functions

            You also clearly don't understand the inter-relationship between block and hash ciphers. Note: SHA-1 can be used to give a block cipher (it's called SHACAL-1). Similarily, a block cipher can be used to hide a function of a document that verifies authenticity. (on the page linked to above, see the following:

            "Just as block ciphers can be used to build hash functions, hash functions can be used to build block ciphers. Examples of such block ciphers are SHACAL, BEAR and LION."

            If you grokked the basic math, this is obvious.
            stevey_d