90% of all statistics can be made to say anything... 50% of the time, aka my thoughts on the Verizon report

90% of all statistics can be made to say anything... 50% of the time, aka my thoughts on the Verizon report

Summary: ** Update 06/23/2008: I realize I didn't do a very good job of talking about what we're reviewing here.  This is in response to the statistics gathered by Verizon related to Forensic Analysis of Data Breaches over a four year span.

TOPICS: Verizon, Security

Pac Man

** Update 06/23/2008: I realize I didn't do a very good job of talking about what we're reviewing here.  This is in response to the statistics gathered by Verizon related to Forensic Analysis of Data Breaches over a four year span. 

First off, let me start by saying, I'm 100% not putting Verizon's numbers into question.  What I intend here is simply to provide a second opinion on a complex set of data that can be easily misinterpreted.  To get you started, here's the report I'm talking about.  When reading this, keep in mind that I've worked the last five years of my life as a computer security consultant, the last three with Ernst & Young's Advanced Security Center, and that may certainly put a spin on these numbers that others won't agree with.  That's fine, the comments here are simply meant to stimulate thought.  I'm looking forward to talkbacks on this one, so if you agree, if you disagree, if you think I'm a bastard... feel free to comment.

As you dig into the report, keep in mind this is a massive compilation of over 4 years worth of study from over 500 forensic investigations.  Read on... if you dare.

The first thing that will jump off the report at you is the first question, "Who is behind data breaches?" which led to the following stats:

  • 73% resulted from external sources
  • 18% were caused by insiders
  • 39% implicated business partners
  • 30% involved multiple parties 

The first thing you're thinking is, "Wow, my consultant has been lying to me about internal threats!", the thing is, that's not necessarily true.  First off, the context around "implicated business partners" and "involved multiple parties" leaves something to be desired in terms of clarification, here's why:

  1. Many "business partners" have access roughly equivalent to internal access due to VPN, B2B, and extranet connections that are shared between partners.  This could lead to an insider threat from an outside source.  My personal experience (which admittedly is not based off of four years of forensic investigations, but is sizeable) is that most clients I work with do NOT have sufficient internal network segregation controls, whether they do this by firewall, VLAN, etc.  I can't say how many times I've been able to hack into a "business partners" weaker applications or network and use that access to compromise another companies Intranet.  Even more discouraging is how many of my clients that I still see using a "flat" network structure.
  2. The term "business partners" is pretty ambiguous.  Does this mean, simply people I do business with, like third-party vendors or does this mean business partners that my company owns?  There's a big difference there, in one case you have true outsiders, in the other, you could attribute the business partner to an insider as well.
  3. What exactly does multiple parties mean?  Does that mean several conspiring external forces, or outsiders conspiring with insiders?  The question this brings up is are things being double counted or not?  Obviously if you total up the above numbers, they do NOT equal 100%, so there must be some overlap, but where and how much?  Hell, multiple parties could simply mean more than one person.
  4. Does external sources mean exclusively external sources, or external only attacks + external with internal help attacks + external business partner attacks?  For that matter, what does external sources mean?  Does it mean attackers that are not a part of the business, or does it simply mean the attack came from outside the victim's network?

Further more, as Verizon states:

"Breaches attributed to insiders, though fewer in number, were much larger than those caused by outsiders when they did occur.  As a reminder of risks inherent to the extended enterprise, business partners were behind well over a third of breaches, a number that rose five-fold over the time period of the study."

Ok, so all this said, you might choose to redefine the numbers that Verizion has provided.  In fact, depending on how this data was actually collected, and Verizon's definitions of their own statistics, you might be able to say the following just as easily and possibly more accurately:

  • 34% to 73+% resulted from external sources (assuming that some portion of the 39% of implicated business partners were counted here and really should've been considered insiders as they are truly a part of the greater business and not really external entities)
  • 18% to 87% were caused by insiders (assuming that some portion of the 39% of implicated business partners are really internal to the network and that some portion of the 30% of involved multiple parties could've included internal resources)
  • 39% implicated business partners
  • 30% involved multiple parties 

That really changes the way you look at it.  Certainly my analysis could be flawed, but just keep this kind of thing in mind when you are looking at the numbers is that, despite Verizon's best efforts to keep us all on the same page, you truly can't understand the context that Verizon wrote some of this with.  That is NOT to criticize Verizon, they did an amazing job of cataloging this information and actually making it mean a LOT of sense... again, I reiterate I love this study... I just have more questions that I hope Verizon will seek to answer. 

To be fair, Verizon does try to cover this in their section entitled "Sources of Data Breaches" on page 10 of the 29 page PDF file.  Also worth noting, there's some more clarification on what "business partners" means on page 14 of the 29 page document, where Verizon states:

"Partner-side information assets and connections were compromised and used by an external entity to attack the victim’s systems in 57 percent of breaches involving a business partner. Though not a willing accomplice, the partner’s lax security practices—often outside the victim’s control—undeniably allow such attacks to take place."

The second question brings up more questions and warrants further analysis.  It states, "How do breaches occur?" and captures the following related numbers:

  • 62% were attributed to a significant error
  • 59% resulted from hacking and intrusions
  • 31% incorporated malicious code
  • 22% exploited a vulnerability
  • 15% were due to physical threats

So the first one that kind of blows my mind a bit is the "62% were attributed to significant error"... so what were the other 38% attributed to?  My general thought on a data breach is that somebody, somewhere, jacked something up.  Maybe it wasn't the victim companies fault, cause tapes were dropped off a truck, or a third party application had a stack overflow, but somebody messed up.  I'm really struggling with that one.  The next that bothers me is that "59% resulted from hacking and intrusions" but only "22% exploited a vulnerability".  I guess I'm looking for a definition of what hacking, intrusions, and vulnerability means to Verizon, cause I'd expect that far more than 22% of data breaches are due to a vulnerability.  The confusion for me goes on, as Verizon states:

"Intrusion attempts targeted the application layer more than the operating system and less than a quarter of attacks exploited vulnerabilities.  Ninety percent of known vulnerabilities exploited by these attacks had patches available for at least six months prior to the breach."

Umm.... 90% of the vulnerabilities exploited had patches, but you just said that the intrusions targeted the application layer more than the operating system.  This allows me to draw one of three conclusions which I summarize below:

  1. Verizon does not consider SQL Injection, XSS, CSRF, and other application layer intrustions as vulnerabilities, as there are no patches for most application layer flaws.
  2. OR, their 90% of all these issues had patches statement is way off
  3. OR, I'm somehow missing over a critical explanation that makes this straightforward (it is a large report)

Before I move further, Verizon later characterizes this information as relating to only those exploits that involved "known vulnerabilties".  This doesn't really help us though.  It still begs the question, are they referring to simply those things for which we have CVE reference numbers?  Cause SQL Injection is a known exploit at this time (God, I hope we can say that now), and it most certainly does not have a patch you can apply.

It gets stranger at a later point in the report, where it looks to me as if they are counting OS level attacks not once but twice.  On page 16 of the 29 page report, the pie chart represents that:

  • 39% are Application/Service Layer exploits
  • 23% are OS/Platform Layer exploits
  • 18% Exploit known vulnerabilities
  • 5% Exploit unknown vulnerabilities

So, somehow, this totals 100% and is represented as a single pie chart... however, their really should be two charts here in my eyes... one that represents application/service layer exploits vs. os/platform layer exploits and one that represents known vulnerabilities vs. unknown vulnerabilities.

You can actually muddy the waters even further as their are so many joint attacks now.  Do you count one of my protocol handler attacks, which exploit software such as browsers, third-party apps, and operating system components as an os/platform layer exploit?  It's got to be delivered somehow, what if it was delivered through an XSS exposure?

Continuing, the next question asks, "What commonalities exist?" and collects the following numbers:

  • 66% involved data the victim did not know was on the system
  • 75% of breaches were not discovered by the victim
  • 83% of attacks were not highly difficult
  • 85% of breaches were the result of opportunistic attacks
  • 87% were considered avoidable through reasonable controls

Yikes, alright, I don't have much to argue about with these stats, but I will say this... they make me really, really sad.  "66% involved data the victim did not know was on the system"... argh.  Great real life example of this: Your HR department puts out a salary spreadsheet for all employees on a Windows File Share that includes detailed information such as Social Security Numbers and Bank Account Numbers (for direct deposits). 

"75% of breachers were not discovered by the victim" and they go on to say, "Most breaches go undetected for quite a while and are discovered by a third party rather than the victim organization"... yikes.  BUT, this brings up a really interesting point... these statistics are all shaded by the fact that there's probably a ton of data breaches that no one ever hears about (except the attacker).  "83% of attacks were not highly difficult", honestly, this is a relative statement which is hard to puzzle, because it begs the question, noth highly difficult to who?  I actually expect this number is much higher due to the large amount of web application attacks that are leading to data breaches, which all tend to be pretty easy issues to find and exploit.

God this report is just stuffed with wonderful information.  I truly commend Verizon on their analysis.  I am looking right now at a pie chart on page 13 of the document that might turn some heads.  It would seem that China may not be our biggest threat as I'm showing a combined total of 47% of data breaches that were investigated East Europe and right here in our own backyard, North America.  In fact, ALL OF ASIA only accounts for 35% of the data breaches.  I'm not saying that China isn't a threat, again, keep in mind the nature of the statistics which are all about data breaches.  There's a whole lot more hackers are interested in.

Ah, now to my favorite section of all... this one is nasty.  Starting on page 22 of the 29 page document, Verizon begins characterizing numbers around the type of data being breached.  In 84% of the data breaches observed, payment card data was compromised.  To that, I simply say, stay gold PCI, stay gold.  Don't expect this number to change much as we move forward, as PCI still lacks the bite to really help.  The decisions around allowing web application firewalls as a last line of defense should keep this problem around for quite sometime.

So, my final conclusions to be drawn from the Verizon report... there's lots to exploit, there's tons of data to steal, there's a lot of misconceptions and confusion, and there's still tons of snake oil that's not helping us out at all.  Oh, and I'm still bitter about PCI's decision around WAFs.  Finally, while this report is great, we still have some unknowns to consider and need to keep that in mind when making strategic decisions about where to focus


Topics: Verizon, Security

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • How many breaches from External...

    sources were facilitated by poor practices of inside sources? Weak passwords, poor surfing habits, poor security implementations, etc. External breaches only occur when an insider allows it to happen. Thus, I would say more like 95% are insider related breaches.
  • Agreed, Blog post to follow

    @ Nate:

    Great job on this. I wanted to say many of the same things, and I'm glad that someone was willing to say them besides me.

    I have a few things I want to talk about with regards to your post, but I'll leave it for a blog post on the TS/SCI Security blog (tssci-security.com) in the next few days.

    For example, lack of firewall/VLAN on partner sites has little to do with the ability of a partner to access your or their networks, compared to say, authentication and authorization requirements. What we see all too often is that partners just setup IP-based authentication sans individual users, which is certainly contrary to OWASP T10:A7 recommendations. Network security concepts just seem to worsen the scenario... now you have a vulnerable firewall, vulnerable Ethernet switch, and a vulnerable web application.

    Another thing missing from the Verizon breach reports was the "we have no idea" bucket. I'd like to assume that the Verizon incident responders are not so good that they don't even know their own weaknesses.
    • Good point

      I did not see any number specified to any type of confidence levels on their findings.
      • Bitter about WAF's? In what way?

        I have been trying to get our Network and security group to buy a Web Appliance for some time since most of our threats (excluding infected systems via VPN. No we don't have NAC) are web based.

        Fred Dunn
        • Read some of my old posts...

          WAFs are NOT an effective method of preventing web application flaws. They certainly help, but they are only good at finding things that a machine can be taught to find, like SQL Inj. and XSS.

          They should NEVER be used as a last or only line of defense.

          PCI suggests they should.

  • seems everyone has an idea to share, now ya ready for the truth?

    secure or not, i know that knowledge is power. i am human and may be wrong in some angle. it seems everyone is still blind about whats going on. EXAMPLE: the conficter decoys(along lots others) was phase one. the backdoor that was actually installed laughs at firewalls, detection systems. strange but true, there is more to it. the backdoor injects radio packets into the bus of the motherboard. this is also use to spread through my phone systems.

    the lags that yall notice and prolly the mouse jumping around for some of you are the early signs of the main worm infection i got 2 years go. the worm can be best described as gears of a watch. when its time for one to shutdown, another turn on elseware. febuary is the date that the DNS to ICMP packets will be used if this isnt addressed. the lag from experiments when the worm was made was caused when another infected machine was turned on. when i was lagged, all i had to do is go to the backroom and see when my roommate turned on her machine. JUST NOW. its more fine tuned now. another link to this concerns infected hardrives. through my tests, the early stages of any infected machine showed me that if you have more than 1 hardrive on anywhere in a local area, the worm will cut out any program that tries to format(even low level) it with an access violation error. this means that if you have 2 drives connected in 1 machine, or 1 drive in your computer(while on), and a roommate 3 rooms away also has her computer on, it wont let you touch worm on the drive.
    again, its more advanced now. it now uses aloghrythms to reset its offset if messed in memory. the lag isnt as bad as it was 2 years ago. it would take 30 minutes at some times just to turn on the machine.

    also note that a layer of the worm hijackes firmware/kernel/bios/ and uses your hardware in such a way where it runs as soon as the power is on and independant of your operating system.

    one thing never mentioned is the string injection into texts that you write(such as i amd doing now). they are the same as the sql injections cept that all i know is that its annoying and makes letters and words disapear and twist around and even meesses with my visual studio programming.

    the only ways i noticed so far to determine if your infected is by trying replace your video/audio/network card's drivers from the main manufacutuer. if they fail, your infected. the only other is through anonyous logon using logonsessions.exe which originally used 0x3e7 as logon and same seen in icmp packets.

    the hubs and routers also are altered. but they failed to encrypt the logs, and can be used to monitor.

    what some of you may seen under certain situations is numbers attached to strings. the purpose and why you seen it is linked to how the worm responds to situations. the high and allmight botnet backdoor binds numbers to components such as textboxes and listviews. when a string that the worm doesnt want you to see in memory is recognized, the numbers tell the worm how to handle the panic. if its a textbox, then it will simply disapear the text before your eyes. if its a listview, then it knows to go further and log of asession so the hacke doesnt get traced.

    again, they create 2 connectikons. 1 from chip in motherboard using radio packet injection in airwaves. and the 2nd through an open port.

    the laws of how not to get infected by certain flaws doesnt appy anymore..

    hope this wakes some of ya up. i have people in my machine in the last 2 days removing parts ofthe worm that intercept discriminating data that was filtered....

    one such intercept would be a responce from an antivirus

    take care