PRISM: Here's how the NSA wiretapped the Internet

PRISM: Here's how the NSA wiretapped the Internet

Summary: UPDATED 5: The National Security Agency's "PRISM" program is able to collect, in real time, intelligence not limited to social networks and email accounts. But the seven tech companies accused of opening 'back doors' to the spy agency could well be proven innocent.

SHARE:

<— What we know; what we think

By tapping into the connection between the Tier 1 network and the edge connection, the NSA would be able to literally view and copy data transmitted over every single session from a user to an application in realtime, and then stored and processed appropriately.

You can't walk into, say, Apple's iCloud datacenter and install a wiretap. Apple would notice it. It would have to be done out of band: such as when the data leaves the datacenter and begins its journey on the way to the user sitting at home on their laptop or mobile device.

Microsoft's Hotmail service — now defunct, and rebranded as Outlook.com — was on the list of PRISM services that were being accessed by the NSA. But the NSA didn't need to seek Microsoft's permission, or even to serve it with a court order or a ruling from the FISC. Because of the sheer size of the company, someone would have eventually either said something to someone else and broken the law by breaching the gagging clauses in the process — or someone would've noticed a backdoor in the systems somewhere.

And, using Hotmail as an example, if the NSA was acquiring all the data since September 2007 — the time the leaked slides show the data harvesting began — the NSA would in theory now have all of everyone's Hotmail data to date.

But that would be almost useless to the NSA. The agency wants to know about the "here and now," not "then." They want information that is immediately actionable.

There's the issue of encryption, such as an SSL connection, which offers a HTTPS secure pipe between the user's computer and the website providing the service. It's like a metal pipe that stretches end-to-end. The port that's opened up on your computer is encrypted and everything that flows through it is completely unreadable.

But if the NSA were intercepting traffic and decrypting it somehow on the edge connection between the application service provider — such as Facebook, Gmail, Amazon, for example — and the Tier 1 network, the application service provider would be unaware that this was happening.

There are a number of wiretap-related laws, and which one is used depends on the case. Of course the main one is the Wiretap Act. But it all depends on which law may sway the judge that must hand the order down to authorize such an act.

According to the Electronic Frontier Foundation (EFF), the Wiretap Act requires police, law enforcement or intelligence agencies to seek a warrant — often called a "super-warrant" — to intercept "electronic communications," such as Internet activity and cell activity. This includes emails, Web history, text messages and instant messaging, and more.

The privacy group states that under the Wiretap Act, although a wiretap order is needed to intercept your electronic communications, only your oral and wire communications — such as voice communications — are covered by the statute's "exclusionary rule." If your phone calls are illegally intercepted, such as without a warrant, that evidence can't be introduced against you in a criminal trial. But, the statute will not prevent the introduction of illegally intercepted emails and text messages in court.

Section 215 of the Patriot Act, which amended the Foreign Intelligence Surveillance Act 1978, allows the government to acquire "tangible things" — so long as the FISC court is aware that it is for an "authorized investigation" to "prevent terrorism" or "clandestine intelligence activities." 

Also, the Communications Assistance for Law Enforcement Act (CALEA), passed in 1994, requires U.S. telecoms firms and manufacturers to ensure their equipment is able to implement government wiretaps. This not only includes traditional telephone lines and broadband connections, but also voice-over-IP (VoIP) traffic.

This should be enough for the NSA to wiretap Tier 1 companies.

Update at 9:00 a.m. ET on June 8: SSL section edited for clarification. Thanks for the feedback; we'll address this in the comments section.

Although SSL-encrypted data is still unreadable at its current destination, the NSA likely has the capabilities to break this encryption later at its datacenter, presumably using vast computational resources. This would have to be done for each session, and likely only for targets of interest since the ability to do this would be extremely computationally expensive, as both public key and symmetric keys would have to be cracked.

Alternatively, the U.S. government could issue a FISA order against the certificate authorities themselves. FISA may well negate any SSL-decryption methods — whether they exist or not.

Let's explore both.

Today most Web services use 128-bit or 256-bit key encryption, both of which would be child's play for an advanced NSA supercomputer optimized specifically for cryptographic work to crack. Facebook and Google, for example, use 128-bit RSA encryption with TLS 1.1 connections for their Web servers. (Google is planning to move to a 2048-bit RSA key later this year.)

Issuing certificate authorities, as well as the National Institute of Standards (NIST), have already recommended that businesses move their Web servers to more complex encryption methods. This is because these sessions are crackable by conventional computer technology, let alone something exotic that the NSA might have in its possession.

However, the SSL encryption appliances and co-processors which must do this at the Web server end without significantly compromising application or server performance are extremely expensive. Corporations have been lax in moving to these standards. 

Cracking the encrypted SSL sessions could also be achieved through compromised certificates from the issuing certificate authority, making decryption of vast amounts of sessions that much easier.

nist-keysize-recs2-600
Recommendations for digital signature encryption hash length, NIST

Having someone on the inside leaking the certificate's private key to a third party like the NSA is unlikely. It could also be discovered relatively easily. What's more likely is that the U.S. government could petition the FISC in order to seek a secret warrant against the certificate authorities. 

The chief executives of Google, Facebook, Microsoft and so on would be none the wiser because they would not have been told by the certificate authorities, as per the gagging order. It's possible that a FISC order could even prevent the certificate authority's chief executive's from knowing. From here, they can forge any SSL certificate. This would allow the NSA to conduct a man-in-the-middle attack without the user or the company involved even knowing.

[Update ends.]

In addition to the direct tap of the Tier 1 edge connections, the NSA is also likely making direct copies of application databases, their contents and files stored at content delivery networks (CDNs).

In many cases this is the exact same thing as the Tier 1 edge peer because this is how the content is distributed in the first place. Ultimately this would allow the NSA to reverse engineer the information as it was stored in the original application, and would not require nearly as much computational power to break than individual SSL sessions, one at a time.

There are two main benefits to wiretapping the Tier 1 edge connections.

Firstly, the companies involved that provide Web services and applications are unaware of the data gathering because it happens outside of their networks. Secondly, these Tier 1 network providers have a far smaller employee base working in these divisions than the aforementioned companies. This allows the NSA to either send its own employees in as "virtual" employees — working under the guise of these companies — while the NSA gags those companies from disclosing this fact to other staff. They could look like special contractors that only work with the special wiretapping routers.

With this technique, the number of people who actually know about the wiretapping would remain low.

Only those who actively use the PRISM system to examine the wiretap-collected data as well as a few people within the Tier 1 companies and the network equipment manufacturer that develops the wiretapping hardware would be directly complicit in this scheme.

All of those involved could be delivered gagging orders under a FISC order, such as the one Verizon received in April, which was published by The Guardian, and face prosecution and jail time if they talk.

The likelihood is that, should this theory prove true, other governments and nations may also be complicit in NSA's wiretapping scheme. The U.K. government has already been implicated with its listening station, the Government Communications Headquarters (GCHQ), reportedly using PRISM, in spite of intelligence sharing and mutual legal assistance treaties between the two countries.

Perhaps this is even happening as far as the UKUSA Agreement, in which the U.K., the U.S., Canada, Australia, and New Zealand agreed on signals intelligence sharing. Or, it could go as far as NATO countries. But there are some doubts over the "NOFORN" classification tags on some of the leaked documents, which indicates that foreign nations — including those in the UKUSA Agreement — are not allowed to view them.

PRISM and data mining: What data is being collected?

PRISM could be considered the "ECHELON 2.0" signals intelligence gathering system between the countries in the UKUSA Agreement. The intelligence system allegedly monitors almost anything carried over telephone wires and intercepts satellite communications.

But telephone wires are in a dying category. Datacenters and cloud services are the new norm. And amid the privacy scandal, it still isn't clear what the wiretapped data actually is.

PRISM is probably more like a Web-based application — like a search engine — than a "program" or an "operation." Behind the scenes there will be a vast big data operation that uses algorithms, natural language queries and search syntax to extrapolate the data the NSA operative needs.

It's also likely that, like a tabbed Web application that you would see in Google's range of services across the top of the page, it would not be surprising if PRISM was just one application out of many in the NSA's toolshed.

It probably sits like a layer on top of the NSA's shared resources and infrastructure. Everything the NSA is doing — at least data pertinent to the operative or analyst — could be fed in, like a data mining application. The NSA collects all kinds of data — from phone call metadata and email content data to radio waves and satellite communications. It may just need a FISC warrant to actively access some of it.

What's also not clear is just how much data is being harvested and stored. The NSA started to capture data from Microsoft in 2007, the leaked documents say. Following on from that, Yahoo was next in 2008 and Google, Facebook and PalTalk in 2009. And so on.

But does the NSA still hold that data? Or does it wipe its storage after six months or so — once the data has proven to be no longer relevant or useful? It's possible that there are daily snapshots, as per the Verizon court order, and takes a copy for later on-the-fly searches. Or perhaps an algorithm is constantly searching for key terms or user-specific data, using only a portion of the overall space, like a surgical search rather than downloading everything.

We simply don't know, and can only hypothesize until further leaks emerge, if any.

One source speaking to ZDNet under the condition of anonymity said $20 million — the amount quoted by the NSA in the leaked document that covers the cost of the PRISM program — wouldn't even cover the air conditioning costs and the electrical bill for the datacenter. Taking the datacenter out of the equation, $20 million would even not cover 3-6 months worth of data storage required to store keep copies of the wiretap data, they said. The storage the NSA would be procuring is most likely the most expensive, high-speed storage taxpayer money can buy.

That said, the Digital Collection System Network (DCSNet) wiretapping system, which connects, stores, indexes and analyzes metadata — such as sender and recipient email addresses, outgoing and incoming phone numbers, and time and date details — cost the Federal Bureau of Investigation (FBI) double that at $39 million by 2007.

The NSA and what it does with telephone call data

But what about the NSA harvesting of Verizon telephony data that sparked off this entire controversy in the first place? PRISM may not have necessarily been designed for that. PRISM may well be just one application out of a suite of NSA-created applications that perform different things. While PRISM focuses entirely on application sessions in the cloud, another application may in fact focus on the recording of phone calls.

President Obama, speaking on Thursday, said: "When it comes to telephone calls, nobody is listening to your telephone calls. That's not what this program was about. As indicated, what the intelligence community is doing is looking at phone numbers and durations of calls."

It would be easy to suggest that in fact nobody is listening to phone calls, at least semantically speaking. It was likely a very carefully considered sentence. But even logistically, there are too many calls to listen to anyway. It's entirely possible that algorithms are being used to transcribe and detect certain words, but in this case that's not too important and goes a little off base.

Obama also referenced "the program." This could strictly mean the examining of telephony metadata, such as the phone numbers and the durations of calls. This could be part of a broader set of tools developed by the NSA to distribute wiretapped data to the appropriate databases, completely separate from what PRISM does, which is to wiretap Internet sessions to cloud-based applications.

In practice, not everyone working at the NSA wants to listen to actual phone calls because this is a labor and time intensive activity. All they need is the metadata which describes how the call occurred. This is enough to establish a connection between two people, and therefore "reasonable suspicion." From here, it's enough to seek a warrant and prosecute as and where necessary using the legal tools that the U.S. government has at its disposal.

The NSA will have different applications doing different things. If there is the ability to record voice conversations — so long as the law allowed it, under FISA or the Wiretap Act — it would probably be in there somewhere. 

What makes PRISM tick?

While we can never be completely sure what infrastructure and software makes up the PRISM system, we have some reasonable ideas. While there would certainly be some custom and exotic hardware involved — as the NSA has its own chip-making capabilities — many of the components would be the same off-the-shelf enterprise hardware and software that powers line of business applications and services at major corporations. 

In some cases, they may be versions of these off-the-shelf systems on "steroids" using early or bleeding edge versions of processors and other components built by vendors under secret contracts.

IBM, for example, has already demonstrated sophisticated natural language abilities in Watson, which participated on the "Jeopardy!" game show in 2011. All the NSA would need to do is buy a tremendous amount of this equipment. Searching through wiretapped application data is a relatively simple exercise compared to having Watson participate on the fly in a trivia game show. 

The NSA supercomputer at the heart of PRISM likely resembles a gigantic Watson using advanced cryptographic co-processors which may employ nanophotonics like those announced by IBM in 2010 — and digs through this information at incredible speeds — petascale and exacale levels — so that only the "cream" rises to the top using key phrases and other patterns as triggers. This would be an evolution of what already existed in ECHELON, but would have more advanced natural language processing capabilities. 

And why should you believe this? It's been done before

In 2006, Room 641A became headline news. It was inside a building in San Francisco that AT&T owned, which fed in fiber optic cables from other telecoms switch buildings carrying Internet backbone traffic. Though this building was only three floors high, it had "the capability to enable surveillance and analysis of internet content on a massive scale, including both overseas and purely domestic traffic," according to Internet expert J. Scott Marcus at the time.

The "beam splitter" was used to — quite simply — split the fiber optic beam to redirect duplicate copies of all phone calls, Web traffic and email content into the clandestine room. That copied data was then handed to the NSA. According to the EFF, one expert said: "This isn't a wiretap. It's a country-tap."

It was effectively a gigantic wiretap on a huge portion of the Internet flowing in and out of the U.S. This led to an almighty class action lawsuit led by the EFF. Perhaps the more worrying part of this is that the wiretap included vast amounts of U.S. resident data, which falls in breach of FISA. Obama said on Thursday: "With respect to the Internet and emails, this does not apply to U.S. citizens and it does not apply to people living in the United States."

The nature of a beam splitter — a "prism" — therefore seems like an apt name for what appears to be a logical progression of Room 641A

In simple terms, this could be exactly what is happening at Tier 1 edge devices, which splits the beam and redirects it to equipment monitored by the NSA. Granted, in this day and age it would be a little obvious to do exactly the same thing that transpired in Room 641A. Instead, a special beam-splitting router installed at the edge connection could perform this and siphon off the data. 

PRISM is likely massive part-signals intelligence (SIGINT) and big data application that has the active and knowing involuntary participation of the U.S.' largest telecom firms, network equipment makers, supercomputer builders, and government-outsourcing professional services companies as the moving cogs in the privacy-invading machine.

Topics: Cloud, Government, Government US, Networking, Security

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

39 comments
Log in or register to join the discussion
  • Very informative

    article on the subject of the NSA, PRISM and how this could work. With all the articles here on privacy, mostly focused lately on Google Glass there have been some interesting comments. Now with this new revelation of the deep reach of the NSA's power to harvest massive data sets and in real time, many of those comments seem to fade in comparison.

    Now I am waiting for the SNL skit where Obama's on a Verizon phone saying: "Can you hear me now? Because I can hear you!"
    DancesWithTrolls
  • Very informative

    article on the subject of the NSA, PRISM and how this could work. With all the articles here on privacy, mostly focused lately on Google Glass there have been some interesting comments. Now with this new revelation of the deep reach of the NSA's power to harvest massive data sets and in real time, many of those comments seem to fade in comparison.

    Now I am waiting for the SNL skit where Obama's on a Verizon phone saying: "Can you hear me now? Because I can hear you!"
    DancesWithTrolls
  • Do de decrypt SSL ?

    The article does mention hurdle of SSL but didn't really explains if they have broken SSL or not.
    Do they really have they machine from Dan Brown's Digital Fortress .. and/or quantum computers are reality ! :)
    ajax123
    • Re: Do de decrypt SSL

      Some say that the NSA has "backdoor" keys to all of the major CA's that issue SSL certs, including the ones used by Facebook and Outlook.com. If that is the case, decrypting SSL is not difficult at all.
      alpbosch
      • Re: Do de decrypt SSL

        A "backdoor" at a CA won't help the NSA decrypt SSL. There is a private key used to generate the session keys that need to break. The private key stays on your computer and CA merely signs the associated public key. There is nothing the CA has or knows that will reveal the private key. That is one of the strengths of SSL.
        purplesuit1
        • Child's play

          Quantum computers can decrypt anything in real time. Think massive parallel decryption to the nth degree.

          You forget that these guys are at least 50 years ahead of the private sector when it comes to technology.
          Astringent
          • you can't decrypt stuff if you don't even know if it's encrypted

            that is the premise of encryption technology.
            if natural language encryption is used, there is no supercomputer that can work out the encryption method because it won't even know it's encrypted simply because it appears unencrypted.
            Low signal to noise encryption is also next to imposible to crack.
            certainly can't crack these in REAL TIME.
            Unless the feds have some form of alien hardware, they don't have access to any exclusive technology.
            Hell, they don't even know how to prevent their own servers from getting breached and you think they are 50 years ahead and can decrypt anything in real time!
            warboat
  • SSL Analogy is bogus

    Your description of how SSL works is completely erroneous. For god sakes, you could at least Wikipedia it. The keys exist only within the client and the server and have nothing to do with the network involved. A secure handshake must occur between the two endpoints before any data can even be transmitted. Putting a middleman in between invalidates the whole design.
    ralphwiggum13
    • Exactly

      The analogy presented is completely bogus. They cannot put a middle man in between and intercept SSL traffic and make sense out of it. They say that they siphon off data at "the edge of the Tier 1 networks where it gets decrypted" but that is not where it gets decrypted. It only gets decrypted at the end nodes which are either in the host network or inside the end user's local network. Doesn't matter where outside these networks they put middle men, they wont get the raw decrypted data.

      But it is possible to intercept and strip SSL from within the host/end-user's network using something like SSLStrip. But this would be noticeable to the user because the connection would now be http instead of https and that should ring an alarm..
      vinaybharadwaj
      • SSLStrip is out

        And it wouldn't work for Google or Facebook, which use Http Strict Transport Security on a list that exists when you first install Chrome.
        charleslmunger
    • ralph, you da man

      gimme five!!!
      Randy Butler
    • j­u­s­t a­s L­i­n­d­a

      before I looked at the receipt which had said $5447, I did not believe ...that...my friend was like they say realey earning money part time at their laptop.. there neighbor haz done this for under fourteen months and resantly cleared the dept on there home and got themselves a Mazda MX-5. go to...... w­w­w.b­a­y­9­5.c­o­m
      sunny leon
  • SSL analogy and TIER1 connection flow

    There are two major flaws in your theory, ignoring the minor ones.

    1) There is no central/singular bottleneck where NSA can wiretap. One of the main reasons is that bigger providers like Google/Facebook and others use Anycast to place servers closest to you for best performance, in many cases enduser requests may end up being processed before even reaching Verizon

    2) The writes don't even seem to have an idea of how SSL works, which is ignominious on part of ZdNet as it claims many editors contributed to this, this does say something about ZDNet as a whole. Now on to the point, SSL is built so as to prevent any middleman capture of unencrypted data, only the Clients computer (The Browser) and The Server (Ex: Web server in the google data center) would be able to decrypt the data.
    There is a caveat, all this security is expected unless NSA has somehow managed to steal the private keys of SSL certificates used by various companies, which would allow NSA to decrypt pretty much all data. Mind you that even Verisign (and others) that issue these certificates do not have the private keys, but the private keys have to be deployed at every single one of the web servers (hundreds of thousands of them) that google uses, all NSA would need is a single google employee for example who would be willing to compromise their integrity.
    zdnet-check
    • Correct, plus

      There's also no way that the NSA is using a man-in-the-middle attack on SSL, because all the logs of IPs used to log in to Facebook and Google would contain IPs the user didn't use.
      charleslmunger
      • technically

        if they were intercepting content, they don't need to leave a trace or login.
        warboat
  • CA certificate servers are the weak links

    While posters here have noted that the way SSL works in theory, intercepting communications over SSL links as described in the article would be problematic at best. However, there has been more and more successful defeats of the certificate authority system, usually by simply hacking into or infecting the systems belonging to CA issuers, the most notable examples probably being Comodo and DigiNotar.

    Now the NSA hacking into such systems would be extremely illegal, but then you have the situation from last year involving the CA authority, Trustwave -- this is the UK Register's description of it:

    "Certificate Authority Trustwave has revoked a digital certificate that allowed one of its clients to issue valid certificates for any server, thereby allowing one of its customers to intercept their employees' private email communication.

    "The skeleton-key CA certificate was supplied in a tamper-proof hardware security module (HSM) designed to be used within a data loss prevention (DLP) system. DLP systems are designed to block the accidental or deliberate leaking of company secrets or confidential information.

    "Using the system, a user's browser or email client would be fooled into thinking it was talking over a secure encrypted link to Gmail, Skype or Hotmail. In reality it was talking to a server on the firm's premises that tapped into communications before relaying them to the genuine server. The DLP system needed to be able to issue different digital certificates from different services on the fly to pull off this approach, which amounts to a man-in-the-middle attack.

    "The same principle approach might be used in government monitoring activities, such as spying on its own citizens using web services such as Gmail and Skype. Evidence suggests that digital certificates issued by Netherlands-based firm DigiNotar last year were used in this way to eavesdrop on the webmail communications of Iran users last year, although no firm state-sponsored connection has been established."

    Now this sounds more like the approach the NSA *could* have taken to defeat SSL, but still....

    In 2011, the Electronic Frontier Foundation (EFF) started putting out as series of reports about the weakening of the CA system -- Google up "How secure is HTTPS today? How often is it attacked?" by Peter Eckersley.
    JustCallMeBC
    • Us geeks use self-signed certificates

      It triggers a security exception in the browser, but it removes the "auto-trusted" CAs from the mix entirely.
      cryptikonline
  • NSA monitoring

    I imagine that tapping all of this data also includes SIP traffic used by such phone providers like Vonage and Skype.
    alpbosch
  • I'll check back in a month for the update...

    quote "[Editor's note: SSL section edited for clarification, 09:00 EST, July 8]" unquote
    Til then, guess we can only wait and speculate...sorry, couldn't resist!
    wizard57m-cnet
  • Here's an update, all

    Good morning, all. We've updated the section on SSL/HTTPS security. It's still a work in progress -- of course -- and we think we may have included some inaccuracies first time around. We hope we've clarified these statements and we'll keep the piece up-to-date as transparently as we can.
    zwhittaker