PRISM: Here's how the NSA wiretapped the Internet

Summary:UPDATED 5: The National Security Agency's "PRISM" program is able to collect, in real time, intelligence not limited to social networks and email accounts. But the seven tech companies accused of opening 'back doors' to the spy agency could well be proven innocent.

<— What we know; what we think

By tapping into the connection between the Tier 1 network and the edge connection, the NSA would be able to literally view and copy data transmitted over every single session from a user to an application in realtime, and then stored and processed appropriately.

You can't walk into, say, Apple's iCloud datacenter and install a wiretap. Apple would notice it. It would have to be done out of band: such as when the data leaves the datacenter and begins its journey on the way to the user sitting at home on their laptop or mobile device.

Microsoft's Hotmail service — now defunct, and rebranded as — was on the list of PRISM services that were being accessed by the NSA. But the NSA didn't need to seek Microsoft's permission, or even to serve it with a court order or a ruling from the FISC. Because of the sheer size of the company, someone would have eventually either said something to someone else and broken the law by breaching the gagging clauses in the process — or someone would've noticed a backdoor in the systems somewhere.

And, using Hotmail as an example, if the NSA was acquiring all the data since September 2007 — the time the leaked slides show the data harvesting began — the NSA would in theory now have all of everyone's Hotmail data to date.

But that would be almost useless to the NSA. The agency wants to know about the "here and now," not "then." They want information that is immediately actionable.

There's the issue of encryption, such as an SSL connection, which offers a HTTPS secure pipe between the user's computer and the website providing the service. It's like a metal pipe that stretches end-to-end. The port that's opened up on your computer is encrypted and everything that flows through it is completely unreadable.

But if the NSA were intercepting traffic and decrypting it somehow on the edge connection between the application service provider — such as Facebook, Gmail, Amazon, for example — and the Tier 1 network, the application service provider would be unaware that this was happening.

There are a number of wiretap-related laws, and which one is used depends on the case. Of course the main one is the Wiretap Act. But it all depends on which law may sway the judge that must hand the order down to authorize such an act.

According to the Electronic Frontier Foundation (EFF), the Wiretap Act requires police, law enforcement or intelligence agencies to seek a warrant — often called a "super-warrant" — to intercept "electronic communications," such as Internet activity and cell activity. This includes emails, Web history, text messages and instant messaging, and more.

The privacy group states that under the Wiretap Act, although a wiretap order is needed to intercept your electronic communications, only your oral and wire communications — such as voice communications — are covered by the statute's "exclusionary rule." If your phone calls are illegally intercepted, such as without a warrant, that evidence can't be introduced against you in a criminal trial. But, the statute will not prevent the introduction of illegally intercepted emails and text messages in court.

Section 215 of the Patriot Act, which amended the Foreign Intelligence Surveillance Act 1978, allows the government to acquire "tangible things" — so long as the FISC court is aware that it is for an "authorized investigation" to "prevent terrorism" or "clandestine intelligence activities." 

Also, the Communications Assistance for Law Enforcement Act (CALEA), passed in 1994, requires U.S. telecoms firms and manufacturers to ensure their equipment is able to implement government wiretaps. This not only includes traditional telephone lines and broadband connections, but also voice-over-IP (VoIP) traffic.

This should be enough for the NSA to wiretap Tier 1 companies.

Update at 9:00 a.m. ET on June 8: SSL section edited for clarification. Thanks for the feedback; we'll address this in the comments section.

Although SSL-encrypted data is still unreadable at its current destination, the NSA likely has the capabilities to break this encryption later at its datacenter, presumably using vast computational resources. This would have to be done for each session, and likely only for targets of interest since the ability to do this would be extremely computationally expensive, as both public key and symmetric keys would have to be cracked.

Alternatively, the U.S. government could issue a FISA order against the certificate authorities themselves. FISA may well negate any SSL-decryption methods — whether they exist or not.

Let's explore both.

Today most Web services use 128-bit or 256-bit key encryption, both of which would be child's play for an advanced NSA supercomputer optimized specifically for cryptographic work to crack. Facebook and Google, for example, use 128-bit RSA encryption with TLS 1.1 connections for their Web servers. (Google is planning to move to a 2048-bit RSA key later this year.)

Issuing certificate authorities, as well as the National Institute of Standards (NIST), have already recommended that businesses move their Web servers to more complex encryption methods. This is because these sessions are crackable by conventional computer technology, let alone something exotic that the NSA might have in its possession.

However, the SSL encryption appliances and co-processors which must do this at the Web server end without significantly compromising application or server performance are extremely expensive. Corporations have been lax in moving to these standards. 

Cracking the encrypted SSL sessions could also be achieved through compromised certificates from the issuing certificate authority, making decryption of vast amounts of sessions that much easier.

Recommendations for digital signature encryption hash length, NIST

Having someone on the inside leaking the certificate's private key to a third party like the NSA is unlikely. It could also be discovered relatively easily. What's more likely is that the U.S. government could petition the FISC in order to seek a secret warrant against the certificate authorities. 

The chief executives of Google, Facebook, Microsoft and so on would be none the wiser because they would not have been told by the certificate authorities, as per the gagging order. It's possible that a FISC order could even prevent the certificate authority's chief executive's from knowing. From here, they can forge any SSL certificate. This would allow the NSA to conduct a man-in-the-middle attack without the user or the company involved even knowing.

[Update ends.]

In addition to the direct tap of the Tier 1 edge connections, the NSA is also likely making direct copies of application databases, their contents and files stored at content delivery networks (CDNs).

In many cases this is the exact same thing as the Tier 1 edge peer because this is how the content is distributed in the first place. Ultimately this would allow the NSA to reverse engineer the information as it was stored in the original application, and would not require nearly as much computational power to break than individual SSL sessions, one at a time.

There are two main benefits to wiretapping the Tier 1 edge connections.

Firstly, the companies involved that provide Web services and applications are unaware of the data gathering because it happens outside of their networks. Secondly, these Tier 1 network providers have a far smaller employee base working in these divisions than the aforementioned companies. This allows the NSA to either send its own employees in as "virtual" employees — working under the guise of these companies — while the NSA gags those companies from disclosing this fact to other staff. They could look like special contractors that only work with the special wiretapping routers.

With this technique, the number of people who actually know about the wiretapping would remain low.

Only those who actively use the PRISM system to examine the wiretap-collected data as well as a few people within the Tier 1 companies and the network equipment manufacturer that develops the wiretapping hardware would be directly complicit in this scheme.

All of those involved could be delivered gagging orders under a FISC order, such as the one Verizon received in April, which was published by The Guardian, and face prosecution and jail time if they talk.

The likelihood is that, should this theory prove true, other governments and nations may also be complicit in NSA's wiretapping scheme. The U.K. government has already been implicated with its listening station, the Government Communications Headquarters (GCHQ), reportedly using PRISM, in spite of intelligence sharing and mutual legal assistance treaties between the two countries.

Perhaps this is even happening as far as the UKUSA Agreement, in which the U.K., the U.S., Canada, Australia, and New Zealand agreed on signals intelligence sharing. Or, it could go as far as NATO countries. But there are some doubts over the "NOFORN" classification tags on some of the leaked documents, which indicates that foreign nations — including those in the UKUSA Agreement — are not allowed to view them.

PRISM and data mining: What data is being collected?

PRISM could be considered the "ECHELON 2.0" signals intelligence gathering system between the countries in the UKUSA Agreement. The intelligence system allegedly monitors almost anything carried over telephone wires and intercepts satellite communications.

But telephone wires are in a dying category. Datacenters and cloud services are the new norm. And amid the privacy scandal, it still isn't clear what the wiretapped data actually is.

PRISM is probably more like a Web-based application — like a search engine — than a "program" or an "operation." Behind the scenes there will be a vast big data operation that uses algorithms, natural language queries and search syntax to extrapolate the data the NSA operative needs.

It's also likely that, like a tabbed Web application that you would see in Google's range of services across the top of the page, it would not be surprising if PRISM was just one application out of many in the NSA's toolshed.

It probably sits like a layer on top of the NSA's shared resources and infrastructure. Everything the NSA is doing — at least data pertinent to the operative or analyst — could be fed in, like a data mining application. The NSA collects all kinds of data — from phone call metadata and email content data to radio waves and satellite communications. It may just need a FISC warrant to actively access some of it.

What's also not clear is just how much data is being harvested and stored. The NSA started to capture data from Microsoft in 2007, the leaked documents say. Following on from that, Yahoo was next in 2008 and Google, Facebook and PalTalk in 2009. And so on.

But does the NSA still hold that data? Or does it wipe its storage after six months or so — once the data has proven to be no longer relevant or useful? It's possible that there are daily snapshots, as per the Verizon court order, and takes a copy for later on-the-fly searches. Or perhaps an algorithm is constantly searching for key terms or user-specific data, using only a portion of the overall space, like a surgical search rather than downloading everything.

We simply don't know, and can only hypothesize until further leaks emerge, if any.

One source speaking to ZDNet under the condition of anonymity said $20 million — the amount quoted by the NSA in the leaked document that covers the cost of the PRISM program — wouldn't even cover the air conditioning costs and the electrical bill for the datacenter. Taking the datacenter out of the equation, $20 million would even not cover 3-6 months worth of data storage required to store keep copies of the wiretap data, they said. The storage the NSA would be procuring is most likely the most expensive, high-speed storage taxpayer money can buy.

That said, the Digital Collection System Network (DCSNet) wiretapping system, which connects, stores, indexes and analyzes metadata — such as sender and recipient email addresses, outgoing and incoming phone numbers, and time and date details — cost the Federal Bureau of Investigation (FBI) double that at $39 million by 2007.

The NSA and what it does with telephone call data

But what about the NSA harvesting of Verizon telephony data that sparked off this entire controversy in the first place? PRISM may not have necessarily been designed for that. PRISM may well be just one application out of a suite of NSA-created applications that perform different things. While PRISM focuses entirely on application sessions in the cloud, another application may in fact focus on the recording of phone calls.

President Obama, speaking on Thursday, said: "When it comes to telephone calls, nobody is listening to your telephone calls. That's not what this program was about. As indicated, what the intelligence community is doing is looking at phone numbers and durations of calls."

It would be easy to suggest that in fact nobody is listening to phone calls, at least semantically speaking. It was likely a very carefully considered sentence. But even logistically, there are too many calls to listen to anyway. It's entirely possible that algorithms are being used to transcribe and detect certain words, but in this case that's not too important and goes a little off base.

Obama also referenced "the program." This could strictly mean the examining of telephony metadata, such as the phone numbers and the durations of calls. This could be part of a broader set of tools developed by the NSA to distribute wiretapped data to the appropriate databases, completely separate from what PRISM does, which is to wiretap Internet sessions to cloud-based applications.

In practice, not everyone working at the NSA wants to listen to actual phone calls because this is a labor and time intensive activity. All they need is the metadata which describes how the call occurred. This is enough to establish a connection between two people, and therefore "reasonable suspicion." From here, it's enough to seek a warrant and prosecute as and where necessary using the legal tools that the U.S. government has at its disposal.

The NSA will have different applications doing different things. If there is the ability to record voice conversations — so long as the law allowed it, under FISA or the Wiretap Act — it would probably be in there somewhere. 

What makes PRISM tick?

While we can never be completely sure what infrastructure and software makes up the PRISM system, we have some reasonable ideas. While there would certainly be some custom and exotic hardware involved — as the NSA has its own chip-making capabilities  — many of the components would be the same off-the-shelf enterprise hardware and software that powers line of business applications and services at major corporations. 

In some cases, they may be versions of these off-the-shelf systems on "steroids" using early or bleeding edge versions of processors and other components built by vendors under secret contracts.

IBM, for example, has already demonstrated sophisticated natural language abilities in Watson, which participated on the "Jeopardy!" game show in 2011. All the NSA would need to do is buy a tremendous amount of this equipment. Searching through wiretapped application data is a relatively simple exercise compared to having Watson participate on the fly in a trivia game show. 

The NSA supercomputer at the heart of PRISM likely resembles a gigantic Watson using advanced cryptographic co-processors which may employ nanophotonics like those announced by IBM in 2010 — and digs through this information at incredible speeds — petascale and exacale levels — so that only the "cream" rises to the top using key phrases and other patterns as triggers. This would be an evolution of what already existed in ECHELON, but would have more advanced natural language processing capabilities. 

And why should you believe this? It's been done before

In 2006, Room 641A became headline news. It was inside a building in San Francisco that AT&T owned, which fed in fiber optic cables from other telecoms switch buildings carrying Internet backbone traffic. Though this building was only three floors high, it had "the capability to enable surveillance and analysis of internet content on a massive scale, including both overseas and purely domestic traffic," according to Internet expert J. Scott Marcus at the time.

The "beam splitter" was used to — quite simply — split the fiber optic beam to redirect duplicate copies of all phone calls, Web traffic and email content into the clandestine room. That copied data was then handed to the NSA. According to the EFF, one expert said: "This isn't a wiretap. It's a country-tap."

It was effectively a gigantic wiretap on a huge portion of the Internet flowing in and out of the U.S. This led to an almighty class action lawsuit led by the EFF. Perhaps the more worrying part of this is that the wiretap included vast amounts of U.S. resident data, which falls in breach of FISA. Obama said on Thursday: "With respect to the Internet and emails, this does not apply to U.S. citizens and it does not apply to people living in the United States."

The nature of a beam splitter — a "prism" — therefore seems like an apt name for what appears to be a logical progression of Room 641A

In simple terms, this could be exactly what is happening at Tier 1 edge devices, which splits the beam and redirects it to equipment monitored by the NSA. Granted, in this day and age it would be a little obvious to do exactly the same thing that transpired in Room 641A. Instead, a special beam-splitting router installed at the edge connection could perform this and siphon off the data. 

PRISM is likely massive part-signals intelligence (SIGINT) and big data application that has the active and knowing involuntary participation of the U.S.' largest telecom firms, network equipment makers, supercomputer builders, and government-outsourcing professional services companies as the moving cogs in the privacy-invading machine.

Topics: Cloud, Government, Government : US, Networking, Security


Zack Whittaker is a writer-editor for ZDNet, and sister sites CNET and CBS News. He is based in the New York newsroom. His PGP key is: EB6CEEA5.

zdnet_core.socialButton.googleLabel Contact Disclosure

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.