It's time that 'metadata' met an end

Mandatory data retention is back on the political agenda, and Australian law enforcement agencies are presenting a new round of ambit claims. Watch out, meanings are being twisted.
Written by Stilgherrian , Contributor

"I think the journalism profession should push back on the use of the term 'metadata' by surveillance agencies. It's data. It's private," tweeted high-profile network engineer Mark Newton last Friday. Those who use the term are maintaining a fiction, he added; namely, the notion that some kinds of data about an individual's communication and online activities are less deserving than others, and don't need to be protected from unwarranted prying by police and spooks by the requirement for, erm, a warrant.

Newton is right. So here's my contribution to that push-back.

The word "metadata" is supposed to refer to any data associated with a communication, other than the "content" of the communication itself. This distinction is intended to parallel the distinction made with telephone calls, where police need a warrant to access the conversation itself through a "lawful intercept" (or, as Americans call it, a "wiretap"), but not to access any information about the call that was recorded by the telco — such as the time the call was made, its duration, and the number called.

That distinction is down an accident of technological history. Listening to a telephone call requires an intrusive act in real time, and it has to be organised in advance or the conversation is lost. The other information was being recorded for billing purposes, and kept long enough to resolve any customer billing disputes. Providing that information to the police was seen as no big deal.

Things are different on the internet. Email, for example, continues to exist even after it's been sent. The same goes for chat logs and file transfers. Routing information exists within the communication itself — think of email headers. And while many activities are logged, those logs are kept to investigate technical faults, not for billing — so they can be thrown out much sooner.

Mandatory data retention is simply the idea that all of that log data be kept, possibly for years, on the off chance that it might, perhaps, maybe one day be useful for investigating a crime — not just in our own country, but in any country that's signatory to the Council of Europe Convention on Cybercrime.

In 2011, attorneys-general from the Quintet nations — the law-officer counterpart of the "Five Eyes" intelligence-gathering nations of the US, the UK, Australia, Canada, and New Zealand — agreed to persuade the whole planet to adopt the convention. Data retention is at the heart of that treaty.

Australia's favourite attorney-general, Senator George Brandis, is pushing the data retention barrow because that's the plan globally — or at least amongst the English-speaking nations that think they run the joint. He'll push it even harder than his predecessors, because he intends to bring a "strong national security focus" to his office, as he said soon after the election.

Data retention supporters argue that because metadata isn't the content of the communication, it doesn't invade people's privacy to anywhere near the same extent — just as with telephone calls. In December, Brandis backed Prime Minister Tony Abbott's characterisation of metadata as "essentially billing data". It's just a few innocuous numbers.

Attorney-General Brandis is wrong. The police want this data precisely because it can reveal so much. Otherwise, why would they want to have it? And it's not even used for billing.

Attorney-General Brandis is clearly either ignorant or wilfully disingenuous. Doesn't he know that the entire commercial economy of the internet is built on the ability to construct or infer all manner of detailed information about people's personal lives by aggregating and data mining the myriad digital footprints they leave behind?

Researchers at Stanford University, for example, found that they could predict people's medical conditions, gun ownership, hobbies, relationships, and religious views simply by looking at the metadata.

"There's a participant in our study who had an early morning call with someone we were able to identify as her sister. And then, a couple of days later, had some calls with the local Planned Parenthood organisation, and then a couple of weeks after had some more calls, and then about a month after had a final call," Jonathan Mayer, the graduate student running the research, told ABC Radio.

"I think it raises the plausible inference that that participant had an abortion, and that in and of itself, even if it's not accurate, should give rise to some privacy concerns."

Another example? "We had a participant who in short order had calls with a lumber yard and a locksmith and a hydroponics dealer and a bong shop. Again, don't need a PhD in computer science to have some sense of what could be going on there."

Data retention is one of the most important political issues relating to our use of the internet now, and as far into the future as you care to imagine, I said on a Patch Monday podcast in October 2012, the last time it was on the political agenda.

Well, it's back on the political agenda right now, because the Senate is reviewing the Telecommunications Interception and Access Act. Police services are starting to issue a new round of ambit claims. The Northern Territory Police even want everyone's web browser history.

In law enforcement, as in every other part of society, the internet is changing information flows — and because information is power, that's triggering a power struggle. Cops and spooks need enough power to do their jobs effectively, sure, but not so much that it intrudes on people's quiet enjoyment of life, or leads to oppression.

Since Attorney-General Brandis is a history buff, he'll probably remember what 17th-century French clergyman and statesman Cardinal Richelieu said, or is supposed to have said. "If one would give me six lines written by the hand of the most honest man, I would find something in them to have him hanged." Under these 21st-century data retention proposals, Richelieu would have 6 terabytes of big data, plus the data mining tools to help construct the incriminating narrative. Is that what we want?

We'll need an intelligent and informed debate to find the new power balance. We won't get there if Brandis, and others of his ilk, continue the metadata fiction, whether they be fools or knaves. It's all data. It's all private.

Editorial standards