Brandis swings his golden hammer, misses mark

Criminalising the re-identification of de-identified government data will hinder legitimate researchers and do nothing to improve citizens' privacy.
Written by Stilgherrian , Contributor

It's called the law of the instrument, or the golden hammer, and it's been attributed to everyone from philosopher Abraham Kaplan, to psychologist Abraham Maslow, to novelist Mark Twain. "I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail," is how Maslow put it in 1966.

Australia's favourite Attorney-General, Senator George Brandis QC, is swinging his golden hammer hard this week, in the process illustrating just how far behind the pace his thinking is.

Just weeks ago, the government published a massive dataset of medical transactions, supposedly made anonymous through "confidentiality measures including encryption, perturbation, and exclusion of rare events".

But it didn't take long for researchers to re-identify all of the medical service provider identification numbers -- although not the patient ID numbers. The dataset was pulled, and the Attorney-General squirted out a press release announcing the criminalisation of all re-identification of published government datasets.

The researchers are the University of Melbourne's Dr Benjamin Rubinstein, Dr Vanessa Teague, and, as Teague told ZDNet, "my postdoc, Dr Chris Culnane, who did all the real work". Their write-up is Understanding the maths is crucial for protecting privacy.

It took some creative thinking, but I mean no disservice to the researchers when I say that the knowledge and skills are more widespread than the Attorney-General may realise.

"The encryption algorithm was described online at data.gov.au. That was the right thing to do, because it made it possible for us to identify weaknesses in the encryption method. Leaving out some of the algorithmic details didn't keep the data secure ­-- if we can reverse-engineer the details in a few days, then there is a risk that others could do so too," they wrote.

"Security through obscurity doesn't work -- keeping the algorithm secret wouldn't have made the encryption secure, it just would have taken longer for security researchers to identify the problem. It is much better for such problems to be found and addressed than to remain unnoticed."

Even newcomers to the information security realm will recognise there the "many eyes" argument.

The researchers don't explain precisely what they did, only to note that "neither the exact algorithm nor the details of subsequent processing were described in detail" but that "we could guess those details for provider IDs and use the dataset to check our hypothesis".

The researchers were able to decrypt "every service provider ID" in the Medicare Benefits Schedule (MBS) data -- although, as I say, they did not attack the different scheme used for patient ID.

This doesn't surprise me. Even Brandis is aware that "with advances of technology, methods that were sufficient to de-identify data in the past may become susceptible to re-identification in the future". Data re-identification is a hot research topic these days, what with the big data n'all.

Brandis has responded with a classic golden hammer. As lawmaker, he can make laws. As Attorney-General, he can make things a crime. Data re-identification led to a bad situation, and embarrassment. So make it a crime.

It's quick. It's easy, provided you don't bother thinking too much as you draft the legislation. And it gives the impression you're solving the problem.

But of course it won't.

As Brandis said in his press release, "Our ability to deliver better policies and to solve many of the great challenges of our time rests on the effective sharing and analysis of data."

That means making good, technically-informed decisions about what data to publish, how to de-identify it, and how to handle problems as they arrived.

Criminalising honest but curious researchers will mean that the only ones researching re-identification will be criminals. A sub-optimal outcome, as they say. The researchers would agree.

"It wouldn't make the encryption any more correct, but it would make it harder for well-intentioned Australian security researchers to identify and fix problems," Teague told ZDNet.

"The better policy would be to make the algorithms open, long in advance, and encourage people to find, explain, and correct weaknesses."

But of course it's much harder to draft legislation that enables effective, efficient, and agile government processes than it is to bring down the crime hammer.

And speaking of agile...

Last December I expressed the view that the Attorney-General's office is far from ready for nimble government. These events have the same whiff of wobbliness about them.

Surely the government had thought through the implications of de-identification and re-identification before putting such a massive medical dataset online? Surely they'd kept themselves informed about advances in re-identification? Surely they had a better process ready, rather than a rush to make people criminals?

Surely I jest.

Editorial standards