​NSW Data Analytics Centre privacy guidelines under fire from private sector

The NSW Data Analytics Centre has been called out by the former state Deputy Privacy Commissioner for its definition of what de-identified data means for large-scale datasets.
Written by Asha Barbaschow, Contributor

Former New South Wales Deputy Privacy Commissioner Anna Johnston has called out the state's newly formed Data Analytics Centre (DAC) for not providing a clear definition of de-identified data to government agencies when collecting its data.

The DAC was first announced last year by NSW Minister for Innovation and Better Regulation Victor Dominello with the catchphrase of data being one of the greatest assets held by government when it is not buried away in bureaucracy.

Dominello then introduced a bill that requires each of the agencies and state-owned amenities to give his department their data, with the power to direct they hand it over within 14 days.

Johnston, who is now the director at Salinger Privacy, explained that the practice the DAC undertakes does not override the privacy statutes that affect government agencies in NSW.

"So basically the DAC cannot collect personal information from agencies if the agency handing it to the DAC would otherwise be in breach of privacy disclosure obligations," she said.

"Now the problem I've seen is that some of the requests from the DAC to other government agencies -- my client agencies -- have said, 'we have to protect privacy and you need to comply with the privacy legislation, so just give us anonymised data'.

"But what they're actually asking for is datasets with large numbers -- hundreds of thousands -- of people's records and only the names exempt, so enough detail left in there such as home addresses, of the hundreds and thousands of people to enable re-identification.

"The problem here is the IT people are not using the same language as the legal people as to what de-identifiable means, so there's a risk that agencies hand over this data thinking it's been de-identified to the point of [the Privacy Commissioner's] definition -- but, I don't think it necessarily meets that definition."

Johnston's remarks were made at the data sharing and interoperability workshop hosted by Australian Information and Privacy Commissioner Timothy Pilgrim.

During the workshop, Pilgrim said that building trust with the public is key to the challenges big data presents for organisations, including government, and highlighted that trust is further challenged by the nature of secondary uses of data.

"Part of the solution, potentially a significant part I suggest, lies in getting de-identification right," he said. "This includes ensuring that government agencies, regulators, businesses, and technology professionals have a common understanding as to what 'getting it right' means.

"At the moment, that common clarity is not evident."

Ian Oppermann, CEO and chief data scientist at the DAC, noted he mostly agreed with what Johnston had said.

"Data is not released into a vacuum; data is always surrounded by other data," he said. "It always is possible to combine whatever data is released with other data."

Oppermann explained that one of the initiatives the DAC is working on with the information commissioner, the privacy commissioner, and a range of organisations including Data61, is to try to understand what sort of data-related services the DAC could create based on the different sorts of datasets agencies and organisations hold.

As part of that, Oppermann mentioned the consortium will be looking at determining the best way to ensure the information handed over is appropriately de-identified.

"So that ultimately everybody knows what their obligations are, everybody knows what their responsibilities are, so everybody knows what side the personally identifiable information line is drawn," he said.

Previously, Oppermann outlined a Big Brother-like project he and the DAC were undertaking near Randwick, southeast of Sydney, to determine who lives where and with whom. By feeding in data such as utility connections and disconnections, and rental bonds, the DAC wants to get down to an update interval of 30 minutes.

Editorial standards