Risk vs. Opportunity: Data use and availability in Australia

Australian technology entrepreneurs and government officials weigh in on the Productivity Commission's Data Availability and Use draft report.
Written by Tas Bindi, Contributor

In a report released in early November, the Australian Productivity Commission called for a new Data Sharing and Release Act, a National Data Custodian body, and for individuals to have greater control of the data that is collected about them by both the public and private sectors.

The Commission's Data Availability and Use draft report sparked mixed reactions. Some praised the Commission for opening up a national dialogue about data sharing as a critical first step towards building a data-driven economy, while others criticised the Commission for not adequately addressing the risks associated with data breaches.

Kat Lane, chair of the Australian Privacy Foundation, said it is impossible to have a productive discussion about data sharing and use because Australians need to first have a statutory right to redress, as well as compensation when their data is misused or released without explicit consent. She also said the Commission needs to introduce regulations like mandatory data breach notifications and a clear dispute resolution mechanism.

"The report is one big advertisement for data sharing with no real substance," Lane told ZDNet. "It downplays data breaches that seem to be regular occurrences. We know already that data breaches can lead to people committing suicide or being prosecuted in countries that outlaw homosexuality or adultery."

While the Productivity Commission acknowledged that Australians should have control of their personal information including the ability to opt-out of data collecting activities, Lane said the report doesn't delve into how this control would be achieved.

"There are real problems with accessing your own data. It's a bureaucratic nightmare," said Lane. "We need to have genuine and meaningful control over our own information which we currently don't have -- I don't know what is happening with my data, I can't access it easily and I've certainly tried. [The Commission] hasn't put enough of a blueprint for us to have any confidence that that would occur."

The Commission also acknowledged that trust between consumers and the public and private sectors is critical to any exploration of data use; however, Lane insisted trust cannot be built while the government seems determined to conduct societal-level surveillance.

"We're supposed to trust the government, yet in the meantime they've got metadata legislation that clearly involves tracking and surveillance," Lane said. "If you're going to build trust with the public, the first thing they have to be certain of is if you anonymise their data, it can never be re-identified, which has happened very recently."

Lane also lamented that the report fails to address the August Census debacle where the Australian Bureau of Statistics (ABS) chose to take down the Census site to ensure the security of citizen data following multiple minor "foreign-sourced" DDoS attacks.


Kat Lane, chair of Australian Privacy Foundation

(Image: Supplied)

"The online Census system was hosted by IBM under contract to the ABS, and the DDoS attack should not have been able to disrupt the system," the ABS said. "Despite extensive planning and preparation by the ABS for the 2016 Census, this risk was not adequately addressed by IBM and the ABS will be more comprehensive in its management of risk in the future.

"We will have the advantage of all the learnings from the new approach first adopted in the 2016 Census, and desirably have five clear years to plan and implement a successful 2021 Census."

Lane said the report should have also addressed a recent case where encrypted Medicare Benefits Schedule and Pharmaceutical Benefits Scheme data published by the Health Department was found to be compromised.

The department was alerted to the security lapse by Melbourne University Department of Computing and Information researcher Dr Vanessa Teague, who found she was able to decrypt some service provider ID numbers in a dataset being used by her and several of her colleagues. The department immediately removed the dataset from the website.

The encryption algorithm was described online at data.gov.au, which Melbourne University's research team said was the right thing to do as it made it possible for them to identify weaknesses in the encryption method.

"Leaving out some of the algorithmic details didn't keep the data secure ­-- if we can reverse-engineer the details in a few days, then there is a risk that others could do so too," Teague's team said. "Security through obscurity doesn't work -- keeping the algorithm secret wouldn't have made the encryption secure, it just would have taken longer for security researchers to identify the problem.

In October, a 1.74GB MySQL database backup containing 1.3 million rows and 647 different tables from the Australian Red Cross has been found to be publicly available, security researcher Troy Hunt revealed. The data came from an online donor application form that contains details including name, gender, address, email, phone number, date of birth, country of birth, blood type, and other donation-related data, as well as appointments they made.

Hunt, who has since deleted his copy of the Red Cross data, highlighted the personal nature of the information available, as people had provided answers to the donor eligibility answers.

"Each donor is asked questions such as whether or not they're on antibiotics, if they're under or over weight, and if they've had any recent surgical procedures. They're personal questions, no doubt, but one of them particularly stands out in terms of sensitivity: 'In the last 12 months, have you engaged in at-risk sexual behaviour?'" he said.

"Clearly that is a deeply personal, private attribute that could be enormously sensitive if the answer is in the affirmative. Because there are many eligibility questions for each donor, there are a total of 7,343,537 answers in the system and naturally, many of these relate to the question of at-risk sexual behaviour."

In light of these incidents, Lane said a whole range of measures will need to be introduced before the public is comfortable with the Privacy Commission's proposal.

Shannon Sedgwick, CEO of Global Media Risk, a company that assists startups and larger firms with risk management and cybersecurity, agreed with this view, saying that in furthering the ideas presented in the Productivity Commission's draft report, there will need to be greater transparency around what security measures will be introduced.

"To establish accessible public and private data sets, data owners would need to have a great deal of trust in the security and control of their data and the screening procedures for those termed 'a trusted user'," said Sedgwick.

Lisa Schutz, founder of Verifier, an Australian-founded platform that allows consumers to share data between organisations without compromising their credentials, also agreed with Lane's view that the risks associated with data sharing cannot be solved without big investment and enforced standards.

"We can't allow too much discretion on personally identifiable data because frankly people are not well educated enough about the risks they are taking," Schutz said. "The biggest risk ... is the handling of de-identified data. There are plenty of examples of where data can get re-identified."

She added that the idea of eliminating the requirement to delete linked data sets created for research purposes will create "plenty of unexploded land mines".

"Just think of the potential harm of the blood donor data breach -- what happens if 10 percent of new donors choose not to donate in the future? If you think about the process of blood donation, there is arguably no need for sexual history and personal identifying information to be stored, linked, and potentially available for discovery."

Former New South Wales Deputy Privacy Commissioner Anna Johnston has called out the state's newly formed Data Analytics Centre (DAC) for not providing a clear definition of de-identified data to government agencies when collecting its data.

The DAC was first announced last year by NSW Minister for Innovation and Better Regulation Victor Dominello with the catchphrase of data being one of the greatest assets held by government when it is not buried away in bureaucracy.

Dominello introduced a bill that requires each of the agencies and state-owned amenities to give his department their data, with the power to demand that they hand it over within 14 days.

While the practice does not override the privacy statutes that affect government agencies in NSW, Johnston said technical people do not use the same language as lawyers to describe what de-identifiable means.

"So there's a risk that agencies hand over this data thinking it's been de-identified to the point of [the Privacy Commissioner's] definition -- but, I don't think it necessarily meets that definition," Johnston said at a data sharing and interoperability workshop hosted by Australian Information and Privacy Commissioner Timothy Pilgrim in Canberra.

The Australian Parliament is, however, currently considering laws that criminalise the re-identification of de-identified datasets that are collected and published by the Commonwealth.

"With advances in technology, methods that were sufficient to de-identify data in the past may become susceptible to re-identification in the future," the explanatory memorandum [PDF] to the Privacy Act amendment states. "The Bill is intended to act as a deterrent against attempts to re-identify de-identified personal information in government datasets and introduces criminal and civil penalties for the prohibited conduct."

For his part, Pilgrim said that by and large, people do want their personal information to work for them, provided that they know about it. He also noted that when there is transparency in how personal information is used, citizens should feel a sense of clarity, choice, and confidence that their privacy rights are being respected.

Schutz praised the concept of a comprehensive right to machine-readable access to consumer data that is mentioned in the Data Use and Availability report.

"This prevents us having to go down the path of recommending open banking API mandates like the UK or continuing to use black letter law to nominate fields of sharing, as done in the Comprehensive Credit Reporting legislation," Schutz said. "Instead, they are recommending an environment where consumer controlled data sharing is the default."

Speaking at this year's StartCon in Sydney, Atlassian co-founder Mike Cannon-Brookes said that the lack of a banking API in Australia is "crazy" and that because banks have access to so much data, it's not easy for consumers to switch vendors.

"We should be able to import our bank accounts just like we can import our mobile phone numbers," he said.

Cannon-Brookes reminded the audience that 10 years ago, telecommunications providers were reluctant to import phone numbers, claiming that it was too difficult -- but now it's easy to transfer phone numbers when switching telcos.

"I think the same thing should happen with banking," Cannon-Brookes said. "Your banking transaction history should be your data that you can take with you if you choose to switch [banks] or if you choose to allow a fintech company to access that to provide some sort of service.

"It's quite obvious strategically why the banks have no interest in that being the case."

Yanir Yakutiel, CEO of Sail, told ZDNet that banks like to use excuses like privacy concerns and vested interests to prevent the open access to data sources that could benefit Australian consumers.

"I believe that it's the role of regulators to make it crystal clear that information relating to particular persons or business belongs to them," Yakutiel said. "And information that has been collated by various companies, most notably those protected by anti-competitive frameworks like the banks, should be forced to open the access to the relevant information and to treat it as a public utility."

Recently in the Review of the Four Major Banks: First Report, the House of Representatives Standing Committee on Economics recommended that banks be forced to provide open access to customer and small business data by July 2018 for competing banks, startups, and other financial institutions.

The committee suggested that the Australian Securities and Investments Commission (ASIC) be charged with developing a binding framework to facilitate this sharing of data, making use of APIs, and ensuring that appropriate privacy safeguards are in place to allow such a practice.

"Increased access to financial sector data, as noted by the Productivity Commission, should also intensify competition in the financial sector. This is because markets work best when customers are informed. At present, banks, not consumers, hold the data. This gives banks a significant degree of power," the report [PDF] states.

While the Productivity Commission has not articulated how it would enable consumers to have access to and control over their data, Schutz believes the discussion will open up opportunities for innovation all over the economy.


Paul Chan, founder of Pureprofile

(Image: Anna Zhu)

Paul Chan, founder of Pureprofile, a publicly-listed online profile marketing and insights technology company, agreed with this view, saying that the Commission is "not trying to paint and define the business models of the future".

"They're trying to paint practical opportunity," Chan said. "The same thing happened to privacy about 10 years ago. The value of sharing and being connected on platforms like Facebook -- the whole social revolution -- threw privacy out the window. People started taking photos inside their house and photos of what they're wearing and sharing it on social media. It became very logical for people to do that and natural rules started to form around these activities."

Chan advised that we keep an open mind about the value that data sharing will bring to consumers' lives and not see the Productivity Commission's proposal as black and white.

"No-one would have bought into [social media] if you tried to explain it to them years ago. They had to see it, experience it, and feel it. All of a sudden we're connected to people better and our lives are different. It's very hard to take that away. It's unstoppable," Chan said. "Right now, data needs to find a home and the government is proposing that the home, from an ownership perspective, should be accessible to the consumer. How can that not make sense?"

Katryna Dow, founder and CEO of Meeco, an Australian-founded platform that enables data to flow compliantly and consensually between an individual and organisation, said the fact that the Commission has opened up a nationwide dialogue ahead of any regulatory intervention is "very positive" as it allows us the opportunity to shape regulation.

"If we get the right kind of stakeholder engagement now, we may be able to find the right balance between privacy, transparency, permission, consent, and the opportunity to create mutual value between individuals and organisations," said Dow.

Dow pointed out, however, that there needs to be greater generational awareness of the "extreme value" of people's identities and personal information.

There is also very limited understanding of how access to our personal information by organisations impacts us, Dow said.

"Most people don't understand that it's not the single points of collection that are concerning -- such as your license plate being photographed at a car park," said Dow. "It's the fact that where my car is, together with the information around my social footprint, together with the information my bank has, together with my credit card transactions, together with a host of other things, that's concerning. Aggregating personal data and bringing that information together into a dossier is very simple for organisations to do.

"We don't have a widespread understanding of how that may impact our ability to apply for things like credit and other services, as well as the pricing of services based on information like the browser we're using. There are far-reaching economic and service access issues."


Michael Jankie, founder of PoweredLocal

(Image: Craig Sillitoe)

Michael Jankie, founder and CEO of PoweredLocal, which provides free social Wi-Fi, marketing opportunities, and data gathering for small businesses, said we are already "willingly" handing over our data all the time when we apply to be a part of loyalty programs, credit cards, insurance, and so forth.

"Where the risk lies in this framework is who has access to identifiable and confidential data," Jankie said. "A part of me feels like some government agencies are getting some new powers slipped into this program. Once a human is involved, the use of highly identifiable information will always be at risk of a breach or misuse.

"My feeling is that identifiable and confidential data should be split from de-identified or non-personal data. Should the police know the statistics of how many people send text messages while driving? Yes. That can be used to better target enforcement and education.

"Let's just hope that going to the footy doesn't require us to tick a box upon entry confirming terms and conditions to collecting private data on us."

Consent vs. explicit informed consent

Dow stressed that although we seem to be "willingly" providing our information, consent is not the same as explicit informed consent. This is where a big opportunity for consumers and businesses lies.

When consumers are actively driving the flow of their personal information, then data collection and use moves from being creepy to cool, Dow said.

"If my bank or airline is working with me to help me understand the information it requires and is using it to give me an optimal experience, then I might consider that as being really cool," said Dow.


Katryna Dow, founder and CEO of Meeco

(Image: Supplied)

One of the most obvious benefits of consumers being able to access and control their data is efficiency, Dow said.

"Your personal information -- such as your employment status, salary, citizenship, your daily commute, the balance of your bank account, and so forth -- stay entirely contextual to you and they remain the same regardless of whether or not you're connecting with a bank or an insurance company," said Dow. "When you consider the number of forms we have to fill out, there is an opportunity for us to cut down on the reuse of our information."

Nathan Kinch, head of experience and labs at Meeco, said providing consumers with greater control over their data also means data accuracy is maintained, which translates to cost savings for business.

"The only way you can make sure data is accurate is by allowing people to have immediate access to it all the time. How would that work?" Lane questioned.

Meeco's answer to this is a platform that allows individuals to save, share, and sync their personal information across their devices. The platform, which was founded in Australia, plugs directly into government and enterprise systems, so consumers have direct access and control over the information that third-parties have.

"A lot of the data is quite inaccurate. There's this big trend -- particularly amongst millennials -- to falsify the information they enter into online products and services. Organisations then pay to acquire this information and it's just fundamentally incorrect," said Kinch.

This "trend" is in part driven by the fact that data is being illegally acquired and sold on the black market, he added, as evidenced by the slew of data breach scandals facing the world's biggest companies as of late.

In September, Yahoo revealed that at least 500 million of its users' accounts had been hacked in 2014, marking the biggest data breach in history. The company admitted that passwords and other information were stolen, but insisted that payment and bank information stayed safe. A hacker was reportedly selling the stolen data on an illegal dark web marketplace called "The Real Deal" for 5 bitcoins (roughly $2,200).

A huge cache of personal data from Dropbox containing the usernames and passwords of nearly 69 million account holders was also discovered online in September. Troy Hunt said the users' credentials had likely been sitting in the dark web for years.

Chan, who founded Pureprofile 15 years ago on the principle that a person's profile is an economic asset, said centralised self-management of data is not only better from a data accuracy perspective, it also presents new business opportunities.

"There is not only an opportunity for businesses to have a much smaller expense, but a portion of those savings can also go back into incentivising consumers to share and keep the details on their profile up to date," said Chan. "For example, they could pay for the consumer's Netflix or Spotify membership because content is currency. Incentivising the consumer creates a good business opportunity for businesses that would normally be investing in separate databases, database management, and other centrally-managed non-consumer-connected type approaches."

Chan added that allowing people control over their personal information and profile means less "guesstimating" on behalf of companies.

"Big data is everywhere. It's being created all over the place. But fundamentally, the data is very hard to work out. In five or 10 questions you can know more about someone than you could ever guesstimate through big data," Chan said. "You will never be able to work out why someone did something or when in the future someone will do something accurately."

Kinch said the opportunity to deliver products and services to the right person at the right time increases exponentially if organisations are able to acquire and utilise timely and accurate consumer-controlled data.

At the moment, the onboarding process for most financial businesses is costly, time-consuming, and often requires the acquisition of third-party datasets. There is also manual data processing that occurs within large teams, costing banks a lot of money.

"It's all very complex, time-consuming, and creates a bad experience for the consumer. They feel uncomfortable with the amount of information that the bank is asking them to give," said Kinch. "Somewhere between 40 and 50 percent of applicants drop off during the process, which can have a significant impact on the business. As drop-off increases, cost of acquisition goes up and the unit economics for that proposition end up being not great."

Kinch added that a lot of companies around the world -- particularly in the financial services, insurance, consumer electronics, and telecommunications industries -- are working on solutions to enable individuals to verify themselves by reusing an existing digital identity credential they have. For example, it could be third-party identity assurance provider or a government-driven digital identity standard like gov.uk verify. The New South Wales government has kicked off a similar scheme with the launch of digital drivers' licenses.

If financial service providers accept that credential, then the onboarding process can be cut down from 18 minutes to potentially less than 1 minute, Kinch said.

"The costs saved can be invested into improving customer experience or creating better incentives for [customers] to engage with your brand," said Kinch.

Following the EU lead

Forrester's 2016 Data Privacy Heatmap shows that countries around the world are moving toward the European standard for data protection. The General Data Protection Regulation (GDPR), which is yet to be enforced and imposes a higher standard of personal data protection, has already started raising the legislative tide within the EU and abroad.

Facebook has come under scrutiny in Europe for how it captures and uses people's data for commercial purposes. Earlier this month, Facebook agreed to pause its collection of WhatsApp user data in the UK for advertising purposes, following a probe by the Information Commissioner's Office.

In a blog post, information commissioner Elizabeth Denham said she began investigating the WhatsApp privacy policy change with "concerns" that WhatsApp users weren't being properly protected.

"It's important that we have control over our personal information, even if services don't charge us a fee," Denham wrote. "We might agree to a company using our information in a certain way in return for us getting a service for free, but if that information is then exploited more than agreed, for a purpose we don't like, then we're entitled to be concerned."

At the end of September, officials in Germany ordered WhatsApp to stop sharing collected user data with Facebook and for the social network to delete data already collected from the 35 million WhatsApp users within the country.

In a similar case against Facebook in December 2015, a Belgian court ordered the social network to stop placing data-tracking cookies on non-Facebook users' computers unless they explicitly agree to the social network's updated privacy policies.

GDPR will come into effect in May 2018, replacing the EU's current privacy rules that have been in place since 1995. The updated rules aim to unify data privacy across the EU to simplify regulations for international businesses doing business in Europe.

Kinch, who currently works in Meeco's London office, said there are a number of components in the GDPR that can be replicated in Australia.

For example, the GDPR broadens the definition of personal data. If analysis has been conducted on someone's personal data and insights were obtained, those insights also fall into the category of personal data. This means any regulation pertaining to personal data also pertains to secondary datasets or insights gained from raw data.

The GDPR also tightens the rules around explicit informed consent. Organisations need to explicitly expose the purpose of using an individual's information with unambiguous language.

"Saying something like 'we have a big data platform that enables us to crunch analysis and figure out how to sell you stuff" won't cut it. It has to be really explicit. For example, you'd have to say something like 'we are using your information to figure out how to most effectively match you to a doctor at your time in need'," said Kinch.

"A 100-page privacy policy or an end-user license agreement won't cut it anymore. The language needs to be clear and unambiguous."

Individuals also have to be notified of data breaches within 72 hours of an organisation finding out about it, under the GDPR; they also have the right to be forgotten should they choose.

The Australian Parliament is currently undertaking its third attempt to pass data breach notification laws, following previous attempts being shelved in the Senate by both Labor and Coalition governments. Due to commencement provisions in the legislation, unless otherwise proclaimed, any laws passed would take effect 12 months after gaining Royal Assent, which means a working notification scheme is unlikely to be introduced before 2018.

The GDPR is also enforcing that organisations have a data protection officer if they don't already, and that they embed privacy in the core foundational design of future products, services, and systems, as well as into their existing capabilities or offerings in the market.

Data breaches in the EU have serious consequences including fines of up to 4 percent of the company's global annual revenue or up to €20 million, whichever is greater.

Speaking at the InnovationAus Open Opportunity Forum, Adrian Turner, CEO of Data61, said the agency will be helping the government define a methodology for releasing open datasets, ensuring citizen privacy is upheld. Data61 is working on data perturbation technology, which injects noise into data so that it's not personally identifiable.

Paul Shetler, former chief digital officer at the Digital Transformation Agency (DTA), who recently resigned, told ZDNet that the Department of the Prime Minister and Cabinet is developing a process for agencies to follow before they can publish confidentialised open datasets.

"This process will work in tandem with the government's recently introduced legislation, the Privacy Amendment (Re-identification Offence) Bill 2016, which creates penalties for the re-identification of de-identified datasets released by the government as open data," Shetler said.

Pia Waugh, director for Strategic and International NFIC Initiatives at the Australian Transaction Reports and Analysis Centre (AUSTRAC), said we need to define the premise of what we're trying to achieve before we can discuss actual solutions.

"If you start from the premise that a citizen should have control of their own information, then that dramatically changes how you design and develop systems ... it would naturally lead to a lot of the government's systems implementing that premise by default," said Waugh.

There also needs to be a shift in how we, as citizens, see the government and how the government sees itself, Waugh said.

"If we constantly see the government as a king in a castle who is trying to control everything, then it naturally becomes the behaviour of the government to try to control [everything] on behalf of other people, on behalf of other organisations, on behalf of companies," said Waugh.

"If government turns its mentality and philosophy towards being a node in a network, towards being a partner in solving problems -- and not just government problems, but also societal problems -- and starts to take a collaborative approach, then we can start to codesign [processes and solutions]. The natural needs, considerations, and concerns of citizens, businesses, organisations, and so forth naturally become a part of the design rather than an afterthought."

At the Open Opportunity Forum in Sydney, Assistant Minister for Smart Cities and Digital Transformation Angus Taylor said the government held its first High-Value Data Roundtable with researchers on October 25. The next high-value data roundtable, to be held in 2017, will include businesses, shortly followed by non-profits.

"For the first time we had a robust process where concerns can actually be raised with government," said Taylor, encouraging community members to contribute their thoughts and help shape the future of Australia's data-driven economy.

"There is an unspoken bond that I firmly believe exists between the government and all Australians. So long that we genuinely treat people with respect and respect their data, Australians will accept us trying to do new and innovative things in the ICT space."

Taylor pointed out that the reason the government hasn't gone ahead and made data accessible to the private sector is because it needs to carefully consider when it should own the solution, when it should partner with a private sector solution, and when it should provide secure and reliable APIs to private sector developers wanting to develop solutions.

He added that the integration of non-specific data across and within government can significantly increase the accuracy and quality of the government's spend in areas such as infrastructure and transport. If the government has better access to housing and traffic data, it would lead to better urban planning, Taylor said.

Shetler agreed with this sentiment, saying the government is well aware that analysing data can provide evidence to drive better policy decisions.

Dominello, who is a big proponent of open data and called data the currency of the digital age, will be pushing for open data initiatives in 2017 as part of the NSW Innovation Strategy.

At the Reimagination Thought Leaders Summit 2016, he alluded to the creation of a data marketplace in 2017, describing it as "big and bold and nothing the country has seen before". The details of the initiative, however, were not disclosed.

Shetler said the government is under no illusion that it has the solutions to all the problems it faces internally and those that Australian citizens face.

"Collaborating with an army of innovators, be they in startups, SMEs, or established firms, is the way for the DTA to make change happen quickly," Shetler said.

Submissions to the inquiry into data use and availability close on December 12.

Editorial standards