Staff at the Australian Federal Police (AFP) are faced with the often confronting task of trawling through thousands of photographs to identify and categorise content that frequently includes child exploitation and extremist materials.
It is difficult work and can take a serious toll on staff, but often analysts at organisations like the AFP are not police, so there's a legal and also ethical dilemma in trying to gain insights out of the data held.
Janis Dalins, a Senior Digital Forensic Examiner at the AFP, was undertaking his PhD at Monash University and was faced with a problem: He needed to give his supervisor access to the data he was analysing.
"I wanted to improve the performance of existing cryptographic and perceptual hashing methods -- in plain English, to speed up the process of identifying images through their content rather than their technical attributes," he wrote in a blog published on the Digital Transformation Agency's (DTA) website.
Dalins approached the Commonwealth Scientific and Industrial Research Organisation's (CSIRO) Data61 to help him develop a solution that gave his supervisor access to the data he was analysing, but also could be used by the AFP going forward.
Dr Yuriy Tyshetskiy, a senior data scientist at Data61, volunteered his help with the project.
Speaking with ZDNet at D61+ LIVE in Brisbane on Wednesday, Dr Raj Gaire from Data61's Distributed Systems Security group detailed the solution that came out of this problem, Data Airlock.
"In a standard data analysis environment, a solution is employed where the data comes to you and you analyse the data and you get the result," Gaire explained. "With Data Airlock, the data remains as far away as possible, we don't want to touch that, the solution goes to the data."
Data Airlock firstly consists of an execution environment that goes into the storage location where the data sits in an encrypted form.
"The execution manager of the system creates a contained environment where the encrypted data goes in, the model goes in, the execution goes in, the encryption happens within that environment, execution happens within that environment, results come out, and then the data is kept," Gaire continued.
The results then go to a vetter -- a human being at the AFP who decides if the data should be released and if so, who gets access to it.
During an investigation, the solutions analyses materials and, if they are considered to be extreme or exploitative images, the reviewer can be warned.
Thanks to the partnership between the AFP and Data61, Data Airlock has become a service that allows for the management of legal and ethical restrictions in the field by providing indirect access to offensive materials on which they can develop and test deep learning-based tools.
Disclosure: Asha McLean travelled to D61+ LIVE as a guest of Data61
Once the unknown and unaccountable process decides you're a potential future criminal, simply wearing the 'wrong' clothes and sitting in the 'wrong' train carriage can attract police attention.
With all of the good a quantum computer promises, one of the side effects is that it will be able to break the mechanisms currently used to secure information. But the industry is onto it, and Australia's QuintessenceLabs is playing a key role.
Industry groups, associations, and people that know what they are talking about, line up to warn of drawbacks from Canberra's proposed Assistance and Access Bill.
Finding experienced data science professionals can be a challenge. Training current employees with aptitude for this type of work could be a good strategy to fill skill gaps.