Data61 using machine learning to track human infectious diseases in Australia

The tool uses statistical machine learning to hopefully prevent the spread of infectious diseases in Australia.

The Commonwealth Scientific and Industrial Research Organisation's (CSIRO) Data61 has developed a tool to track infectious diseases and how they specifically might spread to Australia, using Bayesian inference, a statistical machine learning method, for understanding the propensity of a region to spread disease to other regions.

Using data from dengue virus outbreaks in Queensland as a case study, the tool identifies and tracks new cases of infection to their original source in Australia, and links how the disease has transferred between people.

According to Data61 computer scientist Raja Jurdak, traditional methods of tracking infection routes often depend on time-consuming site investigations or interviews relating to travel routes of infected patients.

Data61 has partnered with Queensland Health to obtain fully anonymised records of the reported dengue cases over a 15-year period. Jurdak told ZDNet these records serve as the ground-truth used to train the models.

"We use multiple sources of information on people movement, including airline passenger data, geo-tagged social media, and tourist surveys, in order to understand how people move between regions," he explained.

"Using the human movement trends as a starting point, our approach learns how the disease spreads among the regions, and use the actual reported cases to validate the results."

Jurdak said this methodology allows Data61 to look into the past and identify the sources of infection, and also predict the potential future spread of disease.

Understanding how infections spread once they reach Australia will provide an opportunity to predict when and where an outbreak is likely to occur, allowing hospitals and biosecurity agencies to be as prepared as possible.

However, one of the challenges of the project is that Data61 is working with movement data that does not cover the whole population.

"This is why we chose to combine multiple data sources on how people move, so that any biases from one data set could be offset by the others," Jurdak told ZDNet. "Because we used the actual reported cases to train and validate our model, we are confident that the model was robust to the limitations of individual datasets."

The tool is part of the broader Disease Networks and Mobility (DiNeMo) project aimed at developing a real-time alert and surveillance system for human infectious diseases. Data61 expects it will provide new insight into the behaviour of human diseases brought into Australia.

According to Jurdak, the tool can be scaled to monitor and track other outbreaks for diseases, involving transmission among humans, animals, or that spread through vectors such as parasites, viruses, and bacteria.

"Because we use the movement of people and animals as the underlying driver of the spread of disease, we can apply our approach to a broad range of outbreaks where we have some information about the movement of the agents -- humans and animals -- involved in the spreading process," he added.

Examples of other applications of the tool include malaria, which also spreads through mosquitoes, and foot and mouth disease in animals.

RELATED COVERAGE