Data scientists: White House issues a call to arms

A COVID-19 dataset may hold the key to flattening the curve. Data scientists are being asked to unlock valuable insights.
Written by Greg Nichols, Contributing Writer

There's been a call to arms for data scientists. U.S. health and tech leaders in The White House Office of Science and Technology Policy want qualified data scientists to mine terabytes of available research data on COVID-19

To give scientists easy access to the research, a database has been uploaded to a centralized hub named The COVID-19 Open Research Dataset

It's an opportunity for service for data scientists, a way to help healthcare workers and policymakers understand a growing dataset that holds the key to making informed decisions. At the moment, we lack the most basic knowledge about COVID-19, including an answer to the most fundamental question: how many people have been infected? Health experts agree that reliable data answering this question and other fundamental questions are needed to guide difficult decisions ahead.

What role do data scientists have to play in the response to the pandemic? 

To answer that question I reached out to Gordon McDonald, CEO of Capice, a Florida-based team of AI experts whose tools and deep learning network are used by corporate clients to quickly train their models and generate predictive insights into things like customer buying habits, product pricing, and employee attrition. After the White House call, McDonald decided to temporarily divert his company's expertise and resources to helping with the COVID-19 data.

"The good news is we have lots of data," says McDonald. "The bad news is the organization and accessibility of that data is very spread out or difficult to access."

Given the difficulties with the dataset, McDonald points to AI deep learning as a necessary tool.

"Deep Learning is not a typical algorithm. A user literally "teaches" the platform with hundreds of examples of the various classifications or predictions. Once taught, then future classifications and predictions are in the hands of the deep learning platform."

This can be applied to health data in general, which is a growing trend in data-driven medicine.

"Does this CAT scan report any issues in any scan frame?" McDonald asks by way of example. "Listen to audio and find instances of sleep apnea. Predict the patients quality of life as good, medium, issues after they have an upcoming surgery."

What progress, if any, has been made so far? How might the effort evolve in the next weeks or months?

"There is at least one company, engine.is, attempting to link data science researchers with data with technology," says McDonald. "I have offered my company's full services to that effort. But all Deep Learning starts with data and data is what we need. I am aware of one COVID-19 data set that has been published for others to use."

More coordination is necessary, McDonald points out. Out of this pandemic, it's possible there will emerge a new framework out of which to deploy data scientists as first-responders to meet urgent and developing problems like pandemics. Until then, the ad hoc response is the best we have.

Editorial standards