When 'code rot' becomes a matter of life or death, especially in the Internet of Things

Code rot leads to under-performing enterprise systems. In today's device-laden edge world, it can be devastating. Another issue lurking: inaccurate AI algorithms.

The possibilities opened up to us by the rise of the Internet of Things (IoT) is a beautiful thing. However, not enough attention is being paid to the software that goes into the things of IoT. This can be a daunting challenge, since, unlike centralized IT infrastructure, there are, by one estimate, at least 30 billion IoT devices now in the world, and every second, 127 new IoT devices are connected to the internet.  

internet-of-things-cebit-cropped-march-2017-photo-by-joe-mckendrick.jpg

Photo: Joe McKendrick

Many of these devices aren't dumb. They are increasingly growing sophisticated and intelligent in their own right, housing significant amounts of local code. The catch is that means a lot of software that needs tending. Gartner estimates that right now, 10 percent of enterprise-generated data is created and processed at the edge, and within five years, that figure will reach 75 percent. 

For sensors inside a refrigerator or washing machine, software issues mean inconvenience. Inside automobiles or vehicles, it means trouble. For software running medical devices, it could mean life or death. 

"Code rot" is one source of potential trouble for these devices. There's nothing new about code rot, it's a scourge that has been with us for some time. It happens when the environment surrounding software changes, when software degrades, or as technical debt accumulates as software is loaded down with enhancements or updates.

It can bog down even the most well-designed enterprise systems. However, as increasingly sophisticated code gets deployed at the edges, more attention needs to be paid to IoT devices and highly distributed systems, especially those with critical functions. Jeremy Vaughan, founder of CEO of TauruSeer, recently sounded the alarm on the code running medical edge environments.

Vaughan was spurred into action when the continuous glucose monitor (CGM) function on a mobile app used by his daughter, who has had Type-1 Diabetes her entire life, failed. "Features were disappearing, critical alerts weren't working, and notifications just stopped," he stated. As a result, his nine-year-old daughter, who relied on the CGM alerts, had to rely on their own instincts.

The apps, which Vaughan had downloaded in 2016, were "completely useless" by the end of 2018. "The Vaughans felt alone, but suspected they weren't. They took to the reviews on Google Play and Apple App store and discovered hundreds of patients and caregivers complaining about similar issues."

Code rot isn't the only issue lurking in medical device software. A recent study out of Stanford University finds the training data used for the AI algorithms in medical devices are only based on a small sample of patients. Most algorithms, 71 percent, are trained on datasets from patients in only three geographic areas -- California, Massachusetts and New York -- "and that the majority of states have no represented patients whatsoever." While the Stanford research didn't expose bad outcomes from AI trained on the geographies, but raised questions about the validity of the algorithms for patients in other areas. 

"We need to understand the impact of these biases and whether considerable investments should be made to remove them," says Russ Altman, associate director of the Stanford Institute for Human-Centered Artificial Intelligence. "Geography correlates to a zillion things relative to health. "It correlates to lifestyle and what you eat and the diet you are exposed to; it can correlate to weather exposure and other exposures depending on if you live in an area with fracking or high EPA levels of toxic chemicals - all of that is correlated with geography."

The Stanford study urges the employment of larger and more diverse datasets for the development of AI algorithms that go into devices. However, the researchers caution, obtaining large datasets is an expensive process. "The public also should be skeptical when medical AI systems are developed from narrow training datasets. And regulators must scrutinize the training methods for these new machine learning systems," they urge.

In terms of the viability of the software itself, Vaughan cites technical debt accumulated with within medical device and app software that can severely reduce their accuracy and efficacy.  "After two years, we blindly trusted that the [glucose monitoring] app had been rebuilt," he relates. "Unfortunately, the only improvements were quick fixes and patchwork. Technical debt wasn't addressed. We validated errors on all devices and still found reviews sharing similar stories."  He urges transparency on the components within these devices and apps, including following US Food and Drug Administration guidelines that call for a "Cybersecurity Bill of Materials (CBOM)" that lists out "commercial, open source, and off-the-shelf software and hardware components that are or could become susceptible to vulnerabilities." 

More and more computing and software development is moving to the edge. The challenge is applying the principles of agile development, software lifecycle management and quality control learned over the years in the data center to the edges, and applying automation on a vaster scale to keep billions of devices current.