Humanizing AI communication: What's needed to make IoT devices sound better
There was an interesting blend in the audience at O'Reilly's AI conference that just wrapped up in New York. Beyond the usual crowd of unicorns, a.k.a. data scientists, there was a sufficiently sized management crowd to fill the room at the executive sessions. With AI all over the media and popular entertainment, you'd have to be living under a rock to not be familiar with the topic of AI, even if the definitions are as fuzzy as the logic that machines synthesize. And executives want to get an idea of what this new boardroom buzzword is all about.
Executives have certainly heard about AI, but their organizations are still at early stages implementing it, according to a 2017 study of 3000 executives presented by MIT Sloan Management Review executive editor David Kiron. Only 23 percent of companies have actually deployed AI, with the upper 5 percent now starting to embed it across their enterprises. There was some dissension within the room that the survey may have under-sampled because executives might not be in close touch with the practitioners who are blazing the trails for AI. But a similar-sized 3000-sample conducted by McKinsey last year showed only 20 percent of companies using AI-related technologies, with commercial adoption in about 12 percent of the cases.
The successes are not hard to find. Wells Fargo goes beyond chatbots to tap AI to improve fraud detection and add context to the customer experience. Google found that building an ML model that tracked usage of trial versions of G Suite enabled them to predict in as little as two days who is likely to becoming a paying customer after the 45-day free trial ends. Comcast uses deep learning to provide more contextual service as it also tracks the working status of its customers' devices.
But as we waded through the success stories and deep dives on methodology, we wondered about how humans can get a grip on such newfound power. While analytics have expanded our ability to gain insights, humans still made the decisions on how to interpret data. With AI, that burden gets shared. A couple points hit us: First was the cornerstone belief in data -- the more data, the better the machine learning or deep learning model. Indeed, the explosion of data is one of the factors that has taken AI from winter to spring. Next was accountability.
Data is not the only reason that, for AI, the seasons are changing from winter to spring. The cloud, which lowers the barriers to entry (you don't need to buy your own HPC grid); optimized hardware (like GPUs and TPUs); connectivity; and open source (you don't have to reinvent the wheel in devising algorithms) are certainly playing their roles. And we're seeing AI being used to help practitioners conduct AI -- witness the emerging generation of non-specialist friendly services like Amazon SageMaker, or tools like Oneclick.ai, and you may not always need data scientists to do AI rocket science work.
But the nagging question is at what point will the mounting volumes of data generate diminishing returns for AI? Over on the storage side, the Hadoop community has already started dealing with that question with the erasure coding as we noted while reviewing Hadoop 3. Just as the internet and email were not originally conceived with security in mind, the awareness that information requires a lifecycle was not one of the considerations when Yahoo, Facebook, and others were developing HDFS based on published Google research.
At the conference, we didn't find any speakers introducing the issue of when enough data is enough, but a presentation from Greg Zaharchuk, associate radiology professor at Stanford University, provided hints that successful AI might not always require seemingly infinite torrents of data. In this case, it was the need to optimize the use of medical imaging, especially CT or MRI scans which insurers and patients alike prefer to minimize because the treatments are costly and unpleasant. And so, you get that cloudy CT image of blood flow into the brain that's a symptom of data sparcity. Ideally, it would be better to either send the patient in for another or a longer scan, but that's not practical: it's too unpleasant for the patient and costly for the insurer to get, literally, picture perfect.
Zaharchuk's team was looking at the potential of deep learning to reduce patient exposure to either costly or harmful radiological imaging procedures. Working from a rather small data set (about 100 patients), they conducted tests for combining "reference image" data with actual patient images from MRI, CT, and PET scans and found that using a collection of deep learning approaches offered promise in, literally, filling in the blanks. And best of all, it didn't require assembling nationwide samples to get workable results.
As to accountability, is it realistic to expect that we should be able to explain what the models do and the rationale behind them? We recalled SAS founder Dr. Jim Goodnight, voicing his concern during an analyst conference a few months back about the accountability of ML and DL models. Especially with the use of neural networks, where multiple models may act in concert, establishing the chain of command can be challenging such as pinpointing the actual algorithm or data set that was responsible for approving or denying that loan application. It's an issue that's getting more airing. The stakes may not be critical if you're looking for the logic behind why Netflix recommends a movie or Amazon suggests a related product, but when it comes to weighty matters like planning brain surgery, that's another story.
It's a question that the community is still grappling with.
Zoubin Ghahramani, a professor at the University of Cambridge's machine learning program, postulated that there are legal liability and privacy issues that could be at stake regarding the use of an algorithm. Kathryn Hume, venture capitalist and vice president of product and strategy for integrate.ai, a startup applying AI to customer interactions for b2c companies, maintained that the real challenges for accountability are explaining the inputs that fed to the models and the outputs that they generate.
"The blind spots in data collection can lead to greater problems," she stated, adding that focusing on outcomes (are we getting the right results for the right targets) might be more germane. Danny Lange, vice president of AI and machine learning at Unity Technologies, pointed to the difficulty of explaining models even for everyday functions such as product recommendations. How to explain the models? "Maybe we should borrow some ideas from human psychology," he ventured.