Data is the missing piece of the AI puzzle. Here's how to fill the gap

Technology professionals and industry leaders are sounding an alarm that organizational data might not be ready to support growing AI ambitions.
Written by Joe McKendrick, Contributing Writer
globe of puzzle pieces
Flavio Coelho/Getty Images

The skills gap that's holding back progress in artificial intelligence (AI) is well documented, but another factor looms large: data complexity. The two leading obstacles to AI success, a new IBM study reveals, are limited AI skills and expertise (cited by 33% of respondents), followed by too much data complexity (25%). 

A majority of companies (58%) to date are not yet actively implementing AI, according to the survey of 8,584 IT professionals. The biggest inhibitors of generative AI at these non-AI-enabled companies include data privacy (57%), and trust and transparency (43%). 

Also: Automation driving AI adoption, but lack of the right skillsets slowing down returns

Among the companies already deploying AI, the key barriers are often related to data, with some organizations taking steps toward trustworthy AI, such as tracking data provenance (37%) and reducing bias (27%). Around a quarter (24%) of companies are seeking to develop their business analytics or intelligence capabilities, which depends on consistent, high-quality data.

However, some industry leaders are sounding the alarm that organizational data may not be ready to support growing AI ambitions. "To remain competitive, CIOs and technology leaders must adapt their data strategies as they integrate gen AI into their technology stacks," says Matt Labovich, US data, analytics and AI leader for PwC. "This involves understanding data and preparing for the transformative impact of emerging technologies."

Also: 5 ways to explore the use of generative AI at work

Technology professionals and their organizations need to address "data security, AI decision-making ethics, and AI literacy," says Shipra Sharma, head of AI and analytics at Bristlecone. "With limited AI education due to the newness of this technology, many individuals are left to figure out how to use it on their own." 

She says actively engaging with the technology, "to educate employees and implementing appropriate safeguards will allow organizations to realize the benefits of generative AI for data management while mitigating the risks. With these protocols in place, advanced data capabilities will grant organizations a notable advantage in their ability to scale their operations."    

Companies looking to make progress in AI, says Labovich, must "strike a balance and acknowledge the significant role of unstructured data in the advancement of gen AI."

Sharma agrees with these sentiments: "It is not necessarily true that organizations must use gen AI on top of structured data to solve highly complex problems. Oftentimes the simplest applications can lead to the greatest savings in terms of efficiency."

Also: How does ChatGPT work?

The wide variety of data that AI requires can be a vexing piece of the puzzle. For example, data at the edge is becoming a major source for large language models and repositories. "There will be significant growth of data at the edge as AI continues to evolve and organizations continue to innovate around their digital transformation to grow revenue and profits," says Bruce Kornfeld, chief marketing and product officer at StorMagic. 

Currently, he continues, "there is too much data in too many different formats, which is causing an influx of internal strife as companies struggle to determine what is business-critical versus what can be archived or removed from their data sets."

Kornfeld says it's urgent that companies "determine approaches and solutions that can, in a cost-effective manner, filter out the noise and unnecessary information that's being stored to make room for what's essential."

Another consideration is that training data comes from a variety of sources, incorporating both public sources as well as an organization's intellectual property, says Osmar Olivo, vice president of product management at Inrupt, a company co-founded by Sir Tim Berners-Lee. 

Also: Six skills you need to become an AI prompt engineer

The choice for many organizations often comes down to deciding "between the competitive advantage companies can get by leveraging AI and protecting their most sensitive data," says Olivo. "This does not need to be a binary choice, however. I expect 2024 to see innovative data management and data privacy solutions emerge, particularly with a focus on protecting data that is being used by AI models."

Establishing a data-first approach, along with "a robust, centralized data repository," is critical to the successful adoption of AI adoption for both corporate and internal IT processes, says Rakesh Jayaprakash, chief analytics evangelist with ManageEngine, the IT management division of Zoho Corp. "This revolves around the meticulous capture of every organizational event and process, with machine-learning algorithms employed to discern valuable patterns."

Still, "while the future promises features with generative AI at their core, we are still some time away from seeing these capabilities translate into tangible benefits for users," Jayaprakash adds. "In light of this, businesses must exercise prudence when investing significant resources in attention-grabbing features that may not offer enduring value." He says AI capabilities must be "seamlessly woven into a platform's fabric."

Also: The 3 biggest risks from generative AI - and how to deal with them

And as organizations develop data strategies to accommodate the rise of gen AI, "there are some no-regret moves that everyone can take to prepare for the inevitable change brought on by emerging technology," says Labovich. 

"Organizations can streamline operations and make short-term improvements, such as gen AI to generate critical operational and financial documentation, external customer and marketing communications, and sharing of organization knowledge across critical employees. These moves can yield benefits like enhanced productivity and cost savings, all while larger data and technology initiatives are underway."

Editorial standards