Linux Foundation partners with Microsoft and Target to create standards for voice technology

The Open Voice Network will look to protect the voices behind the technology and establish trust with consumers.
Written by Jonathan Greig, Contributor

The Linux Foundation is teaming up with companies like Target, Microsoft and Veritone to create the Open Voice Network, an initiative designed to "prioritize trust and standards" in voice-focused technology.

Jon Stine, executive director of the Open Voice Network, told ZDNet that the rapid growth of both the availability and adoption of voice assistance worldwide -- and the future potential of voice as an interface and data source in an artificial intelligence-driven world -- makes it important for certain standards to be communally developed.

Devices and applications are increasingly incorporating voice activation and navigation functions. Mike Dolan, senior vice president at the Linux Foundation, said the network was a "proactive response to combating deep fakes in AI-based voice technology."

"Voice is expected to be a primary interface to the digital world, connecting users to billions of sites, smart environments and AI bots. It is already increasingly being used beyond smart speakers to include applications in automobiles, smartphones and home electronics devices of all types. Key to enabling enterprise adoption of these capabilities and consumer comfort and familiarity is the implementation of open standards," Dolan said, adding that the organization was "excited to bring it under the open governance model of the Linux Foundation to grow the community and pave a way forward."

The nonprofit said the open-source association would be dedicated to promoting open standards that support the adoption of AI-enabled voice assistance systems.

In addition to Target, Microsoft and Veritone, the Linux Foundation said it is working with Schwarz Gruppe, Wegmans Food Markets and Deutsche Telekom. 

Ryan Steelberg, president and co-founder of Veritone, said self-regulation of synthetic voice content creation and used to protect the voice owner as well as establishing trust with the consumer is "foundational." 

"Having an open network through the Open Voice Network for education and global standards is the only way to keep pace with the rate of innovation and demand for influencer marketing," Steelberg said. "Veritone's MARVEL.ai, a Voice as a Service solution, is proud to partner with OVN on building the best practices to protect the voice brands we work with across sports, media and entertainment."

Thousands of companies and organizations have created voice assistant systems independent of today's general-purpose voice platforms as a way to streamline services and improve user experience. 

Linux Foundation representatives said the Open Voice Network would support the platforms by "delivering standards and usage guidelines for voice assistant systems that are trustworthy, inclusive and open." The organization will also provide guidance on voice-specific protection of user privacy and data security and ways to make voice assistants interoperable between platforms. 

"To speak is human, and voice is rapidly becoming the primary interaction modality between users and their devices and services at home and work," said Ali Dalloul, a general manager at Microsoft Azure. 

"The more devices and services can interact openly and safely with one another, the more value we unlock for consumers and businesses across a wide spectrum of use cases, such as Conversational AI for customer service and commerce." 

The Linux Foundation compared the effort to the open standards that were introduced in the earliest days of the internet, noting that those initiatives helped create uniform ways for websites to connect and exchange information.  

Voice assistants are now reliant on a variety of technologies, including Automatic Speech Recognition, Natural Language Processing, Advanced Dialog Management and machine learning.  

Steelberg added that voice technologies and interfaces would be fully integrated into the majority of digital applications, devices, and workflows in five years. As this voice proliferation and adoption increases, he noted that it is imperative that organizations like the Open Voice Network and other participating voice tech providers and developers continue to stay diligent on consumer and data protection, as well as protecting the trademark, copyright and uses of peoples' voices.

Voice technology began to emerge around 2011 with the introduction of Siri to iPhone users, according to Steelberg. Now, he said 1 in every 4 US adults owns some kind of smart speaker, and studies have shown that almost all smartphone users will be using some form of voice assistant within the next two years. 

Stine added that data from January shows there are about 3 billion active conversational agents worldwide, and the number is expected to jump to 8.4 billion by 2024. 

"The number of IoT devices such as smart thermostats, appliances, and speakers are giving voice assistants more utility in a connected user's life," Steelberg said. 

"Smart speakers are the number one way we are seeing voice being used. However, it only starts there. Many industry experts even predict that nearly every application will integrate voice technology in some way in the next five years."

Editorial standards