X
Innovation

Amazon's new tools for Alexa developers hints at what's next for the voice assistant

Amazon is rolling out a slew of new features for Alexa developers, including tools that reveal Amazon's aspirations for the AI assistant.
Written by Stephanie Condon, Senior Writer

Amazon on Wednesday is rolling out a slew of new features and tools to help developers build skills for Alexa, its AI-powered voice assistant. The improvements to the Alexa Skills Kit (ASK) range from sophisticated improvements to Alexa's foundational voice technology to features that hint at the future of Alexa -- such as features that facilitate voice-based experiences outside of the smart home. 

When improving the Alexa Skills Kit, "we try to think in terms of the experiences we enable," Nedim Fresko, Amazon's VP of Alexa Devices and Developer Technologies, said to ZDNet, "but also where we're established and what's next -- where we would like to be established and how we could get that started." 

All told, Amazon is rolling out 31 new features. They fit into a few different themes, according to Fresko. First, there are improvements focused on creating more natural and conversational experiences with Alexa. Next, there are new features focused on creating multi-modal Alexa skills -- skills that are accessible via smart speakers but also screens and other smart home devices. Then, there are new features that focus on taking Alexa outside the home -- to your Uber ride, for instance. Lastly, there are new features focused on helping consumers discover the Alexa skills they're looking for. 

"You can view [the new features] based on where we are in our progression," Fresko said. "When you look at our natural language improvement features, voice has always been our foundation -- we have a duty to make it better and better every year... Voice has to be excellent and natural, and that is the prime imperative. Other areas, like screen and multimedia, you can consider it as adding to a seed we've already planted... that we want to accelerate to the next level. And then on-the-go [Alexa experiences] are relatively new." 

Amazon has long led the smart speaker market, and the proliferation of devices connected to Alexa, as well as the variety of skills available to the AI assistant, should help fend off rival efforts, such as the Google Assistant ecosystem. Customers now have "hundreds of millions" of Alexa devices, Amazon says, including smart speakers, smart TVs, headphones, and PCs. Customer engagement with Alexa has nearly doubled over the last year. Meanwhile, more than 700,000 developers that have built Alexa skills, the company says, and more than 100,000 skills have been published. 

Popular categories of Alexa skills include entertainment, health, and wellness, Fresko said, and more recently, COVID-19 information. Meanwhile, some big brands have tapped Alexa as an additional channel through which to reach customers. As brands like Uber, Domino's, Capital One, and Starbucks look for ways to extend their business strategy, "we definitely do see potential there," Fresko said. 

Here's a look at some of the new Amazon Skills Kit features: 

Deep Neural Networks (DNN): Amazon is adopting DNNs that will improve Alexa's natural language understanding (NLU) of individual words and sentences, for more accurate custom skills understanding. Based on the early results of this improvement, Amazon expects that skills using the improved NLU will see a 15% improvement in accuracy. The improvements will come automatically once the DNNs are rolled out -- they'll cover hundreds of skills by the end of the year, Fresko said. 

Alexa Conversations (in beta) is a new dialog manager that will help developers create more natural conversations with customers. 

"Natural language is actually a very difficult thing to emulate," Fresko said. "When people speak naturally, they change direction, they make contextual references to things they said. Sometimes they over-supply information, sometimes they under-supply it -- when that happens, consumers revert to robotic language and simple phrases, and developers just give up." 

Alexa Conversations uses a deep learning-based approach. Developers give Amazon a few sample phrases of their conversations as well as some APIs that implement the services they're trying to achieve. From those samples, Amazon's AI system tries to anticipate all the possible conversation paths the user might take. It reduces the amount of back end code developers have to create and the amount of training data they have to provide.

Web API for Games (moving into GA this year):  This API makes it easier to create multi-modal, Alexa-activated games for devices with screens. Developers can use web technologies and tools (such as Canvas 2D, WebAudio, WebGL, JavaScript, and CSS) to build interactive gaming experiences for Echo Show and select Fire TV devices. 

Skill Resumption (in preview): This feature will keep a skill running in the background, so a user can return to it after using Alexa for something else, or after a certain period of time. If the user grants the appropriate permissions, the feature will even trigger new engagement proactively. For instance, if a customer uses Alexa to book an Uber ride, the ride-booking app (via Alexa) could proactively tell the customer at the appropriate time, "Your Uber has arrived."

While the feature is in preview mode, only one skill will be able to run in the background at a time. Additionally, the initial timeout time for skills running in the background will be one hour, Fresko said. 

"We're starting this in preview mode, so we don't have final answers to these questions, but we do think about them as the next things to determine," he said to ZDNet. "At this point, we want to see how the feature resonates with developers and users and what they can imagine they can do with it, and until then we'll be conservative about it."

The skills resumption feature, Fresko said, is an example of the way Amazon is thinking about what's needed to take Alexa outside of the home. Inside the home, Alexa is often used for one simple task, he noted. Outside the home -- when customers may be using Alexa via Echo Buds or Echo Auto, for instance -- conversations become more "multi-threaded," he said. 

Alexa for Apps (in preview): This feature is also geared towards taking Alexa on the go. It lets developers combine their Alexa skill with their iOS and Android mobile apps, so users can start an interaction with Alexa and then finish their task with a mobile app. For example, customers could previously ask the Twitter skill for trends or mentions and hear a voice response. Now, when they ask, "Alexa, ask Twitter to search for #Alexa Live," the skill can also open the mobile Twitter app and display the results.

Quick Links for Alexa (in beta): Developers can use this feature to drive traffic from their websites and mobile apps to their Alexa skill. In addition to encouraging more customer engagement, the feature can help developers track conversion from online ads. 

NFI Toolkit (in preview):  The name-free interactions (NFI) toolkit lets developers provide additional signals that Alexa can consider when launching your skill. Developers can suggest up to five launch phrases. For instance, for a trip planning skill, one of these phrases could be, "Alexa, help me plan my trip." 

Developers can also indicate specific intents in their skill that Alexa can consider. For example, in the trip planning skill, you can enable a "get train schedule" intent for name free-interaction. Then, if a customer asks, "When is the next train to Union Station?" that skill would be considered to answer. 

In an early pilot, users of the NFI toolkit saw an increase in dialogs of up to 15%.

Prior and related coverage:

Editorial standards