Amazon Alexa – Part 1, Intents and Utterances

Stories about project management and coding

Amazon Alexa – Part 1, Intents and Utterances

In this post I’ll discuss what Intents are, how Intent, Custom Slots and Sample Utterances are related and how you can easily configure your own.
Note that this post only brushes the topics below, but it should be enough for you to get the samples running.

The whole code mentioned in the snippets below for the Intent schema, Custom Slots and Sample Utterances can be found in the Github Repo for this post, under src/main/resources/speechAssets/

The Intent Schema

The Intent Schema is a JSON structure which declares the set of intents your service can accept and process.
It tells Alexa what your Skill can do and what Slots are available.
It like to compare an Intent to a function and a Slot to a parameter for that function. Let’s look at the example from the above mentioned Sample Repository:

Here we declare an Intent “GetEventData”, with an array of slots (one of them is a custom slot, more an that later). So in other words, we have a function with two parameters, this analogy will become clearer as soon as we’ll discuss the skill backend.
Intents can also have no slots if your Intent doesn’t need any more information to do it’s thing.

Built-in Intents

 

If you look at the sample code in the repository, you’ll notice three additional Intent, these are the built-in Intents provided by the Alexa Skills Kit. These are Intents for common actions that you can choose to implement in your custom skill without providing any sample utterances.
I highly recommend to implement these. I don’t really understand why Amazon designed them as optional, especially the CancelIntent. I had the learn this the hard way when I could not figure out why I couldn’t cancel requests to my skill – because I didn’t implement the CancelIntent.

The good thing about the built-in Intents that most of the configuration is already done for you, for example you don’t need to put them into your sample utterances (more on that below), but you how to think about them when implementing your backend, for example to think about any cleanup code when a CancelIntent arrives.

Slots

Amazon has a quite large list of different built-in slot types, for example AMAZON.Date (can translate an utterance like “tomorrow” into tomorrows calendar date), AMAZON.Airport (Names of a variety of airports.), AMAZON.AT_CITY (Provides recognition for over 5,000 Austrian and world cities commonly used by speakers in Germany and Austria), AMAZON.VideoGame (Titles of video games) and many others.

It’s important that you define your slots as precise as possible, as it reduces the “search area” considerably when the Alexa Service translates speech to text, therefore the natural language recognition is more precise.

Custom Slot Types

Often you’ll want to defined your own slot types, like a list of manufacturers or zodiac signs or what ever you need.
For this you can define your own custom slot types. The syntax is fairly easy, it’s just a list of possible values, seperated by a newline.

Sample Utterances

To enable the Alexa service to understand how your Intents could be asked for by a user and how it should fill in slot values, you have to provide it with some sample utterances. In my experience this is the most time-consuming part of the skill configuration, because you have to think of all the ways a user could ask for an action. Surely it’s impossible to think of all combinations, but try to include as many as possible.

The syntax is also fairly easy, a sample utterance is prefixed with the Intent name and the slots are in curly braces.

Designing the voice UI

The question and answers your skill provides will come from your backend, but should start to think about them from the start.
I like to do this in teams and play out a conversation with real people, this helps you to reduce the awkwardness of your speech output (the process to play them out can be awkward too, so just bring some beers to the meeting).

There are some guidelines you should follow when designing your voice UI, like never to assume that the user knows what he does.
I strongly recommend to read the Do’s and Don’ts in Amazon documentation. For your first skill, you can just use the sample code in the repo to get you started.

In part 2 of the series we’ll discuess how to get the Java backend source up and running.

Tags: , , , ,

One Response

  1. […] the part 1 we’ll discuss Intents and Utterances and how the help Alexa to understand what you intent to […]

Leave a Reply

Your email address will not be published. Required fields are marked *