Speech Recognition

Introduction

Using speech recognition

Enabling the feature in Dashboard

Adding the CAF to ChatMapper

Matching speech

Matching different text than what is displayed

Reacting to repeated errors

Events

Example CMP

Introduction

Since version 2.9, LearnBrite supports speech recognition using Google Chrome’s built-in feature.

The HTML5 Speech Input API in Chrome uses Google's private endpoint (the recognition is done on Google servers). This means speech data is sent to Google’ servers and is handled under their data privacy policy.

Where native HTML5 speech recognition is not supported by the browser, a fallback is used with Google Cloud Speech-to-Text.

A list of all supported languages is not available, but we can make the assumption that it is the same as the one contained here: https://cloud.google.com/speech-to-text/docs/languages

Browsers that support HTML5 speech recognition include:

  • Chrome Desktop
  • Chrome Android
  • Samsung Internet
  • Oculus Browser on GearVR

Oculus Go and Oculus Quest do not support native HTML5 speech recognition at the time of writing (June 10th, 2019), therefore these devices use the fallback recognition service.

Using speech recognition

The following will enable speech recognition on every choice.

Enabling the feature in Dashboard

The feature is enabled by turning on a setting in Dashboard.

  1. Access a space’s edit page
  2. Show advanced settings
  3. Find “Enable Speech Recognition” in the “Experimental” section and change it to Enabled
  4. Click Save at the bottom afterwards

Adding the CAF to ChatMapper

The next step is to add the required Custom Asset Fields (CAFs) in ChatMapper.

To do so, open Project > Project Settings, then select the Custom Asset Fields tab, and finally select the Conversations tab.

To add a new CAF, click on Add New Field at the bottom, and use the following settings:

CAF 1:

  • Title: speechRecognition_enabled
  • Type: Boolean
  • Default value: True

CAF 2:

  • Title: speechRecognition_language
  • Type: Text
  • Default value: en-US

A note on speechRecognition_language: “en-US is the default value and in that case, adding the parameter is not required. To use another language use the language code from https://cloud.google.com/speech-to-text/docs/languages, for example “it-IT” for Italian, or “he-IL” or “he” for Hebrew.

Matching speech

The text to be recognized will be matched against the Dialogue Text from the node or the Menu Text if the Dialogue Text field is empty. For example, in the first image “Pizza” will be the command to be recognized, whereas in the second it will be “Pasta”.

Matching different text than what is displayed

Perhaps what you want to display is different from the command to be recognized. This is fairly common when using ChatMapper, as many choices might have [f] or [a] tokens in the Menu Text.

To cover this eventuality a third CAF exists, allowing to include a specific command to be recognized. It is added in the same way as the previous section.

CAF:

  • Title: speechRecognition_command
  • Type: Text

The text in the speechRecognition_command for that node will now be used to match against rather than the Dialogue Text or Menu Text.

In the following example, “pizza” will be the text that speech is matched against: both Dialogue Text and Menu Text are ignored.

Reacting to repeated errors

A “not understood” value that can be used in ChatMapper conditions can be accessed as

LB.cmPlayer.speechRecognition.notUnderstoodCount

Note that this value is reset to 0 when changing nodes.

This is especially useful when giving feedback to the user: an avatar could for instance interject if notUnderstoodCount > 2 to give pointers on what to say, or a new button could be shown allowing the user to skip to another part of the dialogue.

Events

Speech Recognition will broadcast some events based on its outcome. These can be listened to with the following:

LB.scenarioEvents.AddEventListener("EVENT_NAME", function() {
 
// This will execute when EVENT_NAME is fired.
});

Existing events are:

SpeechRecognition.Success

Fired if speech recognition is successful

SpeechRecognition.Error

Fired as a catch-all event for any error

SpeechRecognition.NoSpeech

Fired when no speech is recorded. Included in SpeechRecognition.Error

SpeechRecognition.NotUnderstood

Fired when speech is not understood. Included in SpeechRecognition.Error

Example CMP

You can download an example CMP at the following here.

© 2019 LearnBrite – Commercial In Confidence

Trademarks & Copyrights are property of their respective owners. Pictures are indicative only & may not reflect final production.

How useful was this article?

Click on a star to rate it!

We are sorry that this article was not useful for you!

Let us improve this article!