How to use Speech Recognition with Xamarin.Android
A few weeks back we looked at using the new Speech API that Apple added into their iOS 10 release. Google also has something similar for Android that gives you some of the same opportunities for incorporating speech recognition into your Android App. In this article we'll examine how you can integrate speech recognition into your Xamarin.Android application using the Android.Speech API to create a FlexGrid with filtering that is driven by speech recognition.
Speech Recognition using Android.Speech
Unlike Apple's Speech API that was added very recently, Android.Speech has been around for many years. It works somewhat similarly to Apple's solution in that it also uses an off device server to handle recognition (at least by default). There isn't any element of interepretation in Google's implementation. It is more explicitly for converting speech to text and vice versa. Google and Xamarin both have some documentation related to the Android.Speech that can be helpful when starting out.
Using Speech Recognition on Android
Much like we did in our Xamarin.iOS sample, we'll modify the FullFilter sample so that it supports voice recognition on Xamarin.Android. We'll add a couple of new objects to the sample including a record button and a bool that lets us check the current recording state.
[Activity(Label = "Full Filter Sample", Icon = "@drawable/flexgrid_filter")]
public class FullFilterActivity : Activity
{
private FlexGrid mGrid;
private EditText mFilterText;
private FilterCellFactory mFilterCellFactory;
private Button mRecordButton;
private bool isRecording;
...
Next we'll need to make some changes to the OnCreate method. We'll do some setup here for the voice recognition including checking for the existence of a mic on the device. Assuming that there is a microphone, we can set up a new voice intent RecognizerIntent. The RecognizerIntent is the really the core of what converts speech to text. It takes a number of different parameters to determine what it records language, prompt, duration of silence before recording stops, and any extra languages to interpret. Once the intent is fully set up we can call StartActivityForResult to eventually retrieve our result.
string rec = Android.Content.PM.PackageManager.FeatureMicrophone;
//checking for Microphone on device
if (rec != "android.hardware.microphone")
{
var micAlert = new AlertDialog.Builder(mRecordButton.Context);
micAlert.SetTitle("Device doesn't have a mic for recording");
micAlert.SetPositiveButton("OK", (sender, e) =>
{
return;
});
micAlert.Show();
}
else
mRecordButton.Click += delegate
{
isRecording = !isRecording;
if (isRecording)
{
// create the voice intent
var voiceIntent = new Intent(RecognizerIntent.ActionRecognizeSpeech);
voiceIntent.PutExtra(RecognizerIntent.ExtraLanguageModel, RecognizerIntent.LanguageModelFreeForm);
// message and modal dialog
voiceIntent.PutExtra(RecognizerIntent.ExtraPrompt, "Speak now");
// end capturing speech if there is 3 seconds of silence
voiceIntent.PutExtra(RecognizerIntent.ExtraSpeechInputCompleteSilenceLengthMillis, 3000);
voiceIntent.PutExtra(RecognizerIntent.ExtraSpeechInputPossiblyCompleteSilenceLengthMillis, 3000);
voiceIntent.PutExtra(RecognizerIntent.ExtraSpeechInputMinimumLengthMillis, 30000);
voiceIntent.PutExtra(RecognizerIntent.ExtraMaxResults, 1);
// method to specify other languages to be recognised here if desired
voiceIntent.PutExtra(RecognizerIntent.ExtraLanguage, Java.Util.Locale.Default);
StartActivityForResult(voiceIntent, 10);
}
};
}
...
After the Intent has completed the interpreted text will be retrievable in the OnActivityReult method. We'll get the text and pass it back to our textbox that handles filtering input if speech was recognized. If nothing was recognized we'll display a message saying as much.
protected override void OnActivityResult(int requestCode, Result result, Intent data)
{
if (requestCode == 10)
{
if (result == Result.Ok)
{
var matches = data.GetStringArrayListExtra(RecognizerIntent.ExtraResults);
if (matches.Count != 0)
{
string textInput = mFilterText.Text + matches[0];
mFilterText.Text = textInput;
}
else
mFilterText.Text = "Nothing was recognized";
}
}
base.OnActivityResult(requestCode, result, data);
}
That's it and at this point you should be able to run the modified Filtering sample with speech recognition.
Wrap up
Adding this type of text to speech functionality makes it easier than ever for users to interact with the controls in their applications. As the pursuit of more natural forms of communication between computing device and user continue to evolve, this type of functionality will become even more common. Right now, speech recognition is a convenience feature though it is likely to become ever more prevalent.