All CategoriesAndroid App Development

Get started with Google's Text to Speech & Speech to Text in Android

Do you want your phone to read the text aloud or to know how Google mic works? Well, this blog is an answer to such questions and more. Android OS provides an awesome feature called Text to Speech (TTS) and Speech to Text (STT). This feature is available from Android Ver 1.6 onwards.

Its Time To Free Up Your Hands With Speech To Text

The Android platform’s Text-to-Speech (TTS) capability allows us to provide speech input to your app. TTS functionality enables an Android device to “speak” text in various languages. All Android-powered devices that support the TTS functionality are normally shipped with an in-built TTS-engine (for ex: Pico) but some devices which have limited storage may lack the language-specific resource files.

As we mentioned above TTS feature enables your Android device to ‘speak’ text in different languages. For this, the TTS engine needs to know which language it needs to speak. According to Wikipedia, the languages that are supported include Cantonese (Hong Kong), Chinese (China), Dutch, English (India), English (United Kingdom), English (United States), French, German, Hindi, Indonesian, Italian, Japanese, Korean, Polish, Portuguese (Brazil), Russian, Spanish (Spain), Spanish (United States) Thai (Thailand), and Turkish (Turkey).

TTS is also known as “speech synthesis”. Speech synthesis is the artificial production of human speech. It is used to translate written information into oral information, especially for mobile applications such as voice-enabled-e-mail and unified messaging.

Having said that let us now understand the process of Speech Generation. Your application initiates speech generation by passing a string or a sequence of characters to the Speech Synthesis framework. This framework is responsible for sending the text to a speech synthesizer which contains the executable code that manages all communication between the Speech Synthesis framework and Core Audio to generate a speech through audio hardware.

Figure 1 : Speech Synthesis Workflow

Application of TTS

The major application of TTS is assistive technology for helping blind hear the written words and in telephone answering devices as automated attendants.
But this cool feature available on Android requires Internet. If your phone does not have internet 24/7 then you will run into issues while trying to use ‘Text to Speech’. In order to fix this issue,
• You need to subscribe to an active data plan and have Internet Service
• Make offline speech recognition feature work on your phone. For this you need to do the following changes in Settings:
1. Go to Settings > Input & Language.
2. Make sure the Google voice typing option is checked.

3. Now scroll to the SPEECH section and tap on Voice Search.

4. Tap on Offline speech recognition – Manage downloaded languages

5. Tap on All to download all the language packs:

Security app > Permissions > Select the app > You will see list of options > Grant permission to audio or mic

Note: Above mentioned steps changes from manufacture to manufacture.

Implementation/ Example for TTS
Let‘s look at the implementation example of text to speech & speech to text. The screenshot given below is an outcome of the application having one text view “@+id/textView”, two image buttons “@+id/btnRecord” and “@+id/btnSpeak”.

Now add the following code in which implements TextToSpeechMain.OnInItListener. OnInItListener is an interface callback indicating the completion of the TextToSpeech engine initialization.

//import all the packages for using the Android Activity class
import android.content.ActivityNotFoundException;
import android.content.Intent;
import android.speech.RecognizerIntent;
import android.speech.tts.TextToSpeech;
import android.os.Bundle;
import android.view.Menu;
import android.view.View;
import android.widget.ImageButton;
import android.widget.TextView;
import android.widget.Toast;
import java.util.ArrayList;
import java.util.Locale;
public class MainActivity extends Activity {
    //take an integer variable for SPEECH and intiate as 1
    protected static final int RESULT_SPEECH = 1;
    //take two Image Buttons
    private ImageButton btnSpeak,btnRecord;
    //take one textview
    private TextView txtText;
    Take a class TextToSpeech for using of Text To Speech
    TextToSpeech t1;
    public void onCreate(Bundle savedInstanceState) {
        //Intialize the TextView and two ImageButton’s
        txtText = (TextView) findViewById(;
        btnSpeak = (ImageButton) findViewById(;
        btnRecord = (ImageButton) findViewById(;
        // set the listener setOnClickListener for button “btnSpeak”
        btnSpeak.setOnClickListener(new View.OnClickListener() {
            public void onClick(View v) {
                Intent intent = new Intent(
                //Get the value using Intent from Speech
                intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, "en-US");
                try {
                    startActivityForResult(intent, RESULT_SPEECH);
		   //Set the Text as Empty
                } catch (ActivityNotFoundException a) {
	           //Show a Toast if the device is not supported
                    Toast t = Toast.makeText(getApplicationContext(),
                    "Opps! Your device doesn't support Speech to Text",
/** Write the code for the “onActivityResult” method so that when you speak something it will be converted into text automatically and vice versa. */
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        super.onActivityResult(requestCode, resultCode, data);
	//check the requestCode as a case
        switch (requestCode) {
            case RESULT_SPEECH: {
                if (resultCode == RESULT_OK && null != data) {
                    final ArrayList<String> text = data.getStringArrayListExtra(
//set the Button’s OnClick method for converting Text to Speech
                    btnRecord.setOnClickListener(new View.OnClickListener() {
                        public void onClick(View v) {
                            String toSpeak = text.toString();
                            Toast.makeText(getApplicationContext(), toSpeak, Toast.LENGTH_SHORT).show();
//use the TextToSpeech class for converting the Text to Speech
                            t1.speak(toSpeak, TextToSpeech.QUEUE_FLUSH, null);
/** Interface definition of a callback to be invoked indicating the completion of the TextToSpeech engine initialization. */
        t1=new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
            public void onInit(int status) {
                if(status != TextToSpeech.ERROR) {

Tips :
1. Do speak clearly and precisely
2. Do use voice commands often and wisely

Hope this blog was informative enough. As the name Android means Robo, it needs commands to work & this coolest feature of TTS & STT proves it. Remember that in order to synthesize the speech intelligently, you need to match the language you select to that of the text to synthesize. We hope you’ll enjoy this cool Android feature while integrating in your Android applications. Keep visiting for more updates on the courses

Android Programming


Related Articles