android

speech-recognition

android-4.2-jelly-bean

I have managed to get continuous speech recognition working (using the SpeechRecognizer class) as a service on all Android versions up to 4.1. My question concerns getting it working on versions 4.1 and 4.2 as it is known there is a problem in that the API doesn't do as documented in that a few seconds after voice recognition is started, if no voice input has been detected then it's as if the speech recogniser dies silently. (http://code.google.com/p/android/issues/detail?id=37883)

I have found a question which proposes a work-around to this problem (Voice Recognition stops listening after a few seconds), but I am unsure as how to implement the Handler required for this solution. I am aware of the 'beep' that will happen every few seconds that this workaround will cause, but getting continuous voice recognition is more important for me.

If anyone has any other alternative workarounds then I'd like to hear those too.

Solution 1

This is a work around for android version 4.1.1.

public class MyService extends Service
{
    protected AudioManager mAudioManager; 
    protected SpeechRecognizer mSpeechRecognizer;
    protected Intent mSpeechRecognizerIntent;
    protected final Messenger mServerMessenger = new Messenger(new IncomingHandler(this));

    protected boolean mIsListening;
    protected volatile boolean mIsCountDownOn;
    private boolean mIsStreamSolo;

    static final int MSG_RECOGNIZER_START_LISTENING = 1;
    static final int MSG_RECOGNIZER_CANCEL = 2;

    @Override
    public void onCreate()
    {
        super.onCreate();
        mAudioManager = (AudioManager) getSystemService(Context.AUDIO_SERVICE); 
        mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
        mSpeechRecognizer.setRecognitionListener(new SpeechRecognitionListener());
        mSpeechRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                                         RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
                                         this.getPackageName());
    }

    protected static class IncomingHandler extends Handler
    {
        private WeakReference<MyService> mtarget;

        IncomingHandler(MyService target)
        {
            mtarget = new WeakReference<MyService>(target);
        }


        @Override
        public void handleMessage(Message msg)
        {
            final MyService target = mtarget.get();

            switch (msg.what)
            {
                case MSG_RECOGNIZER_START_LISTENING:

                    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.JELLY_BEAN)
                    {
                        // turn off beep sound  
                        if (!mIsStreamSolo)
                        {
                            mAudioManager.setStreamSolo(AudioManager.STREAM_VOICE_CALL, true);
                            mIsStreamSolo = true;
                        }
                    }
                     if (!target.mIsListening)
                     {
                         target.mSpeechRecognizer.startListening(target.mSpeechRecognizerIntent);
                         target.mIsListening = true;
                        //Log.d(TAG, "message start listening"); //$NON-NLS-1$
                     }
                     break;

                 case MSG_RECOGNIZER_CANCEL:
                    if (mIsStreamSolo)
                   {
                        mAudioManager.setStreamSolo(AudioManager.STREAM_VOICE_CALL, false);
                        mIsStreamSolo = false;
                   }
                      target.mSpeechRecognizer.cancel();
                      target.mIsListening = false;
                      //Log.d(TAG, "message canceled recognizer"); //$NON-NLS-1$
                      break;
             }
       } 
    } 

    // Count down timer for Jelly Bean work around
    protected CountDownTimer mNoSpeechCountDown = new CountDownTimer(5000, 5000)
    {

        @Override
        public void onTick(long millisUntilFinished)
        {
            // TODO Auto-generated method stub

        }

        @Override
        public void onFinish()
        {
            mIsCountDownOn = false;
            Message message = Message.obtain(null, MSG_RECOGNIZER_CANCEL);
            try
            {
                mServerMessenger.send(message);
                message = Message.obtain(null, MSG_RECOGNIZER_START_LISTENING);
                mServerMessenger.send(message);
            }
            catch (RemoteException e)
            {

            }
        }
    };

    @Override
    public void onDestroy()
    {
        super.onDestroy();

        if (mIsCountDownOn)
        {
            mNoSpeechCountDown.cancel();
        }
        if (mSpeechRecognizer != null)
        {
            mSpeechRecognizer.destroy();
        }
    }

    protected class SpeechRecognitionListener implements RecognitionListener
    {

        @Override
        public void onBeginningOfSpeech()
        {
            // speech input will be processed, so there is no need for count down anymore
            if (mIsCountDownOn)
            {
                mIsCountDownOn = false;
                mNoSpeechCountDown.cancel();
            }               
            //Log.d(TAG, "onBeginingOfSpeech"); //$NON-NLS-1$
        }

        @Override
        public void onBufferReceived(byte[] buffer)
        {

        }

        @Override
        public void onEndOfSpeech()
        {
            //Log.d(TAG, "onEndOfSpeech"); //$NON-NLS-1$
         }

        @Override
        public void onError(int error)
        {
            if (mIsCountDownOn)
            {
                mIsCountDownOn = false;
                mNoSpeechCountDown.cancel();
            }
             mIsListening = false;
             Message message = Message.obtain(null, MSG_RECOGNIZER_START_LISTENING);
             try
             {
                    mServerMessenger.send(message);
             }
             catch (RemoteException e)
             {

             }
            //Log.d(TAG, "error = " + error); //$NON-NLS-1$
        }

        @Override
        public void onEvent(int eventType, Bundle params)
        {

        }

        @Override
        public void onPartialResults(Bundle partialResults)
        {

        }

        @Override
        public void onReadyForSpeech(Bundle params)
        {
            if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.JELLY_BEAN)
            {
                mIsCountDownOn = true;
                mNoSpeechCountDown.start();

            }
            Log.d(TAG, "onReadyForSpeech"); //$NON-NLS-1$
        }

        @Override
        public void onResults(Bundle results)
        {
            //Log.d(TAG, "onResults"); //$NON-NLS-1$

        }

        @Override
        public void onRmsChanged(float rmsdB)
        {

        }

    }
}

02/16/2013 - Fix beep sound if you use Text To Speech in your app make sure to turn off Solo stream in onResults

Solution 2

If you really want to implement continuous listening without internet connection you need to consider third-party packages, one of them is CMUSphinx, check Pocketsphinx android demo for example how to listen for keyword efficiently in offline and react on the specific commands like a key phrase "oh mighty computer". The code to do that is simple:

you create a recognizer and just add keyword spotting search:

recognizer = defaultSetup()
        .setAcousticModel(new File(modelsDir, "hmm/en-us-semi"))
        .setDictionary(new File(modelsDir, "lm/cmu07a.dic"))
        .setKeywordThreshold(1e-5f)
        .getRecognizer();

recognizer.addListener(this);
recognizer.addKeywordSearch(KWS_SEARCH_NAME, KEYPHRASE);
switchSearch(KWS_SEARCH_NAME);

and define a listener:

@Override
public void onPartialResult(Hypothesis hypothesis) {
    String text = hypothesis.getHypstr();
    if (text.equals(KEYPHRASE))
      //  do something
} 

Solution 3

For any of you who are trying to silence the beep sound, regrading the @HoanNguyen answer which is very good but be careful as said in the api set setStreamSolo is cumulative so if there is in error in the speech recognition and on error is called(for example no internet connection) then setStremSolo true is called again and again which will result in your app silencing the whole phone (very bad)! the solution to that is to add the setStremMute(false) to the speechRecognizer onError.

Solution 4

check out my demo app : https://github.com/galrom/ContinuesVoiceRecognition

I recommand to use both PockeySphix and SpeechRecognizer.