Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
176 views
in Technique[技术] by (71.8m points)

java - Highlighting the Text while Speech is Progressing

I'm developing an App in which I've textview consists of String and two buttons. When I click the speak button, the text gets converted to speech. But I want to Highlight the word while speech is running.

Please check the My app screenshot on this below link. enter image description here

This is My text to speech initialization:

textToSpeech = new TextToSpeech(this, new TextToSpeech.OnInitListener() {

        @Override
        public void onInit(int status) {

            if (status == TextToSpeech.SUCCESS) {
                result = textToSpeech.setLanguage(Locale.ENGLISH);
                textToSpeech.setOnUtteranceProgressListener(new UtteranceProgressListener() {
                    @Override
                    public void onStart(String utteranceId) {
                        Log.d(utteranceId, "TTS start");}

                    @Override
                    public void onDone(String utteranceId) {
                        Log.d(utteranceId, "TTS done");}

                    @Override
                    public void onError(String utteranceId) {
             });
            } else {
                Toast.makeText(getApplicationContext(), "Feature is not Available", Toast.LENGTH_SHORT).show();
            }
        }
    });

And other code:

private void speak() {
 if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
        Toast.makeText(getApplicationContext(), "Feature is not Available", Toast.LENGTH_SHORT).show();
    } else {
        textToSpeech.setPitch(1f);
        textToSpeech.setSpeechRate(0.8f);
        HashMap<String, String> params = new HashMap<>();
        params.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "utteranceId");
        textToSpeech.speak(getString(R.string.storytxt), TextToSpeech.QUEUE_FLUSH, params);

    }
}

@Override
protected void onDestroy() {
    super.onDestroy();
    if (textToSpeech != null) {
        textToSpeech.shutdown();
    }
}

Till here I didn't get any problem. Now I want to highlight the text. I don't know how to do it.I've searched everywhere still got no lead on this.

I stored the string in String.xml.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

For Android API 26 and above AND a TTS engine that supports onRangeStart (in this case, Google TTS):

public class MainActivity extends AppCompatActivity implements TextToSpeech.OnInitListener {

    TextToSpeech tts;

    String sentence = "The Quick Brown Fox Jumps Over The Lazy Dog.";

    TextView textView;

    @Override
    protected void onCreate(Bundle savedInstanceState) {

        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        textView = findViewById(R.id.textView);
        textView.setText(sentence);
        tts = new TextToSpeech(this, this);

    }

    // TextToSpeech.OnInitListener (for our purposes, the "main method" of this activity)
    public void onInit(int status) {

        tts.setOnUtteranceProgressListener(new UtteranceProgressListener() {

            @Override
            public void onStart(String utteranceId) {
                Log.i("XXX", "utterance started");
            }

            @Override
            public void onDone(String utteranceId) {
                Log.i("XXX", "utterance done");
            }

            @Override
            public void onError(String utteranceId) {
                Log.i("XXX", "utterance error");
            }

            @Override
            public void onRangeStart(String utteranceId,
                                     final int start,
                                     final int end,
                                     int frame) {
                Log.i("XXX", "onRangeStart() ... utteranceId: " + utteranceId + ", start: " + start
                        + ", end: " + end + ", frame: " + frame);

                // onRangeStart (and all UtteranceProgressListener callbacks) do not run on main thread
                // ... so we explicitly manipulate views on the main thread:
                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {

                        Spannable textWithHighlights = new SpannableString(sentence);
                        textWithHighlights.setSpan(new ForegroundColorSpan(Color.YELLOW), start, end, Spanned.SPAN_INCLUSIVE_INCLUSIVE);
                        textView.setText(textWithHighlights);

                    }
                });

            }

        });

    }

    public void startClicked(View ignored) {

        tts.speak(sentence, TextToSpeech.QUEUE_FLUSH, null, "doesn't matter yet");

    }

}

// -------------------------------------------------------------------

Android API 25 and below:

In theory, the most intuitive way of accomplish this would be to:

1) Break the string into pieces

2) Detect when each piece has been/is being spoken

3) Highlight that piece accordingly

However, unfortunately, when using the Android TextToSpeech class where the speech output is generated in real-time, the smallest unit of speech that you are able to precisely detect the progress of (using UtteranceProgressListener) is an utterance (whatever string you decided to send to the TTS) -- not necessarily a word.

There is no mechanism whereby you can simply send a multi-word string as an utterance, and then somehow detect exactly when each word has been spoken.

Therefore, in order to (easily) highlight each word in order, you would have to either:

A) Send each word to the TTS individually as a single utterance (but this will cause disjointed pronunciation), or

B) Highlight sentence-by-sentence instead, sending each sentence as an utterance (easiest method, but not your desired behaviour).

If you really insist on achieving a word-by-word highlighting effect, the only way I can think of (using Android TextToSpeech) is to use sentence-size utterances, but instead of using speak(), use synthesizeToFile()... and then use a media player or sound player of some sort to play the speech back... somehow approximating the timing of the highlights in terms of where the nth word lies relative to the total audio file length. So, for example, if the sentence is 10 words long, and the file is 30% complete, then you would highlight the 4th word. This would be difficult and inexact, but theoretically possible.

There are obviously apps and games that already exist that do this... games like Parappa the Rapper, or karaoke apps, but I think the way they do it is by having pre-recorded/static audio files with markers encoded at exact times that trigger the highlights. If your text content is always going to be the same, and only in one language, then you could also do this.

However, if the spoken text is user-entered or unknown until runtime, requiring a TTS, then I don't know of any straight-forward solution.

If you decide on one of these more narrowed-down approaches, then I would suggest posting a new question accordingly.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...