2017-06-05 55 views
5

我正在使用IBM bluemix來轉錄某些音頻,並且我想使用API​​說話人識別。Android Bluemix未顯示揚聲器標記

我成立了這樣的識別器:

private RecognizeOptions getRecognizeOptions() { 
    return new RecognizeOptions.Builder() 
      .continuous(true) 
      .contentType(ContentType.OPUS.toString()) 
      //.model("en-US") 
      .model("en-US_BroadbandModel") 
      .timestamps(true) 
      .smartFormatting(true) 
      .interimResults(true) 
      .speakerLabels(true) 
      .build(); 
} 

,但返回的JSON犯規包括揚聲器標籤。我怎樣才能獲得用bluemix java API返回的揚聲器標籤?在Android的

我的錄音機是這樣的:

private void recordMessage() { 
    //mic.setEnabled(false); 
    speechService = new SpeechToText(); 
    speechService.setUsernameAndPassword("usr", "pwd"); 
    if(listening != true) { 
     capture = new MicrophoneInputStream(true); 
     new Thread(new Runnable() { 
      @Override public void run() { 
       try { 
        speechService.recognizeUsingWebSocket(capture, getRecognizeOptions(), new MicrophoneRecognizeDelegate()); 
       } catch (Exception e) { 
        showError(e); 
       } 
      } 
     }).start(); 
     Log.v("TAG",getRecognizeOptions().toString()); 
     listening = true; 
     Toast.makeText(MainActivity.this,"Listening....Click to Stop", Toast.LENGTH_LONG).show(); 
    } else { 
     try { 
      capture.close(); 
      listening = false; 
      Toast.makeText(MainActivity.this,"Stopped Listening....Click to Start", Toast.LENGTH_LONG).show(); 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } 
    } 
} 
+0

我認爲你的意思,他應該添加語音到文本標籤,而不是文本到語音轉換; ) –

+0

@bear什麼是音頻文件和您正在使用的識別方法?你在使用WebSockets嗎? –

+0

@GermanAttanasio我正在使用watson Android音頻流API,請參閱我的更新代碼片段 – bear

回答

0

根據您的例子我寫了一個示例應用程序,並得到了揚聲器標籤的工作。請使用Java-SDK 4.2.1。在您的build.gradle添加

compile 'com.ibm.watson.developer_cloud:java-sdk:4.2.1' 

下面是代碼,使用的WebSockets,中期業績和揚聲器標籤assets文件夾可識別WAV file片段。

RecognizeOptions options = new RecognizeOptions.Builder() 
    .contentType("audio/wav") 
    .model(SpeechModel.EN_US_NARROWBANDMODEL.getName()) 
    .interimResults(true) 
    .speakerLabels(true) 
    .build(); 

SpeechToText service = new SpeechToText(); 
service.setUsernameAndPassword("SPEECH-TO-TEXT-USERNAME", "SPEECH-TO-TEXT-PASSWORD"); 

InputStream audio = loadInputStreamFromAssetFile("speaker_label.wav"); 

service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() { 
    @Override 
    public void onTranscription(SpeechResults speechResults) { 
     Assert.assertNotNull(speechResults); 
     System.out.println(speechResults.getResults().get(0).getAlternatives().get(0).getTranscript()); 
     System.out.println(speechResults.getSpeakerLabels()); 
    } 
}); 

其中loadInputStreamFromAssetFile()是:

public static InputStream loadInputStreamFromAssetFile(String fileName){ 
    AssetManager assetManager = getAssets(); // From Context 
    try { 
    InputStream is = assetManager.open(fileName); 
    return is; 
    } catch (IOException e) { 
    e.printStackTrace(); 
    } 
    return null; 
} 

應用程序日誌:

I/System.out: so how are you doing these days 
I/System.out: so how are you doing these days things are going very well glad to hear 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay 
I/System.out: [{ 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.03, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.34 
I/System.out: }, { 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.34, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.54 
I/System.out: }, { 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.54, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.63 
I/System.out: }, { 
...... blah blah blah 
I/System.out: }, { 
I/System.out: "confidence": 0.343, 
I/System.out: "final": false, 
I/System.out: "from": 13.39, 
I/System.out: "speaker": 1, 
I/System.out: "to": 13.84 
I/System.out: }]