2014-09-03 86 views
1

我創建了運行Sphinx4(語言模型,詞典和聲學模型)所需的所有文件。但是,當我在Eclipse中運行它,下面的異常被拋出:Sphinx4 - IllegalArgumentException

00:16:12.707 INFO unitManager   CI Unit: AE 
00:16:12.713 INFO unitManager   CI Unit: AH 
00:16:12.714 INFO unitManager   CI Unit: B 
00:16:12.714 INFO unitManager   CI Unit: EY 
00:16:12.715 INFO unitManager   CI Unit: F 
00:16:12.715 INFO unitManager   CI Unit: IY 
00:16:12.716 INFO unitManager   CI Unit: JH 
00:16:12.716 INFO unitManager   CI Unit: L 
00:16:12.717 INFO unitManager   CI Unit: M 
00:16:12.722 INFO autoCepstrum   Cepstrum component auto-configured as follows: autoCepstrum {MelFrequencyFilterBank, DiscreteCosineTransform} 
00:16:12.853 INFO dictionary   Loading dictionary from: file:Alphabets/tutorial/alphabets/etc/alphabets.dic 
00:16:12.853 INFO dictionary   Loading filler dictionary from: file:Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont/noisedict 
00:16:12.854 INFO acousticModelLoader Loading tied-state acoustic model from: file:Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont 
00:16:12.854 INFO acousticModelLoader Pool means Entries: 30 
00:16:12.855 INFO acousticModelLoader Pool variances Entries: 30 
00:16:12.855 INFO acousticModelLoader Pool transition_matrices Entries: 10 
00:16:12.855 INFO acousticModelLoader Pool senones Entries: 30 
00:16:12.855 INFO acousticModelLoader Pool mixture_weights Entries: 30 
00:16:12.856 INFO acousticModelLoader Pool senones Entries: 30 
00:16:12.856 INFO acousticModelLoader Context Independent Unit Entries: 10 
00:16:12.856 INFO acousticModelLoader HMM Manager: 10 hmms 
00:16:12.860 INFO acousticModel  CompositeSenoneSequences: 0 
00:16:12.861 INFO largeTrigramModel Loading n-gram language model from: file:Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp 
00:16:12.867 INFO largeTrigramModel 1-grams: 3 
00:16:12.867 INFO largeTrigramModel 2-grams: 1 
00:16:12.867 INFO largeTrigramModel 3-grams: 1 
00:16:13.094 INFO lexTreeLinguist  Max CI Units 11 
00:16:13.095 INFO lexTreeLinguist  Unit table size 1331 
Exception in thread "main" java.lang.IllegalArgumentException 
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:111) 
    at edu.cmu.sphinx.linguist.WordSequence.getWord(WordSequence.java:179) 
    at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:409) 
    at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:412) 
    at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getNGramProbDepth(LargeNGramModel.java:412) 
    at edu.cmu.sphinx.linguist.language.ngram.large.LargeNGramModel.getProbDepth(LargeNGramModel.java:393) 
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeState.createWordStateArc(LexTreeLinguist.java:720) 
    at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist$LexTreeWordState.getSuccessors(LexTreeLinguist.java:1491) 
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.collectSuccessorTokens(WordPruningBreadthFirstSearchManager.java:635) 
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.growBranches(WordPruningBreadthFirstSearchManager.java:387) 
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.localStart(WordPruningBreadthFirstSearchManager.java:359) 
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.startRecognition(WordPruningBreadthFirstSearchManager.java:262) 
    at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:62) 
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:109) 
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:125) 
    at edu.cmu.sphinx.api.AbstractSpeechRecognizer.getResult(AbstractSpeechRecognizer.java:50) 
    at Main.main(Main.java:30) 

這是我正在爲規定的官方網站上的程序:

import java.io.IOException; 
import java.util.Scanner; 

import edu.cmu.sphinx.api.Configuration; 
import edu.cmu.sphinx.api.LiveSpeechRecognizer; 
import edu.cmu.sphinx.api.SpeechResult; 

public class Main { 

    public static void main(String[] args) { 

     Configuration configuration = new Configuration(); 

     configuration 
       .setAcousticModelPath("Alphabets/tutorial/alphabets/model_parameters/alphabets.ci_cont"); 

     configuration.setDictionaryPath("Alphabets/tutorial/alphabets/etc/alphabets.dic"); 

     configuration 
       .setLanguageModelPath("Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp"); 

     LiveSpeechRecognizer recognizer = null; 
     try { 
      recognizer = new LiveSpeechRecognizer(configuration); 
     } catch (IOException e) { 
      e.printStackTrace(); 
     } 
     recognizer.startRecognition(true); 

     SpeechResult result = recognizer.getResult(); 

     recognizer.stopRecognition(); 

     System.out.println(result.getHypothesis()); 
     result.getLattice().dumpDot("lattice.dot", "lattice"); 

    } 
} 

幫助是非常感謝!

+0

只是要補充一點,我已經嘗試過用給一個wav文件,而不是麥克風,而同樣的問題 – coding4fun 2014-09-04 20:25:57

+0

共享您的文件 – 2014-09-05 06:05:41

+0

這裏的鏈接到我的Eclipse項目:https://drive.google.com/file/d/0B2hLLX_4snjbNDFWRE8tNkNYYjg/edit?usp=sharing謝謝! – coding4fun 2014-09-05 11:20:18

回答

1

你的語言模型/Alphabets/tutorial/alphabets/etc/alphabets.lm.dmp是文本的arpa格式,但你添加了一個dmp擴展名。本手冊編輯會混淆識別器。要解決該問題,請在不使用dmp擴展名的情況下將alphabets.lm.dmp重命名爲alphabets.lm,並在代碼中編輯該名稱。只需使用

configuration.setLanguageModelPath("Alphabets/tutorial/alphabets/etc/alphabets.lm");

您還沒有足夠的數據來訓練模型,模型是行不通的。必須有大量的訓練數據。您可以在聲學模型訓練教程細節

http://cmusphinx.sourceforge.net/wiki/tutorialam