2009-11-05 53 views
5

我想從TCP套接字在C#中「流」語音識別。我遇到的問題是SpeechRecognitionEngine.SetInputToAudioStream()似乎需要一個可以查找的定義長度的Stream。現在我能想到的,使這項工作的唯一方法是隨着越來越多的輸入進來的重複運行在一個MemoryStream識別器Streaming輸入到System.Speech.Recognition.SpeechRecognitionEngine

下面是一些代碼來說明:

  SpeechRecognitionEngine appRecognizer = new SpeechRecognitionEngine(); 

      System.Speech.AudioFormat.SpeechAudioFormatInfo formatInfo = new System.Speech.AudioFormat.SpeechAudioFormatInfo(8000, System.Speech.AudioFormat.AudioBitsPerSample.Sixteen, System.Speech.AudioFormat.AudioChannel.Mono); 

      NetworkStream stream = new NetworkStream(socket,true); 
      appRecognizer.SetInputToAudioStream(stream, formatInfo); 
      // At the line above a "NotSupportedException" complaining that "This stream does not support seek operations." 

有誰知道怎麼弄在這附近?它必須支持某種類型的流輸入,因爲它可以使用SetInputToDefaultAudioDevice()與麥克風正常工作。

謝謝,肖恩

+0

也許'SetInputToDefaultAudioDevice()'是微軟 「黑魔法」(通用),或像你建議對其進行某種配料。 – 2009-11-05 19:21:13

回答

2

您是否嘗試過在System.IO.BufferedStream包裹的網絡流?

NetworkStream netStream = new NetworkStream(socket,true); 
BufferedStream buffStream = new BufferedStream(netStream, 8000*16*1); // buffers 1 second worth of data 
appRecognizer.SetInputToAudioStream(buffStream, formatInfo); 
+1

剛剛嘗試過,我得到了同樣的錯誤。 – spurserh 2009-11-05 21:45:18

+0

您是否驗證了緩衝流支持查找? 即在上面的代碼中,buffStream.CanSeek()返回true嗎? – 2009-11-16 20:11:09

1

我最終緩衝了輸入,然後以相繼更大的塊將它發送到語音識別引擎。例如,我可能會首先發送第一個0.25秒,然後是第一個0.5秒,然後是第一個0.75秒,依此類推直到我得到結果。我不確定這是否是最有效的方法,但它對我來說產生了令人滿意的結果。

祝你好運,肖恩

+0

我也遇到了SAPI和MemoryStreams的問題..儘管從默認輸入或文件中都可以正常工作,但無法使其正常工作。當你說你使用緩衝區工作時,你的意思是你使用Serguei建議的BufferStream方法,還是隻是在MemoryStream更大時才阻止識別?我試過都沒有成功。你是否經常使用SpeechHypothesized,SpeechRecognized事件或強制RecognitionResult rr = recognitionizer.Recognize()?你能發佈更多的代碼來幫助嗎?將不勝感激。 – timemirror 2012-08-23 09:28:52

9

我活語音識別通過重寫流類工作:

class SpeechStreamer : Stream 
{ 
    private AutoResetEvent _writeEvent; 
    private List<byte> _buffer; 
    private int _buffersize; 
    private int _readposition; 
    private int _writeposition; 
    private bool _reset; 

    public SpeechStreamer(int bufferSize) 
    { 
     _writeEvent = new AutoResetEvent(false); 
     _buffersize = bufferSize; 
     _buffer = new List<byte>(_buffersize); 
     for (int i = 0; i < _buffersize;i++) 
      _buffer.Add(new byte()); 
     _readposition = 0; 
     _writeposition = 0; 
    } 

    public override bool CanRead 
    { 
     get { return true; } 
    } 

    public override bool CanSeek 
    { 
     get { return false; } 
    } 

    public override bool CanWrite 
    { 
     get { return true; } 
    } 

    public override long Length 
    { 
     get { return -1L; } 
    } 

    public override long Position 
    { 
     get { return 0L; } 
     set { } 
    } 

    public override long Seek(long offset, SeekOrigin origin) 
    { 
     return 0L; 
    } 

    public override void SetLength(long value) 
    { 

    } 

    public override int Read(byte[] buffer, int offset, int count) 
    { 
     int i = 0; 
     while (i<count && _writeEvent!=null) 
     { 
      if (!_reset && _readposition >= _writeposition) 
      { 
       _writeEvent.WaitOne(100, true); 
       continue; 
      } 
      buffer[i] = _buffer[_readposition+offset]; 
      _readposition++; 
      if (_readposition == _buffersize) 
      { 
       _readposition = 0; 
       _reset = false; 
      } 
      i++; 
     } 

     return count; 
    } 

    public override void Write(byte[] buffer, int offset, int count) 
    { 
     for (int i = offset; i < offset+count; i++) 
     { 
      _buffer[_writeposition] = buffer[i]; 
      _writeposition++; 
      if (_writeposition == _buffersize) 
      { 
       _writeposition = 0; 
       _reset = true; 
      } 
     } 
     _writeEvent.Set(); 

    } 

    public override void Close() 
    { 
     _writeEvent.Close(); 
     _writeEvent = null; 
     base.Close(); 
    } 

    public override void Flush() 
    { 

    } 
} 

...和使用的那一個實例作爲輸入SetInputToAudioStream方法流。只要流返回一個長度或返回的計數小於請求的識別引擎認爲輸入已完成。這設置了一個永不完成的循環緩衝區。

+0

嗨,肖恩,我一直在試圖讓你的解決方案工作,但迄今沒有管理它。與其他人一樣,從磁盤文件中一切正常,但只是不適用於MemoryStream。你是否偶爾發出識別請求,或者你能否使用SpeechHypothesized,SpeechRecognized事件?你可以發佈更多的代碼來幫助嗎?謝謝! – timemirror 2012-08-23 09:23:00

+0

對不起,錯過了你的問題,你去了。因此,我可以做到實時語音識別,並通過網絡傳輸音頻信號(我的開源項目ispy的一部分 - http://www.ispyconnect.com) – Sean 2012-10-23 04:51:11

+0

謝謝肖恩......非常棒的項目。 – timemirror 2012-10-24 11:59:48

1

這是我的解決方案。

class FakeStreamer : Stream 
{ 
    public bool bExit = false; 
    Stream stream; 
    TcpClient client; 
    public FakeStreamer(TcpClient client) 
    { 
     this.client = client; 
     this.stream = client.GetStream(); 
     this.stream.ReadTimeout = 100; //100ms 
    } 
    public override bool CanRead 
    { 
     get { return stream.CanRead; } 
    } 

    public override bool CanSeek 
    { 
     get { return false; } 
    } 

    public override bool CanWrite 
    { 
     get { return stream.CanWrite; } 
    } 

    public override long Length 
    { 
     get { return -1L; } 
    } 

    public override long Position 
    { 
     get { return 0L; } 
     set { } 
    } 
    public override long Seek(long offset, SeekOrigin origin) 
    { 
     return 0L; 
    } 

    public override void SetLength(long value) 
    { 
     stream.SetLength(value); 
    } 
    public override int Read(byte[] buffer, int offset, int count) 
    { 
     int len = 0, c = count; 
     while (c > 0 && !bExit) 
     { 
      try 
      { 
       len = stream.Read(buffer, offset, c); 
      } 
      catch (Exception e) 
      { 
       if (e.HResult == -2146232800) // Timeout 
       { 
        continue; 
       } 
       else 
       { 
        //Exit read loop 
        break; 
       } 
      } 
      if (!client.Connected || len == 0) 
      { 
       //Exit read loop 
       return 0; 
      } 
      offset += len; 
      c -= len; 
     } 
     return count; 
    } 

    public override void Write(byte[] buffer, int offset, int count) 
    { 
     stream.Write(buffer,offset,count); 
    } 

    public override void Close() 
    { 
     stream.Close(); 
     base.Close(); 
    } 

    public override void Flush() 
    { 
     stream.Flush(); 
    } 
} 

如何使用:

//client connect in 
TcpClient clientSocket = ServerSocket.AcceptTcpClient(); 
FakeStreamer buffStream = new FakeStreamer(clientSocket); 
... 
//recognizer init 
m_recognizer.SetInputToAudioStream(buffStream , audioFormat); 
... 
//recognizer end 
if (buffStream != null) 
    buffStream.bExit = true;