如何在MATLAB中使聲音信號長度相同？

我發現這個語音識別code，我從blog下載。它工作正常，它要求記錄聲音以創建數據集，然後您必須調用函數來使用神經網絡來訓練系統。如何在MATLAB中使聲音信號長度相同？

我想使用此代碼來訓練使用我想要識別的20個單詞的數據集。

問題：我有一個包含800個文件的20個單詞的數據集，即每個單詞來自不同人的40個記錄。我用Windows錄音機收集文件。問題是，在代碼中，輸入文件的大小總是設置爲8000，另一方面我的數據集不是常量，有些文件是2秒長，有些是3，這意味着會有不同每個文件中的樣本數量。

如果每個輸入信號的採樣變化，它可能會產生錯誤。我想用我的文件來訓練系統。我該怎麼做？

代碼：

clc;clear all; 
load('voicetrainfinal.mat'); 
Fs=8000; 
for l=1:20 
clear y1 y2 y3; 
display('record voice'); 
pause(); 
x=wavrecord(Fs,Fs);  % wavrecord(n,Fs) records n samples at a sampling rate of Fs 
maxval = max(x); 
if maxval<0.04 
    display('Threshold value is too large!'); 
end 
t=0.04; 
j=1; 
for i=1:8000 
    if(abs(x(i))>t) 
     y1(j)=x(i); 
     j=j+1; 
    end 
end 
y2=y1/(max(abs(y1))); 
y3=[y2,zeros(1,3120-length(y2))]; 
y=filter([1 -0.9],1,y3');%high pass filter to boost the high frequency components 
%%frame blocking 
blocklen=240;%30ms block 
overlap=80; 
block(1,:)=y(1:240); 
for i=1:18 
    block(i+1,:)=y(i*160:(i*160+blocklen-1)); 
end 
w=hamming(blocklen); 
for i=1:19 
    a=xcorr((block(i,:).*w'),12);%finding auto correlation from lag -12 to 12 
    for j=1:12 
     auto(j,:)=fliplr(a(j+1:j+12));%forming autocorrelation matrix from lag 0 to 11 
    end 
    z=fliplr(a(1:12));%forming a column matrix of autocorrelations for lags 1 to 12 
    alpha=pinv(auto)*z'; 
    lpc(:,i)=alpha; 
end 
wavplay(x,Fs); 
X1=reshape(lpc,1,228); 
a1=sigmoid(Theta1*[1;X1']); 
    h=sigmoid(Theta2*[1;a1]); 
    m=max(h); 
    p1=find(h==m); 
    if(p1==10) 
     P=0 
    else 
     P=p1 
    end 
end

來源

2016-01-24 Mughees Ismail

[如何訓練併爲神經網絡製作序列化特徵向量？]（http://stackoverflow.com/questions/19419098/how-to-train-on-and-make-a-serialized-feature -vector-for-a-neural-network） –

在你的代碼有：

Fs=8000; 
wavrecord(n,Fs) % records n samples at a sampling rate Fs 
for i=1:8000 
    if(abs(x(i))>t) 
     y1(j)=x(i); 
     j=j+1; 
    end 
end

似乎不是記錄你要導入您的聲音文件（這裏的.wave文件）：

[y, Fs] = wavread(filename);

代替硬編碼8000值您可以讀取文件的長度：

n = length(y);

，然後只用這n變量在for循環：

for i=1:n 
    if(abs(x(i))>t) 
     y1(j)=x(i); 
     j=j+1; 
    end 
end

的代碼的其餘部分似乎是獨立於8000價值。如果您擔心文件長度不恆定。計算您所擁有的所有音頻錄音的最大長度，即n_max。對於短於n_max的記錄，用0填充它們以使它們全部長爲n_max。

n_max = 0; 
for file = ["file1" "file2" ... "filen"] 
    [y, Fs] = wavread(filename); 
    n_max = max(n_max,length(y)); 
end

然後每次處理一個聲音矢量可以與0墊它（無害你，因爲0表示無音）像這樣：

y = [y, zeros(1, n_max - length(y))];

來源

2016-01-24 07:05:00 vrleboss

n=noOfFiles 
for k=1:n 
M(k,1:length(filedata{k})) = filedata{k} 
end

：P

來源

2016-01-24 08:56:46

這總是最好的提供除了代碼之外，還有一些解釋。 –

它在相同的矩陣中需要不同的長度。 n = noOfFiles％佔用輸入的總數。 M（k，1：長度（filedata {k}））= filedata {k}％將第k個文件的所有數據存儲在M. 不管輸入的長度有什麼不同。 –

它需要每一個長度，並把零分爲更短的長度。不需要單獨填充或使用全部長度。 –

如何在MATLAB中使聲音信號長度相同？

回答

相關問題