2015-08-14 222 views
4

這裏就是我與現在的工作:如何混合PCM音頻源(Java)?

for (int i = 0, numSamples = soundBytes.length/2; i < numSamples; i += 2) 
{ 
    // Get the samples. 
    int sample1 = ((soundBytes[i] & 0xFF) << 8) | (soundBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535         
    int sample2 = ((outputBytes[i] & 0xFF) << 8) | (outputBytes[i + 1] & 0xFF); // Automatically converts to unsigned int 0...65535 

    // Normalize for simplicity. 
    float normalizedSample1 = sample1/65535.0f; 
    float normalizedSample2 = sample2/65535.0f; 

    float normalizedMixedSample = 0.0f; 

    // Apply the algorithm. 
    if (normalizedSample1 < 0.5f && normalizedSample2 < 0.5f) 
     normalizedMixedSample = 2.0f * normalizedSample1 * normalizedSample2; 
    else 
     normalizedMixedSample = 2.0f * (normalizedSample1 + normalizedSample2) - (2.0f * normalizedSample1 * normalizedSample2) - 1.0f; 

    int mixedSample = (int)(normalizedMixedSample * 65535); 

    // Replace the sample in soundBytes array with this mixed sample. 
    soundBytes[i] = (byte)((mixedSample >> 8) & 0xFF); 
    soundBytes[i + 1] = (byte)(mixedSample & 0xFF); 
} 

從據我所知,這是該算法的精確表示此頁面上的定義:http://www.vttoth.com/CMS/index.php/technical-notes/68

然而,僅僅混合聲音沉默(全0)會產生很明顯聽起來不正確的聲音,也許最好將其描述爲更高調更響亮。

希望能幫助您確定我是否正確實施算法,或者如果我只是需要以不同的方式(不同的算法/方法)去解決它?

回答

3

在鏈接的文章的作者假定代表音頻的整個流。更具體地X指流X所有樣品的最大絕對值 - 其中X要麼。所以他的算法是掃描兩個流的整體來計算每個流的最大abs樣本,然後對事物進行縮放,使理論上輸出峯值爲1.0。您需要對數據進行多次傳遞以實現此算法,並且如果您的數據正在流入,那麼它將無法工作。

下面是我認爲該算法工作的一個例子。它假定樣本已經被轉換爲浮點數,以轉換代碼錯誤的問題。我將解釋什麼是錯了,以後:

double[] samplesA = ConvertToDoubles(samples1); 
double[] samplesB = ConvertToDoubles(samples2); 
double A = ComputeMax(samplesA); 
double B = ComputeMax(samplesB); 

// Z always equals 1 which is an un-useful bit of information. 
double Z = A+B-A*B; 

// really need to find a value x such that xA+xB=1, which I think is: 
double x = 1/(Math.sqrt(A) * Math.sqrt(B)); 

// Now mix and scale the samples 
double[] samples = MixAndScale(samplesA, samplesB, x); 

混合和縮放:

double[] MixAndScale(double[] samplesA, double[] samplesB, double scalingFactor) 
{ 
    double[] result = new double[samplesA.length]; 
    for (int i = 0; i < samplesA.length; i++) 
     result[i] = scalingFactor * (samplesA[i] + samplesB[i]); 
} 

計算最大峯值:

double ComputeMaxPeak(double[] samples) 
{ 
    double max = 0; 
    for (int i = 0; i < samples.length; i++) 
    { 
     double x = Math.abs(samples[i]); 
     if (x > max) 
      max = x; 
    } 
    return max; 
} 

和轉換。注意我是如何使用短的,以便正確維護符號位:

double[] ConvertToDouble(byte[] bytes) 
{ 
    double[] samples = new double[bytes.length/2]; 
    for (int i = 0; i < samples.length; i++) 
    { 
     short tmp = ((short)bytes[i*2])<<8 + ((short)(bytes[i*2+1]); 
     samples[i] = tmp/32767.0; 
    } 
    return samples; 
} 
+0

試過這段代碼。經過少量編譯和丟失括號錯誤後,當兩個音頻源混合時背景中仍然存在白噪聲。是否還有其他遺漏? –

+0

經過了很長時間的處理這個問題,我決定不使用這種轉換方式,而是'ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer()。get(shorts);',在短褲上。然後返回字節ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer()。put(shorts);'..這完美地工作。 –