cuFFT錯誤的結果只有當從複雜的開始

我以前幫助this答案實現就地轉換，它運作良好，但只有當我從真實的數據開始。如果我從複雜的數據開始，IFT + FFT之後的結果是錯誤的，而且這隻發生在原地版本中，我用這個變換的不合適版本得到了完美的結果。這是代碼：cuFFT錯誤的結果只有當從複雜的開始

#include <stdio.h> 
#include <stdlib.h> 
#include <cuda_runtime.h> 
#include <complex.h> 
#include <cuComplex.h> 
#include <cufft.h> 
#include <cufftXt.h> 

#define N 4 
#define N_PAD (2*(N/2+1)) 

void print_3D_Real(double *array){ 
    printf("\nPrinting 3D real matrix \n"); 
    unsigned long int idx; 
    for (int z = 0; z < N; z++){ 
     printf("---------------------------------------------------------------------------- plane %d below\n", z); 
     for (int x = 0; x < N; x++){ 
      for (int y = 0; y < N; y++){ 
       idx = z + N_PAD * (y + x * N); 
       printf("%.3f \t", array[idx]); 
      } 
      printf("\n"); 
     } 
    } 
} 

void print_3D_Comp(cuDoubleComplex *array){ 
    printf("\nPrinting 3D complex matrix \n"); 
    unsigned long int idx; 
    for (int z = 0; z < (N/2+1); z++){ 
     printf("---------------------------------------------------------------------------- plane %d below\n", z); 
     for (int x = 0; x < N; x++){ 
      for (int y = 0; y < N; y++){ 
       idx = z + (N/2+1) * (y + x * N); 
       printf("%+.3f%+.3fi \t", array[idx].x, array[idx].y); 
      } 
      printf("\n"); 
     } 
    } 
} 

// Main function 
int main(int argc, char **argv){ 
    CU_ERR_CHECK(cudaSetDevice(0)); 
    unsigned long int idx, in_mem_size, out_mem_size; 
    cuDoubleComplex *in = NULL, *d_in = NULL; 
    double *out = NULL, *d_out = NULL; 
    cufftHandle plan_r2c, plan_c2r; 
    in_mem_size = sizeof(cuDoubleComplex) * N*N*(N/2+1); 
    out_mem_size = in_mem_size; 

    in = (cuDoubleComplex *) malloc (in_mem_size); 
    out = (double *) in; 

    cudaMalloc((void **)&d_in, in_mem_size); 
    d_out = (double *) d_in; 

    cufftPlan3d(&plan_c2r, N, N, N, CUFFT_Z2D); 
    cufftPlan3d(&plan_r2c, N, N, N, CUFFT_D2Z); 

    memset(in, 0, in_mem_size); 
    memset(out, 0, out_mem_size); 

    // Initial complex data 
    for (int x = 0; x < N; x++){ 
     for (int y = 0; y < N; y++){ 
      for (int z = 0; z < (N/2+1); z++){ 
       idx = z + (N/2+1) * (y + x * N); 
       in[idx].x = idx; 
      } 
     } 
    } 
    print_3D_Comp(in); 
    cudaMemcpy(d_in, in, in_mem_size, cudaMemcpyHostToDevice); 

    cufftExecZ2D(plan_c2r, (cufftDoubleComplex *)d_in, (cufftDoubleReal *)d_out); 

    cudaMemcpy(out, d_out, out_mem_size, cudaMemcpyDeviceToHost); 
    // Normalisation 
    for (int i = 0; i < N*N*N_PAD; i++) 
     out[i] /= (N*N*N); 
    print_3D_Real(out); 
    cudaMemcpy(d_out, out, out_mem_size, cudaMemcpyHostToDevice); 

    cufftExecD2Z(plan_r2c, (cufftDoubleReal *)d_out, (cufftDoubleComplex *)d_in); 

    cudaMemcpy(in, d_in, in_mem_size, cudaMemcpyDeviceToHost)); 

    print_3D_Comp(in); 

    cudaDeviceReset(); 
    return 0; 
}

我的程序的輸出是這個pastebin。

有人可以指導我在正確的道路上？非常感謝你提前。

來源

2015-11-06 JohnWil

首先，你的代碼不能編譯。

在其最一般的定義中，傅立葉變換執行從一個複雜域到另一個複雜域的映射，並且此操作應爲reversible。

然而，C2R和R2C是特殊情況，假設信號在2個域之一（「時間」域）中可以完全表示爲純真實信號（所有虛部爲零）。

但是，很明顯，會出現一些複雜的「頻率」域表示，它們不能由純粹實時的時域信號表示。如果計數器情況爲真（任何複頻域信號都可以表示爲純粹的實時域信號），那麼FFT對於複雜的時域信號不可能是可逆的（因爲所有的頻域數據集映射到純粹的實時域數據）

因此，您不能選擇頻域中的任意數據，並期望它能夠正確映射到純粹的實時域信號中。（*）

作爲示範，改變你的輸入數據設置如下：

  in[idx].x = (idx)?0:1;

，我相信你會得到一個「合格」的測試用例。如果你實際上是在發佈你的問題時使用了這個特定的數據集，我認爲這是不能被支持的。「此外，你的指控」我有這個轉換的不合適版本的完美結果「，我相信不能支持。如果您不同意，請發佈一個完整的代碼，演示您的通過測試用例的場外轉換，這與您的發佈代碼完全相同。

最後，我們可以用fftw來測試它。你的程序的轉換使用FFTW而不是CUFFT產生完全相同的輸出：

$ cat t355.cpp 
#include <stdio.h> 
#include <stdlib.h> 
#include <fftw3.h> 
#include <string.h> 

#define N 4 
#define N_PAD (2*(N/2+1)) 

void print_3D_Real(double *array){ 
    printf("\nPrinting 3D real matrix \n"); 
    unsigned long int idx; 
    for (int z = 0; z < N; z++){ 
     printf("---------------------------------------------------------------------------- plane %d below\n", z); 
     for (int x = 0; x < N; x++){ 
      for (int y = 0; y < N; y++){ 
       idx = z + N_PAD * (y + x * N); 
       printf("%.3f \t", array[idx]); 
      } 
      printf("\n"); 
     } 
    } 
} 

void print_3D_Comp(fftw_complex *array){ 
    printf("\nPrinting 3D complex matrix \n"); 
    unsigned long int idx; 
    for (int z = 0; z < (N/2+1); z++){ 
     printf("---------------------------------------------------------------------------- plane %d below\n", z); 
     for (int x = 0; x < N; x++){ 
      for (int y = 0; y < N; y++){ 
       idx = z + (N/2+1) * (y + x * N); 
       printf("%+.3f%+.3fi \t", array[idx][0], array[idx][1]); 
      } 
      printf("\n"); 
     } 
    } 
} 

// Main function 
int main(int argc, char **argv){ 
    unsigned long int idx, in_mem_size, out_mem_size; 
    fftw_complex *in = NULL; 
    double *out = NULL; 
    in_mem_size = sizeof(fftw_complex) * N*N*(N/2+1); 
    out_mem_size = in_mem_size; 

    in = (fftw_complex *) malloc (in_mem_size); 
    out = (double *) in; 

    memset(in, 0, in_mem_size); 
    memset(out, 0, out_mem_size); 

    // Initial complex data 
    for (int x = 0; x < N; x++){ 
     for (int y = 0; y < N; y++){ 
      for (int z = 0; z < (N/2+1); z++){ 
       idx = z + (N/2+1) * (y + x * N); 
       in[idx][0] = idx; 
      } 
     } 
    } 
    print_3D_Comp(in); 

    fftw_plan plan_c2r = fftw_plan_dft_c2r_3d(N, N, N, in, out, FFTW_ESTIMATE); 
    fftw_plan plan_r2c = fftw_plan_dft_r2c_3d(N, N, N, out, in, FFTW_ESTIMATE); 

    fftw_execute(plan_c2r); 
    // Normalisation 
    for (int i = 0; i < N*N*N_PAD; i++) 
     out[i] /= (N*N*N); 
    print_3D_Real(out); 
    fftw_execute(plan_r2c); 



    print_3D_Comp(in); 

    return 0; 
} 
$ g++ t355.cpp -o t355 -lfftw3 
$ ./t355 

Printing 3D complex matrix 
---------------------------------------------------------------------------- plane 0 below 
+0.000+0.000i +3.000+0.000i +6.000+0.000i +9.000+0.000i 
+12.000+0.000i +15.000+0.000i +18.000+0.000i +21.000+0.000i 
+24.000+0.000i +27.000+0.000i +30.000+0.000i +33.000+0.000i 
+36.000+0.000i +39.000+0.000i +42.000+0.000i +45.000+0.000i 
---------------------------------------------------------------------------- plane 1 below 
+1.000+0.000i +4.000+0.000i +7.000+0.000i +10.000+0.000i 
+13.000+0.000i +16.000+0.000i +19.000+0.000i +22.000+0.000i 
+25.000+0.000i +28.000+0.000i +31.000+0.000i +34.000+0.000i 
+37.000+0.000i +40.000+0.000i +43.000+0.000i +46.000+0.000i 
---------------------------------------------------------------------------- plane 2 below 
+2.000+0.000i +5.000+0.000i +8.000+0.000i +11.000+0.000i 
+14.000+0.000i +17.000+0.000i +20.000+0.000i +23.000+0.000i 
+26.000+0.000i +29.000+0.000i +32.000+0.000i +35.000+0.000i 
+38.000+0.000i +41.000+0.000i +44.000+0.000i +47.000+0.000i 

Printing 3D real matrix 
---------------------------------------------------------------------------- plane 0 below 
23.500 -1.500 -1.500 -1.500 
-6.000 0.000 0.000 0.000 
-6.000 0.000 0.000 0.000 
-6.000 0.000 0.000 0.000 
---------------------------------------------------------------------------- plane 1 below 
-0.500 0.750 0.000 -0.750 
3.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 
-3.000 0.000 0.000 0.000 
---------------------------------------------------------------------------- plane 2 below 
0.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 
---------------------------------------------------------------------------- plane 3 below 
-0.500 -0.750 0.000 0.750 
-3.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 
3.000 0.000 0.000 0.000 

Printing 3D complex matrix 
---------------------------------------------------------------------------- plane 0 below 
+0.000+0.000i +6.000+0.000i +6.000+0.000i +6.000+0.000i 
+24.000+0.000i +30.000+0.000i +30.000+0.000i +30.000+0.000i 
+24.000+0.000i +30.000+0.000i +30.000+0.000i +30.000+0.000i 
+24.000+0.000i +30.000+0.000i +30.000+0.000i +30.000+0.000i 
---------------------------------------------------------------------------- plane 1 below 
+1.000+0.000i +4.000+0.000i +7.000+0.000i +10.000+0.000i 
+13.000+0.000i +16.000+0.000i +19.000+0.000i +22.000+0.000i 
+25.000+0.000i +28.000+0.000i +31.000+0.000i +34.000+0.000i 
+37.000+0.000i +40.000+0.000i +43.000+0.000i +46.000+0.000i 
---------------------------------------------------------------------------- plane 2 below 
+2.000+0.000i +8.000+0.000i +8.000+0.000i +8.000+0.000i 
+26.000+0.000i +32.000+0.000i +32.000+0.000i +32.000+0.000i 
+26.000+0.000i +32.000+0.000i +32.000+0.000i +32.000+0.000i 
+26.000+0.000i +32.000+0.000i +32.000+0.000i +32.000+0.000i 
$

（*），你可以說，如果你願意的話，那該C2R和R2C變換的複共軛對稱特徵應該將所有可能的複雜「頻率」域信號正確映射爲唯一的，純粹實際的「時間」域信號。我聲明，沒有證據，它不具有2個數據點：

這個問題中的示例代碼。
由於C2R或R2C變換中的複數空間在數值上大於實際空間（因子爲(2*(N/2+1))/N），所以有理由認爲，不可能將所有可能的複數信號映射爲唯一實數信號。獨特的1：1映射對於完全可逆性是必需的。

有關的缺乏隨機數據對稱的可能性更多的背景，請注意cufft documentation周圍CUFFT_COMPATIBILITY_FFTW_ASYMMETRIC討論。

來源

2015-11-15 23:28:24

感謝您的回答羅伯特。你的確是正確的，我的不合適的版本不會產生與該輸入信號正確的結果，但它產生了與您發佈的輸入信號相同的結果，並且我以前使用它來測試它。傅立葉變換我不是很好，並不知道並非所有的複雜信號都可以用純粹的真實信號表示，因此我認爲我的代碼是錯誤的。我現在知道它對其他信號是正確的，如果它沒有編譯它一定是因爲我在這裏發佈時刪除了太多東西。再次感謝您的好評！ – JohnWil

cuFFT錯誤的結果只有當從複雜的開始

回答

相關問題