2016-10-10 82 views
11

我試圖剖析在Ubuntu上有Cuda的8.0 16.04 CUDA代碼,但它返回「無法剖析應用。統一存儲分析失敗」。我嘗試從終端和Nisght Eclipe進行分析。代碼正在編譯並運行,但無法獲取配置文件。統一內存分析失敗

代碼 -

cusparseHandle_t handle; 
cusparseCreate(&handle); 
cusparseSafeCall(cusparseCreate(&handle)); 

//set the parameters 
const int n_i = 10; 
const int d = 18; 
const int n_t = 40; 
const int n_tau = 2; 
const int n_k = 10; 

float *data = generate_matrix3_1(d, n_i, n_t); 
//float* data = get_data1(d, n_i,n_t); 
float* a = generate_matrix3_1(n_i,n_k,n_tau); 
float* b = sparse_generate_matrix1(n_k,d,0.5); 
float* c = sparse_generate_matrix1(n_k,d,0.5); 

float* previous_a = generate_matrix3_1(n_i,n_k,n_tau); 
float* previous_b = sparse_generate_matrix1(n_k,d,0.1); 
float* previous_c = sparse_generate_matrix1(n_k,d,0.1); 

// calculate norm of data 
float norm_data = 0; 
for (int i = 0; i < n_i; i++) 
{ 
    for (int t = n_tau; t < n_t; t++) 
    { 
     for (int p = 0; p < d; p++) 
     { 
      norm_data = norm_data + ((data[p*n_i*n_t + i*n_t + t])*(data[p*n_i*n_t + i*n_t + t])); 
     } 
    } 
} 

// set lambda and gamma parameter 
float lambda = 0.0001; 
float gamma_a = 2; 
float gamma_b = 3; 
float gamma_c = 4; 

float updated_t = 1; 
float updated_t1 = 0; 

float rel_error = 0; 
int loop = 1; 
float objective = 0; 

// create sparse format for the data 
float **h_data = new float*[1]; 
int **h_data_RowIndices = new int*[1]; 
int **h_data_ColIndices = new int*[1]; 
int nnz_data = create_sparse_MY(data,d,n_i*n_t,h_data,h_data_RowIndices,h_data_ColIndices); 

// transfer sparse data to device memory 
int *d_data_RowIndices; (cudaMalloc(&d_data_RowIndices, (d+1) * sizeof(int))); 
(cudaMemcpy(d_data_RowIndices, h_data_RowIndices[0], (d+1) * sizeof(int), cudaMemcpyHostToDevice)); 
int *d_data_ColIndices; (cudaMalloc(&d_data_ColIndices, nnz_data * sizeof(int))); 
(cudaMemcpy(d_data_ColIndices, h_data_ColIndices[0], (nnz_data) * sizeof(int), cudaMemcpyHostToDevice)); 

命令編譯代碼 -

NVCC -lcusparse main.cu -o文件hello.out

Profiling-

nvprof -o教授./文件hello.out

錯誤 -

== == 13621是NVPROF剖析過程13621,命令:./hello.out ========錯誤:統一內存分析失敗。

有人可以幫我嗎?

+0

請提供一個簡短的完整測試用例。您試圖分析的程序,如何編譯它,用於分析它的完整命令以及完整的輸出消息。 –

+0

更新 –

回答

20

具有相同的誤導性錯誤,只能以root權限運行探查器,例如, sudo nvprof或sudo nvvp。

5

我遭受同樣的錯誤。但是,即使我添加了sudo權限,該錯誤依然存在。終端返回sudo: nvprof: command not found

試試這個命令,它適用於我。 nvprof --unified-memory-profiling off /hello.out

+1

你應該須藤之後添加CUDA路徑PATH根或給出確切的路徑問題像=「須藤/usr/local/cuda-8.0/bin/nvprof」 –

+0

一些用戶建議:*如果你從這個錯誤終端:sudo:nvprof:命令未找到您可以嘗試sudo -su。 * – GhostCat