2015-11-02 106 views
0

我有我需要從數據中提取五個不同的.csv文件,我提取每一個數據後,我想將這些數據放入一個表。如何從多個csv文件提取數據並將數據放入一個公用表中? [Matlab的]

我的.csv目錄:

2015.csv 
2014.csv 
2013.csv 
2012.csv 
2011.csv 

我在做這樣的嘗試是:

csvfiles = dir('.../*.csv'); 
dataArray = {} 
table = table(dataArray{1:end-1}, 'VariableNames', {'WEEK','2015','2014', '2013', '2012', '2011'}); 

for file = csvfiles 
     delimiter = ','; 
     startRow = 2; 
     formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]'; 

     fileID = fopen(file,'r'); % 
     dataArray = {dataArray, textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false)}; 
     fclose(fileID); 
     extracting_data = file.column1 + file.column3 
end 

然而,不僅在FILEID採取了無效的參數file,但我不知道如何提取數據並將其存儲在表中。我可以做的fileid有效利用file(1).name,但隨後textscan()拋出一個錯誤說Invalid file identifier. Use fopen to generate a valid file identifier.

基本上,我的目標是:

1. Open each file in the directory. 
2. Extract all necessary data from the known columns 
3. Put all 52 values inside that file into their own column (one column per file). 

編輯1: 這裏是我的代碼更新。我將dataArray的數據結構更改爲矩陣。

csvfiles = dir('/.../Data/*.csv'); 
data_matrix = zeros(52, 5); % Create empty matrix and format the matrix like the table. 
iter = 0; 

for file = 1:numel(csvfiles) 
    iter = iter + 1; 
    delimiter = ','; 
    startRow = 2; 
    formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]'; 

    fileID = fopen(csvfiles(file).name,'r'); 
    data = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false); 
    fclose(fileID); 

    extracted_data = file.column1 + file.column2; % I make sure to use the column headers. 

    % Add all 52 values from the extracted data to a single column. 
    data_matrix[:,iter] = influenza_a; 
end 


%% Create output table. 
Influenza = table(data_matrix{1:end-1}, 'VariableNames',{'WEEK','2015','2014','2013','2012','2011'}); 
+1

變化'用於文件= csvfiles'到'文件= 1:numel(csvfiles)'和'FILEID =的fopen(csvfiles(文件), 'R');'應該是這樣的首次定位 – Adriaan

+0

@ Adriaan我得到一個錯誤,說'首先輸入必須是char類型的文件名,或者'fopen()'行的文件名爲'double.'的文件名爲 。當我disp(csvfiles(file))時,它返回'name,date,bytes,isdir,datenum'。 – Hunter

+0

您是否像我在之前的評論中所說的那樣更改了「fopen」行?所以不是在'for'這一行,我有點短。更改'的fileid = FOPEN(文件, 'R');''來FILEID =的fopen(csvfiles(文件), 'R');',除了在'for'環路我建議改變較早 – Adriaan

回答

1

我explaine在評論%1)%5)對代碼所做的修改,現在一切都應該匹配。

%1) The path is needed again later, so its put in a seperate variable. Also 
%use only two dots here. 
directory='/Users/user/folder/folder/MATLAB/folder/Data'; 
csvfiles = dir(fullfile(directory,'*.csv')); 

%2) numel(csvfiles) to avoid unnecessary constants. 
data_matrix = zeros(52, numel(csvfiles)); % Create empty matrix and format the matrix like the table. 
iter = 0; 

for file = 1:numel(csvfiles) 
    iter = iter + 1; 
    delimiter = ','; 
    startRow = 2; 
    formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]'; 
    %3) Use absolute path here, otherwise file is not found 
    filename= fullfile(directory,csvfiles(file).name); 
    fileID = fopen(filename,'r'); 
    %4) Inserted error check 
    if fileID<0 
     error('failed to open file %s',filename) 
    end 
    data = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false); 
    fclose(fileID); 

    extracted_data = file.column1 + file.column2; % I make sure to use the column headers. 

    % Add all 52 values from the extracted data to a single column. 
    %5) Indexing was wrong, should be right this way: 
    data_matrix(:,file) = influenza_a; 
end 


%% Create output table. 
Influenza = table(data_matrix{1:end-1}, 'VariableNames',{'WEEK','2015','2014','2013','2012','2011'}); 
+0

當試圖訪問第1列和第2列時,它會從非結構數組對象拋出一個錯誤'結構內容引用。而不是'file。',我將其更改爲'data。',但它仍然不訪問該列。 – Hunter

+0

我的工作是將流水線替換爲:'influenza_a = data {:, 6} + data {:, 8};',但'data_matrix(:,file)= influenza_a'會拋出一個錯誤'Cell contents assignment to a非單元陣列對象。「 – Hunter

相關問題