2012-03-28 1329 views
1

我有一個簡單的矩陣,在某些列中有重複值。我需要按名稱和星期分組數據,並總結每週給定的價格。這裏是例子:在Matlab中分組和彙總

name day week price 
John 12 12 200 
John 14 12 70 
John 25 13 150 
John 1 14 10 
Ann 13 12 100 
Ann 15 12 100 
Ann 20 13 50 

所需的輸出將是:

name week sum 
    John 12 270 
    John 13 150 
    John 14 10 
    Ann 12 200 
    Ann 13 50 

有沒有一個很好的辦法做到這一點?我用的循環,但不知道它是做的最好的方式:

names= unique(data(:,1)); % getting unique names from data 
n=size(names, 1);   % number of unique names 
m=size(data(:,1),1);  % number of total rows 
sum=[];     % empty matrix for writing the results 
for i = 1:n    
     temp=[];   % creating temporar matrix 
     k=1; 
    for j=1:m 
     if name(i)==data(j,1)  % going through all the rows and getting the rows of 
      temp(k,:)=data(j,:); % the same name and putting in temporar matrix 
      k=k+1; 
     end 
    end 
    count=0; 
    s=1; 
    for l = 1:size(temp,1)-1  % going through temporar matrix of one name(e.g.John) 
     if temp(l,3)==temp(l+1,3) % checking if the day of current row is equal to the 
     count=count+temp(l,4); % date of the next row (the data is sorted by name 
     else      % and date) and then summing the prices 4th column 
      sum(s, 1:3)=[names(i) temp(l,3) count]; 
      count=0;    % if the days are not equal, then writing the answer 
      s=s+1;    % to the output matrix sum 
     end   
    end 
end 
+0

單字母變量名和缺乏的意見相結合,使你的代碼非常難走。你能擴展變量名稱並註釋代碼的意圖嗎? – 2012-03-28 18:27:39

回答

3

使用accumarray。它會分組和彙總這樣的值。您可以使用第三otuput參數從unique(data(:,1))得到的數字指標傳遞給的accumarraysubs說法。詳情請參閱doc accumarray

1

也許最簡單的方法是使用GRPSTATS功能從統計工具箱。你必須在nameweek以產生第一組結合:

[name_week priceSum] = grpstats(price, strcat(name(:), '@', week(:)), {'gname','sum'});