2011-06-28 44 views
1

我有一種情況,其中有N維標籤矩陣X 1 在標籤基體中的示例條目如下樣品的特定行

Label = [1; 3; 5; ....... 6] 

給出我想隨機樣品 'M1' label1的記錄,的LABEL2等 'M2' 的記錄,使得輸出LabelIndicatorMatrix(N X 1名維)看起來像

LabelIndicatorMatrix = [1; 1; 0;.....1] 

1代表記錄已被choosen,0表示記錄期間不choosen採樣。輸出矩陣滿足以下條件

Sum(LabelIndicatorMatrix) = m1+m2...m6 

回答

1

,你可以用這個代碼的小樣本開始,它選擇您的標籤矢量的隨機樣本,發現其已被選定至少一次您的標籤矢量值:

Label = [1; 3; 5; ....... 6]; 
index = randi(N,m1,1); 
index = unique(index); 
LabelIndicatorMatrix = zeros(N,1); 
LabelIndicatorMatrix(index)=1; 

這就是說我不確定我瞭解LabelIndicatorMatrix的最終條件。

2

一個可能的解決方案:

Label = randi([1 6], [100 1]); %# random Nx1 vector of labels 
m = [2 3 1 0 1 2];    %# number of records to sample from each category 

LabelIndicatorMatrix = false(size(Label)); %# marks selected records 
uniqL = unique(Label);      %# unique labels: 1,2,3,4,5,6 
for i=1:numel(uniqL) 
    idx = find(Label == uniqL(i));   %# indices where label==k 
    ord = randperm(length(idx));    %# random permutation 
    ord = ord(1:min(m(i),end));    %# pick first m_k 
    LabelIndicatorMatrix(idx(ord)) = true; %# mark them as selected 
end 

爲了確保我們滿足要求,我們檢查:

>> sum(LabelIndicatorMatrix) == sum(m) 
ans = 
    1 

這是我在向量化的解決方案的嘗試:

Label = randi([1 6], [100 1]); %# random Nx1 vector of labels 
m = [2 3 1 0 1 2];    %# number of records to sample from each category 

%# some helper functions 
firstN = @(V,n) V(1:min(n,end));     %# first n elements from vector 
pickN = @(V,n) firstN(V(randperm(length(V))), n); %# pick n elements from vector 

%# randomly sample labels, and get indices 
idx = bsxfun(@eq, Label, unique(Label)'); %'# idx(:,k) indicates where label==k 
[r c] = find(idx);       %# row/column indices 
idx = arrayfun(@(k) pickN(r(c==k),m(k)), 1:size(idx,2), ... 
       'UniformOutput',false);  %# sample m(k) from labels==k 

%# mark selected records 
LabelIndicatorMatrix = false(size(Label)); 
LabelIndicatorMatrix(vertcat(idx{:})) = true; 

%# check results are correct 
assert(sum(LabelIndicatorMatrix)==sum(m))