這是一個非常基本的解決方案,不是很漂亮,但我試圖堅持基本功能。 這將從a
返回尺寸b的含有至少一個行中的樣本,其a[,4] == "b"
編輯:更新爲僅使用基函數的要求併爲其中至少一個「A」需要這兩種情況的工作要繪製並且至少需要繪製一個「b」
a <- data.frame(matrix(1:36,ncol=3),rbind(as.matrix(rep('a',each=10)),as.matrix(rep('b', each=2))))
names(a) <- c("X1","X2","X3","X4")
b <- 5
a2 <- data.frame()
for (i in b){
draw <- sample(1:nrow(a),b-1,replace = F) # draw a sample of size b-1
a2<- a[draw,] # store rows in a2
a3<- a[-draw,] # store rest in a3
if(sum(a2[,4]=="b") == 0){ # if a2 has no "b" in column 4
# draw 1 value from rownames containing "b" in fourth column and append to draw, store in draw2
draw2 <- c(draw,sample(rownames(a[which(a$X4=="b"),]),1,replace = F))
# else draw one random row from rownames not in a but not in a2
}else{
if(sum(a2[,4]=="a") == 0){ # if a2 has no "a" in column 4
# draw 1 value from rownames containing "a" in fourth column and append to draw, store in draw2
draw2 <- c(draw,sample(rownames(a[which(a$X4=="a"),]),1,replace = F))
# else draw one random row from rownames not in a but not in a2
}
else {draw2 <- c(draw,sample(rownames(a3),1,replace = F))}}
a2<- a[draw2,] # pick these rows
}
a2
分層採樣:分別從每個組中取樣,根據某些規則(例如,90%a組和10%b組)選擇每個子樣本。 – lmo
您可以從採樣包中的功能層獲得分層採樣 – G5W