子集，DAT [13：24.2]不回我的預期在這種情況下

我有一個命名爲Diet和Bodyweight兩列數據集。行1:12是對照觀測值，我想獲得對照組的Bodyweight列的平均值。

所以我用這個：mean(dat[1:12,2])

然後我想找出多少不可控制的觀測（行13:24）均低於對照觀測（行1:12）的平均值。

所以我用這個： dat[dat[13:24,2] < mean(dat[1:12,2]), ]

這給了我這樣的：

Diet Bodyweight 
3 chow  24.04 
10 chow  20.10 
12 chow  26.25 
15 hf  22.80 
22 hf  21.90 
24 hf  20.73

但我期待它返回這樣的事情，其中不包括行1:12：

Diet Bodyweight 
15 hf  22.80 
22 hf  21.90 
24 hf  20.73

我該如何做到這一點？

*編輯：dput（）結果：

> dput(dat) 
structure(list(Diet = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L), .Label = c("chow", "hf"), class = "factor"), Bodyweight = c(21.51, 
28.14, 24.04, 23.45, 23.68, 19.79, 28.4, 20.98, 22.51, 20.1, 
26.91, 26.25, 25.71, 26.37, 22.8, 25.34, 24.97, 28.14, 29.58, 
30.92, 34.02, 21.9, 31.53, 20.73)), .Names = c("Diet", "Bodyweight" 
), class = "data.frame", row.names = c(NA, -24L))

來源

2015-01-27 Chris

**請張貼with'dput（）'** – smci 2015-01-27 01:26:12

d分兩步進行。首先得到有針對性的行然後應用邏輯的選擇：

> dat[ 13:24, ][dat[13:24,2] < mean(dat[1:12,2]), ] 
    Diet Bodyweight 
15 hf  22.80 
22 hf  21.90 
24 hf  20.73

你可以連續調用「[」。第二次調用「[」只是選擇具有12項邏輯向量的行，但由於它是從相同的一組值中創建的，所以它是「同步的」。

來源

2015-01-27 01:40:12

如果你不知道的行號，不知道其他測試變量的名稱（在情況下，你必須比只有更多的測試變量「 HF「），但知道」周先生「飲食是你的控制，你可以這樣做：

library(dplyr) 
df %>% filter(Diet != "chow", Bodyweight < mean(Bodyweight[Diet == "chow"]))

將返回：

# Diet Bodyweight 
#1 hf  22.80 
#2 hf  21.90 
#3 hf  20.73

來源

2015-01-27 01:37:22

如果您不知道行號，也可以使用名稱和值。

df[with(df, Diet == "hf" & Bodyweight < mean(Bodyweight[Diet == "chow"])), ] 
# Diet Bodyweight 
# 15 hf  22.80 
# 22 hf  21.90 
# 24 hf  20.73

來源

2015-01-27 01:44:12

由於重複的例子，這可能是一個更好的方式來做到這一點在未來。儘管如此，這超出了我的能力。 – Chris 2015-01-27 01:48:44

當然。別擔心。總是很好的瞭解替代方法 – 2015-01-27 01:50:08

認真提高可讀性，分區數據集 - 如果你做了很多不可控制的大量列比較的控制，這會有所幫助。魔術行索引認爲是不好的 - 它太容易犯了一個錯誤：

# Add a column to split into control/non-control 
dat$control <- c(rep(T,12),rep(F,12)) 

# Get aliases for those partitions... (this makes a copy of the whole df) 
dat_n <- dat[dat$control==F,] 
dat_c <- dat[dat$control==T,] 

# Now the expression is waaay more legible and self-explanatory 
dat_n[ dat_n$Bodyweight < mean(dat_c$Bodyweight) ,] 
# Diet Bodyweight control 
# 3 hf  22.80 FALSE 
# 10 hf  21.90 FALSE 
# 12 hf  20.73 FALSE

來源

2015-01-27 01:57:53 smci

子集，DAT [13：24.2]不回我的預期在這種情況下

回答

相關問題