比較R中特定條件下的連續行

對於每個參與者和每個試驗，我需要檢查CURRENT_ID中的所有連續行，第一行在列A中的值爲0，最後一行的值爲如果兩個條件都滿足，我希望在新列C中有一個值爲0，如果它們不是我想要的值爲1.比較R中特定條件下的連續行

如果您有任何建議，我會很感激。

下面是數據的一些示例行：

A B participant trial CURRENT_ID  C 
0 1 ppt01   45  3    0 
1 0 ppt01   45  4    0 
0 1 ppt01   45  10    0 
0 0 ppt01   45  11    0 
1 0 ppt01   45  12    0 
0 1 ppt01   87  2    0 
1 0 ppt01   87  3    0 
1 1 ppt01   87  4    1 
1 1 ppt01   87  5    1 
0 1 ppt01   34  6    0 
0 0 ppt01   34  7    0 
0 0 ppt01   34  8    0 
0 0 ppt01   34  9    0 
0 0 ppt01   34  10    0 
1 0 ppt01   34  11    0 
0 1 ppt01   8  5    0 
1 0 ppt01   8  6    0 
0 1 ppt01   8  9    0 
0 0 ppt01   8  10    0 
0 0 ppt01   8  11    0 
1 0 ppt01   8  12    0 
0 1 ppt02   87  2    0 
0 0 ppt02   87  3    0 
0 0 ppt02   87  4    0 
1 0 ppt02   87  5    0 
0 1 ppt02   55  5    0 
1 0 ppt02   55  6    0 
0 1 ppt02   55  9    0 
1 0 ppt02   55  10    0 
0 1 ppt02   55  11    1 
1 0 ppt02   55  12    0 
0 1 ppt02   22  2    0 
1 0 ppt02   22  3    0 
0 1 ppt02   22  4    1 
0 1 ppt02   22  10    0 
1 0 ppt02   22  11    1 
1 1 ppt02   22  12    1

編輯：我需要考慮每對連續行的（連續的基礎上CURRENT_ID的值）爲每個參與者和試驗。在上面的示例中，第8行和第9行在新列C中的值爲1，因爲第8行在列A中具有1（而不是0），並且第9行在B列中具有1（而不是0）。

A B participant trial CURRENT_ID  C 
1 1 ppt01   87  4    1 
1 1 ppt01   87  5    1

EDIT2：下面我需要怎麼考慮的行對：

A B participant trial CURRENT_ID  C 
0 1 ppt01   45  3    0 
1 0 ppt01   45  4    0 

0 1 ppt01   45  10    0 
0 0 ppt01   45  11    0 

0 0 ppt01   45  11    0 
1 0 ppt01   45  12    0 

0 1 ppt01   87  2    0 
1 0 ppt01   87  3    0 

1 0 ppt01   87  3    0 
1 1 ppt01   87  4    1 

1 1 ppt01   87  4    1 
1 1 ppt01   87  5    1 

0 1 ppt01   34  6    0 
0 0 ppt01   34  7    0 

0 0 ppt01   34  7    0 
0 0 ppt01   34  8    0 

0 0 ppt01   34  8    0 
0 0 ppt01   34  9    0 

0 0 ppt01   34  9    0 
0 0 ppt01   34  10    0 

0 0 ppt01   34  10    0 
1 0 ppt01   34  11    0

來源

2017-06-14 dede

你想通過參與者和/或試用嗎？你是否正在嘗試爲'CURRENT_ID'的連續值做這個？ – akash87

是提供的C列中的數據正確 - 爲什麼第8,9行會得到1 - 不匹配我的描述！？ – BigDataScientist

@ akash87我需要考慮每個參與者和試驗的CURRENT_ID的連續值。 – dede

如果您想參加試用的組內AB的組對，這應該工作：

d %>% group_by(participant, trial) %>% mutate(AB = ceiling(1:n()/2)) %>% group_by(participant, trial, AB) %>% mutate(newC = ifelse(length(A) == 1 | (A[1] == 0 & B[2] == 0), 0, 1))

我已經將新列留在所以你可以看到這是如何完成的。

輸出：

# A tibble: 15 x 8 
     A  B participant trial CURRENT_ID  C AB newC 
    <int> <int>  <chr> <int>  <int> <int> <dbl> <dbl> 
1  0  1  ppt01 45   3  0  1  0 
2  1  0  ppt01 45   4  0  1  0 
3  0  1  ppt01 45   10  0  2  0 
4  0  0  ppt01 45   11  0  2  0 
5  1  0  ppt01 45   12  0  3  0 
6  0  1  ppt01 87   2  0  1  0 
7  1  0  ppt01 87   3  0  1  0 
8  1  1  ppt01 87   4  1  2  1 
9  1  1  ppt01 87   5  1  2  1 
10  0  1  ppt01 34   6  0  1  0 
11  0  0  ppt01 34   7  0  1  0 
12  0  0  ppt01 34   8  0  2  0 
13  0  0  ppt01 34   9  0  2  0 
14  0  0  ppt01 34   10  0  3  0 
15  1  0  ppt01 34   11  0  3  0

否則，最先描述：

require(dplyr) 
d %>% group_by(participant, trial) %>% mutate(newC = ifelse(A[1] == 0 & B[n()] == 0, 0, 1))

輸出：

Source: local data frame [15 x 7] 
Groups: participant, trial [3] 

# A tibble: 15 x 7 
     A  B participant trial CURRENT_ID  C newC 
    <int> <int>  <chr> <int>  <int> <int> <dbl> 
1  0  1  ppt01 45   3  0  0 
2  1  0  ppt01 45   4  0  0 
3  0  1  ppt01 45   10  0  0 
4  0  0  ppt01 45   11  0  0 
5  1  0  ppt01 45   12  0  0 
6  0  1  ppt01 87   2  0  1 
7  1  0  ppt01 87   3  0  1 
8  1  1  ppt01 87   4  1  1 
9  1  1  ppt01 87   5  1  1 
10  0  1  ppt01 34   6  0  0 
11  0  0  ppt01 34   7  0  0 
12  0  0  ppt01 34   8  0  0 
13  0  0  ppt01 34   9  0  0 
14  0  0  ppt01 34   10  0  0 
15  1  0  ppt01 34   11  0  0

我使用dput()使用您的數據的一個子集：

d <- structure(
    list(
    A = c(0L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 
      0L, 0L, 0L, 0L, 1L), 
    B = c(1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 
      1L, 0L, 0L, 0L, 0L, 0L), 
    participant = c(
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01", 
     "ppt01" 
    ), 
    trial = c(
     45L, 
     45L, 
     45L, 
     45L, 
     45L, 
     87L, 
     87L, 
     87L, 
     87L, 
     34L, 
     34L, 
     34L, 
     34L, 
     34L, 
     34L 
    ), 
    CURRENT_ID = c(3L, 4L, 10L, 11L, 12L, 2L, 3L, 4L, 5L, 6L, 
        7L, 8L, 9L, 10L, 11L), 
    C = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
      1L, 0L, 0L, 0L, 0L, 0L, 0L) 
), 
    .Names = c("A", "B", "participant", 
      "trial", "CURRENT_ID", "C"), 
    class = "data.frame", 
    row.names = c(NA,-15L) 
)

來源

2017-06-14 21:05:02 ssp3nc3r

在我的理解行6-9不應該得到1'行'newC'。你還會爲'trial'組合嗎？還是我錯過了sthg？ – BigDataScientist

是的，他說他希望兩個條件都符合。如果他想要別的東西，他必須澄清。 – ssp3nc3r

哦，男孩，時間睡覺。 Sry打擾;） – BigDataScientist

比較R中特定條件下的連續行

回答

相關問題