dplyr：基於由變量字符串

鑑於這一數據選擇多列變異新列：dplyr：基於由變量字符串

df=data.frame(
    x1=c(2,0,0,NA,0,1,1,NA,0,1), 
    x2=c(3,2,NA,5,3,2,NA,NA,4,5), 
    x3=c(0,1,0,1,3,0,NA,NA,0,1), 
    x4=c(1,0,NA,3,0,0,NA,0,0,1), 
    x5=c(1,1,NA,1,3,4,NA,3,3,1))

我想創建一個使用dplyr選定列的橫行最小值一個額外的列min。這很容易使用的列名：

df <- df %>% rowwise() %>% mutate(min = min(x2,x5))

但我有一個大的DF具有不同的列名，所以我需要從價值觀mycols的一些字符串匹配。現在其他線程告訴我使用選擇幫助函數，但我必須缺少一些東西。下面是matches：

mycols <- c("x2","x5") 
df <- df %>% rowwise() %>% 
    mutate(min = min(select(matches(mycols)))) 
Error: is.string(match) is not TRUE

而且one_of：

mycols <- c("x2","x5") 
df <- df %>% 
rowwise() %>% 
mutate(min = min(select(one_of(mycols)))) 
Error: no applicable method for 'select' applied to an object of class "c('integer', 'numeric')" 
In addition: Warning message: 
In one_of(c("x2", "x5")) : Unknown variables: `x2`, `x5`

我是什麼俯瞰？ select_應該工作嗎？它不會在以下幾點：

df <- df %>% 
    rowwise() %>% 
    mutate(min = min(select_(mycols))) 
Error: no applicable method for 'select_' applied to an object of class "character"

而且同樣：

df <- df %>% 
    rowwise() %>% 
    mutate(min = min(select_(matches(mycols)))) 
Error: is.string(match) is not TRUE

來源

2017-02-19 strangeloop

您需要使用dplyr動詞的SE版本當使用字符串。在這種情況下，使用'select _（）' –

不能正常工作，因爲我預計它可以工作：'df <- df %>％ rowwise（）％>％ mutate（min = min（select_（mycols）））'yield「Error ：沒有將'select_'應用於類「字符」類的對象的適用方法「 – strangeloop

由於它將字符串（正則表達式）作爲參數而不是字符串向量，因此會出現'matches'錯誤。 – cderv

這是一個有點棘手。在SE評估的情況下，您需要將該操作作爲字符串傳遞。

mycols <- '(x2,x5)' 
f <- paste0('min',mycols) 
df %>% rowwise() %>% mutate_(min = f) 
df 
# A tibble: 10 × 6 
#  x1 x2 x3 x4 x5 min 
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
#1  2  3  0  1  1  1 
#2  0  2  1  0  1  1 
#3  0 NA  0 NA NA NA 
#4  NA  5  1  3  1  1 
#5  0  3  3  0  3  3 
#6  1  2  0  0  4  2 
#7  1 NA NA NA NA NA 
#8  NA NA NA  0  3 NA 
#9  0  4  0  0  3  3 
#10  1  5  1  1  1  1

來源

2017-02-19 20:55:24

謝謝！現在，我想要最低的非NA值，所以我需要稍微調整一下這個代碼。看起來從'min'變爲'pmin（na.rm = T）'工作（將na.rm = T加到'min（）似乎不起作用）： 'f < - paste0（'pmin （'，mycols，'，na.rm = T）'）' 'df <- df %>％rowwise（）％>％mutate_（min = f）' – strangeloop

這裏的另一種解決方案有點技術與purrr包從設計的函數式編程的tidyverse幫助。

Fist，matchesdplyr的助手將正則表達式字符串作爲參數，而不是向量。找到匹配所有列的正則表達式是一種很好的方法。當你理解functionnal編程的基本計劃（代碼下，你可以使用你希望dplyr選擇助手）

然後，purrr功能的偉大工程與dplyr。

解決問題的方法：

df=data.frame(
    x1=c(2,0,0,NA,0,1,1,NA,0,1), 
    x2=c(3,2,NA,5,3,2,NA,NA,4,5), 
    x3=c(0,1,0,1,3,0,NA,NA,0,1), 
    x4=c(1,0,NA,3,0,0,NA,0,0,1), 
    x5=c(1,1,NA,1,3,4,NA,3,3,1)) 


# regex to get only x2 and x5 column 
mycols <- "x[25]" 

library(dplyr) 

df %>% 
    mutate(min_x2_x5 = 
      # select columns that you want in df 
      select(., matches(mycols)) %>% 
      # use pmap on this subset to get a vector of min from each row. 
      # dataframe is a list so pmap works on each element of the list that is to say each row 
      purrr::pmap_dbl(min) 
     ) 
#> x1 x2 x3 x4 x5 min_x2_x5 
#> 1 2 3 0 1 1   1 
#> 2 0 2 1 0 1   1 
#> 3 0 NA 0 NA NA  NA 
#> 4 NA 5 1 3 1   1 
#> 5 0 3 3 0 3   3 
#> 6 1 2 0 0 4   2 
#> 7 1 NA NA NA NA  NA 
#> 8 NA NA NA 0 3  NA 
#> 9 0 4 0 0 3   3 
#> 10 1 5 1 1 1   1

我不會進一步解釋有關purrr在這裏，但它工作正常，你的情況

來源

2017-02-19 21:37:30 cderv

dplyr：基於由變量字符串

回答

相關問題