2017-07-14 67 views
0

Dplyr's mutate函數可以評估「鏈接」表達式,例如,如何實施mutate-like鏈評估?

library(dplyr) 

data.frame(a = 1) %>% 
    mutate(b = a + 1, c = b * 2) 
## a b c 
## 1 1 2 4 

這怎麼實現?在dplyr的源代碼快速瀏覽發現候補碼的基本結構:

library(lazyeval) 
library(rlang) 

compat_as_lazy <- function(quo) { 
    structure(class = "lazy", list(
    expr = f_rhs(quo), 
    env = f_env(quo) 
)) 
} 

compat_as_lazy_dots <- function(...) { 
    structure(class = "lazy_dots", lapply(quos(...), compat_as_lazy)) 
} 

my_mutate <- function(.data, ...) { 
    lazy_eval(compat_as_lazy_dots(...), data = .data) 
} 

data.frame(a = 1) %>% 
    my_mutate(b = a + 1, c = b * 2) 
## Error in eval(x$expr, data, x$env) : object 'b' not found 

...但這樣的「天真」的實施不工作,背後mutate_impl的C++代碼是非常複雜的。我知道它不起作用,因爲"lazy_dots"上的lazy_eval使用lapply,即每個表達式都彼此獨立評估,而我寧願需要鏈接評估並將結果返回到共享環境。如何使它工作?

+0

哦,你正試圖做出自己的mutate函數.... – CPak

回答

2

我不能完全肯定這是你想要的,但這裏是基礎R 3點發生變異的克隆與您的示例工作:

mutate_transform <- function(df,...){ 
    lhs <- names(match.call())[-1:-2] 
    rhs <- as.character(substitute(list(...)))[-1] 
    args = paste(lhs,"=",rhs) 
    for(arg in args){ 
    df <- eval(parse(text=paste("transform(df,",arg,")"))) 
    } 
df 
} 

mutate_within <- function(df,...){ 
    lhs <- names(match.call())[-1:-2] 
    rhs <- as.character(substitute(list(...)))[-1] 
    args = paste(lhs,"=",rhs) 
    df <- eval(parse(text=paste("within(df,{",paste(args,collapse=";"),"})"))) 
    df 
} 

mutate_attach <- function(df,...){ 
    lhs <- names(match.call())[-1:-2] 
    rhs <- as.character(substitute(list(...)))[-1] 
    new_env <- new.env() 
    with(data = new_env,attach(df,warn.conflicts = FALSE)) 
    for(i in 1:length(lhs)){ 
    assign(lhs[i],eval(parse(text=rhs[i]),envir=new_env),envir=new_env) 
    } 
    add_vars <- setdiff(lhs,names(df)) 
    with(data = new_env,detach(df)) 
    for(var in add_vars){ 
    df[[var]] <- new_env[[var]] 
    } 
    df 
} 

data.frame(a = 1) %>% mutate_transform(b = a + 1, c = b * 2) 
# a b c 
# 1 1 2 4 
data.frame(a = 1) %>% mutate_within(b = a + 1, c = b * 2) 
# a c b <--- order is different here 
# 1 1 4 2 
data.frame(a = 1) %>% mutate_attach(b = a + 1, c = b * 2) 
# a b c 
# 1 1 2 4 
0

閱讀Moody_Mudskipper的答案,我用我自己出來後解決方案,再次實現了表達式列表的lazyeval::lazy_eval功能「記住」先前評價:

my_eval <- function(expr, .data = NULL) { 
    idx <- structure(seq_along(expr), 
        names = names(expr)) 
    lapply(idx, function(i) { 
    evl <- lazy_eval(expr[[i]], data = .data) 
    .data[names(expr)[i]] <<- evl 
    evl 
    }) 
} 

接下來,lazy_evalmy_mutate需求是用my_eval代替一切按預期工作。