2017-07-26 91 views
0

我想寫一個函數,將吐出模型診斷圖。如何使用dplyr和ggplot2將列名作爲函數參數傳遞?

to_plot <- function(df, model, response_variable, indep_variable) { 
    resp_plot <- 
    df %>% 
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>% 
    group_by(indep_variable) %>% 
    summarize(actual_response = mean(response_variable), 
       predicted_response = mean(model_resp)) %>% 
    ggplot(aes(indep_variable)) + 
    geom_line(aes(x = indep_variable, y = actual_response, colour = "actual")) + 
    geom_line(aes(x = indep_variable, y = predicted_response, colour = "predicted")) + 
    ylab(label = 'Response') 

} 

當我運行這在一個數據集,dplyr拋出我不明白的錯誤:

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity') 
to_plot(mtcars, fit, mpg, wt) 

Error in grouped_df_impl(data, unname(vars), drop) : 
    Column `indep_variable` is unknown 

基於一些粗俗的調試,我發現,在GROUP_BY步驟中出現錯誤,所以它可能與我如何調用函數中的列有關。謝謝!

+1

你需要的另一層複雜處理*標準評價*(即,使用'indep_variable'代表的價值,而不是尋找'indep_variable'本身):https://stackoverflow.com/問題/ 44593596/how-to-pass-strings-denoting-expressions-to-dplyr-0-7 -verbs/44593617#44593617 –

+5

這是因爲dplyr使用非標準評估。 Hadley在這裏解釋NSE:http://dplyr.tidyverse.org/articles/programming.html和一個相當不錯的網絡研討會:https://www.rstudio.com/resources/webinars/whats-new-in-dplyr-0 -7-0/ – biomiha

+0

謝謝。根據您的回答,我在下面添加了一個建議答案,但希望能夠讓您更清楚地瞭解有關方面的反饋意見。 – joe

回答

1

此代碼似乎修復它。正如上面提到的評論者一樣,傳遞給函數的變量必須包裝在「enquo」函數中,然後用!!解開。請注意,使用字符串時,aes()函數變爲aes_()。

library(tidyverse) 

to_plot <- function(df, model, response_variable, indep_variable) { 
    response_variable <- enquo(response_variable) 
    indep_variable <- enquo(indep_variable) 

    resp_plot <- 
    df %>% 
    mutate(model_resp = predict.glm(model, df, type = 'response')) %>% 
    group_by(!!indep_variable) %>% 
    summarize(actual_response = mean(!!response_variable), 
       predicted_response = mean(model_resp)) %>% 
    ggplot(aes_(indep_variable)) + 
    geom_line(aes_(x = indep_variable, y = quote(actual_response)), colour = "blue") + 
    geom_line(aes_(x = indep_variable, y = quote(predicted_response)), colour = "red") + 
    ylab(label = 'Response') 

    return(resp_plot) 
} 

fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity')) 
to_plot(mtcars, fit, mpg, wt) 
+0

這工作,但不是很優雅。請隨時編輯改進,我不認爲我的頭腦完全包裹了這一點。 – joe

相關問題