2017-10-21 101 views
-1

我想創建一個函數,使得對於某個數據框,它可以使用列名作爲函數的第一個參數,並使用列的值(特定列的行值)作爲函數中的第二個參數。然後,第二個參數的值將根據開關函數中設置的值轉換爲數值。如何在R中使用apply函數,其中函數需要函數參數的同一列中的列名和值?

這是我一直在努力的工作。

# I also put print("ERROR in Question")) if there is no match at all 
scoreraw <- function(Question, Answer) { 

    switch(Question, "Today is my favourite day?" = 
    {switch(Answer,"Strongly Agree" = 3,"Agree"= 2, "Disagree" = 1, "Strongly 
Disagree" = 0)}, 
    "I hate Tuesdays?"= 
    {switch(Answer,"Strongly Agree" = 0,"Agree"= 1, "Disagree" = 2, "Strongly 
Disagree" = 3)}, 
    print("ERROR in Question")) 
} 

這裏是一個快速測試與功能展示它是如何工作的:

# We expect the value to be 3 based on the Question and Answer argument 
scoreraw("Today is my favourite day?","Strongly Agree") 
    # [1] 3 



#Let us now create a dummy dataset of questions 

x <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 
y <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 

c <- data.frame(x,y) 

# Just changing the names to match the questions in the switch statement 
colnames(c) <- c("Today is my favourite day?", "I hate Tuesdays?") 

# The two factors were converted to characters since factors are treated as 
# integers by default (I may be incorrect here) 
c$`Today is my favourite day?` <- as.character(c$`Today is my favourite day`) 
c$`I hate Tuesdays?` <- as.character(c$`I hate Tuesdays`) 

#>c 
# Today is my favourite day? I hate Tuesdays? 
# 1    Strongly Agree Strongly Agree 
# 2      Agree    Agree 
# 3     Disagree   Disagree 
# 4   Strongly Disagree Strongly Disagree 

這就是我想要的數據框看起來像將我的功能

# Today is my favourite day? I hate Tuesdays? 
# 1       3    0 
# 2       2    1 
# 3       1    2 
# 4       0    3 

我後試圖使用apply函數,但我的問題是如何選擇任意列名稱並將該函數應用於特定列中的所有行值?此時我只能通過手動選擇列名和某個行值來應用該功能。

沒有能力
#Example of selecting column name and row value manually 
scoreraw(colnames(c)[2],c[1,2]) 
# [1] 0 

編輯當前工作的代碼來選擇任意列

# I also put print("ERROR in Question")) if there is no match at all 
scoreraw <- function(Question, Answer) { 

    switch(Question, "Today is my favourite day?" = 
    {switch(Answer,"Strongly Agree" = 3,"Agree"= 2, "Disagree" = 1, "Strongly 
Disagree" = 0)}, 
    "I hate Tuesdays?"= 
    {switch(Answer,"Strongly Agree" = 0,"Agree"= 1, "Disagree" = 2, "Strongly 
Disagree" = 3)}, 
    print("ERROR in Question")) 
} 


#Let us now create a dummy dataset of questions 

x <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 
y <- c("Strongly Agree","Agree","Disagree","Strongly Disagree") 

c <- data.frame(x,y) 

# Just changing the names to match the questions in the switch statement 
colnames(c) <- c("Today is my favourite day?", "I hate Tuesdays?") 

# The two factors were converted to characters since factors are treated as 
# integers by default (I may be incorrect here) 
c$`Today is my favourite day?` <- as.character(c$`Today is my favourite 
day`) 
c$`I hate Tuesdays?` <- as.character(c$`I hate Tuesdays`) 



call_scoreraw <- function(n, DF) { 
    sapply(DF[[n]], function(x) scoreraw(colnames(DF)[n], x)) 
} 

#I included unlist as I noticed the output can also be a list 
a <- unlist(call_scoreraw(1, c)) 
b <- as.data.frame(a) 

我現在試圖將For循環在call_scoreraw功能的scoreraw功能適用於任何列/秒。

call_scoreraw <- function(n, DF) { 
    Storage <- numeric(ncol(DF)) 
    for (i in n:ncol(DF)){ 
    Storage[i] <- sapply(DF[,i], function(x) scoreraw(colnames(DF)[i], x)) 
    } 
} 

正如你所看到的,我目前需要找到一種方法來存儲來自for循環的值。我無法使用已定義的存儲變量執行此操作Storage有關如何執行此操作的任何建議?

+0

函數'scoreraw'中有一個輸入錯誤,它應該是'Tuesdaydays'而不是'tuesdays'。 –

+0

謝謝我現在改變了錯字。 @RuiBarradas – MrReference

回答

0

定義另一個函數來調用scoreraw。就像這樣:

call_scoreraw <- function(n, DF) { 
    if(length(n) > 1){ 
     t(sapply(n, function(i){ 
      sapply(DF[[i]], function(x) scoreraw(colnames(DF)[i], x)) 
     })) 
    } else { 
     sapply(DF[[n]], function(x) scoreraw(colnames(DF)[n], x)) 
    } 
} 

call_scoreraw(2, c) 
# Strongly Agree    Agree   Disagree Strongly Disagree 
#    0     1     2     3 

call_scoreraw(1:2, c) 
#  Strongly Agree Agree Disagree Strongly Disagree 
#[1,]    3  2  1     0 
#[2,]    0  1  2     3 

注意與價值觀的載體n返回matrix類的一個對象,如果你想,你可以強制到data.frame呼叫。

res <- call_scoreraw(1:2, c) 
res2 <- as.data.frame(res) 
+0

如果您想將此call_scoreraw函數應用於arbiturary數量的列,那麼該怎麼辦? - 我想自動化這個功能,這樣我就不需要手動索引我想要選擇的列。換句話說,隱式的FOR循環。 @RuiBarradas – MrReference

+0

我試了一下代碼,它運行良好。對於我自己的學習,你能解釋一下代碼的這部分嗎?(sapply(n,function(i)sapply(DF [[i]]),function(x)scoreraw(colnames(DF)[i]), (i),即這個部分......函數(i)sapply(DF [[i) ]],函數(x)scoreraw(colnames(DF)[i],x)) – MrReference

+0

@MrReference第一個'sapply'通過矢量'n'循環,第二個通過矢量' [I]]'。因此,對於''n'中的每個'i'和'DF [[i]]中的每個''''應用函數'scoreraw',它包含兩個參數,一個列名和一個標量。 –

相關問題