2017-09-15 58 views
-1

如果我有這樣的數據:如何只保留唯一的行但忽略列?

df1 <- data.frame(name = c("apple", "apple", "apple", "orange", "orange"), 
     ID = c(1, 2, 3, 4, 5), 
     is_fruit = c("yes", "yes", "yes", "yes", "yes")) 

,我想只保留唯一行,卻忽略了ID柱之間,使得輸出如下所示:

df2 <- data.frame(name = c("apple", "orange"), 
     ID = c(1, 4), 
     is_fruit = c("yes", "yes")) 

df2 
# name ID is_fruit 
#1 apple 1  yes 
#2 orange 4  yes 

我怎樣才能做到這一點,最好與dplyr

+3

基數R:'df1 [!duplicated(df1 [-2]),]' –

回答

3

您可以使用distinct函數;通過顯式指定變量,您可以保留基於這些列的唯一行;並且還從?distinct

如果有多個行對輸入的給定組合,僅第一行將被保留

distinct(df1, name, is_fruit, .keep_all = T) 
# name ID is_fruit 
#1 apple 1  yes 
#2 orange 4  yes 
2

基礎R

df1[!duplicated(df1[!names(df1) %in% c("ID")]),] 
# name ID is_fruit 
#1 apple 1  yes 
#4 orange 4  yes 

c("ID")替換爲您要忽略的列的名稱

相關問題