2017-07-03 52 views
0

我正在讀取csv文件,並嘗試根據兩個連續行的bug_id和bug_when相同並且第i行的列列值爲「RESOLVED」的條件來更新名爲'added'的列的值「然後通過連接」已添加「列(i和i + 1行)的值更新(i + 1)行上添加的列的值並刪除第i行。我累了,但它沒有正常工作。該文件包含以下信息:更新列值並刪除R中的行

bug_id bug_when   field  added 
1141327 2015-03-09 16:21:30 Status  RESOLVED 
1141327 2015-03-09 16:21:30 Resolution DUPLICATE 
1142623 2015-03-24 18:15:22 Status  RESOLVED 
1142623 2015-03-24 18:15:22 Resolution FIXED 
1143179 2015-07-30 09:37:56 Status  RESOLVED 
1143179 2015-07-30 09:37:56 Resolution FIXED 

這裏是我的代碼:

dataframe <- read.csv("prototype.csv", header = TRUE) 
start <- 1 
end <- nrow(dataframe)-1 

for(i in start:end) 
{ 
    if(dataframe$bug_id[i]==dataframe$bug_id[i+1] & dataframe$bug_when[i]==dataframe$bug_when[i+1]) 
    { 
    if(dataframe$added[i]=="RESOLVED") 
    { 
     df <- paste(dataframe$added[i],"-",dataframe$added[i+1]) 
     dataframe$added[i+1] <- df 
     dataframe <- dataframe[!(dataframe[i,])] 
    } 

    } 

} 

任何建議將高度讚賞。 所需的結果:

bug_id bug_when   field  added 
1141327 2015-03-09 16:21:30 Resolution RESOLVED-DuPLICATE 
1142623 2015-03-24 18:15:22 Resolution RESOLVED-FIXED 
1143179 2015-07-30 09:37:56 Resolution RESOLVED-FIXED 
+0

你可以添加數據例如你想要的結果,你提供的? –

+0

@PLapointe希望添加結果 – user2293224

回答

0

這裏是如何做到這一點與dplyr。基本上,每添加一次t-1中的「RESOLVED」,添加的字符串都與paste連接。然後使用filter僅保留帶有「分辨率」的字段。

library(dplyr) 
df%>% 
    group_by(bug_id,bug_when)%>% 
    mutate(added=ifelse(lag(added) =="RESOLVED" & !is.na(lag(added)), 
        paste(lag(added),(added),sep="-"), 
        added))%>% 
    filter(field=="Resolution") 

    bug_id   bug_when  field    added 
    <int>    <chr>  <chr>    <chr> 
1 1141327 2015-03-09 16:21:30 Resolution RESOLVED-DUPLICATE 
2 1142623 2015-03-24 18:15:22 Resolution  RESOLVED-FIXED 
3 1143179 2015-07-30 09:37:56 Resolution  RESOLVED-FIXED 

數據

df <- read.table(text="bug_id bug_when   field  added 
1141327 '2015-03-09 16:21:30' Status  RESOLVED 
1141327 '2015-03-09 16:21:30' Resolution DUPLICATE 
1142623 '2015-03-24 18:15:22' Status  RESOLVED 
1142623 '2015-03-24 18:15:22' Resolution FIXED 
1143179 '2015-07-30 09:37:56' Status  RESOLVED 
1143179 '2015-07-30 09:37:56' Resolution FIXED", 
       header=TRUE,stringsAsFactors=FALSE) 
0

我想你想結合骨料和粘貼,就像這樣:

df <- read.table(text="bug_id bug_when   field  added 
1141327 '2015-03-09 16:21:30' Status  RESOLVED 
1141327 '2015-03-09 16:21:30' Resolution DUPLICATE 
1142623 '2015-03-24 18:15:22' Status  RESOLVED 
1142623 '2015-03-24 18:15:22' Resolution FIXED 
1143179 '2015-07-30 09:37:56' Status  RESOLVED 
1143179 '2015-07-30 09:37:56' Resolution FIXED",stringsAsFactors = FALSE,header=TRUE) 

df2 <- aggregate(added ~ bug_id + bug_when, df,paste,collapse = "-") 
df2$field <- "Resolution" 

# bug_id   bug_when    added  field 
# 1 1141327 2015-03-09 16:21:30 RESOLVED-DUPLICATE Resolution 
# 2 1142623 2015-03-24 18:15:22  RESOLVED-FIXED Resolution 
# 3 1143179 2015-07-30 09:37:56  RESOLVED-FIXED Resolution