2017-10-12 81 views
-1

的數據具有包含多個觀測兩種通用基團,其中一些是在DLA字段NADLA日期與組內所有記錄的日期相同。如何擴大DLA值,以便在相應的日期填入NA值。我在dplyr之內工作,我懷疑有一個我找不到的解決方案。這些數據是具有約5k行和約500個個體的較大數據集的一小部分。非常感謝。一個組內的擴大的日期值時,下面NA

dat <- structure(list(GenIndID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("BHS_106", 
"BHS_164"), class = "factor"), IndID = structure(c(1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 3L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 7L, 
8L), .Label = c("BHS_106_A", "BHS_106_B", "BHS_106_C", "BHS_106_D", 
"BHS_164_A", "BHS_164_B", "BHS_164_C", "BHS_164_D"), class = "factor"), 
    DLA = structure(c(1507010400, 1507010400, 1507010400, 1507010400, 
    1507010400, 1507010400, 1507010400, 1507010400, NA, NA, 1499061600, 
    1499061600, 1499061600, 1499061600, 1499061600, 1499061600, 
    1499061600, NA, NA, NA), tzone = "", class = c("POSIXct", 
    "POSIXt"))), .Names = c("GenIndID", "IndID", "DLA"), row.names = c(411L, 
412L, 413L, 414L, 415L, 416L, 417L, 418L, 419L, 420L, 442L, 443L, 
444L, 445L, 446L, 447L, 448L, 449L, 450L, 451L), class = "data.frame") 

> dat 
    GenIndID  IndID  DLA 
411 BHS_106 BHS_106_A 2017-10-03 
412 BHS_106 BHS_106_A 2017-10-03 
413 BHS_106 BHS_106_A 2017-10-03 
414 BHS_106 BHS_106_A 2017-10-03 
415 BHS_106 BHS_106_B 2017-10-03 
416 BHS_106 BHS_106_B 2017-10-03 
417 BHS_106 BHS_106_B 2017-10-03 
418 BHS_106 BHS_106_B 2017-10-03 
419 BHS_106 BHS_106_C  <NA> 
420 BHS_106 BHS_106_D  <NA> 
442 BHS_164 BHS_164_A 2017-07-03 
443 BHS_164 BHS_164_A 2017-07-03 
444 BHS_164 BHS_164_A 2017-07-03 
445 BHS_164 BHS_164_A 2017-07-03 
446 BHS_164 BHS_164_A 2017-07-03 
447 BHS_164 BHS_164_A 2017-07-03 
448 BHS_164 BHS_164_A 2017-07-03 
449 BHS_164 BHS_164_B  <NA> 
450 BHS_164 BHS_164_C  <NA> 
451 BHS_164 BHS_164_D  <NA> 
+2

的可能的複製[如何通過組最近的非NA NA替換?](https://stackoverflow.com/questions/39063253/how-to -replace-na-with-most-recent-non-na-by-group)或者[用組值替換NA值](https://stackoverflow.com/questions/23583739/replace-na-value-with-該組的價值) –

+0

是的,這是重複的。道歉。我如何刪除?因爲它有一個答案,所以SO不會允許(至少在我的名聲下)。 –

回答

0

'GenIndID'分組後我們需要fill。由於NAs位於最下方,因此默認爲.direction = 'down'。所以,我們並不需要指定它

dat %>% 
    group_by(GenIndID) %>% 
    fill(DLA) 
相關問題