2013-01-15 49 views
0

通過時間間隔

在這裏變換數據設定爲具有一排是用於個別的一些數據與id = 1

id time status 
-------------- 
1 t status 

t是時間的一些事件,並status是要麼如果發生事件則爲1或如果沒有發生則爲0(在這種情況下,t是研究的持續時間)。

假設t位於a2a3之間。

我的目標是我的數據轉換成如下:

id period start stop status 
--------------------------- 
1 1  0  a1 0  
1 2  a1 a2 0  
1 3  a2 t status 

個人1的總時間分爲三個區間,其中存在(0, a1)和任何情況下(a1, a2)

問題

你能想到一種有效的方法來編寫一個R函數,它輸入一個數據集和一個向量,並輸出轉換後的數據集?


編輯

第1部分 我一直在問一個具體的例子。這裏是一個:

id time status 
    -------------- 
    1 5 1 

a1=1a2=3a3=7

第2部分我也被要求展示我的嘗試。這是

> data <- data.frame(id=1, time=5, status=1) 
> a <- c(1, 3, 7) 
> N <- nrow(data) 
> data$period <- ifelse(data$time < a[1], 1, 
+      ifelse(data$time < a[2], 2, 
+        ifelse(data$time < a[3], 3, 4))) 
> 
> 
> dataTemp1 <- data.frame(matrix(nrow=N, ncol=ncol(data))) 
> names(dataTemp1) <- names(data) 
> dataTemp2 <- data.frame(matrix(nrow=N, ncol=ncol(data))) 
> names(dataTemp2) <- names(data) 
> dataTemp3 <- data.frame(matrix(nrow=N, ncol=ncol(data))) 
> names(dataTemp3) <- names(data) 
> dataTemp4 <- data.frame(matrix(nrow=N, ncol=ncol(data))) 
> names(dataTemp4) <- names(data) 
> 
> for(j in 1:N) 
+ { 
+ if(data[j, "period"] == 1){ 
+  data[j, "start"] <- 0 
+  data[j, "stop"] <- data[j, "time"] 
+ } else if(data[j, "period"] == 2){ 
+  dataTemp1[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp1[j, "start"] <- 0 
+  dataTemp1[j, "stop"] <- a[1] 
+  dataTemp1[j, "status"] <- 0 
+  
+  data[j, "start"] <- a[1] 
+  data[j, "stop"] <- data[j, "time"] 
+ } else if(data[j, "period"] == 3){ 
+  dataTemp1[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp1[j, "start"] <- 0 
+  dataTemp1[j, "stop"] <- a[1] 
+  dataTemp1[j, "status"] <- 0 
+  
+  dataTemp2[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp2[j, "start"] <- a[1] 
+  dataTemp2[j, "stop"] <- a[2] 
+  dataTemp2[j, "status"] <- 0 
+  
+  data[j, "start"] <- a[2] 
+  data[j, "stop"] <- data[j, "time"]  
+ } else if(data[j, "period"] == 4){ 
+  dataTemp1[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp1[j, "start"] <- 0 
+  dataTemp1[j, "stop"] <- a[1] 
+  dataTemp1[j, "status"] <- 0 
+  
+  dataTemp2[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp2[j, "start"] <- a[1] 
+  dataTemp2[j, "stop"] <- a[2] 
+  dataTemp2[j, "status"] <- 0 
+  
+  dataTemp3[j, c("id", "time", "period")] <- 
+  data[j, c("id", "time", "period")] 
+  dataTemp3[j, "start"] <- a[2] 
+  dataTemp3[j, "stop"] <- a[3] 
+  dataTemp3[j, "status"] <- 0 
+  
+  data[j, "start"] <- a[3] 
+  data[j, "stop"] <- data[j, "time"] 
+ } 
+ } 
> 
> dataTemp1 <- dataTemp1[complete.cases(dataTemp1), ] 
> dataTemp2 <- dataTemp2[complete.cases(dataTemp2), ] 
> dataTemp3 <- dataTemp3[complete.cases(dataTemp3), ] 
> dataTemp4 <- dataTemp4[complete.cases(dataTemp4), ] 
> 
> data <- rbind(data, dataTemp1, dataTemp2, dataTemp3, dataTemp4) 
> data[, "period"] <- ifelse(data[, "start"] == 0, 1, 
+       ifelse(data[, "start"] == a[1], 2, 
+         ifelse(data[, "start"] == a[2], 3, 
+           ifelse(data[, "start"] == a[3], 4, 
+             5)))) 
> data <- data[order(data$id, data$start), 
+    c("id", "period", "start", "stop", "status")] 
> data 
    id period start stop status 
2 1  1  0 1  0 
3 1  2  1 3  0 
1 1  3  3 5  1 
+0

你應該提供一個可重複的例子。 ai是什麼日期?爲什麼不提供一些數值,而不僅僅是符號,還可以顯示你嘗試過什麼? – agstudy

+0

@agstudy:我做了編輯。但是,我想要一個功能而不是一個只適用於一個例子的程序。 – user7064

+0

@Arun:Wahou,thx!如果你讓它成爲答案,我會接受它! – user7064

回答

0

我將它寫成一個適當的重複性解決方案:

df <- data.frame(id=1, time=5, status=2) 
a <- c(1, 3, 7) 

res.fn <- function(df, a) { 
    id <- rep(1, length(a)) 
    period <- 1:length(a) 
    start <- c(0, a[1:(length(a)-1)]) 
    stop <- c(a[1:(length(a)-1)], df$time) 
    status <- c(rep(0, length(a)-1), df$status) 
    data.frame(id, period, start, stop, status) 
} 
> res.fn(df, a) 

    id period start stop status 
1 1  1  0 1  0 
2 1  2  1 3  0 
3 1  3  3 5  2 
+0

這很完美! – user7064

+0

是的,當然,我會的。謝謝 – user7064