2017-09-28 132 views
0

我有,我想重新融入原始數據文件中的一些模式輸出。我能做到這一點使用嵌套ifelse(),但是我想辦法推廣的過程中,這樣我可以跨多個數據集批處理進程運行它。這是我最初嘗試的。if語句和循環將新數據追加到原始數據幀

模型輸出對應於時間塊,而每個原始數據點與離散時間相關聯。

,我決定在同一時間(這裏是第一天一個參數的一個例子)手動運行一天,這個非常大的,醜陋的ifelse能夠正確地彙總數據。

track[,"phase"]= ifelse((phaseTable1$start[1]<=track$Time)& (track$Time< phaseTable1$end[1]), phaseTable1$phase[1], 
        ifelse((phaseTable1$start[2]<=track$Time)& (track$Time< phaseTable1$end[2]), phaseTable1$phase[2], 
         ifelse((phaseTable1$start[3]<=track$Time)& (track$Time< phaseTable1$end[3]), phaseTable1$phase[3], 
           ifelse((phaseTable1$start[4]<=track$Time)& (track$Time< phaseTable1$end[4]), phaseTable1$phase[4], 
             ifelse((phaseTable1$start[5]<=track$Time)& (track$Time< phaseTable1$end[5]), phaseTable1$phase[5], 
               ifelse((phaseTable1$start[6]<=track$Time)& (track$Time< phaseTable1$end[6]), phaseTable1$phase[6], 
                ifelse((phaseTable1$start[7]<=track$Time)& (track$Time< phaseTable1$end[7]), phaseTable1$phase[7], 
                  ifelse((phaseTable1$start[8]<=track$Time)& (track$Time< phaseTable1$end[8]), phaseTable1$phase[8], 
                    ifelse((phaseTable1$start[9]<=track$Time)& (track$Time< phaseTable1$end[9]), phaseTable1$phase[9], 
                      ifelse((phaseTable1$start[10]<=track$Time)& (track$Time< phaseTable1$end[10]), phaseTable1$phase[10], 
                       ifelse((phaseTable1$start[11]<=track$Time)& (track$Time< phaseTable1$end[11]), phaseTable1$phase[11], 
                         ifelse((phaseTable1$start[12]<=track$Time)& (track$Time< phaseTable1$end[12]), phaseTable1$phase[12], 
                           ifelse((phaseTable1$start[13]<=track$Time)& (track$Time< phaseTable1$end[13]), phaseTable1$phase[13], 
                             ifelse((phaseTable1$start[14]<=track$Time)& (track$Time<phaseTable1$end[14]), phaseTable1$phase[14], 
                              ifelse((phaseTable1$start[15]<=track$Time)& (track$Time< phaseTable1$end[15]), phaseTable1$phase[15], 
                                ifelse((phaseTable1$start[16]<=track$Time)& (track$Time< phaseTable1$end[16]), phaseTable1$phase[16], 
                                  ifelse((phaseTable1$start[17]<=track$Time)& (track$Time< phaseTable1$end[17]), phaseTable1$phase[17], 
                                    ifelse((phaseTable1$start[18]<=track$Time)& (track$Time< phaseTable1$end[18]), phaseTable1$phase[18], 
                                     ifelse((phaseTable1$start[19]<=track$Time)& (track$Time< phaseTable1$end[19]), phaseTable1$phase[19], 
                                       ifelse((phaseTable1$start[20]<=track$Time)& (track$Time< phaseTable1$end[20]), phaseTable1$phase[20], 
                                         ifelse((phaseTable1$start[21]<=track$Time)& (track$Time< phaseTable1$end[21]), phaseTable1$phase[21], 
                                           ifelse((phaseTable1$start[22]<=track$Time)& (track$Time< phaseTable1$end[22]), phaseTable1$phase[22], 
                                            ifelse((phaseTable1$start[23]<=track$Time)& (track$Time< phaseTable1$end[23]), phaseTable1$phase[23], 
                                              ifelse((phaseTable1$start[24]<=track$Time)& (track$Time< phaseTable1$end[24]), phaseTable1$phase[24], 
                                                ifelse((phaseTable1$start[25]<=track$Time)& (track$Time< phaseTable1$end[25]), phaseTable1$phase[25], 
                                                  ifelse((phaseTable1$start[26]<=track$Time)& (track$Time< phaseTable1$end[26]), phaseTable1$phase[26], 
                                                   ifelse((phaseTable1$start[27]<=track$Time)& (track$Time< phaseTable1$end[27]), phaseTable1$phase[27], 
                                                     ifelse((phaseTable1$start[28]<=track$Time)& (track$Time< phaseTable1$end[28]), phaseTable1$phase[28], 
                                                       ifelse((phaseTable1$start[29]<=track$Time)& (track$Time< phaseTable1$end[29]), phaseTable1$phase[29], 
                                                         ifelse((phaseTable1$start[30]<=track$Time)& (track$Time< phaseTable1$end[30]), phaseTable1$phase[30], 
                                                          ifelse((phaseTable1$start[31]<=track$Time)& (track$Time< phaseTable1$end[31]), phaseTable1$phase[31], 
                                                            ifelse((phaseTable1$start[32]<=track$Time)& (track$Time< phaseTable1$end[32]), phaseTable1$phase[32], 
                                                              ifelse((phaseTable1$start[33]<=track$Time)& (track$Time< phaseTable1$end[33]), phaseTable1$phase[33], 
                                                                ifelse((phaseTable1$start[34]<=track$Time)& (track$Time< phaseTable1$end[34]), phaseTable1$phase[34], 
                                                                 ifelse((phaseTable1$start[35]<=track$Time)& (track$Time< phaseTable1$end[35]), phaseTable1$phase[35],phaseTable1$phase[35] 

                                                        ))))))))))))))))))))))))))))))))))) 

這個工作,但它是相當笨拙,嵌套條件的數量從每天的數據內變化一天。

我試圖返工這個融入了更多實用的循環

for (j in 1:nrow(phaseTable1)){ 
if((phaseTable1$start[j]<=track$Time)&(track$Time< phaseTable1$end[j])){track$tau== phaseTable1$tau[j]} 

} 

並不斷再次得到這樣的警告,導致沒有數據

In if ((phaseTable1$start[j] <= track$Time) & (track$Time < ... the condition has length > 1 and only the first element will be used 

我試過被聚合這樣

for (j in 1:nrow(phaseTable1)){ 
     track$phase<-ifelse(((phaseTable1$star [j]<=track$Time)&(track$Time< phaseTable1$end[j])), phaseTable1$phase[j],""))) 
} 

而出現新列的數據幀,但是它們是空的。

我試圖再次使用在一篇博客文章,這也導致了錯誤建議thatssorandom包的包裝。

for (j in 1:nrow(phaseTable1)){ 
ie(
    i(((phaseTable1$start[j]<=track$Time)&(track$Time< phaseTable1$end[j])),track$phase<- phaseTable1$phase[j]), 
e("na")) 

    } 

有我正在做或有另一種解決方案,實現什麼,我試圖做一個明顯的錯誤?我承認我是一個相對業餘的用戶,我已經探討了其他ifelse論壇的問題,但一直沒能弄清楚我做錯了什麼。我有一個工作循環,可以讓我在數據框中每天運行我的模型。如果我能夠讓下一個循環運行,那麼我將能夠將它嵌套到第一個循環中,並且能夠批量聚合數據。任何有關解決方案的洞察將非常感謝!

回答

0

如果沒有數據集的工作,這可以用findInterval

df1 <- data.frame(start = seq(as.POSIXct("2017-08-07 00:00:00"), by = "hour", length.out = 24)) 
df1$end <- df1$start + 3600 
df1$phase <- letters[seq_len(nrow(df1))] 

v <- findInterval(c(as.POSIXct("2017-08-07 02:38:24"), as.POSIXct("2017-08-07 21:59:59")), df1$start) 
df1$phase[v] 
[1] "c" "v" 

來完成除非有,不需要結束時間


對於第一時間間隔之間的間隙錯誤,請看?&

& and && indicate logical AND and | and || indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.

第二個錯誤:輸入錯誤phaseTable1$star [j]應該phaseTable1$start[j]

第三個錯誤:輸入錯誤i應該if

+0

這太好了,謝謝!我不知道findInterval()。長表和短表之間的區別也很有幫助。 – sea83

0

我發現,似乎是工作的解決方案。不得不重新考慮我如何設置循環。

for (j in 1:nrow(phaseTable1)){ 
for (k in 1:nrow(track)){ 
if((phaseTable1$start[j]<=track$Time[k])&(track$Time[k]< phaseTable1$end[j])){track$model[k]= phaseTable1$model[j]} 

if((phaseTable1$start[j]<=track$Time[k])&(track$Time[k]< phaseTable1$end[j])){track$phase[k]= phaseTable1$phase[j]} 

if((phaseTable1$start[j]<=track$Time[k])&(track$Time[k]< phaseTable1$end[j])){track$tau[k]= phaseTable1$tau[j]} 

if((phaseTable1$start[j]<=track$Time[k])&(track$Time[k]< phaseTable1$end[j])){track$eta[k]= phaseTable1$eta[j]} 

} 

} 
+0

嵌套'for'循環不應該需要這個。如果您發佈一些具有所需輸出的樣本數據,則可能有更好的方法來完成此操作。 – manotheshark