2017-02-17 48 views
-1

我有3個觀察值的最近12個月的data.frame值。有一個日期變量corresponging到month.m0(最近),然後將值時間向後推移,每次從其減去1個月:將移動年份值重新分配到實際月份


date <- c("2017-01-01", "2016-12-01", "2016-10-01") 
month.m0 <- c(1, 2, 3) 
month.m1 <- c(4, 5, 6) 
month.m2 <- c(7, 8, 9) 
month.m3 <- c(10, 11, 12) 
month.m4 <- c(13, 14, 15) 
month.m5 <- c(16, 17, 18) 
month.m6 <- c(19, 20, 21) 
month.m7 <- c(22, 23, 24) 
month.m8 <- c(25, 26, 27) 
month.m9 <- c(28, 29, 30) 
month.m10 <- c(31, 32, 33) 
month.m11 <- c(34, 35, 36) 

df <- data.frame(date, month.m0, month.m1, month.m2, month.m3, month.m4, month.m5, month.m6, month.m7, month.m8, month.m9, month.m10, month.m11) 

的投入將是:

 date month.m0 month.m1 month.m2 month.m3 month.m4 month.m5 month.m6 month.m7 month.m8 month.m9 month.m10 month.m11 
1 2017-01-01  1  4  7  10  13  16  19  22  25  28  31  34 
2 2016-12-01  2  5  8  11  14  17  20  23  26  29  32  35 
3 2016-10-01  3  6  9  12  15  18  21  24  27  30  33  36 

這裏的問題是,我不知道每個觀察的真實月份,因爲數字是序數並取決於日期變量。

初始值(month.m0)對應於第一行到一月份的月份,因爲日期是一月份(不管是一天還是一年)。對於第二行,該日期指示month.m0對應於十二月,而第三行對應於十月。然後,month.m1((月(日期) - 月(1))值,month.m2對應於(月(日期) - 月(2))等等,返回從初始值時

EDITED OUTPUT:

我是想分配給每個值的真實月份,所以輸出會是:

 date Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
1 2017-01-01 1 34 31 28 25 22 19 16 13 10 7 4 
2 2016-12-01 35 32 29 26 23 20 17 14 11 8 5 2 
3 2016-10-01 30 27 24 21 18 15 12 9 6 3 36 33 

很容易的第一個月指定爲每個觀察,但是當它向後退時,它會變得複雜n次。有任何想法嗎?

EDITED SOLUTION:

最後你給同性戀者的關​​鍵:我把@AntoniosK答案,修改一個小的運營商獲得瞭解決方案:

df %>% 
    gather(month_num,value,-date) %>%          # reshape datset 
    mutate(month_num = as.numeric(gsub("month.m","",month_num)),    # keep only the number (as your step) 
     date = ymd(date),             # transform date to date object 
     month_actual = month(date),          # keep the number of the actual month (baseline) 
     month_now = month_actual - month_num,        # create the current month (baseline + step) 
     month_now_upd = ifelse(month_now < 1, month_now+12, month_now), # update month number (for numbers < 1) 
     month_now_upd_name = month(month_now_upd, label=T)) %>%   # get name of the month 
    select(date, month_now_upd_name, value) %>%        # keep useful columns 
    spread(month_now_upd_name, value) %>%         # reshape again 
    arrange(desc(date)) 
+0

感謝您的編輯。 @Sotos。我正在尋找編輯按鈕,然後我意識到我還沒有特權... – phariza

+0

不客氣。但是,你不明白你的意思。你可以嘗試並更好地解釋它嗎?也許包括你迄今爲止所做的任何嘗試 – Sotos

+0

更新後的文章與解釋。希望它夠了! – phariza

回答

0

假設df是您所提供的數據幀。 ..

library(dplyr) 
library(tidyr) 
library(lubridate) 

df %>% 
    gather(month_num,value,-date) %>%          # reshape datset 
    mutate(month_num = as.numeric(gsub("month.m","",month_num)),    # keep only the number (as your step) 
     date = ymd(date),             # transform date to date object 
     month_actual = month(date),          # keep the number of the actual month (baseline) 
     month_now = month_actual + month_num,        # create the current month (baseline + step) 
     month_now_upd = ifelse(month_now > 12, month_now-12, month_now), # update month number (for numbers > 12) 
     month_now_upd_name = month(month_now_upd, label=T)) %>%   # get name of the month 
    select(date, month_now_upd_name, value) %>%        # keep useful columns 
    spread(month_now_upd_name, value) %>%         # reshape again 
    arrange(desc(date))              # start from recent month 

#   date Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
# 1 2017-01-01 1 4 7 10 13 16 19 22 25 28 31 34 
# 2 2016-12-01 5 8 11 14 17 20 23 26 29 32 35 2 
# 3 2016-10-01 12 15 18 21 24 27 30 33 36 3 6 9 

注意,我創建你不會到底需要各種(有用)變量,但他們會幫助你理解這個過程當你逐步運行鏈接命令時。 如果需要,可以通過在mutate內組合一些命令來縮短上述代碼。

+0

該死的,絕對正確,@AntoniosK。我會編輯這個問題,以免誤解別人。謝謝你的回答,它很整潔。 – phariza

+1

已編輯和更新。您的解決方案是實現理想輸出的關鍵。謝謝! – phariza

0

你的解釋對我來說不是很清楚,所以我的輸出不完全是你的。但這是我該怎麼做的:

library(dplyr) 
library(tidyr) 
df %>% 
    # First create a new variable containing the month as a numeric between 1-12 
    mutate(month = strftime(date, "%m")) %>% 
    # Make data tidy so basically there is new column col containing 
    # month.1, month.2, month.3, ... and a column val containg 
    # the values 
    gather(col, val, -date, -month) %>% 
    # remove "month.m" so the col column has numeric values 
    mutate_at("col", str_replace, pattern = "month.m", replacement = "") %>% 
    mutate_at(c("month", "col"), as.numeric) %>% 
    # Compute the difference between the month column and the col column 
    mutate(col = abs((col - month + 1) %% 12)) %>% 
    # Sort the dataframe according to the new col column 
    arrange(month, col) %>% 
    # Add month.m to the col column so we redefine the names of the columns 
    mutate(col = paste0("month.m", col), month = NULL) %>% 
    # Untidy the data frame 
    spread(col, val) 
+0

輸出不同,因爲第三行不同於第一和第二行。我的錯!我會編輯它。謝謝! – phariza