您要找的是一個面板數據結構。面板數據(也稱爲橫截面時間序列數據)是隨時間以及實體間而變化的數據。在你的情況下,你的waves
的value
在每個實體內隨時間變化,而group
因實體而異。我們可以做一個簡單的gather
和join
來得到一個典型的面板數據格式。
library(tidyr)
library(dplyr)
panel_df = df %>%
gather(index, value) %>%
inner_join(lookup, by = "index") %>%
group_by(index) %>%
mutate(time = 1:n())
# index value group time
# <chr> <dbl> <chr> <int>
# 1 waves1 0.0000000 healthy 1
# 2 waves1 0.2474040 healthy 2
# 3 waves1 0.4794255 healthy 3
# 4 waves1 0.6816388 healthy 4
# 5 waves1 0.8414710 healthy 5
# 6 waves1 0.9489846 healthy 6
# 7 waves1 0.9974950 healthy 7
# 8 waves1 0.9839859 healthy 8
# 9 waves1 0.9092974 healthy 9
# 10 waves1 0.7780732 healthy 10
# # ... with 476 more rows
這裏,index
表示實體尺寸和我已經手動創建一個time
變量以指示面板數據的時間維度。
爲了形象化的面板數據,你可以不喜歡與ggplot2
如下:
library(ggplot2)
# Visualize all waves, grouped by health status
ggplot(panel_df, aes(x = time, y = value, group = index)) +
geom_line(aes(color = group))
# Only Healthy people
panel_df %>%
filter(group == "healthy") %>%
ggplot(aes(x = time, y = value, color = index)) +
geom_line()
# Compare healthy and unhealthy people's waves
panel_df %>%
ggplot(aes(x = time, y = value, color = index)) +
geom_line() +
facet_grid(. ~ group)
與時間維度工作:
# plot acf for each entity `value` time series
par(mfrow = c(3, 2))
by(panel_df$value, panel_df$index, function(x) acf(x))
library(forecast)
panel_df %>%
filter(index == "waves1") %>%
{autoplot(acf(.$value))}
最後,plm
包是極好的與面板數據的工作。來自計量經濟學的各種面板迴歸模型已經實現,但爲了不再提供這個答案,我只會留下一些鏈接供自己研究。pdim
告訴你的實體和時間維度的面板數據,以及它是否是平衡的:
library(plm)
# Check dimension of Panel
pdim(panel_df, index = c("index", "time"))
# Balanced Panel: n=6, T=81, N=486
- What is Panel Data?
- Getting Started in Fixed/Random Effects Models using R
- Regressions with Panel Data
我已經修改了你的數據更好示範。
數據:
library(zoo)
w1 <- sin(seq(0,20,0.25))
w2 <- cos(seq(0,20,0.25))
w3 = w1*2
w4 = w2*0.5
w5 = w1*w2
w6 = w2^2
df <- data.frame(w1,w2,w3,w4,w5,w6, stringsAsFactors = FALSE)
names(df) <- paste("waves", 1:6, sep="")
waves <- zoo(df)
lookup <- data.frame(index = paste("waves", 1:6, sep=""),
group = c("healthy", "unhealthy"),
stringsAsFactors = FALSE)
你可以在這裏使用'data.table'。 – agstudy