重塑增加兩列data.frame

我有data.frame看起來像這樣重塑增加兩列data.frame

timestamp value.x station value.y parameter.x value parameter.y 
1 1/1/2010 0.6  abc  188,000 AREA PLANTED 22 PROGRESS 
2 1/1/2010 0.6  abc  156.3  YIELD   NA NA 
3 1/1/2010 -10  def  188,000 AREA PLANTED 22 PROGRESS 
4 1/1/2010 -10  def  156.3  YIELD   NA NA

而且我想用reshape，使它看起來像這樣：

timestamp value.x station AREA PLANTED YIELD PROGRESS 
1 1/1/2010 0.6  abc  188,000   156.3 22  
3 1/1/2010 -10  def  188,000   156.3 22

我試着

reshape(data = b, varying = list(c('value.y', 'parameter.x', 'value', 'parameter.y')), 
     v.names = c('AREA PLANTED', 'YIELD', 'PROGRESS'), 
     timevar = row.names(b), 
     times = b$timestamp, direction = 'wide', idvar = b$station)

但它說

Error in [.data.frame(data, , idvar) : undefined columns selected

我試着改變了一下，但不管我做了什麼，它一直拋出這個錯誤。

來源

2017-04-27 R.M.

您的整形有'b $ station'（小寫's'），但數據幀的列名是'Station'（大寫'S'）？ – neilfws

類型，固定.... –

這是有點到處 - 你沒有指定'idvar = b $ station' - 你已經說過'data = b' - 你想'idvar =「station」我想。與'timevar ='相同。您也有多個值，每個站和時間戳交互不起作用。您可以通過重新設置（變換（b，時間= ave（as.character（Station），Station，FUN = seq_along）），direction =「wide」，idvar = c（「timestamp」，「Station」，「 value.x「））' – thelatemail

這使用reshape2。我不認爲有可能在一個步驟中投射數據幀。請注意，看起來輸入是其他一些連接操作的結果（因爲某些名稱具有.x和。suffixes）。我想，加入可以改進，以避免這種併發症

df <- read.table(header=TRUE, stringsAsFactors = FALSE, text = 
"timestamp value.x station value.y parameter.x value parameter.y 
1/1/2010 0.6  abc  188,000 AREAPLANTED 22 PROGRESS 
1/1/2010 0.6  abc  156.3  YIELD   NA NA 
1/1/2010 -10  def  188,000 AREAPLANTED 22 PROGRESS 
1/1/2010 -10  def  156.3  YIELD   NA NA 
") 

library(reshape2) 

# extract the last two columns into a variable/value and make unique 
df1 <- unique(df[!is.na(df$value),c("timestamp", "value.x", "station", "parameter.y", "value")]) 
names(df1) <- c("timestamp", "value.x", "station", "variable", "value") 

# extract columns 4,5 into a variable value 
df2 <- df[,c("timestamp", "value.x", "station", "parameter.x", "value.y")] 
names(df2) <- c("timestamp", "value.x", "station", "variable", "value") 

# cast 
dcast(rbind(df1, df2), timestamp + value.x + station ~ variable, value.var = "value") 

# timestamp value.x station AREAPLANTED PROGRESS YIELD 
# 1 1/1/2010 -10.0  def  188,000  22 156.3 
# 2 1/1/2010  0.6  abc  188,000  22 156.3

來源

2017-04-27 02:08:38 epi99

我@ epi99該任務需要被分解成步驟和重組達成一致。下面是做這件事的tidyverse方式，假設你的數據幀被稱爲b作爲示例代碼：

library(tidyverse) 
b = read.csv("C:\\Temp\\stack_overflow_sample_data_which_I_hacked_together_in_Excel.csv") 
df1 = b %>% select(timestamp, value.x, station, value.y, parameter.x) %>% spread(key = parameter.x, value = value.y) 
df2 = b %>% select(timestamp, value.x, station, value, parameter.y) %>% filter(!is.na(value)) %>% spread(key = parameter.y, value = value) 
df.answer = merge(df1, df2, by = c("timestamp", "value.x", "station"))

來源

2017-04-27 02:45:01 lebelinoz

仍然在基礎R，考慮dataframes 之間根據您的需要。根據您的需要，您當前的設置使用用於廣泛到長整形的參數，反之亦然。

mdf <- merge(
    reshape(b, timevar="parameter.x", 
     v.names = c("value.y"), 
     idvar = c("timestamp", "value.x", "station"), 
     direction = "wide", 
     drop = c("value", "parameter.y")), 

    reshape(b[!is.na(b$value),], timevar="parameter.y", 
     v.names = c("value"), 
     idvar = c("timestamp", "value.x", "station"), 
     direction = "wide", 
     drop = c("value.y", "parameter.x")), 
    by=c("timestamp", "value.x", "station") 
) 

names(mdf) <- gsub("(value\\.y\\.|value\\.)", "", names(mdf)) 

mdf  
# timestamp  x station AREA PLANTED YIELD PROGRESS 
# 1 1/1/2010 -10.0  def  188,000 156.3  22 
# 2 1/1/2010 0.6  abc  188,000 156.3  22

來源

2017-04-27 02:56:53 Parfait

重塑增加兩列data.frame

回答

相關問題