2017-06-13 47 views
3

我有以下數據框:展開數據幀

df <- data.frame(
    month=c("July", "August", "August"), 
    day=c(31, 1, 2), 
    time=c(12, 12, 12)) 

    month day time 
1 July 31 12 
2 August 1 12 
3 August 2 12 

我有次一個文本文件(十進制格式),我想所有的時間從更換「時間」欄文本文件。文本文件中有多個日期,每個日期有300多條記錄。

7-31-2016 #the days are all concatenated together, this line represents the beginning of one day (July 31) 
13.12344 
13.66445 
13.76892 
... 
8-1-2016 #here is another day (August 1) 
14.50333 
14.52000 
14.53639 
... 

但是,文本文件比當前數據幀長得多 - 它有393條記錄。所以我希望得到的數據框看起來是這樣的:

month day  time 
5 July 31 13.12344 
6 July 31 13.66445 
7 July 31 13.76892 
..... 
393 August 1 14.50333 
394 August 1 14.52000 
394 August 1 14.53639 

基本上,我只需要能夠擴大我目前的數據框從新文件匹配的記錄數量,同時保持相同的一天。希望這是有道理的。

+1

如何是你的文本文件struture? –

+0

請提供從文本文件中讀取的數據幀或列表。 – OmaymaS

+0

@NicoCoallier文本文件結構完全和我在文章中列出的一樣。它基本上只是連接在一起的一系列時間。日期表示新的一天(例如7月31日,8月1日等)。 – ale19

回答

2
# Create txt data 
txt <- data.frame(x = c('7-31-2016', '13.12344', '13.66445', '13.76892', '8-1-2016', '14.50333', '14.52000', '14.53639')) 
# Load Your data 
df <- data.frame(
    month=c("July", "August", "August"), 
    day=c(31, 1, 2), 
    time=c(12, 12, 12)) 

# Need a year to join dates 
df$year <- 2016 

# Create date column 
df$date <- as.Date(paste0(df$month, "/", df$day, "/", df$year), format = "%B/%d/%Y") 

# Find values with dashes, then replaces with/
txt$dash <- grepl('-', txt$x) 
txt$x <- gsub("-", "/", txt$x) 

# Adds new columns 
library(dplyr) 
txt <- mutate(txt, date = ifelse(dash==TRUE, as.Date(x, format = "%m/%d/%Y"), NA)) 
txt <- mutate(txt, time = ifelse(dash==FALSE, as.numeric(x), NA)) 

# Fill down values 
library(zoo) 
txt$date <- na.locf(txt$date) 

# Removes NA and keeps necessary columns 
txt <- txt[!is.na(txt$time),] 
txt <- txt[c("date", "time")] 

# Merge 
output <- merge(df, txt, by = "date") 
0

所以你想合併你現有的data.frame df,它只有3行,new_text,它有很多。使用:

merge(df, new_text, all.y = T) #all.y will interpolate new rows for the ones that don't match 

欲瞭解更多信息,請參閱?merge

0

讓txt文件到可合併dataframe

df$V2=as.numeric(df$V1) 
Temp=is.na(df$V2) 
df$V2=NA 
df$V2[Temp]=df$V1[Temp] 
df$V2=na.locf(df$V2) 
df=df[!Temp,] 

     V1  V2 
2 13.12344 7/31/2016 
3 13.66445 7/31/2016 
4 13.76892 7/31/2016 
6 14.50333 8/1/2016 
7 14.52 8/1/2016 
8 14.53639 8/1/2016