2014-10-04 93 views
8

我有一個數據幀,它由兩列組成:一個字符向量col1和一個list列,col2從其他列中刪除保留數據幀列的信息

myVector <- c("A","B","C","D") 

myList <- list() 
myList[[1]] <- c(1, 4, 6, 7) 
myList[[2]] <- c(2, 7, 3) 
myList[[3]] <- c(5, 5, 3, 9, 6) 
myList[[4]] <- c(7, 9) 

myDataFrame <- data.frame(row = c(1,2,3,4)) 

myDataFrame$col1 <- myVector 
myDataFrame$col2 <- myList 

myDataFrame 
# row col1   col2 
# 1 1 A 1, 4, 6, 7 
# 2 2 B  2, 7, 3 
# 3 3 C 5, 5, 3, 9, 6 
# 4 4 D   7, 9 

我想不公開我的col2在列表中仍然保持了向量的每個元素存儲在col1的信息。用不同的方式來描述它,在常用的數據框整形術語中:「寬」列表欄應轉換爲「長」格式。

然後在一天結束時,我想要兩個長度等於length(unlist(myDataFrame$col2))的向量。在代碼:

# unlist myList 
unlist.col2 <- unlist(myDataFrame$col2) 
unlist.col2 
# [1] 1 4 6 7 2 7 3 5 5 3 9 6 7 9 

# unlist myVector to obtain 
# unlist.col1 <- ??? 
# unlist.col1 
# [1] A A A A B B B C C C C C D D 

我想不出任何直接的方式來得到它。

回答

3

這裏,這個想法是使用sapply先獲取每個列表元素的長度,然後用rep複製col1length

l1 <- sapply(myDataFrame$col2, length) 
    unlist.col1 <- rep(myDataFrame$col1, l1) 
    unlist.col1 
#[1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 

或者通過@Ananda Mahto的建議,上述可還與vapply

with(myDataFrame, rep(col1, vapply(col2, length, 1L))) 
    #[1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 
4

您可以使用「data.table」以展開整個data.frame,並提取感興趣的列來完成。

library(data.table) 
## expand the entire data.frame (uncomment to see) 
# as.data.table(myDataFrame)[, unlist(col2), by = list(row, col1)] 

## expand and select the column of interest: 
as.data.table(myDataFrame)[, unlist(col2), by = list(row, col1)]$col1 
# [1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 

就R的新版本,現在可以使用,而不是sapply(list, length)方法的lengths功能。 lengths功能相當快。

with(myDataFrame, rep(col1, lengths(col2))) 
# [1] "A" "A" "A" "A" "B" "B" "B" "C" "C" "C" "C" "C" "D" "D" 
15

您也可以使用unnest從包tidyr

library(tidyr) 
unnest(myDataFrame, col2) 

#  row col1 col2 
# (dbl) (chr) (dbl) 
# 1  1  A  1 
# 2  1  A  4 
# 3  1  A  6 
# 4  1  A  7 
# 5  2  B  2 
# 6  2  B  7 
# 7  2  B  3 
# 8  3  C  5 
# 9  3  C  5 
# 10  3  C  3 
# 11  3  C  9 
# 12  3  C  6 
# 13  4  D  7 
# 14  4  D  9 
相關問題