2017-06-20 75 views
2

我有一個列表(將稱爲「子列表」,以避免混淆)包含命名元素。並非所有子列表都包含所有命名元素。我想補充缺失元素的子列表,如NA增加子列表中缺失元素列表的子列表NA

實施例:

l <- list(list(a = 1, b = 2, c = 3), 
    list(a = 4, b = 5, c = 6), 
    list(a = 7, b = 8), 
    list(a = 9, c = 10)) 

如可以看到的,在第三和第四子列表分別缺少cb元件。我想這些元素被擴充爲NA這些子列表,即:

res <- list(list(a = 1, b = 2, c = 3), 
    list(a = 4, b = 5, c = 6), 
    list(a = 7, b = 8, c = NA), 
    list(a = 9, b = NA, c = 10)) 

在現實中,如果這使得它變得更容易,每個子列表中只缺少最後k元素(即我沒有的情況如第四個子列表中缺少一箇中間元素b),但我覺得我們在這個時候,讓我們找到一個通用的解決方案。

更新: 針對此特定場景獲得3個極好的解決方案,其中子列表元素爲int s。但是元素可以是chr,甚至可以是列表!例如: -

l <- list(list(a = list(1,2), b = 2, c = 3), 
     list(b = 5, c = 6), 
     list(a = list(5,6), b = 8), 
     list(a = list(7,8), c = 10)) 

a元素是一個列表,應該留在res列表的方式。如果它丟失了,我想一個NA,像往常一樣:

res <- list(list(a = list(1,2), b = 2, c = 3), 
    list(a = NA, b = 5, c = 6), 
    list(a = list(5,6), b = 8, c = NA), 
    list(a = list(7,8), b = NA, c = 10)) 
+1

威爾普,這變成了一個變色龍的問題,所以我走了。順便說一句,你還應該爲你的新輸入顯示你想要的輸出。 –

+0

用「字符」對子列表進行子集設置可以在不存在名稱的情況下返回NULL,即開始可以是nms = unique(unlist(lapply(l,names))); lapply(l,「[」,nms)「,然後,恢復」名稱「並替換NULL值 –

回答

1

更新:我們可以把唯一的名稱,然後通過列表循環和子集的名稱。不在列表中的名稱將返回NULL,我們將使用NA進行分配。這應該適用於所有輸入。

# data 
l <- list(list(a = list(1,2), b = 2, c = 3), 
     list(b = 5, c = 6), 
     list(a = list(5,6), b = 8), 
     list(a = list(7,8), c = 10)) 

myNames <- unique(unlist(sapply(l, names))) 

res <- lapply(l, function(i){ 
    x2 <- lapply(myNames, function(j){ 
    x1 <- i[[ j ]] 
    if(is.null(x1)){ x1 <- NA} 
    x1 
    }) 
    names(x2) <- myNames 
    x2 
}) 

# check results 
identical(res, 
      #expected output 
      list(list(a = list(1,2), b = 2, c = 3), 
       list(a = NA, b = 5, c = 6), 
       list(a = list(5,6), b = 8, c = NA), 
       list(a = list(7,8), b = NA, c = 10))) 
# [1] TRUE 

原文: 我們可以把子列表爲數據幀和失蹤柱與填充rbind,然後再拆分:

# data: 
l <- list(list(a = list(1,2), b = 2, c = 3), 
      list(a = list(3,4), b = 5, c = 6), 
      list(a = list(5,6), b = 8), 
      list(a = list(7,8), c = 10)) 

library(dplyr) 

# convert to dataframe and rbind with fill on missing columns 
x <- bind_rows(lapply(l, as_data_frame)) 

# then convert it back to list 
res <- lapply(split(x, seq(nrow(x))), as.list) 

# drop names, we can skip this step if we want to keep names as 1,2,3,4... 
names(res) <- NULL 

# result 
res 

# [[1]] 
# [[1]]$a 
# [1] 1 
# 
# [[1]]$b 
# [1] 2 
# 
# [[1]]$c 
# [1] 3 
# 
# 
# [[2]] 
# [[2]]$a 
# [1] 4 
# 
# [[2]]$b 
# [1] 5 
# 
# [[2]]$c 
# [1] 6 
# 
# 
# [[3]] 
# [[3]]$a 
# [1] 7 
# 
# [[3]]$b 
# [1] 8 
# 
# [[3]]$c 
# [1] NA 
# 
# 
# [[4]] 
# [[4]]$a 
# [1] 9 
# 
# [[4]]$b 
# [1] NA 
# 
# [[4]]$c 
# [1] 10 
+0

除了'stringsAsFactors = FALSE'外,您可以將'data.frame'更改爲'as_tibble' –

+0

@ GioraSimchoni是的,那是我剛剛編輯的,謝謝。 – zx8754

+0

請參閱更新。元素不一定是'int's,它們可以是列表。 –

0

當然還有一個更好的方法做它,但這適用於這兩個例子。

res和res2是您提供的示例結果。

l.res和l2.res是代碼的結果。

l <- list(list(a = 1, b = 2, c = 3), 
      list(a = 4, b = 5, c = 6), 
      list(a = 7, b = 8), 
      list(a = 9, c = 10)) 

res <- list(list(a = 1, b = 2, c = 3), 
      list(a = 4, b = 5, c = 6), 
      list(a = 7, b = 8, c = NA), 
      list(a = 9, b = NA, c = 10)) 

l2 <- list(list(a = list(1,2), b = 2, c = 3), 
      list(b = 5, c = 6), 
      list(a = list(5,6), b = 8), 
      list(a = list(7,8), c = 10)) 
res2 <- list(list(a = list(1,2), b = 2, c = 3), 
      list(a = NA, b = 5, c = 6), 
      list(a = list(5,6), b = 8, c = NA), 
      list(a = list(7,8), b = NA, c = 10)) 


#vector with 'column names' to be checked 

aux=c("a","b","c") 

#function that check if all sublists have all the elements 
#if not, create the element and asign NA value 
myfunction<-function(l.list,n.names){ 

    for(i in 1:length(l.list)){ 
    for(j in 1:length(n.names)){ 
     if (n.names[j] %in% names(l.list[[i]]) == FALSE) { 
     l.list[[i]][n.names[j]]<-NA 
     l.list[[i]]=l.list[[i]][order(unlist(names(l.list[[i]])))] 
     } 
    } 
    } 

    return(l.list) 
} 

#Applying to example 1 
l.res<-myfunction(l,aux) 

data.frame(l.res) #as a data frame just for comparison purpose 
## a b c a.1 b.1 c.1 a.2 b.2 c.2 a.3 b.3 c.3 
## 1 1 2 3 4 5 6 7 8 NA 9 NA 10 
data.frame(res) 
## a b c a.1 b.1 c.1 a.2 b.2 c.2 a.3 b.3 c.3 
## 1 1 2 3 4 5 6 7 8 NA 9 NA 10 


#Applying to example 2 
l2.res<-myfunction(l2,aux) 

data.frame(l2.res) #as a data frame just for comparison purpose 
## a.1 a.2 b c a b.1 c.1 a.5 a.6 b.2 c.2 a.7 a.8 b.3 c.3 
## 1 1 2 2 3 NA 5 6 5 6 8 NA 7 8 NA 10 
data.frame(res2) 
## a.1 a.2 b c a b.1 c.1 a.5 a.6 b.2 c.2 a.7 a.8 b.3 c.3 
## 1 1 2 2 3 NA 5 6 5 6 8 NA 7 8 NA 10 

希望它有幫助。