2015-02-10 68 views
4

形勢&數據的R - 合併和熔體列表數據幀

我公司員工df_employees的數據幀:

df_employees <- structure(list(empNo = c(1001, 1002, 1003)), .Names = "empNo", row.names = c(NA, 
    -3L), class = "data.frame") 

> df_employees 
    empNo 
1 1001 
2 1002 
3 1003 

和技能列表l_skills

l_skills <- list(c("skill1", "skill2", "skill3"), c("skill1", "skill2"), 
      "skill1") 


> l_skills 
[[1]] 
[1] "skill1" "skill2" "skill3" 

[[2]] 
[1] "skill1" "skill2" 

[[3]] 
[1] "skill1" 

問題

如何合併和融化的數據給我造成數據幀df_result

> df_result 
    empNo skills 
1 1001 skill1 
2 1001 skill2 
3 1001 skill3 
4 1002 skill1 
5 1002 skill2 
6 1003 skill1 

嘗試

我以爲我可以使用類似的方法來this cSplit function,但我得到一個錯誤當試圖安裝cSplit

> install.packages("cSplit") 
Installing package into ‘C:/Users/<username>/Documents/R/win-library/3.1’ 
(as ‘lib’ is unspecified) 
Warning in install.packages : 
    package ‘cSplit’ is not available (for R version 3.1.2) 
+0

你用什麼標準來匹配'empNo'和'skills'? – 2015-02-10 03:51:00

+0

@MaratTalipov在這個特定的例子中,我沒有匹配的ID,但每個列表元素對應於數據框的同一行。 – tospig 2015-02-10 03:52:50

+0

@MaratTalipov例如,'df_employees [1,]'對應'l_skills [[1]]','df_employees [2,]'對應'l_skills [[2]]'... – tospig 2015-02-10 03:55:43

回答

5

您可以使用melt

library(reshape2) 

L <- l_skills 
names(L) <- df_employees$empNo 

result <- melt(L) 
colnames(result) <- c('skills','empNo') 

result 
# skills empNo 
# 1 skill1 1001 
# 2 skill2 1001 
# 3 skill3 1001 
# 4 skill1 1002 
# 5 skill2 1002 
# 6 skill1 1003 

基礎R解決方案:從reshape2包功能

do.call(rbind,mapply(cbind,df_employees$empNo,l_skills)) 
#  [,1] [,2]  
#[1,] "1001" "skill1" 
#[2,] "1001" "skill2" 
#[3,] "1001" "skill3" 
#[4,] "1002" "skill1" 
#[5,] "1002" "skill2" 
#[6,] "1003" "skill1" 
+0

完美,謝謝!我有一個懷疑,我需要在某個時候使用「熔化」,但無法繞過它。 – tospig 2015-02-10 04:01:13

4

你可以同時也使用stackbase R

setNames(stack(setNames(L, df_employees$empNo)), c('skills', 'empNo')) 
# skills empNo 
#1 skill1 1001 
#2 skill2 1001 
#3 skill3 1001 
#4 skill1 1002 
#5 skill2 1002 
#6 skill1 1003 

或者splitstackshape

library(splitstackshape) 
listCol_l(transform(df_employees, skills=I(L)), 'skills')[] 
# empNo skills_ul 
#1: 1001 skill1 
#2: 1001 skill2 
#3: 1001 skill3 
#4: 1002 skill1 
#5: 1002 skill2 
#6: 1003 skill1 

其中

L <- l_skills