2017-08-14 127 views
2

我從來沒有使用過R中的數據幀列表。也許它並不複雜,但我現在無法自拔。R - 數據幀列表中的拆分字符串

所以我就dataframes

df1 <- data.frame(v5 = c(0.5,0.6,0.7,0.96),v6 = c("Tiny|Marsian|Worker", "Tiny|Human|Student", "Tiny|Goblin|Soldier", "Tiny|Horse|Guardian")) 
df2 <- data.frame(v5 = c(0.56,0.32,0.55),v6 = c("Tiny|Human|Worker", "Tiny|Marsian|Student", "Tiny|Goblin|Soldier")) 

ldf <- list(df1,df2) 

每個數據幀包含6列(在這種情況下,只有2)和行的不同之每個的數量df的列表。 列V6包含三個不同的信息,每個信息由「管道」 我現在需要做的是通過「管道」分割這些信息,並製作三個單獨的列。正如我會把它弄了一個DF出

library(stringr) 
split = str_split_fixed(string = df1$v6, pattern = "\\|", n = 3) 

此後,我想追加現在在列2結束回到LDF

的個人dataframes到底的信息我希望我的數據框看起來像這樣

df1 <- data.frame(v5 = c(0.5,0.6,0.7,0.96), 
v6 = c("Tiny|Marsian|Worker", "Tiny|Human|Student", "Tiny|Goblin|Soldier", "Tiny|Horse|Guardian"), 
v7=c("Marsian","Human","Goblin","Horse")) 
    df2 <- data.frame(v5 = c(0.56,0.32,0.55), 
v6 = c("Tiny|Human|Worker", "Tiny|Marsian|Student", "Tiny|Goblin|Soldier", 
v7 = c("Human", "Marsian", "Goblin"))) 

我該如何實現這一目標?我已經嘗試了幾件事

x <- lapply(ldf, `[`, 6) 

但使用splitfuctions時出現問題! 請幫我

+0

Thx,將圖書館的'字符串'包含在代碼中 –

+1

帶監護人的小馬是怎麼出現的? :/ – Sotos

+0

固定,小馬守護問題 –

回答

0

隨着dplyrpurrr

library('dplyr') 
library('purrr') 
ldf2 <- map(ldf, mutate, v7 = str_split_fixed(string = v6, pattern = "\\|", n = 3)[, 2]) 

ldf2 

[[1]] 
    v5     v6  v7 
1 0.5 Tiny|Marsian|Worker Marsian 
2 0.6 Tiny|Human|Student Human 
3 0.7 Tiny|Goblin|Soldier Goblin 

[[2]] 
    v5     v6  v7 
1 0.56 Tiny|Human|Worker Human 
2 0.32 Tiny|Marsian|Student Marsian 
3 0.55 Tiny|Goblin|Soldier Goblin 

mutate()增加了新列基於字符串分割data.frame,並map()正在申請這個mutate()ldf每個元素。

編輯:

如果你想三個不同的列,建議立即進行刪除使用:隨着lapplytidy::separatedo.call功能

ldf2 <- map(ldf, separate, col = 'v6', into = c('Col1', 'Col2', 'Col3'), sep = '\\|') 
+0

按預期完美運作。 –

0

你可以這樣做:

combinedDF = do.call(rbind,lapply(ldf,function(x) { 

x %>% 
tidyr::separate(v6,c("v70","v7","v72"), sep = "\\|", remove=FALSE) %>% 
dplyr::select(-c(v70,v72)) 

})) 

沒有lapply/rbind (感謝@Sotos)

bind_rows(ldf) %>% 
tidyr::separate(v6,c("v70","v7","v72"), sep = "\\|", remove=FALSE) %>% 
select(-c(v70, v72)) 


combinedDF 
# v5     v6  v7 
#1 0.50 Tiny|Marsian|Worker Marsian 
#2 0.60 Tiny|Human|Student Human 
#3 0.70 Tiny|Goblin|Soldier Goblin 
#4 0.56 Tiny|Human|Worker Human 
#5 0.32 Tiny|Marsian|Student Marsian 
#6 0.55 Tiny|Goblin|Soldier Goblin 
+0

如果你打算讓最後的結果成爲一個大數據框(它似乎不是OP想要的),那麼你應該使用'bind_rows',即'bind_rows(ldf)%>%separate(v6,c(「 col1「,」col2「,」col3「),sep =」\\ |「,remove = FALSE)%>%select(-c(col1,col3))'。我還添加了另一個選擇語句以刪除不需要的列 – Sotos

+0

謝謝,包括編輯 – OdeToMyFiddle