r分割的列分成多個列通過圖案

我想在一個數據幀d.df的一列中的數字和字符分隔：r分割的列分成多個列通過圖案

col1 
ab 12 14 56 
xb 23 234 2342 2 
ad 23 45

預期輸出：

col1 col2 
ab  12 14 56 
xb  23 234 2342 2 
ad  23 45

我承認這將是類似的東西，但我不知道分離器

t <- as.data.frame(str_match(d$col1,"^(.*)"))

我試過很多方法ODS輸出功率爲：

col1  col2  
a   b 12 14 56 
x   b 23 234 2342 2 
a   d 23 45

來源

2015-08-09 Lucia

這裏的方法將會有很大的不同，這取決於這實際上是你的字符串的樣子還是隻是一個例子。如果他們總是兩個字母和數字，你可以substring：

> df <- data.frame(col1 = c("ab 12 14 56", "xb 23 234 2342 2", "ad 23 45")) 
> 
> df$col1.1 <- sapply(df$col1, substring, 0, 2) 
> 
> df$col1.2 <- sapply(df$col1, substring, 3) 
> 
> df 
       col1 col1.1   col1.2 
1  ab 12 14 56  ab  12 14 56 
2 xb 23 234 2342 2  xb 23 234 2342 2 
3   ad 23 45  ad   23 45

如果長度和琴絃的持倉變化，正則表達式可能更適合。使用基礎R的方法，你可以只提取數字或字母（保持空格）：

> df <- data.frame(col1 = c("ab 12 14 56", "xb 23 234 2342 2", "ad 23 45")) 
> df$col1.1 <- sapply(regmatches(df$col1, gregexpr("[a-zA-Z]", df$col1)), paste, collapse = "") 
> df$col1.2 <- sapply(regmatches(df$col1, gregexpr("[0-9]\\s*", df$col1)), paste, collapse = "") 
> df 
       col1 col1.1  col1.2 
1  ab 12 14 56  ab  12 14 56 
2 xb 23 234 2342 2  xb 23 234 2342 2 
3   ad 23 45  ad   23 45

來源

2015-08-09 04:24:26 Molx

它的工作原理！謝謝！ – Lucia

您可以使用separate從tidyr。

library(tidyr) 
d.df %>% separate(col1, c("col1", "col2"), sep="(?<=[a-z]{2})") 
# col1   col2 
# 1 ab  12 14 56 
# 2 xb 23 234 2342 2 
# 3 ad   23 45

正則表達式，"(?<=[a-z]{2})"，是一個向後看，意思是「在位置分割字符串在經過兩次小寫字符後跟一個空格」。 tidyr似乎對後視的長度有限制，所以{2}用於指定字母的數量。

來源

2015-08-09 04:15:52 jenesaisquoi

這裏是data.table一個選項。

library(data.table)#v1.9.5+ 
setnames(setDT(df1)[, tstrsplit(col1, 
     '(?<=[^0-9]) (?=[0-9])', perl=TRUE)], paste0('col', 1:2))[] 
# col1   col2 
#1: ab  12 14 56 
#2: xb 23 234 2342 2 
#3: ad   23 45

我們將'data.frame'轉換爲'data.table'（setDT(df1)）。在'data.table'的開發版本中使用tstrsplit，通過匹配字母后的空格和數字部分之前的空格，在'col1'中分隔空間。我們使用正則表達式（(?<=[^0-9])和（(?=[0-9])）進行匹配。

來源

2015-08-09 06:49:56 akrun

r分割的列分成多個列通過圖案

回答

相關問題