如何R中

創建一個散列數據框鑑於以下數據（myinput.txt）：如何R中

A q,y,h 
B y,f,g 
C n,r,q 
### more rows

我怎麼能轉換成這樣的數據結構中的R？

$A 
[1] "q" "y" "h" 
$B 
[1] "y" "f" "g" 
$C 
[1] "n" "r" "q"

來源

2013-02-15 neversaint

，我認爲這是您的數據：

dat <- read.table(text="q,y,h 
y,f,g 
n,r,q", header=FALSE, sep=",", row.names=c("A", "B", "C"))

如果你想要一個自動的方法：

as.list(as.data.frame((t(dat)), stringsAsFactors=FALSE)) 

## $A 
## [1] "q" "y" "h" 
## 
## $B 
## [1] "y" "f" "g" 
## 
## $C 
## [1] "n" "r" "q"

另一對夫婦的方法，其工作是：

lapply(apply(dat, 1, list), "[[", 1) 

unlist(apply(dat, 1, list), recursive=FALSE)

來源

2013-02-15 04:22:08

@塞巴斯蒂安 - C：非常感謝。有沒有辦法讓'dat'自動識別row.names？即不分配它。 – neversaint 2013-02-15 04:41:09

@neversaint我只是這樣做，重新創建您的數據。我應該使用'row.names = 1'，所以一個例子是：'read.csv（「dat.csv」，row.names = 1）'。您可能還想將'colClasses =「字符」'或'stringsAsFactors = FALSE'添加到'read.table'中。 – 2013-02-15 04:45:56

使用位的readLinesstrsplit和正則表達式來解釋打破了名關開始：

dat <- readLines(textConnection("A q,y,h 
B y,f,g 
C n,r,q")) 

result <- lapply(strsplit(dat,"\\s{2}|,"),function(x) x[2:length(x)]) 
names(result) <- gsub("^(.+)\\s{2}.+$","\\1",dat) 

> result 
$A 
[1] "q" "y" "h" 

$B 
[1] "y" "f" "g" 

$C 
[1] "n" "r" "q"

或用更少的正則表達式和更多的步驟：

result <- strsplit(dat,"\\s{2}|,") 
names(result) <- lapply(result,"[",1) 
result <- lapply(result,function(x) x[2:length(x)]) 

> result 
$A 
[1] "q" "y" "h" 

$B 
[1] "y" "f" "g" 

$C 
[1] "n" "r" "q"

來源

2013-02-15 05:06:52 thelatemail

回答

相關問題