2014-10-06 80 views
1

我的數據看起來就像下面的矩陣:R:樹重疊串

verkoop   V621 
verkoopcode  V62123 
verkoopcodenaam V6212355 
verkoopdatum  V621335 
verkoopdatumchar V62133526 
verkooppr  V6216 
verkoopprijs  V62162 
verkoopsafdeling V621213452 
verkoopsartikel V62126324 

現在,我想使樹中的R如下:

V621 --> V62123 --> V6212355 
     --> V621335 --> V62133526 
     --> V6216 --> V62162 
     --> V621213452 
     --> V62126324 

或類似的東西。以便它們考慮重疊的子串

回答

2

可以使用igraph包中的minimum.spanning.tree函數來創建這樣的樹。

# load data 
df <- read.table(text='verkoop   V621 
verkoopcode  V62123 
verkoopcodenaam V6212355 
verkoopdatum  V621335 
verkoopdatumchar V62133526 
verkooppr  V6216 
verkoopprijs  V62162 
verkoopsafdeling V621213452 
verkoopsartikel V62126324') 
# use igraph package 
require(igraph) 
# create adjacency matrix 
adj <- nchar(sapply(df$V1, gsub, x=df$V1, replacement='')) 
adj[!sapply(df$V1, grepl, x=df$V1)] <- 0 
# name adjecency matrix 
colnames(adj) <- df$V2 
# original graph 
gr <- graph.adjacency(adj, mode='directed', weighted=TRUE) 
# minimum spanning tree 
mst <- minimum.spanning.tree(gr) 
# e.g. for graphical representation 
plot(mst, vertex.size=40)