[R字符串解析挑戰

我處理包含字符串如下[R字符串解析挑戰

 Col1 
     ------------------------------------------------------------------ 
     Department of Mechanical Engineering, Department of Computer Science 
     Division of Advanced Machining, Center for Mining and Metallurgy 
     Department of Aerospace, Center for Science and Delivery

我所試圖做的是包含單詞開始，要麼，部門或Divison或中心，直至逗號（單獨字符串列，）最終輸出應該看起來像這樣

 Dept_Mechanical_Eng Dept_Computer_Science Div_Adv_Machining Cntr_Mining_Metallurgy Dept_Aerospace Cntr_Science_Delivery 
     1      1      0     0      0    0 
     0      0      1     1      0    0 
     0      0      1     1      1    1

我在預期的輸出中爲了審美目的而屠殺了實際名稱。任何幫助解析這個字符串非常感謝。

來源

2016-04-27 Lilla Bulten

'library（splitstackshape）; cSplit_e（mydf，「Col1」，「，」，type =「character」，drop = TRUE，fill = 0）'。也可以從「qdapTools」中查看'strsplit' +'mtabulate'。 – A5C1D2H2I1M1N2O1R2T1

這與我剛剛列表另一個文本示例的問題非常相似。你和這位提問者在同一班嗎？ Count the number of times (frequency) a string occurs

inp <- "Department of Mechanical Engineering, Department of Computer Science 
     Division of Advanced Machining, Center for Mining and Metallurgy 
     Department of Aerospace, Center for Science and Delivery" 
inp2 <- factor(scan(text=inp,what="",sep=",")) 
#Read 6 items 
inp3 <- readLines(textConnection(inp)) 

as.data.frame(setNames(lapply(levels(inp2), function(ll) as.numeric(grepl(ll, inp3))), trimws(levels(inp2)))) 
    Department.of.Aerospace Division.of.Advanced.Machining 
1      0        0 
2      0        1 
3      1        0 
    Center.for.Mining.and.Metallurgy Center.for.Science.and.Delivery 
1        0        0 
2        1        0 
3        0        1 
    Department.of.Computer.Science Department.of.Mechanical.Engineering 
1        1         1 
2        0         0 
3        0         0

來源

2016-04-27 04:56:24

啊:)謝謝42，工作。 –

[R字符串解析挑戰

回答

相關問題